RE: [PHP] Re: email validation (no regex)

2005-09-22 Thread Michael Sims
J B wrote:
 On 9/21/05, Michael Sims [EMAIL PROTECTED] wrote:
 Additionally, some mail servers unconditionally accept mail
 addressed to ANY username at their domain, whether that user
 actually exists or not.  This is very bad practice, because it
 usually means the accepting MTA is a dumb host that has to forward
 all incoming mail to an internal mail server which knows which
 accounts exist, and if that server ends up rejecting the message,
 the dumb MTA creates a DSN and sends it back to the envelope
 sender (which is quite often forged).  This causes the so-called
 backscatter which results in innocent people getting bounces for
 messages they didn't send.  Nevertheless, lots of mail servers are
 configured this way, so you cannot simply assume that an account is
 real just because you didn't get a 5xx on RCPT TO.

   Just as a side note, and I do agree that this behaviour is bad
 practice in principle, but I imagine they (the MTAs) do this for the
 same reason that login prompts don't tell you when you enter a bogus
 username and still prompt for the password and give a generic access
 denied error...it prevents username fishing.

There probably are a few people who accept mail to any address at their domain 
to
foil dictionary attacks, but IMHO the vast majority of servers that are set up 
this
way are due to mail admins who just don't know any better.  It's not always 
easy to
set up a border MTA so that it knows about the accounts that exist on an 
internal
machine...it usually involves custom scripting or real-time callouts to the 
internal
server and it takes a relatively knowledgeable admin to implement it (at least 
that
has been my experience).

I had someone else email me privately saying that they did the above precisely 
to
foil dictionary attacks, but this person configured his server to simply discard
email to nonexistent accounts.  That has it's disadvantages (since it could make
legit senders believe their messages are being delivered when they aren't) but 
it
least it doesn't create any backscatter.  In the default case, accepting all 
email
unconditionally then later rejecting it is just irresponsible, since it makes 
you a
vector for abuse, and could eventually get you blacklisted if other mail 
servers get
sick of receiving bogus bounces from your domain...

(As a side note, apparently the list software doesn't like the offtopic nature 
of
this sub-thread (I just received a 550 on this message), so this will be my last
post on the matter.  But since I've gone to the trouble of typing it up let me 
throw
in the words PHP, web, and Apache, so this will make it through. :) )

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP] Re: email validation (no regex)

2005-09-21 Thread Jim Moseby
  So, what is the general thought about validating email 
 addresses in this
  manner?
  
  JM
 Thre is a good reason why virtually everyone uses regex 
 patterns for email validating.

Excellent start!  And that good reason is...?  
How can regex ensure that the email address that is submitted is a valid (ie
working, able to receive email) address?
Why is regex a better way?

JM

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP] Re: email validation (no regex)

2005-09-21 Thread bruce
jim...

validating email means different things to different people...

but there's no way you're going to be able to 'throw' together something in
2-3 days that others have taken years to create/refine...

if you only want to determine if an email address is valid, what does that
mean to you? are you following the current/latest rfc 2822 (i think)
standard? or are you just trying to get a quick halfway ok function...

as an example, i was looking at a way of using a regex/function for email
validation for a user input form... i decided that it was simply too tough
to deal with the various nuances, and chickened out, using a combination
perl/php approach...

but you could do what you want to do. however, it's going to be painful if
you want it to match the rfc spec...

good luck...

-bruce

ps. take a look at perl's email::valid function if you want to get a feel
for how extensive this task can get...


-Original Message-
From: Jim Moseby [mailto:[EMAIL PROTECTED]
Sent: Wednesday, September 21, 2005 11:01 AM
To: 'Al'; php-general@lists.php.net
Subject: RE: [PHP] Re: email validation (no regex)


  So, what is the general thought about validating email
 addresses in this
  manner?
 
  JM
 Thre is a good reason why virtually everyone uses regex
 patterns for email validating.

Excellent start!  And that good reason is...?
How can regex ensure that the email address that is submitted is a valid (ie
working, able to receive email) address?
Why is regex a better way?

JM

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Re: email validation (no regex)

2005-09-21 Thread Ben
Jim Moseby said the following on 09/21/05 11:00:
So, what is the general thought about validating email 

addresses in this

manner?

JM

Thre is a good reason why virtually everyone uses regex 
patterns for email validating.
 
 
 Excellent start!  And that good reason is...?  
 How can regex ensure that the email address that is submitted is a valid (ie
 working, able to receive email) address?
 Why is regex a better way?

Personally I would go for a combination.  Regex is much faster so if you
can eliminate fake addresses with regex you won't have to waste your
time attempting to look up MX records or connect to mail servers that
don't exist.

My apologies for the line wrapping, but the following is a slightly
modified function I found online and have been using for a while.  It
doesn't actually connect to the remote server and try sending to the
address provided like your function does, it merely checks for a valid
MX for the domain.  The extra time spent attempting a fake send to an
address was deemed not worth the bother as some mail servers (especially
qmail) do not, by default or without patching, block messages from being
sent to non-existant email addresses.  Instead the message is accepted
and bounced.  Your method will not detect this.

- Ben


function isValidEmail($address, $checkMX = false) {
// Return true or false depending on whether the email address is valid
$valid_tlds = array(arpa, biz, com, edu, gov, int,
mil, net, org, aero,
ad, ae, af, ag, ai, al, am, an, ao, aq,
ar, as, at, au,
aw, az, ba, bb, bd, be, bf, bg, bh, bi,
bj, bm, bn, bo,
br, bs, bt, bv, bw, by, bz, ca, cc, cf,
cd, cg, ch, ci,
ck, cl, cm, cn, co, cr, cs, cu, cv, cx,
cy, cz, de, dj,
dk, dm, do, dz, ec, ee, eg, eh, er, es,
et, fi, fj, fk,
fm, fo, fr, fx, ga, gb, gd, ge, gf, gh,
gi, gl, gm, gn,
gp, gq, gr, gs, gt, gu, gw, gy, hk, hm,
hn, hr, ht, hu,
id, ie, il, in, io, iq, ir, is, it, jm,
jo, jp, ke, kg,
kh, ki, km, kn, kp, kr, kw, ky, kz, la,
lb, lc, li, lk,
lr, ls, lt, lu, lv, ly, ma, mc, md, mg,
mh, mk, ml, mm,
mn, mo, mp, mq, mr, ms, mt, mu, mv, mw,
mx, my, mz, na,
nc, ne, nf, ng, ni, nl, no, np, nr, nt,
nu, nz, om, pa,
pe, pf, pg, ph, pk, pl, pm, pn, pr, pt,
pw, py, qa, re,
ro, ru, rw, sa, sb, sc, sd, se, sg, sh,
si, sj, sk, sl,
sm, sn, so, sr, st, su, sv, sy, sz, tc,
td, tf, tg, th,
tj, tk, tm, tn, to, tp, tr, tt, tv, tw,
tz, ua, ug, uk,
um, us, uy, uz, va, vc, ve, vg, vi, vn,
vu, wf, ws, ye,
yt, yu, za, zm, zr, zw, coop, info,
museum, name, pro);

// Rough email address validation using POSIX-style regular expressions
if (!eregi([EMAIL PROTECTED],}\.[a-z0-9\-\.]{2,}$,
$address)) {
return false;
}
else {
$address = strtolower($address);
}

// Explode the address on name and domain parts
$name_domain = explode(@, $address);

// There can be only one ;-) I mean... the @ symbol
if (count($name_domain) != 2)


// There can be only one ;-) I mean... the @ symbol
if (count($name_domain) != 2)
return false;

// Check the domain parts
$domain_parts = explode(., $name_domain[1]);
if (count($domain_parts)  2)
return false;

// Check the TLD ($domain_parts[count($domain_parts) - 1])
if (!in_array($domain_parts[count($domain_parts) - 1], $valid_tlds))
return false;

// Search DNS for MX records corresponding to the hostname
($name_domain[0])
if ($checkMX  !getmxrr($name_domain[1], $mxhosts))
return false;

return true;
}

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP] Re: email validation (no regex)

2005-09-21 Thread Jim Moseby
 jim...
 
 validating email means different things to different people...

True, but for the most part people just want to know whether a user has
entered a real working email address into their forms.  What better test
than to try to send an email to it?  

 
 but there's no way you're going to be able to 'throw' 
 together something in
 2-3 days that others have taken years to create/refine...

I threw the example I posted together in about 10 minutes (and it shows :).
Even though I'm not at a place where I can test it right now, I think it
will work with some tweaking.  

 
 if you only want to determine if an email address is valid, 
 what does that
 mean to you? are you following the current/latest rfc 2822 (i think)
 standard? or are you just trying to get a quick halfway ok function...

Of course the SMTP standard would have to be followed, I typed what you see
from memory, just as a conceptual model.

 
 as an example, i was looking at a way of using a 
 regex/function for email
 validation for a user input form... i decided that it was 
 simply too tough
 to deal with the various nuances, and chickened out, using a 
 combination
 perl/php approach...

So what do you get from them that my function would not give you?

 
 but you could do what you want to do. however, it's going to 
 be painful if
 you want it to match the rfc spec...

Really?  Why does it need to be painful?  I just need to do a 'EHLO', 'Mail
From:' and 'RCPT to:' and 'QUIT'. It's not going to actually send an email.
Seems simple to me.  Maybe there's something else in the spec that I don't
see?

 
 good luck...
 

Thanks.  :o)

 ps. take a look at perl's email::valid function if you want 
 to get a feel
 for how extensive this task can get...
 
My question is why does it have to be so complicated?  SMTP servers are
the best email validation devices known to man.  Why not let them do the
dirty work?

JM -- playing devils advocate  :o)

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP] Re: email validation (no regex)

2005-09-21 Thread Murray @ PlanetThoughtful
What you have is virtually impossible to determine if all legitimate
 possibilities are covered.
email validation using regex is a very heavily analyzed subject
Google regex email validate and you'll find loads of expressions.
 Look at the Zend article, it provides some insight.
 
 I fully understand about the almost limitless possibilities. Googling the
 subject returns results more mind boggling than the regex itself.  :o)  Do
 ANY of the regex examples you have found cover all those possibilities?
 If
 so, why are there so many different approaches?  For most applications,
 where you will only be validating a small number of emails in a given day,
 why put yourself to all the regex pain, still to not have covered all the
 possibilities?
 
 In the end, with regards to email validation, all most people need is to
 know that a given email has a proper username, just 1 '@' in the middle,
 and
 a valid domain.  If it doesn't, its a bogus email address.

As to that, why not validate the email address by sending an automated
message to the supplied account, requiring the person to click on a
validation link? Easy, simple, works better than either method currently
being discussed, purely for its simplicity, if nothing else.

Much warmth,

Murray
---
Lost in thought...
http://www.planetthoughtful.org

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP] Re: email validation (no regex)

2005-09-21 Thread bruce
because you should want/need to validate that the address is correct prior
to determining if the email server is up running...

the regex function simply allows you to quickly determine if the address is
valid... doens't mean that it's going to go to an actual live user...!!

btw simply checking for a single '@' with a domain doesn't do it... what if
the user has '[EMAIL PROTECTED]' or '[EMAIL PROTECTED]'. will your regex 
accept/deny
this???

welcome to the world of email validation

-bruce


-Original Message-
From: Murray @ PlanetThoughtful [mailto:[EMAIL PROTECTED]
Sent: Wednesday, September 21, 2005 12:01 PM
To: 'Jim Moseby'; php-general@lists.php.net
Subject: RE: [PHP] Re: email validation (no regex)


What you have is virtually impossible to determine if all legitimate
 possibilities are covered.
email validation using regex is a very heavily analyzed subject
Google regex email validate and you'll find loads of expressions.
 Look at the Zend article, it provides some insight.

 I fully understand about the almost limitless possibilities. Googling the
 subject returns results more mind boggling than the regex itself.  :o)  Do
 ANY of the regex examples you have found cover all those possibilities?
 If
 so, why are there so many different approaches?  For most applications,
 where you will only be validating a small number of emails in a given day,
 why put yourself to all the regex pain, still to not have covered all the
 possibilities?

 In the end, with regards to email validation, all most people need is to
 know that a given email has a proper username, just 1 '@' in the middle,
 and
 a valid domain.  If it doesn't, its a bogus email address.

As to that, why not validate the email address by sending an automated
message to the supplied account, requiring the person to click on a
validation link? Easy, simple, works better than either method currently
being discussed, purely for its simplicity, if nothing else.

Much warmth,

Murray
---
Lost in thought...
http://www.planetthoughtful.org

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP] Re: email validation (no regex)

2005-09-21 Thread Murray @ PlanetThoughtful
 because you should want/need to validate that the address is correct prior
 to determining if the email server is up running...
 
 the regex function simply allows you to quickly determine if the address
 is
 valid... doens't mean that it's going to go to an actual live user...!!
 
 btw simply checking for a single '@' with a domain doesn't do it... what
 if
 the user has '[EMAIL PROTECTED]' or '[EMAIL PROTECTED]'. will your regex 
 accept/deny
 this???
 
 welcome to the world of email validation
 
 -bruce
 
 As to that, why not validate the email address by sending an automated
 message to the supplied account, requiring the person to click on a
 validation link? Easy, simple, works better than either method currently
 being discussed, purely for its simplicity, if nothing else.

I agree, so basic validation is A Good Thing. However, the most desirable
form of validation would have to be, can I send a legitimate email to that
account and receive acknowledgement that it's working by having the user
click on a validation link.

Much warmth,

Murray
---
Lost in thought...
http://www.planetthoughtful.org

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP] Re: email validation (no regex)

2005-09-21 Thread Jim Moseby
 
 btw simply checking for a single '@' with a domain doesn't do 
 it... what if
 the user has '[EMAIL PROTECTED]' or '[EMAIL PROTECTED]'. will your 
 regex accept/deny
 this???

My function will quickly deny those because the DNS lookup for them will
immediately fail. Will your regex deny '[EMAIL PROTECTED]'?  Should
it?

 welcome to the world of email validation

That's your world.  Mine is much simpler.  :o)  Seriously, I think Ben and
Manuel have it right.  A combination approach is probably most effective
(and complex).  I was hoping for a simple solution for the regex challenged.
Of course the old tried and true validation email that requires the user to
validate himself is the most fool-proof method, but thats not an on-the-fly
solution.

JM

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP] Re: email validation (no regex)

2005-09-21 Thread Murray @ PlanetThoughtful
  because you should want/need to validate that the address is correct
 prior
  to determining if the email server is up running...
 
  the regex function simply allows you to quickly determine if the address
  is
  valid... doens't mean that it's going to go to an actual live user...!!
 
  btw simply checking for a single '@' with a domain doesn't do it... what
  if
  the user has '[EMAIL PROTECTED]' or '[EMAIL PROTECTED]'. will your regex
 accept/deny
  this???
 
  welcome to the world of email validation
 
  -bruce
 
  As to that, why not validate the email address by sending an automated
  message to the supplied account, requiring the person to click on a
  validation link? Easy, simple, works better than either method currently
  being discussed, purely for its simplicity, if nothing else.
 
 I agree, so basic validation is A Good Thing. However, the most desirable
 form of validation would have to be, can I send a legitimate email to that
 account and receive acknowledgement that it's working by having the user
 click on a validation link.

After all, for all the regex / interrogation you perform, you still can't be
certain that the user entered an account *they own*. See? Sending a
validation email is *also* A Good Thing!

Much warmth,

Murray
---
Lost in thought...
http://www.planetthoughtful.org

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP] Re: email validation (no regex)

2005-09-21 Thread Philip Hallstrom

but you could do what you want to do. however, it's going to
be painful if
you want it to match the rfc spec...


Really?  Why does it need to be painful?  I just need to do a 'EHLO', 'Mail
From:' and 'RCPT to:' and 'QUIT'. It's not going to actually send an email.
Seems simple to me.  Maybe there's something else in the spec that I don't
see?


Some mail servers can be configured to not reject the email until the end 
of DATA.  I know you can do this in postfix.


Although if the user is invalid, why you'd wait I don't know, but it is 
possible.


-philip

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP] Re: email validation (no regex)

2005-09-21 Thread Michael Sims
Philip Hallstrom wrote:
 but you could do what you want to do. however, it's going to be
 painful if you want it to match the rfc spec...

 Really?  Why does it need to be painful?  I just need to do a
 'EHLO', 'Mail From:' and 'RCPT to:' and 'QUIT'. It's not going to
 actually send an email. Seems simple to me.  Maybe there's something
 else in the spec that I don't see?

 Some mail servers can be configured to not reject the email until the
 end of DATA.  I know you can do this in postfix.

 Although if the user is invalid, why you'd wait I don't know, but it
 is possible.

Additionally, some mail servers unconditionally accept mail addressed to ANY
username at their domain, whether that user actually exists or not.  This is 
very
bad practice, because it usually means the accepting MTA is a dumb host that 
has
to forward all incoming mail to an internal mail server which knows which 
accounts
exist, and if that server ends up rejecting the message, the dumb MTA creates 
a
DSN and sends it back to the envelope sender (which is quite often forged).  
This
causes the so-called backscatter which results in innocent people getting 
bounces
for messages they didn't send.  Nevertheless, lots of mail servers are 
configured
this way, so you cannot simply assume that an account is real just because you
didn't get a 5xx on RCPT TO.

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP] Re: email validation (no regex)

2005-09-21 Thread Jim Moseby

 -Original Message-
 From: Jim Moseby [mailto:[EMAIL PROTECTED]
 Sent: Wednesday, September 21, 2005 12:21 PM
 To: php-general@lists.php.net
 Subject: RE: [PHP] Re: email validation (no regex)
 
 
 
  btw simply checking for a single '@' with a domain doesn't do
  it... what if
  the user has '[EMAIL PROTECTED]' or '[EMAIL PROTECTED]'. will your
  regex accept/deny
  this???
 
 My function will quickly deny those because the DNS lookup 
 for them will
 immediately fail. Will your regex deny 
 '[EMAIL PROTECTED]'?  Should
 it?
 
  welcome to the world of email validation
 
 That's your world.  Mine is much simpler.  :o)  Seriously, I 
 think Ben and
 Manuel have it right.  A combination approach is probably 
 most effective
 (and complex).  I was hoping for a simple solution for the 
 regex challenged.
 Of course the old tried and true validation email that 
 requires the user to
 validate himself is the most fool-proof method, but thats not 
 an on-the-fly
 solution.

 
 jim...
 
 these are valid emails... as defined by the rfc..
 
 so your function would be in error..

This is where I think you and I are not connecting.  I don't care if they
are valid according to the RFC.  I want to know if they are likely to be
*WORKING* email addresses.  And so, from that perspective, my function would
not necessarily be in error, but working as designed.

Others have brought up truly valid points with regards to the reliability of
it though.  Different quirks of MTA configuration and function are difficult
to overcome.  I have learned you cannot rely on 'RCPT To:' responding with a
'250' as verification that it is a valid user.  I have learned that a domain
need not have an MX record at all, to receive mail.

Learning is why I'm here, and why I posted this question.  Thank you for
your input.

JM

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Re: email validation (no regex)

2005-09-21 Thread J B
On 9/21/05, Michael Sims [EMAIL PROTECTED] wrote:
 Additionally, some mail servers unconditionally accept mail addressed to ANY
 username at their domain, whether that user actually exists or not.  This is 
 very
 bad practice, because it usually means the accepting MTA is a dumb host 
 that has
 to forward all incoming mail to an internal mail server which knows which 
 accounts
 exist, and if that server ends up rejecting the message, the dumb MTA 
 creates a
 DSN and sends it back to the envelope sender (which is quite often forged).  
 This
 causes the so-called backscatter which results in innocent people getting 
 bounces
 for messages they didn't send.  Nevertheless, lots of mail servers are 
 configured
 this way, so you cannot simply assume that an account is real just because you
 didn't get a 5xx on RCPT TO.

  Just as a side note, and I do agree that this behaviour is bad
practice in principle, but I imagine they (the MTAs) do this for the
same reason that login prompts don't tell you when you enter a bogus
username and still prompt for the password and give a generic access
denied error...it prevents username fishing.
  Of course, I would think that a better solution would be to do
immediate rejection and then block the remote IP after X send attempts
with invalid usernames, but maybe there's a compelling reason not to
do that and I just haven't thought of it...

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Re: email validation (no regex)

2005-09-21 Thread Jasper Bryant-Greene

J B wrote:

On 9/21/05, Michael Sims [EMAIL PROTECTED] wrote:


Additionally, some mail servers unconditionally accept mail addressed to ANY
username at their domain, whether that user actually exists or not.  This is 
very
bad practice, because it usually means the accepting MTA is a dumb host that 
has
to forward all incoming mail to an internal mail server which knows which 
accounts
exist, and if that server ends up rejecting the message, the dumb MTA creates 
a
DSN and sends it back to the envelope sender (which is quite often forged).  
This
causes the so-called backscatter which results in innocent people getting 
bounces
for messages they didn't send.  Nevertheless, lots of mail servers are 
configured
this way, so you cannot simply assume that an account is real just because you
didn't get a 5xx on RCPT TO.



  Just as a side note, and I do agree that this behaviour is bad
practice in principle, but I imagine they (the MTAs) do this for the
same reason that login prompts don't tell you when you enter a bogus
username and still prompt for the password and give a generic access
denied error...it prevents username fishing.
  Of course, I would think that a better solution would be to do
immediate rejection and then block the remote IP after X send attempts
with invalid usernames, but maybe there's a compelling reason not to
do that and I just haven't thought of it...



If someone else on my ISP tries to username fish and gets my ISP's 
MTA's IP blocked by any other MTA, I'd sure be pissed off about it.


That's probably the reason why they don't block remote IPs after X 
invalid username send attempts -- MTAs are often shared by many, many users.


--
Jasper Bryant-Greene
Freelance web developer
http://jasper.bryant-greene.name/

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php