From: Dominic Mitchell <[EMAIL PROTECTED]>
Sent: Wednesday, May 30, 2001 4:45 PM


> On Wed, May 30, 2001 at 11:40:03AM -0400, Andy Williams wrote:
> > All the one's that claimed to be valid from E::V failed chaddr!
> > [EMAIL PROTECTED] had this result from chaddr:
> > user: andyw. is good
> > host: hillway.com is good
> > address `[EMAIL PROTECTED]' is bad: rfc822 failure
> >
> > So I guess [EMAIL PROTECTED] is invalid even though it works.... wierd!
>
> What is valid on the left hand side of an email address is extremely
> weird anyway.  Practically anything is allowed.  A pseudo grammar for
> them is in RFC822.  There's also much fun trying to parse them in
> Friedl's book on regular expressions (the owl book).  He ends up with a
> mammoth 5k regex to parse email addresses...
>
> -Dom
>

Having just had a look at E::V it looks like the module is using the
'mammoth 5k regex'.  I prefer the regex that is given in CGI Programming
with Perl.  This regex is designed to accept the more common address
formats.

RFC822 will allow all of the following (taken from CGI Programming with
Perl) and was designed to accept all the addresses in use in 1982:

Alfred Neuman <Neuman@BBN-TENEXA>
":sysmail"@ Some-Group. Some-Org
Muhammed.(I am the greatest) Ali @(the)vegas.WBA

I have checked the following code against the original test cases which
originally returned as valid and none of the list are considered valid.

sub IsValidAddress {
    my $addr_to_check = shift;

    $addr_to_check =~ s/("(?:[^"\\]|\\.)*"|[^\t ]*)[ \t]*/$1/g;

    my $esc    = '\\\\';
    my$space   = '\040';
    m $ctrl    = '\000-\037';
    my $dot    = '\.';
    my $nonASCII  = '\x80-\xff';
    my $CRlist   = '\012\015';
    my $letter   = 'a-zA-Z';
    my $digit   = '\d';

    my $atom_char  = qq{ [^$space<>\@,;:".\\[\\]$esc$ctrl$nonASCII] };
    my $atom    = qq{ $atom_char+ };
    my $byte    = qq{ (?: 1?$digit?$digit |
                                    2[0-4]$digit  |
                                    25[0-5]    ) };

    my $qtext   = qq{ [^$esc$nonASCII$CRlist"] };
    my $quoted_pair = qq{ $esc [^$nonASCII] };
    my $quoted_str  = qq{ " (?: $qtext | $quoted_pair )* " };

    my $word    = qq{ (?: $atom | $quoted_str ) };
    my $ip_address  = qq{ \\[ $byte (?: $dot $byte ){3} \\] };
    my $sub_domain  = qq{ [$letter$digit]
                                            [$letter$digit-]{0,61}
[$letter$digit]};
    my $top_level  = qq{ (?: $atom_char ){2,4} };
    my $domain_name = qq{ (?: $sub_domain $dot )+ $top_level };
    my $domain   = qq{ (?: $domain_name | $ip_address ) };
    my $local_part  = qq{ $word (?: $dot $word )* };

    my $address    = qq{ $local_part \@ $domain };

    return $addr_to_check =~ /^$address$/ox ? $addr_to_check : "";
}


Hope this helps,

Matt
--
s&&!msfQ!&&s&$&utvK&&s&(Q)&\1!sfiupoB&&s&^&reverse Ibdlfs&e&s&^&#
&&s&$&#!uojsq&&s&(.)&chr(ord($1)-1)&ge&s&(.*)&reverse $1&see



Reply via email to