> -----Original Message----- > From: Rasmus Lerdorf [mailto:ras...@lerdorf.com] > Sent: 03 April 2010 02:44 > To: Jared Williams > Cc: internals@lists.php.net > Subject: Re: [PHP-DEV] Re: [PHP-CVS] svn: /php/php-src/ > branches/PHP_5_2/NEWS > branches/PHP_5_2/ext/filter/logical_filters.c > branches/PHP_5_3/NEWS > branches/PHP_5_3/ext/filter/logical_filters.c > trunk/ext/filter/logical_filters.c > > On 04/02/2010 06:25 PM, Jared Williams wrote: > > > > > >> -----Original Message----- > >> From: Rasmus Lerdorf [mailto:ras...@lerdorf.com] > >> Sent: 03 April 2010 01:20 > >> To: Jared Williams > >> Cc: internals@lists.php.net > >> Subject: Re: [PHP-DEV] Re: [PHP-CVS] svn: /php/php-src/ > >> branches/PHP_5_2/NEWS branches/PHP_5_2/ext/filter/logical_filters.c > >> branches/PHP_5_3/NEWS > >> branches/PHP_5_3/ext/filter/logical_filters.c > >> trunk/ext/filter/logical_filters.c > >> > >> On 04/02/2010 04:47 PM, Jared Williams wrote: > >>> Would make sense. Especially considering HTML5's current > > validation > >>> rules of emails is something different again. > >>> > >>> > >> > > > http://www.whatwg.org/specs/web-apps/current-work/multipage/states-of- > >>> the-type-attribute.html#e-mail-state > >>> > >>> Having a mismatch in validation between client & server > >> just a recipe > >>> for user frustration. > >> > >> I actually think this regex is really close to the HTML5 > >> specification. > >> The main thing it drops are comments and folded > whitespace, both of > >> which are not supported in this regex either. > >> That means addresses like the following will all fail even though
> >> they are technically valid: > >> > >> test > >> b...@example.com > >> > >> (with a carriage return after test there) > >> > >> (hey)rasmus(there)@(go)php.net(woo) > >> > >> rasmus(Hey > >> guess what > >> this is a "valid" > >> email address) > >> @php.net > >> > >> rasmus."ras...@php.net"@php.net > >> > >> As far as I am concerned I am perfectly ok with rejecting > addresses > >> like these and I think we should just stick to the > >> HTML5 definition. > >> > >> The ABNF for an HTML5 valid email field is: > >> > >> 1*( atext / "." ) "@" ldh-str 1*( "." ldh-str ) > >> > >> which means there must be a . in the domain part, so HTML5 doesn't > >> think a...@b is valid either. The left-hand side looks wrong > though. > >> It seems to me it should be: > >> > >> 1*atext *("." 1*atext) > >> > >> You can't have a trailing . there. rasm...@php.net is not > valid and > >> if I am reading that HTML5 ABNF correctly it would seem to allow > >> that. > >> > > > > Interesting, missed the point of the . when initially > looked at this, > > Here's the regexp current in webkit > > > > 38 static const char emailPattern[] = > > 39 "[a-z0-9!#$%&'*+/=?^_`{|}~.-]+" // local part > > 40 "@" > > 41 "[a-z0-9-]+(\\.[a-z0-9-]+)+"; // domain part > > > > > (http://trac.webkit.org/browser/trunk/WebCore/html/ValidityState.cpp) > > I am all for disallowing esoteric otherwise valid addresses, > but this trivial regex will allow all sorts of completely > invalid addresses that will never actually route. Some > examples of invalid addresses that passes that regex: > > ras...@php.123 > ras...@-php.net > ras...@php-.net > ras...@php.net- > .ras...@php.net > rasm...@php.net > rasmus..lerd...@php.net > ....@php.net > ras...@128.128.128.128 > > That last one needs to be ras...@[128.128.128.128] to be valid. > It appears the HTML5 spec still allows the pattern attribute on a <input type="email"> And seems implemented that way in webkit http://trac.webkit.org/browser/trunk/WebCore/html/HTMLInputElement.cpp #L231 So can specify a more restrictive regexp for emails. Eg... <input type="email" pattern="[-a-zA-Z0-9!#$%&\'*+/=?^_`{|}~]+(\.[-a-zA-Z0-9!#$%&\'*+/=?^_` {|}~]+)*...@[a-za-z0-9]+(\.[a-zA-Z0-9]+)+"> Or whatever the pattern will be. Just thinking if might be good to have filter extension expose a HTML5 case insensitive regexp/pattern to PHP, so could insert it into forms, and not have to worry about it. Jared PS, though need to check how the multiple attribute interacts with pattern validation, <input type="email" name="list" multiple /> sends a comma separated list of email addresses. -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php