> -----Original Message-----
> From: Rasmus Lerdorf [mailto:ras...@lerdorf.com] 
> Sent: 03 April 2010 02:44
> To: Jared Williams
> Cc: internals@lists.php.net
> Subject: Re: [PHP-DEV] Re: [PHP-CVS] svn: /php/php-src/ 
> branches/PHP_5_2/NEWS 
> branches/PHP_5_2/ext/filter/logical_filters.c 
> branches/PHP_5_3/NEWS 
> branches/PHP_5_3/ext/filter/logical_filters.c 
> trunk/ext/filter/logical_filters.c
> 
> On 04/02/2010 06:25 PM, Jared Williams wrote:
> >  
> > 
> >> -----Original Message-----
> >> From: Rasmus Lerdorf [mailto:ras...@lerdorf.com]
> >> Sent: 03 April 2010 01:20
> >> To: Jared Williams
> >> Cc: internals@lists.php.net
> >> Subject: Re: [PHP-DEV] Re: [PHP-CVS] svn: /php/php-src/ 
> >> branches/PHP_5_2/NEWS
branches/PHP_5_2/ext/filter/logical_filters.c
> >> branches/PHP_5_3/NEWS
> >> branches/PHP_5_3/ext/filter/logical_filters.c
> >> trunk/ext/filter/logical_filters.c
> >>
> >> On 04/02/2010 04:47 PM, Jared Williams wrote:
> >>> Would make sense. Especially considering HTML5's current
> > validation
> >>> rules of emails is something different again.
> >>>
> >>>
> >>
> > 
>
http://www.whatwg.org/specs/web-apps/current-work/multipage/states-of-
> >>> the-type-attribute.html#e-mail-state
> >>>
> >>> Having a mismatch in validation between client & server
> >> just a recipe
> >>> for user frustration.
> >>
> >> I actually think this regex is really close to the HTML5 
> >> specification.
> >>  The main thing it drops are comments and folded 
> whitespace, both of 
> >> which are not supported in this regex either.
> >> That means addresses like the following will all fail even though

> >> they are technically valid:
> >>
> >> test
> >> b...@example.com
> >>
> >> (with a carriage return after test there)
> >>
> >> (hey)rasmus(there)@(go)php.net(woo)
> >>
> >> rasmus(Hey
> >> guess what
> >> this is a "valid"
> >> email address)
> >> @php.net
> >>
> >> rasmus."ras...@php.net"@php.net
> >>
> >> As far as I am concerned I am perfectly ok with rejecting 
> addresses 
> >> like these and I think we should just stick to the
> >> HTML5 definition.
> >>
> >> The ABNF for an HTML5 valid email field is:
> >>
> >>   1*( atext / "." ) "@" ldh-str 1*( "." ldh-str )
> >>
> >> which means there must be a . in the domain part, so HTML5
doesn't 
> >> think a...@b is valid either.  The left-hand side looks wrong 
> though.  
> >> It seems to me it should be:
> >>
> >>   1*atext *("." 1*atext)
> >>
> >> You can't have a trailing . there.  rasm...@php.net is not 
> valid and 
> >> if I am reading that HTML5 ABNF correctly it would seem to allow 
> >> that.
> >>
> > 
> > Interesting, missed the point of the . when initially 
> looked at this, 
> > Here's the regexp current in webkit
> > 
> > 38  static const char emailPattern[] =
> > 39      "[a-z0-9!#$%&'*+/=?^_`{|}~.-]+" // local part
> > 40      "@"
> > 41      "[a-z0-9-]+(\\.[a-z0-9-]+)+"; // domain part
> > 
> > 
>
(http://trac.webkit.org/browser/trunk/WebCore/html/ValidityState.cpp)
> 
> I am all for disallowing esoteric otherwise valid addresses, 
> but this trivial regex will allow all sorts of completely 
> invalid addresses that will never actually route.  Some 
> examples of invalid addresses that passes that regex:
> 
> ras...@php.123
> ras...@-php.net
> ras...@php-.net
> ras...@php.net-
> .ras...@php.net
> rasm...@php.net
> rasmus..lerd...@php.net
> ....@php.net
> ras...@128.128.128.128
> 
> That last one needs to be ras...@[128.128.128.128] to be valid.
> 

It appears the HTML5 spec still allows the pattern attribute on a
<input type="email">

And seems implemented that way in webkit

http://trac.webkit.org/browser/trunk/WebCore/html/HTMLInputElement.cpp
#L231

So can specify a more restrictive regexp for emails.

Eg... 

<input type="email"
pattern="[-a-zA-Z0-9!#$%&\'*+/=?^_`{|}~]+(\.[-a-zA-Z0-9!#$%&\'*+/=?^_`
{|}~]+)*...@[a-za-z0-9]+(\.[a-zA-Z0-9]+)+"> 
Or whatever the pattern will be.

Just thinking if might be good to have filter extension expose a HTML5
case insensitive regexp/pattern to PHP, so could insert it into forms,
and not have to worry about it.

Jared

PS, though need to check how the multiple attribute interacts with
pattern validation, <input type="email" name="list" multiple /> sends
a comma separated list of email addresses.

  


-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to