You've convinced me that staying close to RFC is a "best choice" even though we lose the opportunity for users to correct addresses at the point of data entry.
nb the suggested regex in my last posting doesn't work well enough! e.g., [email protected] isn't matched C On Aug 7, 4:48 pm, Jonathan Lundell <[email protected]> wrote: > On Aug 7, 2009, at 8:13 AM, Carl wrote: > > > > > This is an excellent article on the traps to beware of when regex'ing > > email address formats > > >http://www.regular-expressions.info/email.html > > > This may ignite a debate though :) > > A discussion, maybe. In the abstract, I like the idea of verifying the > RFC verbatim, but we *should* be clear on what we're trying to do. > Guard against typos? Prevent some kind of attack? How much do we care > about false positives? > > The article objects (to RFC-style checking) that [email protected], > for example, will validate. I'm not too concerned about that, in that > there are lots of ways that a user can enter a wrong but > (syntactically) valid address. We deal with that through active > validation, not a syntax check. > > Might there be a security concern? The quoted variation of the RFC > checker is very permissive: > > "([^"\r\\]|\\["\r\\])*" > > Could that open the door to some kind of injection attack? Presumably > we sanitize it for display; how about when we actually use it to send > mail? Any consumer that doesn't understand quoted names could end up > very confused. > > I take false positives as a v. bad thing: if a user enters a real and > valid address, I do not want to reject it. So I don't much like the > explicit list of TLDs (below), on the grounds that it's bound to > expand, and at some point it'll break. From the Wikipedia TLD article: > > > During the 32nd International Public ICANN Meeting in Paris in 2008, > > ICANN started a new process of TLD naming policy to take a > > "significant step forward on the introduction of new generic top- > > level domains." This program envisions the availability of many new > > or already proposed domains, as well a new application and > > implementation process. Observers believed that the new rules could > > result in hundreds of new gTLDs to be registered. Proposed TLDs > > include music, berlin and nyc. > > I think I'd favor the RFC-style pattern without the quoted-name > alternation. > > One thing we could do is to give the developer an option: > IS_EMAIL(something or other) that lets them select one of a small > number of regexes. And of course the developer can always use IS_MATCH > if they don't like our choice of email filters. > > If we permitted a choice, I'd suggest: > > 1. default to the RFC regex, but without quoted names > 2. RFC including quoted names > 3. something like the pattern below, including the TLD filter (maybe) > > > > > > > I favour this variation... > > [a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*@(?:[a- > > z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+(?:[A-Z]{2}|com|org|net|gov|mil|biz| > > info|mobi|name|aero|jobs|museum)\b > > > C > > > On Aug 7, 8:25 am, Jonathan Lundell <[email protected]> wrote: > >> On Aug 7, 2009, at 12:22 AM, mdipierro wrote: > > >>> I will take a patch for this. > > >> If nobody else gets to it first, I'll work up a patch over the > >> weekend. > > >>> Massimo > > >>> On Aug 7, 1:33 am, Jonathan Lundell <[email protected]> wrote: > >>>> On Aug 6, 2009, at 9:32 PM, DenesL wrote: > > >>>>> IS_EMAIL does not follow the RFC specs for valid email addresses > >>>>> (seehttp://en.wikipedia.org/wiki/E-mail_address) > > >>>>> even a simple [email protected] fails > > >>>>> it is kinda late to work on the regex now, maybe tomorrow. > > >>>> The RFC is fairly hard to validate. If that's what we really > >>>> want, I > >>>> found this one on the web that looks about right: > > >>>> ^(?!\.)("([^"\r\\]|\\["\r\\])*"|([-a-z0-9!#$%&'*+/=?^_`{|}~]|(?...@[a- > >>>> z0-9][\w\.-]*[a-z0-9]\.[a-z][a-z\.]*[a-z]$ > > >>>> It assumes the case-insensitive flag. > > >>>>http://haacked.com/archive/2007/08/21/i-knew-how-to-validate-an- > >>>> email... > > >>>> Overkill? Or, what the heck? --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "web2py-users" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/web2py?hl=en -~----------~----~----~----~------~----~------~--~---

