https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6985

--- Comment #6 from Kevin A. McGrail <[email protected]> ---
(In reply to AXB from comment #5)
> (In reply to Kevin A. McGrail from comment #4)
> > (In reply to Kevin A. McGrail from comment #3)
> > > Mark, I've been thinking the same things especially where I want to treat 
> > > a
> > > blog hoster, for example, as a TLD so that I can blacklist 
> > > dave.blog.example
> > > but not steve.blog.example.
> > 
> > From looking at this internally, I believe the answer is the util_rb_2TLD
> > code.
> > 
> > So new TLDs can be added to any old cf file using this code and then added
> > to RegistrarBoundaries.pm
> > 
> > see 20_aux_tlds.cf
> > 
> > Thoughts?
> 
> imo, not ideal - it will be missing "regex optimization"
> 
>  %VALID_TLDS as Regexp::List optimized regexp, for use in Plugins etc
> # Paste above list to:
> #  perl -MRegexp::List -e '$/=undef; $_=<>; $r = Regexp::List->new; push @l,
> $_ for (split); print $r->list2re(@l)'
> # Verified up to date 20120401
> $VALID_TLDS_RE = qr/
>   (?=[abcdefghijklmnopqrstuvwxyz])
>  
> (?:a(?:e(?:ro)?|r(?:pa)?|s(?:ia)?|[cdfgilmnoqtuwxz])|b(?:
> iz?|[abdefghjmnorstwyz])
>  
> |c(?:at?|o(?:m|op)?|[cdfghiklmnruvxyz])|d[ejkmoz]|e(?:[cegrst]|d?u)|f[ijkmor]
>  
> |g(?:[adefghilmnpqrstuwy]|ov)|h[kmnrtu]|i(?:n(?:fo|t)?|[delmoqrst])|j(?:o(?:
> bs)?|[emp])
>  
> |k[eghimnprwyz]|l[abcikrstuvy]|m(?:o(?:bi)?|u(?:
> seum)?|[acdeghkmnpqrstvwxyz]|i?l)
>   |n(?:a(?:me)?|et?|[cfgilopruz])|o(?:m|rg)|p(?:ro?|[aefghklmnstwy])|r[eosuw]
>   |s[abcdeghiklmnortuvyz]|t(?:r(?:avel)?|[cdfghjkmnoptvwz]|e?l)|u[agksyz]
>   |v[aceginu]|w[fs]|y[et]|z[amw]|qa|xxx
>   )/ix;

That regex is painful... I don't know that we can maintain it and publish new
TLDs fast enough via cf file.  In my opinion, the entire RegistrarBoundaries.pm
should be moved to a CF file that we can maintain via nightly rules updates.

Thoughts?

-- 
You are receiving this mail because:
You are the assignee for the bug.

Reply via email to