https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6985

--- Comment #12 from Henrik Krohns <[email protected]> ---
(In reply to Mark Martinec from comment #11)
>
> Although it would involve some more intrusive code changes, my choice
> would be to replace the regexp TLD lookup with an associative array (hash)
> lookup - for speed and simplicity of configuration.

my $urischemeless =
qr/[a-z\d][a-z\d._-]{0,251}\.${tldsRE}\.?(?::\d{1,5})?(?:\/[^$tbirdenddelim]{1,251})?/io;
..
my $tbirdurire = qr/(?:\b|(?<=$iso2022shift)|(?<=[$tbirdstartdelim]))
                    (?:(?:($uriknownscheme)(?=(?:[$tbirdenddelim]|\z))) |
                       (?:($urimailscheme)(?=(?:[$tbirdenddelimemail]|\z))) |
                       (?:\b($urischemeless)(?=(?:[$tbirdenddelim]|\z))))/xo;
..
while (/$tbirdurire/igo) {

If tldsRE is replaced with a generic alphanum match, it's going to while-loop
bazillion strings which yet have to be sliced and tld-checked.

I have a hard time seeing how that's more efficient, instead of just
re-generating VALID_TLDS_RE with few lines of code so it's usable where needed.

-- 
You are receiving this mail because:
You are the assignee for the bug.

Reply via email to