https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6985
--- Comment #12 from Henrik Krohns <[email protected]> --- (In reply to Mark Martinec from comment #11) > > Although it would involve some more intrusive code changes, my choice > would be to replace the regexp TLD lookup with an associative array (hash) > lookup - for speed and simplicity of configuration. my $urischemeless = qr/[a-z\d][a-z\d._-]{0,251}\.${tldsRE}\.?(?::\d{1,5})?(?:\/[^$tbirdenddelim]{1,251})?/io; .. my $tbirdurire = qr/(?:\b|(?<=$iso2022shift)|(?<=[$tbirdstartdelim])) (?:(?:($uriknownscheme)(?=(?:[$tbirdenddelim]|\z))) | (?:($urimailscheme)(?=(?:[$tbirdenddelimemail]|\z))) | (?:\b($urischemeless)(?=(?:[$tbirdenddelim]|\z))))/xo; .. while (/$tbirdurire/igo) { If tldsRE is replaced with a generic alphanum match, it's going to while-loop bazillion strings which yet have to be sliced and tld-checked. I have a hard time seeing how that's more efficient, instead of just re-generating VALID_TLDS_RE with few lines of code so it's usable where needed. -- You are receiving this mail because: You are the assignee for the bug.
