Sidney,

> News of an ICANN decision to allow international character
> sets in domain names was reported last week, for example

IDN and punycode has been around for a while below TLD, but so far
the few TLDs were only for testing. We came across it in:

  http://marc.info/?t=123928717600002

> I'm concerned that it might have a big impact on SpamAssassin's parsing
> of headers and URLs.

It is quite possible there is still some too-strict regexp lying
around. I know I fixed some in a dkim plugin.

> However, what does this mean for detecting URLs in plain text messages
> in which a URL string can be in a non-ASCII charset and MUAs might
> (eventually) parse them as URLs?

Slippery road ahead...
Can't hurt to open a PR as a placeholder for concerns and ideas.

  Mark

Reply via email to