http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5780
------- Additional Comments From [EMAIL PROTECTED] 2008-01-19 15:58 -------
I'm using this comment as a convenient place to document something that I've
found: I'm concerned that the code in PerMsgStatus.pm that parses out email
addresses is some fancy stuff from Email::Find that while elegant, is written to
cover the specs in the RFCs rather than what actually can be handled by
real-world MTAs and MUAs. For example it recognizes -AE/[EMAIL PROTECTED] as
a mailto URI (although it leaves out the final ']' which is probably was not
intended). The criteria used by ThunderBird and Outlook Express for what email
addresses to linkify are already quite liberal and I think provide a better
model for what SpamAssassin should recognize. Thunderbird lexical scan
determines ths start of a URI using the following characters:
><"'`,{[(|\ space, and for an email URI (any with a @ in it) any non-ASCII
character
The end of a URI
><"`}]{[(|
and (' or non-ASCII in email URIs
and ) only if a ( was found after the start (not in email URIs)
Also, the following characters are allowed inside a URI but not as the last
character, because in that position they are considered as part of the plan text
of a message:
.,;!?-'
This will be clearer when I code up test cases. I still need to determine what
the edge cases for Outlook Express are, but without the source code I have to do
that experimentally.
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.