https://bz.apache.org/SpamAssassin/show_bug.cgi?id=8272
Bug ID: 8272
Summary: A HREF with UTF-8 host name invisible to SA
Product: Spamassassin
Version: 4.0.2
Hardware: PC
OS: Mac OS X
Status: NEW
Severity: normal
Priority: P2
Component: spamassassin
Assignee: [email protected]
Reporter: [email protected]
Target Milestone: Undefined
Created attachment 5961
--> https://bz.apache.org/SpamAssassin/attachment.cgi?id=5961&action=edit
Sample email not tagged by SA as having a URIBL match
We are seeing many emails that are not tagged for DNSBL hits even though the
domains are listed.
The test/html section contains an "<A HREF=" element with the URI containing a
Basic Authentication string ending in '@' followed by a host name in which the
basic domain is written as non-ASCII UTF-8 letters. These are variants of A-Z
other than 0x41-0x5a, 0x61-0x7a.
The basic authentication string up to '@' may also contain UTF-8 characters
such as E2 88 95 (a type of slash) to make it look like it was the beginning of
the host name.
Example:
00000000 68 72 65 66 3d 22 68 74 74 70 73 3a 2f 2f 73 62 |href="https://sb|
00000010 78 70 70 71 6d 67 6b 64 72 73 69 6d 67 6b 74 6d |xppqmgkdrsimgktm|
00000020 66 7a 6d 2e 63 6f 6d e2 88 95 73 62 78 70 70 71 |fzm.com...sbxppq|
00000030 6d 67 6b 64 72 73 69 6d 67 6b 74 6d 66 7a 6d e2 |mgkdrsimgktmfzm.|
00000040 88 95 73 62 78 70 70 71 6d 67 6b 64 72 73 69 6d |..sbxppqmgkdrsim|
00000050 67 6b 74 6d 66 7a 6d e2 88 95 73 62 78 70 70 71 |gktmfzm...sbxppq|
00000060 6d 67 6b 64 72 73 69 6d 67 6b 74 6d 66 7a 6d 40 |mgkdrsimgktmfzm@|
00000070 73 62 78 70 70 71 6d 67 6b 64 72 73 69 6d 67 6b |sbxppqmgkdrsimgk|
00000080 74 6d 66 7a 6d 2e f0 9d 95 9c f0 9d 95 96 f0 9d |tmfzm...........|
00000090 95 9b f0 9d 95 9a f0 9d 95 92 f0 9d 95 95 f0 9d |................|
000000a0 95 9e f0 9d 95 9a f0 9d 95 9f 2e 63 6e 2f 63 61 |...........cn/ca|
000000b0 6f 6e 69 6d 61 3d 73 62 78 70 70 71 6d 67 6b 64 |onima=sbxppqmgkd|
000000c0 72 73 69 6d 67 6b 74 6d 66 7a 6d 2e 63 6f 2e 6a |rsimgktmfzm.co.j|
000000d0 70 2f 22 0a |p/".|
The listed domain name in this example is kejiadmin[.]cn, written as
"𝕜𝕖𝕛𝕚𝕒𝕕𝕞𝕚𝕟[.]cn", and this is the actually payload opened by a browser.
--
You are receiving this mail because:
You are the assignee for the bug.