https://bz.apache.org/SpamAssassin/show_bug.cgi?id=8272

            Bug ID: 8272
           Summary: A HREF with UTF-8 host name invisible to SA
           Product: Spamassassin
           Version: 4.0.2
          Hardware: PC
                OS: Mac OS X
            Status: NEW
          Severity: normal
          Priority: P2
         Component: spamassassin
          Assignee: dev@spamassassin.apache.org
          Reporter: joew...@surbl.org
  Target Milestone: Undefined

Created attachment 5961
  --> https://bz.apache.org/SpamAssassin/attachment.cgi?id=5961&action=edit
Sample email not tagged by SA as having a URIBL match

We are seeing many emails that are not tagged for DNSBL hits even though the
domains are listed. 

The test/html section contains an "<A HREF=" element with the URI containing a
Basic Authentication string ending in '@' followed by a host name in which the
basic domain is written as non-ASCII UTF-8 letters. These are variants of A-Z
other than 0x41-0x5a, 0x61-0x7a. 

The basic authentication string up to '@' may also contain UTF-8 characters
such as E2 88 95 (a type of slash) to make it look like it was the beginning of
the host name.

Example:

00000000  68 72 65 66 3d 22 68 74  74 70 73 3a 2f 2f 73 62  |href="https://sb|
00000010  78 70 70 71 6d 67 6b 64  72 73 69 6d 67 6b 74 6d  |xppqmgkdrsimgktm|
00000020  66 7a 6d 2e 63 6f 6d e2  88 95 73 62 78 70 70 71  |fzm.com...sbxppq|
00000030  6d 67 6b 64 72 73 69 6d  67 6b 74 6d 66 7a 6d e2  |mgkdrsimgktmfzm.|
00000040  88 95 73 62 78 70 70 71  6d 67 6b 64 72 73 69 6d  |..sbxppqmgkdrsim|
00000050  67 6b 74 6d 66 7a 6d e2  88 95 73 62 78 70 70 71  |gktmfzm...sbxppq|
00000060  6d 67 6b 64 72 73 69 6d  67 6b 74 6d 66 7a 6d 40  |mgkdrsimgktmfzm@|
00000070  73 62 78 70 70 71 6d 67  6b 64 72 73 69 6d 67 6b  |sbxppqmgkdrsimgk|
00000080  74 6d 66 7a 6d 2e f0 9d  95 9c f0 9d 95 96 f0 9d  |tmfzm...........|
00000090  95 9b f0 9d 95 9a f0 9d  95 92 f0 9d 95 95 f0 9d  |................|
000000a0  95 9e f0 9d 95 9a f0 9d  95 9f 2e 63 6e 2f 63 61  |...........cn/ca|
000000b0  6f 6e 69 6d 61 3d 73 62  78 70 70 71 6d 67 6b 64  |onima=sbxppqmgkd|
000000c0  72 73 69 6d 67 6b 74 6d  66 7a 6d 2e 63 6f 2e 6a  |rsimgktmfzm.co.j|
000000d0  70 2f 22 0a                                       |p/".|

The listed domain name in this example is kejiadmin[.]cn, written as
"𝕜𝕖𝕛𝕚𝕒𝕕𝕞𝕚𝕟[.]cn", and this is the actually payload opened by a browser.

-- 
You are receiving this mail because:
You are the assignee for the bug.

Reply via email to