On Thu, 25 Feb 2021, Rick Cooper wrote:

I was just working on some rules to catch the current crop of mal formed
urls used to escape detection by solutions that extract urls from emails and
compare them to known bad urls and I am wondering if spamassassin's patterns
for extraction take this into account?

For instance:

https:www.google.com/mail
https:\/www.google.com/mail
https:\\www.google.com/mail

Will all work at getting you to gmail because the technical spec doesn't
actually require \\ after the colon.
Will spamassassin still extract and normalize the urls above? I was hoping
to avoid digging through the source to find out.

Yes, all of those do get detected and normalized.

http:fnord01.com/blah
http:\/fnord02.com/blah
http:/\fnord03.com/blah
http:\\fnord04.com/blah

Feb 25 13:24:03.445 [13854] dbg: rules: ran uri rule __ALL_URI ======> got hit: 
"http://fnord03.com/blah";
Feb 25 13:24:03.446 [13854] dbg: rules: ran uri rule __ALL_URI ======> got hit: 
"http://fnord02.com/blah";
Feb 25 13:24:03.447 [13854] dbg: rules: ran uri rule __ALL_URI ======> got hit: 
"http://fnord01.com/blah";
Feb 25 13:24:03.447 [13854] dbg: rules: ran uri rule __ALL_URI ======> got hit: 
"http://fnord04.com/blah";


--
 John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
 jhar...@impsec.org                         pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  Are you a mildly tech-literate politico horrified by the level of
  ignorance demonstrated by lawmakers gearing up to regulate online
  technology they don't even begin to grasp? Cool. Now you have a
  tiny glimpse into a day in the life of a gun owner.   -- Sean Davis
-----------------------------------------------------------------------
 271 days since the first private commercial manned orbital mission (SpaceX)

Reply via email to