Re: Multiple regex on same URL

2020-07-07 Thread John Hardin

On Tue, 7 Jul 2020, Martin Gregorie wrote:


On Tue, 2020-07-07 at 22:07 +, Pedro David Marco wrote:

Thanks Martin, but  the meta may be possitive if one URL triggers
SUBRULE1 and another different URL triggers SUBRULE2...
 how can you be sure both SUBRULES are possitive in the "same" URL?


I didn't spot the requirement that the URIs must match: I read your
requirement as being that two matches from a group of URLs within a
defined set or with the same second level domain would do. My mistake.

Might it be easier to define and implement with a decent RDBMS and a
clever SQL query?


Ugh, no.

The (?=...)(?!...) is a good way, but if you use * or + you need to be 
careful to avoid the possibility of a backtrack DOS - use the "non-greedy" 
version. However, that weakness is smaller as we're looking at URIs rather 
than the entire message body - there's less to potentially backtrack over.


I suggest the positive match first, then the negative match, as the 
positive match will probably occur in only a small percentage of URIs 
scanned and will thus generally fail and shortcircuit the evaluation of 
the (much more likely to hit) negative lookforward match.


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  We have to realize that people who run the government can and do
  change. Our society and laws must assume that bad people -
  criminals even - will run the government, at least part of the
  time.   -- John Gilmore
---
 Today: Robert Heinlein's 113th birthday


Re: Multiple regex on same URL

2020-07-07 Thread John Hardin

On Tue, 7 Jul 2020, Martin Gregorie wrote:


On Tue, 2020-07-07 at 20:39 +, Pedro David Marco wrote:



  >On Tuesday, July 7, 2020, 03:16:34 PM GMT+2, Henrik K <
h...@hege.li> wrote:


Also newer SpamAssassin already has URIDetail plugin which can also
do what you want:
  uri_detail SYMBOLIC_TEST_NAME key1 =~ /value1/  key2 !~ /value2/
...

if it uses the same key more than once, then uri_detail joins them
with "OR", but we need an "AND"
-Pedro


That should be easy enough to do with a metarule:

uri   __SUBRULE1 /(URL alternateslist1)/
uri   __SUBRULE2 /(URL alternateslist2)/
meta  MYMETARULE (__SUBRULE1 && __SUBRULE2)
score MYMETARULE 6.0


Unfortunately there's no way to enforce them being checked together on the 
*same* URI: uri1 could hit SR1 and uri2 could hit SR2 and the meta would fire, but it 
would be inappropriate.


The (?=...)(?!...) construct is better.


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  We have to realize that people who run the government can and do
  change. Our society and laws must assume that bad people -
  criminals even - will run the government, at least part of the
  time.   -- John Gilmore
---
 Today: Robert Heinlein's 113th birthday


Re: Multiple regex on same URL

2020-07-07 Thread Pedro David Marco
 

   >On Wednesday, July 8, 2020, 12:28:37 AM GMT+2, Martin Gregorie 
 wrote:  
 >>I didn't spot the requirement that the URIs must match: I read your
>requirement as being that two matches from a group of URLs within a
>defined set or with the same second level domain would do. My mistake.

Probably my fault, Martin.. my "English" leaves much to be desired...

>Might it be easier to define and implement with a decent RDBMS and a
>clever SQL query? 
The simplest way has been to patch uri_detail plugin so it can combine multiple 
equal keys with OR or AND on demand... :-)
Pedro

  

Re: Multiple regex on same URL

2020-07-07 Thread Martin Gregorie
On Tue, 2020-07-07 at 22:07 +, Pedro David Marco wrote:
> Thanks Martin, but  the meta may be possitive if one URL triggers
> SUBRULE1 and another different URL triggers SUBRULE2...
>  how can you be sure both SUBRULES are possitive in the "same" URL? 
>
I didn't spot the requirement that the URIs must match: I read your
requirement as being that two matches from a group of URLs within a
defined set or with the same second level domain would do. My mistake.

Might it be easier to define and implement with a decent RDBMS and a
clever SQL query? 

Martin




Re: Multiple regex on same URL

2020-07-07 Thread Pedro David Marco
 

   >On Tuesday, July 7, 2020, 11:56:22 PM GMT+2, Martin Gregorie 
 wrote:  
 
> That should be easy enough to do with a metarule:

>uri  __SUBRULE1 /(URL alternateslist1)/
>uri  __SUBRULE1 /(URL alternateslist2)/
>meta  MYMETARULE (__SUBRULE1 && __SUBRULE2)
>score MYMETARULE 6.0

.>..or something like that

>Martin
Thanks Martin, but  the meta may be possitive if one URL triggers SUBRULE1 and 
another different URL triggers SUBRULE2...
 how can you be sure both SUBRULES are possitive in the "same" URL? 
-Pedro






  

Re: Multiple regex on same URL

2020-07-07 Thread Martin Gregorie
On Tue, 2020-07-07 at 20:39 +, Pedro David Marco wrote:
>  
> 
>>On Tuesday, July 7, 2020, 03:16:34 PM GMT+2, Henrik K <
> h...@hege.li> wrote:  
>  
> > Also newer SpamAssassin already has URIDetail plugin which can also
> > do what you want:
> >   uri_detail SYMBOLIC_TEST_NAME key1 =~ /value1/  key2 !~ /value2/
> > ...
> if it uses the same key more than once, then uri_detail joins them
> with "OR", but we need an "AND" 
> -Pedro
> 
That should be easy enough to do with a metarule:

uri   __SUBRULE1 /(URL alternateslist1)/
uri   __SUBRULE1 /(URL alternateslist2)/
meta  MYMETARULE (__SUBRULE1 &&
__SUBRULE2)
score MYMETARULE 6.0

...or something like that

Martin




Re: Multiple regex on same URL

2020-07-07 Thread Pedro David Marco
 

   >On Tuesday, July 7, 2020, 03:16:34 PM GMT+2, Henrik K  wrote: 
 
 
>Also newer SpamAssassin already has URIDetail plugin which can also do what 
>you want:

>  uri_detail SYMBOLIC_TEST_NAME key1 =~ /value1/  key2 !~ /value2/ ...
if it uses the same key more than once, then uri_detail joins them with "OR", 
but we need an "AND" 
-Pedro


  

Re: Multiple regex on same URL

2020-07-07 Thread @lbutlr
On 07 Jul 2020, at 07:16, Henrik K  wrote:
> On Tue, Jul 07, 2020 at 11:41:01AM +, Pedro David Marco wrote:
>> 
>>> On Tuesday, July 7, 2020, 01:05:36 PM GMT+2, Henrik K  wrote:
>> 
>> 
>>> What examply do you mean by checking multiple regex on the "same" URL?  Give
>> an example.  Most likely it's already possible without any changes.
>> 
>> 
>> for example..  checking if an URL matches Regex1  BUT does NOT matches 
>> Regex2 
>> can be done  with looksahead/behind but is cpu-expensive and may be too 
>> complex
>> to maintain... 
> 
> Why would lookahead be expensive?  It's normal regex.  It's probably more
> expensive to run two separate regexes.

Is the ReDos Attack relevant here?


"The Regular expression Denial of Service (ReDoS) is a Denial of Service 
attack, that exploits the fact that most Regular Expression implementations may 
reach extreme situations that cause them to work very slowly (exponentially 
related to input size). An attacker can then cause a program using a Regular 
Expression to enter these extreme situations and then hang for a very long 
time."



-- 
Once upon a time, a woman was picking up firewood. She came upon a
poisonous snake frozen in the snow. She took the snake home and
nurse it back to health. One day the snake bit her on the cheek.
As she lay dying, she asked the snake, "Why have you done this to
me?" And the snake answered, "Look, bitch, you knew I was a
snake."



Re: Best Possible Way To Block Phish/Malware URL

2020-07-07 Thread Raymond Dijkxhoorn

Hai!


That isn't only Phishtank data...


+1



and using that data in that particular way hardly scales to bigger setups


data could be stored in DB_File just like GeoIP2, that saves ram imho


Treansferring the complete set over and over might now be the best way of 
doing the distribution of datasets like that...


I agree with Alex, sets like that should be rdldnsd based to make it 
scalable imho.



FTR: GoogleSafeBrowsing is not free for all, anymore



that explains low hitratio ? :=)


 :-)

Bye, Raymond


Re: Multiple regex on same URL

2020-07-07 Thread Henrik K
On Tue, Jul 07, 2020 at 11:41:01AM +, Pedro David Marco wrote:
> 
> >On Tuesday, July 7, 2020, 01:05:36 PM GMT+2, Henrik K  wrote:
> 
> 
> >What examply do you mean by checking multiple regex on the "same" URL?  Give
> an example.  Most likely it's already possible without any changes.
> 
> 
> for example..  checking if an URL matches Regex1  BUT does NOT matches Regex2 
> can be done  with looksahead/behind but is cpu-expensive and may be too 
> complex
> to maintain... 

Why would lookahead be expensive?  It's normal regex.  It's probably more
expensive to run two separate regexes.

uri FOO /^(?!.*?donotfind)(?=.*?findthis)/

Also newer SpamAssassin already has URIDetail plugin which can also do what
you want:

  uri_detail SYMBOLIC_TEST_NAME key1 =~ /value1/  key2 !~ /value2/ ...



Re: Best Possible Way To Block Phish/Malware URL

2020-07-07 Thread Axb

On 7/7/20 2:57 PM, Benny Pedersen wrote:

Axb skrev den 2020-07-07 14:46:


That isn't only Phishtank data...


+1


and using that data in that particular way hardly scales to bigger setups


data could be stored in DB_File just like GeoIP2, that saves ram imho


rblnsd is the way to go:
- you can control TTL
- its scales to millions of minions
- it's cheap in terms of RAM and cycles
- low maintenance
- does not add load to clients.


Re: Best Possible Way To Block Phish/Malware URL

2020-07-07 Thread Benny Pedersen

Axb skrev den 2020-07-07 14:46:


That isn't only Phishtank data...


+1

and using that data in that particular way hardly scales to bigger 
setups


data could be stored in DB_File just like GeoIP2, that saves ram imho


FTR: GoogleSafeBrowsing is not free for all, anymore


that explains low hitratio ? :=)


Re: Best Possible Way To Block Phish/Malware URL

2020-07-07 Thread Raymond Dijkxhoorn

Hai!


I Tried GoogleSafeBrowsing but not helping much as it has very low
detection ratio.



is another reporting problem

whatever that may mean


if all phishes is reported to google then safebrowsing would be more 
usefull



FTR: GoogleSafeBrowsing is not free for all, anymore


If i recall correctly the ClamAV support for that also was stopped months 
ago. Due toi exactly that.


bye, Raymond


Re: Best Possible Way To Block Phish/Malware URL

2020-07-07 Thread Axb

On 7/7/20 2:39 PM, Benny Pedersen wrote:

Axb skrev den 2020-07-07 13:23:


domains listed in Phishtank are picked up by SURBL


and rbldnsd support a fix of this 
https://www.isc.org/blogs/qname-minimization-and-privacy/


i have disabled it in bind9


Phishtank signatures in SpamAssassin?


https://spamassassin.apache.org/full/3.4.x/doc/Mail_SpamAssassin_Plugin_Phishing.txt 




you probably mean ClamAV


no


That isn't only Phishtank data...
and using that data in that particular way hardly scales to bigger setups




I Tried GoogleSafeBrowsing but not helping much as it has very low
detection ratio.

is another reporting problem

whatever that may mean


if all phishes is reported to google then safebrowsing would be more 
usefull


FTR: GoogleSafeBrowsing is not free for all, anymore


Re: Best Possible Way To Block Phish/Malware URL

2020-07-07 Thread Benny Pedersen

Axb skrev den 2020-07-07 13:23:


domains listed in Phishtank are picked up by SURBL


and rbldnsd support a fix of this 
https://www.isc.org/blogs/qname-minimization-and-privacy/


i have disabled it in bind9


Phishtank signatures in SpamAssassin?


https://spamassassin.apache.org/full/3.4.x/doc/Mail_SpamAssassin_Plugin_Phishing.txt


you probably mean ClamAV


no


I Tried GoogleSafeBrowsing but not helping much as it has very low
detection ratio.

is another reporting problem

whatever that may mean


if all phishes is reported to google then safebrowsing would be more 
usefull


Re: Multiple regex on same URL

2020-07-07 Thread Pedro David Marco
 

   >On Tuesday, July 7, 2020, 01:05:36 PM GMT+2, Henrik K  wrote: 
 
 
>What examply do you mean by checking multiple regex on the "same" URL?  Give 
>an example.  Most likely it's already possible without any changes.

for example..  checking if an URL matches Regex1  BUT does NOT matches Regex2  
can be done  with looksahead/behind but is cpu-expensive and may be too complex 
to maintain... 

Pedro 


  

Re: Best Possible Way To Block Phish/Malware URL

2020-07-07 Thread Axb

On 7/7/20 1:20 PM, Benny Pedersen wrote:

KADAM, SIDDHESH skrev den 2020-07-07 13:13:


Can anybody suggest me a best possible way to block phish/malware url
from body of an email using spamassassin.


report to https://phishtank.com/ 1 step :=)

next is to use https://sanesecurity.com/ with phishtank signatures

using phishtank signatures in spamassassin needs more ram


domains listed in Phishtank are picked up by SURBL

Phishtank signatures in SpamAssassin?  you probably mean ClamAV


I Tried GoogleSafeBrowsing but not helping much as it has very low
detection ratio.


is another reporting problem 

whatever that may mean




Re: Best Possible Way To Block Phish/Malware URL

2020-07-07 Thread Benny Pedersen

KADAM, SIDDHESH skrev den 2020-07-07 13:13:


Can anybody suggest me a best possible way to block phish/malware url
from body of an email using spamassassin.


report to https://phishtank.com/ 1 step :=)

next is to use https://sanesecurity.com/ with phishtank signatures

using phishtank signatures in spamassassin needs more ram


I Tried GoogleSafeBrowsing but not helping much as it has very low
detection ratio.


is another reporting problem


Re: Best Possible Way To Block Phish/Malware URL

2020-07-07 Thread Axb

On 7/7/20 1:13 PM, KADAM, SIDDHESH wrote:

Guys,

Can anybody suggest me a best possible way to block phish/malware url from body
of an email using spamassassin.

I Tried GoogleSafeBrowsing but not helping much as it has very low detection 
ratio.

Regards,
Siddhesh


iirc  "ramprasad at NETCORE.CO.IN"  should be able to help you.



Best Possible Way To Block Phish/Malware URL

2020-07-07 Thread KADAM, SIDDHESH

  
  
Guys,
Can anybody suggest me a best possible way
to block phish/malware url from body of an email using
spamassassin. 
  
I Tried GoogleSafeBrowsing but not helping
much as it has very low detection ratio. 
  
Regards,
Siddhesh
  


  



Re: Multiple regex on same URL

2020-07-07 Thread Henrik K
On Tue, Jul 07, 2020 at 10:18:30AM +, Pedro David Marco wrote:
> I have written a small simple patch (tested in SA 3.4.2 so far, sorry) to be
> able to check up to three regex expressions on the "same" URL. It seems to 
> work
> well
> but... any crazy (with all respects) volunteer for checks.. tests... etc?
> 
> Disclaimer: I am not a super Perl developer, so the code may be ugly for perl
> monks :-(  sorry..

What examply do you mean by checking multiple regex on the "same" URL?  Give
an example.  Most likely it's already possible without any changes.



Re: Freshdesk (again)

2020-07-07 Thread Raymond Dijkxhoorn

Ha!


>We report abuse to many organisations, including, but not limited to company's 
like sendgrid.


We are so tired af reporting abuse with no answer at all, that we 
stopped reporting problems time ago :-( as Marc Roos has said... 
we are not paid for it !  


Understand completely.

Ironically... we han run into problems a couple of times for reporting 
abuses... probable someone considering you are "suggesting" they are 
not doing their job...


I know at least sendgrid is very much aware of whats going on.


If Sendgrid reacts to the reports, bravo for them!  


And again i can understand the sentiment. ... :-)

Bye, Raymond

Re: Freshdesk (again)

2020-07-07 Thread Pedro David Marco
 
   >On Tuesday, July 7, 2020, 11:24:10 AM GMT+2, Raymond Dijkxhoorn 
 wrote:  
 >Hello Marc,
>I hear you. And dont worry about that ;) rather have a clean inbox and so do 
>more people.

>We report abuse to many organisations, including, but not limited to company's 
>like sendgrid.

>Raymond Dijkxhoorn - SURBL

We are so tired af reporting abuse with no answer at all, that we stopped 
reporting problems time ago :-(as Marc Roos has said... we are not paid for 
it !  
Ironically... we han run into problems a couple of times for reporting 
abuses... probable someone considering you are "suggesting" they are not doing 
their job...
If Sendgrid reacts to the reports, bravo for them!  


Pedro




  

Re: Multiple regex on same URL

2020-07-07 Thread Matus UHLAR - fantomas

On 07.07.20 10:18, Pedro David Marco wrote:

I have written a small simple patch (tested in SA 3.4.2 so far, sorry) to
be able to check up to three regex expressions on the "same" URL.  It
seems to work wellbut...  any crazy (with all respects) volunteer for
checks..  tests...  etc?



Disclaimer: I am not a super Perl developer, so the code may be ugly for perl 
monks :-(  sorry..
Regards,
---Pedro.


try posting the patch or a link to it. 


--
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
"To Boot or not to Boot, that's the question." [WD1270 Caviar]


Multiple regex on same URL

2020-07-07 Thread Pedro David Marco
I have written a small simple patch (tested in SA 3.4.2 so far, sorry) to be 
able to check up to three regex expressions on the "same" URL. It seems to work 
wellbut... any crazy (with all respects) volunteer for checks.. tests... etc?
Disclaimer: I am not a super Perl developer, so the code may be ugly for perl 
monks :-(  sorry..
Regards,
---Pedro.






RE: Freshdesk (again)

2020-07-07 Thread Raymond Dijkxhoorn

Hello Marc,


They definately do. I report to them and they do take them down
pretty quickly.



Make sure you get paid for doing this every time. Because you are doing
the work that they should be doing.


I hear you. And dont worry about that ;)
I rather have a clean inbox and so do more people.

We report abuse to many organisations, including, but not limited to 
company's like sendgrid.


Raymond Dijkxhoorn - SURBL


Wildcarded lookups on SURBL

2020-07-07 Thread Raymond Dijkxhoorn

Hi!

Since a long time (and i know there has been discussion about it in the 
past) both SURBL and DBL offer their datalist as a wildcarded list.


Yet spamassassin still is stripping down the lookups to the base level and 
doing the ookups for that. Unless its inside the 2tld and 3tld additions.


The comunity would really benefit a lot more if there would be a version 
that looked up with the wildcards.


Spammers move to services they can easilly abuse.

A good example right now is them using page[.]link

SURBL is listing many abused subdomains there. While SA doesnt make use of 
that. While its not on the 2/3tld additions.


This works for many many domains and the list is simply too big to put all 
inside 2/3tld files. While its also changing daily.


Is there no way of changing the way these lookups are done for SURBL (and 
likely DBL either) this would really improove the system a lot i think.


We list new abused subdomains daily and there shiuld be no interaction on 
that with the users of the data IMHO.


How could we get something like this into action? File a bug?

Thanks! Raymond Dijkxhoorn - SURBL


RE: Freshdesk (again)

2020-07-07 Thread Marc Roos
 


>> They definately do. I report to them and they do take them down 
pretty quickly.

Make sure you get paid for doing this every time. Because you are doing 
the work that they should be doing.



Re: Freshdesk (again)

2020-07-07 Thread Raymond Dijkxhoorn

Hai!


it might help to add your complaint via ab...@sendgrid.com.


I very much doubt it. Sendgrid's business is sending mail and they do not 
care if that mail is spam or not. If enough servers block them they will go 
away.


They do, however, apparently care about phishing - they did disable the 
sendgrid redirect that some phisher has been spamming at me for the last 
three weeks.


They definately do. I report to them and they do take them down pretty 
quickly.


Inside SURBL we do list the abused CT links. Unfortunately SA doesnt make 
use of the wildcarded list that SURBL delivers for a long time now.


So if you want to use it add:

util_rb_3tldct.sendgrid.net

Inside your loca.cf

And while you are at it also add:

util_rb_2tldpage.link

Bye, Raymond