Re: How do I search and capture text for use in a rule?

2021-05-07 Thread John Hardin

On Fri, 7 May 2021, Steve Dondley wrote:


On 2021-05-07 10:33 AM, Henrik K wrote:

On Fri, May 07, 2021 at 10:19:49AM -0400, Steve Dondley wrote:
I want to extract the first part of an email address from the 
"Delivered-To"

header and use it witin a custom rule.

Example pseudo code:

my ($first_part) = $email_file =~ /^Deliver-To: (.*)/;

body __LOCAL_AWKWARD_INTRO /hi $first_part/i


How can I do this in my .cf file?


With a silly kludge, a full rule that matches the complete raw email with a
single regex.  Example in stock rules:

full __FROM_NAME_IN_MSG /^From:\s+([^<]\S+\s\S+)\s(?=.{1,2048}^\1\r?$)/sm

So something like (untested)

full __LOCAL_AWKWARD_INTRO
/^Delivered-To:\s+<([^@>]+)(?=.{1,2048}\bHi\s+\1\b)/sm



Thanks. I don't quite understand the {1,2048} bit. That looks like a look 
ahead assertion up to 2048 characters? What is magical about 2048?


A limit there it to prevent runaway matching and excessive scan times.

What if the "Delivered-To" header is more than 2048 characters away from 
the salutation, which doesn't seem unlikely.


That is indeed a shortcoming with this approach. As Henrik says, it's a 
kludge.


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.org pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
 Tomorrow: the 76th anniversary of VE day


Re: How do I search and capture text for use in a rule?

2021-05-07 Thread Steve Dondley

On 2021-05-07 10:33 AM, Henrik K wrote:

On Fri, May 07, 2021 at 10:19:49AM -0400, Steve Dondley wrote:
I want to extract the first part of an email address from the 
"Delivered-To"

header and use it witin a custom rule.

Example pseudo code:

my ($first_part) = $email_file =~ /^Deliver-To: (.*)/;

body __LOCAL_AWKWARD_INTRO /hi $first_part/i


How can I do this in my .cf file?


With a silly kludge, a full rule that matches the complete raw email 
with a

single regex.  Example in stock rules:

full __FROM_NAME_IN_MSG 
/^From:\s+([^<]\S+\s\S+)\s(?=.{1,2048}^\1\r?$)/sm


So something like (untested)

full __LOCAL_AWKWARD_INTRO
/^Delivered-To:\s+<([^@>]+)(?=.{1,2048}\bHi\s+\1\b)/sm



Thanks. I don't quite understand the {1,2048} bit. That looks like a 
look ahead assertion up to 2048 characters? What is magical about 2048? 
What if the "Delivered-To" header is more than 2048 characters away from 
the salutation, which doesn't seem unlikely.


Re: How do I search and capture text for use in a rule?

2021-05-07 Thread John Hardin

On Fri, 7 May 2021, Henrik K wrote:


On Fri, May 07, 2021 at 10:19:49AM -0400, Steve Dondley wrote:

I want to extract the first part of an email address from the "Delivered-To"
header and use it witin a custom rule.

Example pseudo code:

my ($first_part) = $email_file =~ /^Deliver-To: (.*)/;

body __LOCAL_AWKWARD_INTRO /hi $first_part/i


How can I do this in my .cf file?


With a silly kludge, a full rule that matches the complete raw email with a
single regex.


We're discussing neater ways to do that on the dev list, it's something 
that's been desired for a long time.


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.org pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
 Tomorrow: the 76th anniversary of VE day


Re: How do I search and capture text for use in a rule?

2021-05-07 Thread Henrik K
On Fri, May 07, 2021 at 10:19:49AM -0400, Steve Dondley wrote:
> I want to extract the first part of an email address from the "Delivered-To"
> header and use it witin a custom rule.
> 
> Example pseudo code:
> 
> my ($first_part) = $email_file =~ /^Deliver-To: (.*)/;
> 
> body __LOCAL_AWKWARD_INTRO /hi $first_part/i
> 
> 
> How can I do this in my .cf file?

With a silly kludge, a full rule that matches the complete raw email with a
single regex.  Example in stock rules:

full __FROM_NAME_IN_MSG /^From:\s+([^<]\S+\s\S+)\s(?=.{1,2048}^\1\r?$)/sm

So something like (untested)

full __LOCAL_AWKWARD_INTRO 
/^Delivered-To:\s+<([^@>]+)(?=.{1,2048}\bHi\s+\1\b)/sm

If the raw message is Base64 encoded or such, it will never match.



How do I search and capture text for use in a rule?

2021-05-07 Thread Steve Dondley
I want to extract the first part of an email address from the 
"Delivered-To" header and use it witin a custom rule.


Example pseudo code:

my ($first_part) = $email_file =~ /^Deliver-To: (.*)/;

body __LOCAL_AWKWARD_INTRO /hi $first_part/i


How can I do this in my .cf file?


Re: ExtractText and docx

2021-05-07 Thread Benny Pedersen

On 2021-05-07 06:58, Henrik K wrote:


Which is why I'm debating if the whole plugin is useful at all or
just feeding Bayes crap.


oh dear :=)

bayes can only be fooled by provide poison data in autolearn, if it 
manuel trained as spam, then poison data loose


maybe there is another problem, YMMV

clamav with foxhole 3dr party sigs can check javascript embedded, more 
clean solution imho