Am 2022-05-14 17:43, schrieb Henrik K:
On Sat, May 14, 2022 at 04:54:01PM +0200, Michael Storz wrote:

After Henrik has presented his implementation, I guess I have to tell you what I have been working on lately. I am working on a general Tag.pm Plugin. I took the Tagmatch.pm plugin from Paul and rewrote and extended it. With Paul's plugin you can do all kinds of operations on tags (I use tag instead of tagmatch because this looks similar to the header and body keywords). I
extended it with a settag command that allows you to extract data from
header, body or other tags via regexp and assign it to a tag. These tags can
then be used as usual. Coming back to the Esp.pm plugin: for me the
definition for an ESP looks like this:

####################
#
# Mailchimp
#
####################

# header field X-MC-User has the customer-id
settag _LRZ_MCID_ X-MC-User =~ /^([0-9a-z]{25})$/

Maybe we can consider tags and regex captures the same in the future.. they
are simply global variables.  In that case, a separate "settag" command
wouldn't even be needed, since you could just do the "header FOO
/(?<LRZMCID>bar)/" stanza.

Btw we already agreed somewhere that tags are not supposed to contain
underscores, since it's the tag delimiter itself. It could be awkward to
parse and make sense.

I know and I do not agree :-) The _ around a tag are only needed as an explizit representation to distinguish them from other stuff. For me tags will play a big role in SpamAssassin. Since tags should always be in uppercase we need _ to make them more readable. I do not think there will be big problems in parsing with templates. But we'll see.


thing which I have not done yet, is using tags in regexps like the example
above

body MATCHER /My name is ${FROM_NAME:NAME}/

There should consensus for a general form that will work well in the future for all these causes. If we consider that there is only one type of global
variable/tag, it would be simpler.

Yes.


I really dislike anything resembling $ { } because they are valid regexp
meta characters, that's asking for some trouble.

Just use a different sigil than $. Perl uses $, @, %, & and *. Looking at my keyboard, I see ยง and # which could be used.


If we start adding lot of "tag" stuff in the mix too, there will probably be a horrible web of dependencies all around. Not sure if there is anything we can do up front to ease it. Whole SA with it's arcane priority system and dozen plugins doing their thing in independent ways would really need to be
rewritten from ground up.  And then we can likely forget any backwards
compatibility for people that are still using years old versions.  Any
takers?  :-D

I think we are getting the asynchronous stuff working with 4.0 Therefore I do not see problems with this approach.


However, to fully create this design, I believe more time is needed and such functionality should not be incorporated into SpamAssassin until after the
4.0 release. First the handling of the tags must be improved, which is
currently totally broken. I am still writing together where the problems
with the tags are and how to fix them.

Good to see some enthusiasm. Personally I will be satisfied after 4.0.0 is released and will stay lurking and acting on any bugs, but that's probably it on my behalf.. there's hobbies and then there's hobbies that start to
feel like a payless job..

Michael

Reply via email to