Esp.pm

Loren Wilton Sun, 02 May 2021 01:37:17 -0700

John Hardin wrote:

An awful lot I think could be done simply by having rules that cancapture to named per-message-global variables, and allowing thosevariables to be used in other (or the same) rules.
I've been wanting this for years.


Proposal for discussion:

Consider the following rules that could be in a user_prefs on a system thatallows per-user rules:


   header __TO_ME   To:Addr    /<me\@myhost\.com>/
   meta    NOT_TO_ME    !__TO_ME

This could be useful for a single user, but obviously could not besite-wide, even if the site found such a thing useful, as all users havetheir own email addresses. The problem is obviously the hard-coded addressstring.

PMS has a number of per-message variables attached to it that can be used inthe Perl code for various things. I'm proposing a way to add per-messageconstants and variables to this collection, and a way to access them in ruletext.


Consider this variation on the above rules:

   variable __ME            /me\@myhost\.com/
   header    __TO_ME    To:Addr    /<$(__ME)>/
   meta    NOT_TO_ME    !__TO_ME

The format of the "variable" declaration is deliberately the simplified formof a rule declaration to simplify parsing and help file descriptionconsiderations. Since it is actually defining a constant it could be called"constant". I called it "variable" because the thing it defines could varyfrom user to user and message to message.

Now we can put the __ME in each user_prefs and have a global rule in somesite-global rules file. Each __ME instance would stick the string into a PMSvariable for the duration of the message being parsed. The name of the PMSvariable would be some variation on the "rule" name of the variable.

The text of the rules with a $(name) string in them, to be compiled, wouldhave to have a way to reach into the relevent PMS variable to resolve thatpart of the string. Perhaps this means that rules containing variables couldnot be compiled. As the number of them is likely to be relatively smallcompared to the number of all rules, that is probably an acceptabletradeoff.

There are no execution ordering problems as long as 'variable' declarationsare parsed before rules are run.


Now consider variable capture from the message:

header __SUB_CAP Subject:Capture /Your (\w+) Order/i$(__COMPANY)=\1

Here we can define a PMS variable and populate it on a rule match. The rulecan be used as any normal rule, it just additionally captures one or morevariables while it runs.

We can use this is a match against some other message part with a rulesimilar to the __TO_ME rule above. Obviously in this case we have ruleordering to consider, since we have to capture (or attempt the capture; thestring may come up null) before we can run the rule depending on the string.


An alternative to the above capture symtax could be:

   header __COMPANY    Subject    /Your (\w+) Order/

The disadvantages I see to this are that you can only capture one stringfrom the match, and you now have to wonder whether a rule name representsand integer or a string or both. I'm not in favor of the mess this couldcreate.

Note that you could extend this fairly trivially (in a syntactic sense) toallow a match against multiple captured strings in a pattern:


   variable    __GROUP    /Order for $(__ME_) from $(__ORDER)/
   body        MY_ORDER    /$(__GROUP)/i    # __GROUP exists in the body

This also gets into problems of rule dependencies, since now some 'variable'declarations could not be executed until other rules have run. But it mightbe worth considering as an extension. Likely the mechanisms to implement theconstant declaration, capture, and match code would be most all that wasnecessary to implement this too.


I think that the above would do most of what people would like to do.

Errors:

A reference to an undefined variable would be a rule syntax error,invalidating that rule.

   A poorly formatted capture would be a rule syntax error
   A poorly formatted variable would essentially be a rule syntax error.

Circular references would be a rule syntax error, invalidating the rulewhere it was detected (which could then invalidate other depending rules)

A dependency on text from a net rule would push other depending ruleevaluation to after the net rules returned results. I assume this is alreadydone for meta rules that depend on net rules. But I could see thispotentially being a pain point.

I'm not married to any of the above suggested syntax; it just seemed like areasonable starting point, and simple to describe and to use. Discussion andsuggestions on the formats are welcome.

I don't know what it would mean to put a 'variable' name in a meta. Likelyit would be meaningless and probably disallowed. Likewise I don't see muchpoint in assigning a score or a description to a varable, since they aren'treally rules themselves.


Discussion is open!

       Loren

Re: svn commit: r1889364 - /spamassassin/trunk/lib/Mail/SpamAssassin/Plugin/Esp.pm

Reply via email to