Am 2022-05-03 17:08, schrieb Henrik K:
I really dislike these kinds of "band-aids" which really don't help the
main
cause: terribly convoluted plugin code. Why skip debug output if owner
is
missing? Is the a good reason for owner missing, what is the cause for
that
and can it be better fixed upstream? Why are there all kinds of regexs
without any sanity/error checks? Etc..
If I don't hear any volunteers, I guess I need to try to clean up this
too.
@Henrik: Please read the end of the text for the question I have about
your rewrite of FromNameSpoof.pm
After Henrik asked for volunteers, I took a quick look at the
FromNameSpoof.pm plugin and, like Henrik, immediately realized that the
DKIM signature part can't work. You can't call the same function in both
async and sync mode. The call to action_depends_on_tags2 is not a
declaration that makes every call to _check_fromnamespoof an
asynchronous call.
Why does the plugin seem to work anyway? If you were to examine the log
files closely, you would find that the plugin works most of the time but
not always. In one of my plugins I fell for a similar bug related to
DKIMDOMAIN. So I think it is interesting to present a detailed analysis
of the problem.
Apart from the 35 predefined common tags in the PerMsgStatus.pm plugin,
which are mainly for use in templates, there is another class of tags
defined via set_tag, which for me I divide into static and dynamic tags.
Static tags are those that are defined before the rules are executed.
These essentially come from the Metadata.pm and PerMsgStatus.pm modules.
Examples are SENDERDOMAIN and AUTHORDOMAIN.
Dynamic tags are defined by executing rules with an eval function.
DKIMDOMAIN belongs to the dynamic tags even if one is tempted to count
it among the static tags due to the similarity in name to SENDERDOMAIN
and AUTHORDOMAIN.
The tag DKIMDOMAIN is set together with the tags DKIMIDENTITY and
DKIMSELECTOR by the worker routine _check_dkim_signature of the plugin
DKIM.pm. The two plugins DKIM.pm and FromNameSpoof.pm have a similar
structure. Both have a worker routine that is activated by the first
eval function called and calculates and caches the results. The
respective eval functions then only use the result from the cache.
The plugin DKIM.pm has 9 eval functions that can call the worker routine
_check_dkim_signature:
check_dkim_signed
check_dkim_valid
check_dkim_valid_author_sig
check_dkim_valid_envelopefrom
check_dkim_dependable
check_dkim_adsp
check_dkim_signsome
check_dkim_signall
check_dkim_testing
These are in turn called by 17 rules:
full DKIM_SIGNED eval:check_dkim_signed()
full DKIM_VALID eval:check_dkim_valid()
full DKIM_VALID_AU eval:check_dkim_valid_author_sig()
full DKIM_VALID_EF eval:check_dkim_valid_envelopefrom()
full __DKIM_DEPENDABLE eval:check_dkim_dependable()
header DKIM_ADSP_NXDOMAIN eval:check_dkim_adsp('N')
header DKIM_ADSP_DISCARD eval:check_dkim_adsp('D')
header DKIM_ADSP_ALL eval:check_dkim_adsp('A')
header DKIM_ADSP_CUSTOM_LOW eval:check_dkim_adsp('1')
header DKIM_ADSP_CUSTOM_MED eval:check_dkim_adsp('2')
header DKIM_ADSP_CUSTOM_HIGH eval:check_dkim_adsp('3')
full __RESIGNER1 eval:check_dkim_valid('linkedin.com')
full __RESIGNER2
eval:check_dkim_valid('googlegroups.com','yahoogroups.com','yahoogroups.de')
full DKIM_VERIFIED eval:check_dkim_valid()
header DKIM_POLICY_TESTING eval:check_dkim_testing()
header DKIM_POLICY_SIGNSOME eval:check_dkim_signsome()
header DKIM_POLICY_SIGNALL eval:check_dkim_signall()
The plugin FromNameSpoof.pm has 7 eval functions that can call the
worker routine _check_fromnamespoof:
check_fromname_different
check_fromname_domain_differ
check_fromname_spoof
check_fromname_contains_email
check_fromname_equals_replyto
check_fromname_equals_to
check_fromname_owners_differ
These are called by (only) 2 and 3 rules, respectively:
header __FROM_EQ_REPLY
eval:check_fromname_equals_replyto() ifplugin
Mail::SpamAssassin::Plugin::FreeMail
header __PLUGIN_FROMNAME_EQUALS_TO eval:check_fromname_equals_to()
header __PLUGIN_FROMNAME_SPOOF eval:check_fromname_spoof()
The generation of the evaluation of the rules happens in some sense in
the plugin Check.pm in the subroutine run_generic_tests in
while (my($rulename, $test) = each %{$opts{testhash}->{$priority}})
{
$opts{loop_body}->($self, $pms, $conf, $rulename, $test, %nopts);
}
As you can see the hash testhash which holds the rules is accessed here.
This leads to the fact that each evaluation of the rules of a priority
level is carried out in a different order. This is how the random order
of the evaluation occurs.
Since no priorities were assigned for the rules of the two plugins, the
rules are all executed with the same priority 0.
This means that in
17 / (17 +2) * 100 = 89.5% resp. 17 / (17 +3) * 100 = 85.0%
of the cases an eval function of DKIM.pm is executed first. Only in the
rest of the cases an eval function of FromNameSpoof.pm is executed
first.
The subroutine parsed_metadata contains an asynchronous call to the
worker routine _check_fromnamespoof to wait for the tag DKIMDOMAIN
(action_depends_on_tags).
If now first an eval function of DKIM.pm is called, then the worker
routine _check_dkim_signature is called which defines the tag
DKIMDOMAIN. After that the waiting worker routine _check_fromnamespoof
is called immediately, which evaluates the tag DKIMDOMAIN and calculates
the actual data of the plugin. From this point on, when an eval function
is called from one of the two plugins, only the data from the caches is
read.
On the other hand, if one of the eval functions from FromNameSpoof.pm is
called first, the worker routine _check_fromnamespoof is called
immediately (synchronously). The value of the DKIMDOMAIN tag is now
undefined and cannot be used or is used incorrectly in the calculation
of the data. As we have seen in the above calculation, this is the case
in 10.5% or 15% of the evaluations.
What can be done about it now? The simplest and dirtiest solution is
what the plugins AWL.pm and TxRep.pm do:
Mail::SpamAssassin::Plugin::AWL
header AWL eval:check_from_in_auto_welcomelist()
priority AWL 1000
Mail::SpamAssassin::Plugin::TxRep
header TXREP eval:check_senders_reputation()
priority TXREP 1000
i.e. via the priority the order of the rules is controlled. For this it
would be enough to run one of the DKIM rules with a higher priority.
The clean solution is to call all eval functions asynchronously if the
DKIMDOMAIN tag has to be considered. This is the way Henrik has taken.
@Henrik: I was only surprised that you took the asynchronous method with
rule_pending/rule_ready + check_cleanup instead of the method for tags
with action_depends_on_tags. This separates the evaluation of the eval
functions of the two plugins in time, since check_cleanup is called at
the very end after the evaluation of all priorities. Since
_check_fromnamespoof in turn defines a set of tags, all functions that
depend on these tags are also evaluated only at this time and thus all
meta rules that depend on them, etc.
If, on the other hand, action_depends_on_tags is used, e.g.
sub _check_eval {
my ($self, $pms, $result) = @_;
if (exists $pms->{fromname_async_queue}) {
my $rulename = $pms->get_current_eval_rule_name();
$pms->action_depends_on_tags2('DKIMDOMAIN', sub {
$self->_check_fromnamespoof($pms);
return $result->() ? $pms->got_hit($rulename, '', ruletype =>
'header') : 0 ;
});
return;
}
$self->_check_fromnamespoof($pms);
return $result->();
}
then after the DKIMDOMAIN tag is defined, all eval functions that depend
on it are called immediately. Thus, all eval functions run at priority
level 0 and all dependent eval and meta rules can run at the same
priority level. This solution runs on 3.4.6. Therefore I would be
interested which reasons speak against this solution. Maybe there is
something in 4.0 which changes the game.
Regards,
Michael