Am 2022-05-03 17:08, schrieb Henrik K:
I really dislike these kinds of "band-aids" which really don't help the main cause: terribly convoluted plugin code. Why skip debug output if owner is missing? Is the a good reason for owner missing, what is the cause for that
and can it be better fixed upstream?  Why are there all kinds of regexs
without any sanity/error checks?  Etc..

If I don't hear any volunteers, I guess I need to try to clean up this too.



@Henrik: Please read the end of the text for the question I have about your rewrite of FromNameSpoof.pm

After Henrik asked for volunteers, I took a quick look at the FromNameSpoof.pm plugin and, like Henrik, immediately realized that the DKIM signature part can't work. You can't call the same function in both async and sync mode. The call to action_depends_on_tags2 is not a declaration that makes every call to _check_fromnamespoof an asynchronous call.

Why does the plugin seem to work anyway? If you were to examine the log files closely, you would find that the plugin works most of the time but not always. In one of my plugins I fell for a similar bug related to DKIMDOMAIN. So I think it is interesting to present a detailed analysis of the problem.

Apart from the 35 predefined common tags in the PerMsgStatus.pm plugin, which are mainly for use in templates, there is another class of tags defined via set_tag, which for me I divide into static and dynamic tags. Static tags are those that are defined before the rules are executed. These essentially come from the Metadata.pm and PerMsgStatus.pm modules. Examples are SENDERDOMAIN and AUTHORDOMAIN.

Dynamic tags are defined by executing rules with an eval function. DKIMDOMAIN belongs to the dynamic tags even if one is tempted to count it among the static tags due to the similarity in name to SENDERDOMAIN and AUTHORDOMAIN.

The tag DKIMDOMAIN is set together with the tags DKIMIDENTITY and DKIMSELECTOR by the worker routine _check_dkim_signature of the plugin DKIM.pm. The two plugins DKIM.pm and FromNameSpoof.pm have a similar structure. Both have a worker routine that is activated by the first eval function called and calculates and caches the results. The respective eval functions then only use the result from the cache.

The plugin DKIM.pm has 9 eval functions that can call the worker routine _check_dkim_signature:

check_dkim_signed
check_dkim_valid
check_dkim_valid_author_sig
check_dkim_valid_envelopefrom
check_dkim_dependable
check_dkim_adsp
check_dkim_signsome
check_dkim_signall
check_dkim_testing

These are in turn called by 17 rules:

full     DKIM_SIGNED            eval:check_dkim_signed()
full     DKIM_VALID             eval:check_dkim_valid()
full     DKIM_VALID_AU          eval:check_dkim_valid_author_sig()
full     DKIM_VALID_EF          eval:check_dkim_valid_envelopefrom()
full     __DKIM_DEPENDABLE      eval:check_dkim_dependable()
header   DKIM_ADSP_NXDOMAIN     eval:check_dkim_adsp('N')
header   DKIM_ADSP_DISCARD      eval:check_dkim_adsp('D')
header   DKIM_ADSP_ALL          eval:check_dkim_adsp('A')
header   DKIM_ADSP_CUSTOM_LOW   eval:check_dkim_adsp('1')
header   DKIM_ADSP_CUSTOM_MED   eval:check_dkim_adsp('2')
header   DKIM_ADSP_CUSTOM_HIGH  eval:check_dkim_adsp('3')
full     __RESIGNER1     eval:check_dkim_valid('linkedin.com')
full __RESIGNER2 eval:check_dkim_valid('googlegroups.com','yahoogroups.com','yahoogroups.de')
full   DKIM_VERIFIED            eval:check_dkim_valid()
header DKIM_POLICY_TESTING      eval:check_dkim_testing()
header DKIM_POLICY_SIGNSOME     eval:check_dkim_signsome()
header DKIM_POLICY_SIGNALL      eval:check_dkim_signall()

The plugin FromNameSpoof.pm has 7 eval functions that can call the worker routine _check_fromnamespoof:

check_fromname_different
check_fromname_domain_differ
check_fromname_spoof
check_fromname_contains_email
check_fromname_equals_replyto
check_fromname_equals_to
check_fromname_owners_differ

These are called by (only) 2 and 3 rules, respectively:

header __FROM_EQ_REPLY eval:check_fromname_equals_replyto() ifplugin Mail::SpamAssassin::Plugin::FreeMail
header    __PLUGIN_FROMNAME_EQUALS_TO eval:check_fromname_equals_to()
header    __PLUGIN_FROMNAME_SPOOF     eval:check_fromname_spoof()

The generation of the evaluation of the rules happens in some sense in the plugin Check.pm in the subroutine run_generic_tests in

while (my($rulename, $test) = each %{$opts{testhash}->{$priority}}) {
      $opts{loop_body}->($self, $pms, $conf, $rulename, $test, %nopts);
    }

As you can see the hash testhash which holds the rules is accessed here. This leads to the fact that each evaluation of the rules of a priority level is carried out in a different order. This is how the random order of the evaluation occurs.

Since no priorities were assigned for the rules of the two plugins, the rules are all executed with the same priority 0.

This means that in

17 / (17 +2) * 100 = 89.5% resp. 17 / (17 +3) * 100 = 85.0%

of the cases an eval function of DKIM.pm is executed first. Only in the rest of the cases an eval function of FromNameSpoof.pm is executed first.

The subroutine parsed_metadata contains an asynchronous call to the worker routine _check_fromnamespoof to wait for the tag DKIMDOMAIN (action_depends_on_tags).

If now first an eval function of DKIM.pm is called, then the worker routine _check_dkim_signature is called which defines the tag DKIMDOMAIN. After that the waiting worker routine _check_fromnamespoof is called immediately, which evaluates the tag DKIMDOMAIN and calculates the actual data of the plugin. From this point on, when an eval function is called from one of the two plugins, only the data from the caches is read.

On the other hand, if one of the eval functions from FromNameSpoof.pm is called first, the worker routine _check_fromnamespoof is called immediately (synchronously). The value of the DKIMDOMAIN tag is now undefined and cannot be used or is used incorrectly in the calculation of the data. As we have seen in the above calculation, this is the case in 10.5% or 15% of the evaluations.

What can be done about it now? The simplest and dirtiest solution is what the plugins AWL.pm and TxRep.pm do:

Mail::SpamAssassin::Plugin::AWL
header    AWL    eval:check_from_in_auto_welcomelist()
priority  AWL    1000

Mail::SpamAssassin::Plugin::TxRep
header    TXREP  eval:check_senders_reputation()
priority  TXREP  1000

i.e. via the priority the order of the rules is controlled. For this it would be enough to run one of the DKIM rules with a higher priority.

The clean solution is to call all eval functions asynchronously if the DKIMDOMAIN tag has to be considered. This is the way Henrik has taken.

@Henrik: I was only surprised that you took the asynchronous method with rule_pending/rule_ready + check_cleanup instead of the method for tags with action_depends_on_tags. This separates the evaluation of the eval functions of the two plugins in time, since check_cleanup is called at the very end after the evaluation of all priorities. Since _check_fromnamespoof in turn defines a set of tags, all functions that depend on these tags are also evaluated only at this time and thus all meta rules that depend on them, etc.

If, on the other hand, action_depends_on_tags is used, e.g.

sub _check_eval {
  my ($self, $pms, $result) = @_;

  if (exists $pms->{fromname_async_queue}) {
    my $rulename = $pms->get_current_eval_rule_name();
    $pms->action_depends_on_tags2('DKIMDOMAIN', sub {
      $self->_check_fromnamespoof($pms);
return $result->() ? $pms->got_hit($rulename, '', ruletype => 'header') : 0 ;
    });
    return;
  }

  $self->_check_fromnamespoof($pms);
  return $result->();
}

then after the DKIMDOMAIN tag is defined, all eval functions that depend on it are called immediately. Thus, all eval functions run at priority level 0 and all dependent eval and meta rules can run at the same priority level. This solution runs on 3.4.6. Therefore I would be interested which reasons speak against this solution. Maybe there is something in 4.0 which changes the game.

Regards,
Michael

Reply via email to