It would be really nice to have a setting like sa_debug_sampling (or 
sa_debug_ratio, sa_debug_sample, etc.) taking an integer value (eg. 100) that 
would trigger debug logging for 1 email message every sa_debug_sampling 
messages (eg. 1 message every 100 messages).
The sampling can be fuzzy, for a sa_debug_sampling value of 100 I would not 
mind getting a debug output every 90 to 110 messages.

Another way to address that could be time-based with a setting like 
sa_debug_interval taking an integer value in seconds (eg. 10) that would 
trigger debug logging for 1 email message every sa_debug_interval seconds (eg. 
1 message every 10 secondes).

The implementation of such feature might have an impact on performance even when it is disabled. The $sa_debug level is used inside SpamAssassin to decide when to callback registered logging methods. Without monkeypatching SA, Amavis would have to run SA with enabled debugging all the time and filter-out dbg-loglines for non-sample messages afterwards.

With amavis 2.13 you can prototype it yourself. Place a modified version of Amavislog.pm inside a high-priority @INC directory, e.g:

# diff -u0 /usr/share/perl5/Mail/SpamAssassin/Logger/Amavislog.pm /etc/perl/Mail/SpamAssassin/Logger/Amavislog.pm --- /usr/share/perl5/Mail/SpamAssassin/Logger/Amavislog.pm  2023-05-12 00:52:23.000000000 +0200 +++ /etc/perl/Mail/SpamAssassin/Logger/Amavislog.pm  2023-06-26 18:45:41.114752331 +0200
@@ -24 +23,0 @@
-  if ($args{debug}) { for (keys %llmap) { $llmap{$_} = 1 if $llmap{$_} > 1 } }
@@ -32 +31 @@
-  my $ll = $self->{llmap}->{$level};
+  my $ll = $self->is_sample(4) ? 1 : $self->{llmap}{$level};
@@ -37,0 +37,8 @@
+# sampling per amavis child
+sub is_sample {
+  my ($self, $every_nth) = @_;
+  return unless $Amavis::MSGINFO;
+  my $log_id = $Amavis::MSGINFO->log_id;
+  $self->{samples}{$log_id} = 1;
+  return %{$self->{samples}} % $every_nth == 0;
+}

Reply via email to