Thomas,
> Mark Martinec schrieb:
> > I can prepare a small change to amavisd which would use
> > (partition_tag,mail_id) as a unique key instead of just the (mail_id).
> > ...
> > Do you think it would help?
>
> That would be great! (and probably also speed up amavisd's lookups
> without the need to add an additional key on mail_id)
Ok, here it goes. The patch below changes SQL clauses to always join
tables using both the partition_tag and the mail_id fields. This has
no particular effect on existing 2.6.0 databases where a primary key
is mail_id (it is just a redundant condition), but saves a day
when a primary key is a composite (partition_tag,mail_id).
Actually amavisd does not know and need not be aware of what is
a primary key. The logic goes like this: when a mail_id for a new
message being processed is generated, amavisd tries to INSERT
a record with this randomly generated mail_id into table msgs
(using $sql_clause{'ins_msg'} SQL clause). If the operation fails,
another mail_id is generated and attempt repeated, until it eventually
succeeds. Thus it depends entirely on SQL's decision whether a
particular record is allowed or would break some UNIQUE constraint.
So, by only changing a table declaration (PRIMARY KEY or adding
a CONSTRAINT), it changes what keys amavisd will be allowed to insert.
Changing a primary key to (partition_tag,mail_id) brings consqeuences
to quarantining, in particular to releasing from a SQL quarantine,
where it no longer suffices to specify mail_id=xxx in AM.PDP request,
but may be necessary to specify also a partition_tag=xx to distinquish
between SQL-quarantined messages which happen to have the same mail_id.
For compatibility I added code to amavisd (using a new 'sel_msg' entry
in %sql_clause) where in case of releasing from SQL quarantine and a
release request is missing a partition_tag=xx attribute, amavisd tries
to find a missing partition_tag value by querying msgs table (using
only mail_id as a key). If exactly one record matches, then everything
is fine, and releasing may proceed. If multiple records with the same
mail_id exists the release request is aborted, with a message asking
user to supply the disambiguating partition_tag=xx attribute.
Quarantine id for a SQL-quarantined message (as logged in main
log entry) is changed from:
quarantine: aX3C4f6btXgX
quarantine: aX3C4f6btXgX[21]
i.e. has a partition_tag in brackets appended to mail_id.
Correspondingly the amavisd-release is also changed to be
able to parse 'aX3C4f6btXgX[21]', splitting it into mail_id
and partition_tag and providing each as a separate attribute
in an AM.PDP request.
For a classical file-based quarantining a template %P may be
used when forming a file name, which will be replaced by
a current partition_tag value. This allows for a file-based
quarantine for example to be split into weekly subdirectories.
For example:
cd /var/virusmails
mkdir -p P21/spam P22/spam P23/spam
chown -R vscan:vscan P21 P22 P23
and having:
$spam_quarantine_method = 'local:P%P/spam/%m.gz';
$sql_partition_tag =
sub { my($msginfo)[EMAIL PROTECTED]; iso8601_week($msginfo->rx_time) };
would store spam into per- partition_tag (e.g. per-week) subdirectories.
There is one detail left to be resolved for releasing a message
from such a file-based schema (probably a change in amavisd-release).
Will think about it later on.
I hope this could contribute some progress regarding SQL partitioning
problems. I haven't tried it yet myself on a partitioned db, I'm just
running a normal non-partitioned database under PostgreSQL currently.
--- amavisd-release.orig 2008-03-17 15:43:33.000000000 +0100
+++ amavisd-release 2008-05-22 18:44:50.000000000 +0200
@@ -146,5 +146,6 @@
sub release_file($$$@) {
my($sock,$mail_file,$secret_id,@alt_recips) = @_;
- my($fn_path,$fn_prefix,$mail_id,$fn_suffix); local($1,$2,$3,$4);
+ my($fn_path,$fn_prefix,$mail_id,$fn_suffix,$part_tag); local($1,$2,$3,$4);
+ $part_tag = $1 if $mail_file =~ s/ \[ ( [^\]]* ) \] \z//xs;
if ($mail_file =~ m{^ ([^/].*/)? ([A-Z0-9][A-Z0-9._-]*[_-])?
([A-Z0-9][A-Z0-9_+-]{10}[A-Z0-9]) (\.gz)? \z}xsi) {
@@ -165,6 +166,7 @@
"mail_id=$mail_id",
);
- push(@query, "secret_id=$secret_id") if $secret_id ne '';
- push(@query, "mail_file=$mail_file") if $quar_type =~ /^[FZB]\z/;
+ push(@query, "secret_id=$secret_id") if $secret_id ne '';
+ push(@query, "mail_file=$mail_file") if $quar_type =~ /^[FZB]\z/;
+ push(@query, "partition_tag=$part_tag") if $part_tag ne '';
if (@alt_recips) { # override original recipients if list is nonempty
push(@query, map {"recipient=$_"} @alt_recips);
--- amavisd.orig 2008-04-23 20:50:05.000000000 +0200
+++ amavisd 2008-05-22 18:44:50.000000000 +0200
@@ -1089,5 +1089,5 @@
'UPDATE msgs SET content=?, quar_type=?, quar_loc=?, dsn_sent=?,'.
' spam_level=?, message_id=?, from_addr=?, subject=?, client_addr=?'.
- ' WHERE mail_id=?',
+ ' WHERE partition_tag=? AND mail_id=?',
'ins_rcp' =>
'INSERT INTO msgrcpt (partition_tag, mail_id, rid,'.
@@ -1096,14 +1096,17 @@
'INSERT INTO quarantine (partition_tag, mail_id, chunk_ind, mail_text)'.
' VALUES (?,?,?,?)',
+ 'sel_msg' => # obtains partition_tag if missing in a release request
+ 'SELECT partition_tag FROM msgs WHERE mail_id=?',
'sel_quar' =>
- 'SELECT mail_text FROM quarantine WHERE mail_id=? ORDER BY chunk_ind',
+ 'SELECT mail_text FROM quarantine WHERE partition_tag=? AND mail_id=?'.
+ ' ORDER BY chunk_ind',
'sel_penpals' => # no message-id references list
"SELECT msgs.time_num, msgs.mail_id, subject".
- " FROM msgs JOIN msgrcpt ON msgs.mail_id=msgrcpt.mail_id".
+ " FROM msgs JOIN msgrcpt USING (partition_tag,mail_id)".
" WHERE sid=? AND rid=? AND content!='V' AND ds='P'".
" ORDER BY msgs.time_num DESC LIMIT 1",
'sel_penpals_msgid' => # with a nonempty message-id references list
"SELECT msgs.time_num, msgs.mail_id, subject, message_id, rid".
- " FROM msgs JOIN msgrcpt ON msgs.mail_id=msgrcpt.mail_id".
+ " FROM msgs JOIN msgrcpt USING (partition_tag,mail_id)".
" WHERE sid=? AND content!='V' AND ds='P' AND message_id IN (%m)".
" AND rid!=sid".
@@ -9326,4 +9329,5 @@
my($partition_tag) = c('sql_partition_tag');
$partition_tag = &$partition_tag($msginfo) if ref $partition_tag eq 'CODE';
+ $partition_tag = 0 if !defined($partition_tag);
$msginfo->partition_tag($partition_tag);
@@ -14442,4 +14446,5 @@
# mail_id=xxxxxxxxxxxx
# secret_id=xxxxxxxxxxxx (authorizes a release/report)
+ # partition_tag=xx (only required if mail_id not unique)
# quar_type=x F/Z/B/Q/M (defaults to Q or F)
# file/zipfile/bsmtp/sql/mailbox
@@ -14514,4 +14519,5 @@
elsif ($attr_ref->{'request'} =~ /^(?:release|requeue|report)\z/i) {
exists $attr_ref->{'mail_id'} or die "Missing 'mail_id' field";
+ $msginfo->partition_tag($attr_ref->{'partition_tag'}); # may be undef
my($fn) = $attr_ref->{'mail_id'};
# amavisd almost-base64: 62 +, 63 -
@@ -14562,7 +14568,27 @@
or die "SQL quarantine code not enabled";
my($conn_h) = $obj->{conn_h}; my($sql_cl_r) = cr('sql_clause');
+ my($sel_msg) = $sql_cl_r->{'sel_msg'};
+ my($sel_quar) = $sql_cl_r->{'sel_quar'};
+ if (!defined($msginfo->partition_tag) &&
+ defined($sel_msg) && $sel_msg ne '') {
+ do_log(5, "preprocess_policy_query: missing partition_tag in request,".
+ " fetching msgs record for mail_id=%s", $msginfo->mail_id);
+ # find a corresponding partition_tag if missing from a release request
+ $conn_h->begin_work_nontransaction; #(re)connect if necessary
+ $conn_h->execute($sel_msg, untaint($msginfo->mail_id));
+ my($a_ref); my($cnt) = 0;
+ while ( defined($a_ref=$conn_h->fetchrow_arrayref($sel_msg)) ) {
+ $cnt++; $msginfo->partition_tag($a_ref->[0]) if $cnt==1;
+ ll(5) && do_log(5, "got msgs record for mail_id=%s: %s",
+ $msginfo->mail_id, join(', ',@$a_ref));
+ }
+ $conn_h->finish($sel_msg) if defined $a_ref; # only if not all read
+ $cnt <= 1 or die "Multiple ($cnt) records with same mail_id exists, ".
+ "specify a partition_tag in AM.PDP request";
+ }
$conn_h->begin_work_nontransaction; # (re)connect if not connected
$fh = Amavis::IO::SQL->new;
- $fh->open($conn_h,$sql_cl_r->{'sel_quar'},untaint($msginfo->mail_id))
+ $fh->open($conn_h, $sel_quar, untaint($msginfo->mail_id),
+ 'r', untaint($msginfo->partition_tag))
or die "Can't open sql obj for reading: $!";
} else {
@@ -17265,4 +17291,5 @@
$bsmtp_file_final =~ s{%(.)}
{ $1 eq 'b' ? $msginfo->body_digest
+ : $1 eq 'P' ? $msginfo->partition_tag
: $1 eq 'm' ? $msginfo->mail_id
: $1 eq 'n' ? $msginfo->log_id
@@ -17495,4 +17522,5 @@
$suggested_filename =~ s{%(.)}
{ $1 eq 'b' ? $msginfo->body_digest
+ : $1 eq 'P' ? $msginfo->partition_tag
: $1 eq 'm' ? $msginfo->mail_id
: $1 eq 'n' ? $msginfo->log_id
@@ -18425,6 +18453,6 @@
0+untaint($spam_level), $m_id, $from, $subj,
untaint($msginfo->client_addr), #maybe we have a better info now
- # $os_fp,
- $mail_id); # $rfc2822_sender, $rfc2822_from,
+ $msginfo->partition_tag, $mail_id);
+ # $os_fp, $rfc2822_sender, $rfc2822_from,
# SQL_CHAR, SQL_VARCHAR, SQL_VARBINARY, SQL_BLOB, SQL_INTEGER, SQL_FLOAT,
# SQL_TIMESTAMP, SQL_TYPE_TIMESTAMP_WITH_TIMEZONE, ...
@@ -18489,14 +18517,17 @@
or do { $@ = "errno=$!" if $@ eq ''; chomp $@ };
die $@ if $@ =~ /^timed out\b/; # resignal timeout
- if ($self->{mode} eq 'w') {
+ if ($self->{mode} eq 'w') { # open for write access
ll(4) && do_log(4,"Amavis::IO::SQL::open %s drv=%s (%s); key=%s, p_tag=%s",
$self->{mode}, $driver, $self->{clause},
$self->{dbkey}, $self->{partition_tag});
- } else {
- eval { $conn_h->execute($self->{clause}, $self->{dbkey}); 1 }
- or do { $@ = "errno=$!" if $@ eq '' };
+ } else { # open for read access
+ eval {
+ $conn_h->execute($self->{clause}, $self->{partition_tag},$self->{dbkey});
+ 1;
+ } or do { $@ = "errno=$!" if $@ eq '' };
my($ll) = $@ ne '' ? -1 : 4;
- ll($ll) && do_log($ll,"Amavis::IO::SQL::open %s drv=%s (%s); key=%s: %s",
- $self->{mode}, $driver, $self->{clause}, $self->{dbkey}, $@);
+ do_log($ll,"Amavis::IO::SQL::open %s drv=%s (%s); key=%s, p_tag=%s: %s",
+ $self->{mode}, $driver, $self->{clause},
+ $self->{dbkey}, $self->{partition_tag}, $@) if ll($ll);
if ($@ ne '') {
chomp $@;
@@ -18827,5 +18858,11 @@
next if $r->recip_done;
$r->recip_smtp_response($smtp_response); $r->recip_done(2);
- $r->recip_mbxname($mail_id) if $smtp_response =~ /^2/;
+ if ($smtp_response =~ /^2/) {
+ my($mbxname) = $mail_id;
+ my($p_tag) = $msginfo->partition_tag;
+ $mbxname .= '[' . $p_tag . ']'
+ if defined($p_tag) && $p_tag ne '' && $p_tag ne '0';
+ $r->recip_mbxname($mbxname);
+ }
}
section_time('fwd-sql');
Mark
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
AMaViS-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/amavis-user
AMaViS-FAQ:http://www.amavis.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/howto/