https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7218

            Bug ID: 7218
           Summary: SQLBasedAddrList.pm generates lots of duplicate key
                    errors in database logs
           Product: Spamassassin
           Version: SVN Trunk (Latest Devel Version)
          Hardware: All
                OS: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Learner
          Assignee: [email protected]
          Reporter: [email protected]

I'm using SQLBasedAddrList.pm for AWL with PostgreSQL backend like this:

loadplugin                      Mail::SpamAssassin::Plugin::AWL

use_auto_whitelist              1
auto_whitelist_factory          Mail::SpamAssassin::SQLBasedAddrList
user_awl_dsn                    DBI:Pg:
user_awl_sql_override_username  defang

header AWL            eval:check_from_in_auto_whitelist()
describe AWL            Adjusted score from AWL reputation of From: address
tflags AWL            userconf noautolearn
priority AWL                    1000

I have many errors in PostgreSQL logs:
ERROR:  duplicate key value violates unique constraint "awl_pkey"
DETAIL:  Key (username, email, signedby, ip)=(defang, [email protected], ,
208.192) already exists.
STATEMENT:  INSERT INTO awl (username,email,ip,count,totscore) VALUES
($1,$2,$3,$4,$5)

This is caused by SQLBasedAddrList.pm add_score function which works like this:

    # try inserting first, and if that fails we'll do the update; this way
    # we avoid to large extent a race condition between multiple processes
    $rc = INSERT INTO awl (username,email,ip,count,totscore) VALUES (?,?,?,?,?)
    if (!$rc) {
        # insert failed, assume primary key constraint, so try the update
        UPDATE awl SET count=?, totscore=totscore+? WHERE username=? AND
email=? AND signedby=?
    }

This is less than ideal way to do UPSERT/MERGE, as it generates these errors -
which fill logs and prevent using for example transactions.

Please change it to the following (pseudocode):

    # try updating first, and if that affects positive number of rows then
we're done
    $rc = UPDATE awl SET count=?, totscore=totscore+? WHERE username=? AND
email=? AND signedby=?
    return if $rc>0;

    # we didn't update anything, try inserting, if successful then we're done
    $rc = INSERT INTO awl (username,email,ip,count,totscore) VALUES (?,?,?,?,?)
    return if $rc

    # it looks like a concurrent process inserted the record between our update
and insert, try update once more
    $rc = UPDATE awl SET count=?, totscore=totscore+? WHERE username=? AND
email=? AND signedby=?
    return

This way an error on insert is still possible but it is unlikely.

-- 
You are receiving this mail because:
You are the assignee for the bug.

Reply via email to