https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7014

            Bug ID: 7014
           Summary: spamassassin SQL deadlock using bayes
           Product: Spamassassin
           Version: 3.3.2
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: major
          Priority: P2
         Component: spamassassin
          Assignee: [email protected]
          Reporter: [email protected]

Created attachment 5187
  --> https://issues.apache.org/SpamAssassin/attachment.cgi?id=5187&action=edit
/usr/share/perl5/Mail/SpamAssassin/BayesStore/PgSQL.pm diff fixing suboptimal
SQL

spamassassin bayes is almost unusable on a high traffic site with postgres,
bayes and multiple instances, since it deadlocks in a few seconds and breaking
them considerably slows down the db server and sa both. You see such things as:

2014-02-19 18:21:56 CET ERROR:  deadlock detected
2014-02-19 18:21:56 CET DETAIL:  Process 63882 waits for ShareLock on
transaction 858300; blocked by process 63883.
        Process 63883 waits for ShareLock on transaction 858419; blocked by
process 63965.
        Process 63965 waits for ShareLock on transaction 858365; blocked by
process 63931.
        Process 63931 waits for ShareLock on transaction 858390; blocked by
process 63942.
        Process 63942 waits for ShareLock on transaction 858316; blocked by
process 63899.
        Process 63899 waits for ShareLock on transaction 858301; blocked by
process 63882.
        Process 63882: select put_tokens(1,
E'{"\\\\000\\\\127\\\\346\\\\327\\\\314","\\\\000\\\\132\\\\372\\\\370\\\\325","\\\\000\\\\135\\\\013\\\\022\\\\111","\\\\000\\\\201\\\\
        Process 63883: select put_tokens(1,
E'{"\\\\001\\\\271\\\\217\\\\220\\\\362","\\\\001\\\\275\\\\163\\\\036\\\\167","\\\\002\\\\046\\\\236\\\\242\\\\071","\\\\002\\\\100\\\\
        Process 63965: UPDATE bayes_token SET atime = $1 WHERE id = $2 AND
token IN
($3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17,$18,$19,$20,$21,$22,$23,$24,$25,$26,$27,$2
        Process 63931: UPDATE bayes_token SET atime = $1 WHERE id = $2 AND
token IN
($3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17,$18,$19,$20,$21,$22,$23,$24,$25,$26,$27,$2
        Process 63942: UPDATE bayes_token SET atime = $1 WHERE id = $2 AND
token IN
($3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17,$18,$19,$20,$21,$22,$23,$24,$25,$26,$27,$2
        Process 63899: UPDATE bayes_token SET atime = $1 WHERE id = $2 AND
token IN
($3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17,$18,$19,$20,$21,$22,$23,$24,$25,$26,$27,$2
2014-02-19 18:21:56 CET HINT:  See server log for query details.
2014-02-19 18:21:56 CET CONTEXT:  SQL statement "UPDATE bayes_token
               SET spam_count = greatest_int(spam_count + inspam_count, 0),
                   ham_count = greatest_int(ham_count + inham_count, 0),
                   atime = greatest_int(atime, inatime)
             WHERE id = inuserid 
               AND token = _token"
        PL/pgSQL function "put_tokens" line 14 at SQL statement
2014-02-19 18:21:56 CET STATEMENT:  select put_tokens(1,
E'{"\\\\000\\\\127\\\\346\\\\327\\\\314","\\\\000\\\\132\\\\372\\\\370\\\\325","\\\\000\\\\135\\\\013\\\\022\\\\111","\\\\0

As this blog post patiently explains (in 2010) the SQL code in SA could use
lots of love and tenderness since it's written pretty SQL unfriendly:
http://linuxblog.kieser.net/2010/06/spamassassin-spamd-deadlocks-runnning.html

Obviously he's also completely right about the fix: kill INs. He also provide a
patch which I am attaching (against 3.3.2, probably doesn't matter since this
code seems to be untouched for ages), which completely fix the issue, makes the
db server and sa 10 times faster.

I hope you'll apply this because it makes a tremendous difference. (I might
even have moved out the commit of the loop but it'd block again.)

File is in /usr/share/perl5/Mail/SpamAssassin/BayesStore/

-- 
You are receiving this mail because:
You are the assignee for the bug.

Reply via email to