Bug Tracker item #2881853, was opened at 2009-10-19 11:28
Message generated for change (Comment added) made by ktmgantt
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=1126467&aid=2881853&group_id=250683

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: daemon
Group: v3.9.0
Status: Closed
Resolution: Fixed
Priority: 5
Private: No
Submitted By: Mark Diekhans (diekhans)
Assigned to: Stevan Bajic (sbajic)
Summary: postgres duplicate key errors

Initial Comment:
Trying to train on around 10,000 message via email to spam@, results in a large 
number of errors like:

postgres[74634]: [16-1] ERROR:  duplicate key value violates unique constraint 
"dspam_token_data_uid_key"
postgres[74634]: [16-2] STATEMENT:  PREPARE dspam_update_plan (bigint) AS 
UPDATE dspam_token_data SET last_hit=CURRENT_DATE,spam_hits=spam_hits+1 WHERE
postgres[74634]: [16-3]  uid=1003 AND token=$1;PREPARE dspam_insert_plan 
(bigint,int,int) AS INSERT INTO dspam_token_data
postgres[74634]: [16-4]  (uid,token,spam_hits,innocent_hits,last_hit) VALUES 
(1003,$1,$2,$3,CURRENT_DATE);

thanks.  I can test changes..


----------------------------------------------------------------------

Comment By: Ken (ktmgantt)
Date: 2009-12-23 06:26

Message:
The error is correct. It is a bit of a chicken and egg problem since if the
uid/token pair is not already in the database, you need to do an INSERT but
if it is you should do an UPDATE. The PostgreSQL documentation provides the
following pl/PgSQL function with error management to perform the
INSERT-UPDATE. Maybe using it would be worthwhile:

Example 38-2. Exceptions with UPDATE/INSERT

This example uses exception handling to perform either UPDATE or INSERT,
as appropriate:

CREATE TABLE db (a INT PRIMARY KEY, b TEXT);

CREATE FUNCTION merge_db(key INT, data TEXT) RETURNS VOID AS
$$
BEGIN
    LOOP
        -- first try to update the key
        UPDATE db SET b = data WHERE a = key;
        IF found THEN
            RETURN;
        END IF;
        -- not there, so try to insert the key
        -- if someone else inserts the same key concurrently,
        -- we could get a unique-key failure
        BEGIN
            INSERT INTO db(a,b) VALUES (key, data);
            RETURN;
        EXCEPTION WHEN unique_violation THEN
            -- do nothing, and loop to try the UPDATE again
        END;
    END LOOP;
END;
$$
LANGUAGE plpgsql;

SELECT merge_db(1, 'david');
SELECT merge_db(1, 'dennis');

Regards,
Ken

----------------------------------------------------------------------

Comment By: Stevan Bajic (sbajic)
Date: 2009-12-23 02:26

Message:
:)

----------------------------------------------------------------------

Comment By: Mark Diekhans (diekhans)
Date: 2009-12-22 20:44

Message:
Just tested the latest git repository and postgress is very well behaved.

Thank you!!!

----------------------------------------------------------------------

Comment By: Stevan Bajic (sbajic)
Date: 2009-12-18 18:25

Message:
Hallo Mark

I think one of the recent GIT submissions fixed that issue. Can you please
check and let us know if you still have that issue?

-- 
Kind Regards from Switzerland,

Stevan Bajić

----------------------------------------------------------------------

Comment By: Mark Diekhans (diekhans)
Date: 2009-10-19 18:07

Message:
hi Stevan, 

Thank you!  I was having a problem and assumed that the error message in
the postgres log indicated the source of the problem.  I will go back to
square one tonight and see and try again.  May I suggest that the
resolution to this bug is to document that this is expected behavior?

Cheers,
Mark




----------------------------------------------------------------------

Comment By: Stevan Bajic (sbajic)
Date: 2009-10-19 16:47

Message:
Hallo Mark

Looking at the code inside DSPAM I see that those errors are intentional
there. It looks to me that DSPAM is first trying to INSERT a token and if
that insert is failing then it is doing an UPDATE. And this mechanism is
then producing all the above log entries.

Look yourself in src/pgsql_drv.c at the function _ds_setall_spamrecords().
Starting at line 757 the code is preparing two SQL clauses. One for
updating and one for inserting. Then at line 872 it is first executing the
prepared insert plan (dspam_insert_plan) and if that is failing it updates
stat.status and later on line 879 the code is testing if stat.status has a
certain value (set above if the insert is failing) and if that is the case
then it executes the prepared update plan (dspam_update_plan).

Ugly, ugly, ugly, .... but working. Would be cool if there would exist
something like "ON DUPLICATE ..." like in MySQL. But unfortunately that
does not exist. I could however write a PLPGSQL function doing what ON
DUPLICATE does but I don't know if I should do that? I see that the current
code has other own made functions which are only used if PostgreSQL >= 8 is
used. But according to the PostgreSQL documentation I see that the CREATE
FUNCTION is supported since 7.1 (I can't find older documentation on the
PostgreSQL home page).

However.... Even if using a own made function I think that the PostgreSQL
log would write a log entry about duplicate key. So I don't really see a
big benefit in using a custom made function. But I could be wrong. I see
that inside a function I could trap exceptions and maybe if I trap them
then they are not reported?

I am not a big specialist of PostgreSQL, so I don't know how those
exceptions behave in a function. Maybe they don't log anything to the log?

Do you maybe know if exceptions inside functions are logged?
Do you maybe know if you could filter those "uplicate key value violates
unique
constraint" out from being logged?


Kind Regards from Switzerland

Stevan Bajic

----------------------------------------------------------------------

Comment By: Mark Diekhans (diekhans)
Date: 2009-10-19 13:09

Message:
Hello Stevan,
Postgres 8.3.7 
oddly, there is no sql.errors file being created
the database was reinitialized to the empty state, not upgraded.
More syslog messages (all pretty must the same) and dspam.conf attached.
Please suggest things to try.  Happy to reset database.
Mark

----------------------------------------------------------------------

Comment By: Stevan Bajic (sbajic)
Date: 2009-10-19 12:20

Message:
Hallo Mark

It would be very helpful if you could post more information (longer log
messages). Especially the last lines of the file sql.errors would be
interesting. If possible then attach the file here in this bug report.

Could you attach as well dspam.conf?

What version of PostgreSQL are you using?

Is the data inside PostgreSQL new or have you taken the data from an older
DSPAM release? If from an older DSPAM release: have you executed the stuff
mentioned in UPGRADING?


Kind Regards from Switzerland

Stevan Bajic

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=1126467&aid=2881853&group_id=250683

------------------------------------------------------------------------------
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
_______________________________________________
Dspam-devel mailing list
Dspam-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspam-devel

Reply via email to