Bug Tracker item #2881853, was opened at 2009-10-19 11:28 Message generated for change (Comment added) made by ktmgantt You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1126467&aid=2881853&group_id=250683
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: daemon Group: v3.9.0 Status: Closed Resolution: Fixed Priority: 5 Private: No Submitted By: Mark Diekhans (diekhans) Assigned to: Stevan Bajic (sbajic) Summary: postgres duplicate key errors Initial Comment: Trying to train on around 10,000 message via email to spam@, results in a large number of errors like: postgres[74634]: [16-1] ERROR: duplicate key value violates unique constraint "dspam_token_data_uid_key" postgres[74634]: [16-2] STATEMENT: PREPARE dspam_update_plan (bigint) AS UPDATE dspam_token_data SET last_hit=CURRENT_DATE,spam_hits=spam_hits+1 WHERE postgres[74634]: [16-3] uid=1003 AND token=$1;PREPARE dspam_insert_plan (bigint,int,int) AS INSERT INTO dspam_token_data postgres[74634]: [16-4] (uid,token,spam_hits,innocent_hits,last_hit) VALUES (1003,$1,$2,$3,CURRENT_DATE); thanks. I can test changes.. ---------------------------------------------------------------------- Comment By: Ken (ktmgantt) Date: 2009-12-23 06:26 Message: The error is correct. It is a bit of a chicken and egg problem since if the uid/token pair is not already in the database, you need to do an INSERT but if it is you should do an UPDATE. The PostgreSQL documentation provides the following pl/PgSQL function with error management to perform the INSERT-UPDATE. Maybe using it would be worthwhile: Example 38-2. Exceptions with UPDATE/INSERT This example uses exception handling to perform either UPDATE or INSERT, as appropriate: CREATE TABLE db (a INT PRIMARY KEY, b TEXT); CREATE FUNCTION merge_db(key INT, data TEXT) RETURNS VOID AS $$ BEGIN LOOP -- first try to update the key UPDATE db SET b = data WHERE a = key; IF found THEN RETURN; END IF; -- not there, so try to insert the key -- if someone else inserts the same key concurrently, -- we could get a unique-key failure BEGIN INSERT INTO db(a,b) VALUES (key, data); RETURN; EXCEPTION WHEN unique_violation THEN -- do nothing, and loop to try the UPDATE again END; END LOOP; END; $$ LANGUAGE plpgsql; SELECT merge_db(1, 'david'); SELECT merge_db(1, 'dennis'); Regards, Ken ---------------------------------------------------------------------- Comment By: Stevan Bajic (sbajic) Date: 2009-12-23 02:26 Message: :) ---------------------------------------------------------------------- Comment By: Mark Diekhans (diekhans) Date: 2009-12-22 20:44 Message: Just tested the latest git repository and postgress is very well behaved. Thank you!!! ---------------------------------------------------------------------- Comment By: Stevan Bajic (sbajic) Date: 2009-12-18 18:25 Message: Hallo Mark I think one of the recent GIT submissions fixed that issue. Can you please check and let us know if you still have that issue? -- Kind Regards from Switzerland, Stevan Bajić ---------------------------------------------------------------------- Comment By: Mark Diekhans (diekhans) Date: 2009-10-19 18:07 Message: hi Stevan, Thank you! I was having a problem and assumed that the error message in the postgres log indicated the source of the problem. I will go back to square one tonight and see and try again. May I suggest that the resolution to this bug is to document that this is expected behavior? Cheers, Mark ---------------------------------------------------------------------- Comment By: Stevan Bajic (sbajic) Date: 2009-10-19 16:47 Message: Hallo Mark Looking at the code inside DSPAM I see that those errors are intentional there. It looks to me that DSPAM is first trying to INSERT a token and if that insert is failing then it is doing an UPDATE. And this mechanism is then producing all the above log entries. Look yourself in src/pgsql_drv.c at the function _ds_setall_spamrecords(). Starting at line 757 the code is preparing two SQL clauses. One for updating and one for inserting. Then at line 872 it is first executing the prepared insert plan (dspam_insert_plan) and if that is failing it updates stat.status and later on line 879 the code is testing if stat.status has a certain value (set above if the insert is failing) and if that is the case then it executes the prepared update plan (dspam_update_plan). Ugly, ugly, ugly, .... but working. Would be cool if there would exist something like "ON DUPLICATE ..." like in MySQL. But unfortunately that does not exist. I could however write a PLPGSQL function doing what ON DUPLICATE does but I don't know if I should do that? I see that the current code has other own made functions which are only used if PostgreSQL >= 8 is used. But according to the PostgreSQL documentation I see that the CREATE FUNCTION is supported since 7.1 (I can't find older documentation on the PostgreSQL home page). However.... Even if using a own made function I think that the PostgreSQL log would write a log entry about duplicate key. So I don't really see a big benefit in using a custom made function. But I could be wrong. I see that inside a function I could trap exceptions and maybe if I trap them then they are not reported? I am not a big specialist of PostgreSQL, so I don't know how those exceptions behave in a function. Maybe they don't log anything to the log? Do you maybe know if exceptions inside functions are logged? Do you maybe know if you could filter those "uplicate key value violates unique constraint" out from being logged? Kind Regards from Switzerland Stevan Bajic ---------------------------------------------------------------------- Comment By: Mark Diekhans (diekhans) Date: 2009-10-19 13:09 Message: Hello Stevan, Postgres 8.3.7 oddly, there is no sql.errors file being created the database was reinitialized to the empty state, not upgraded. More syslog messages (all pretty must the same) and dspam.conf attached. Please suggest things to try. Happy to reset database. Mark ---------------------------------------------------------------------- Comment By: Stevan Bajic (sbajic) Date: 2009-10-19 12:20 Message: Hallo Mark It would be very helpful if you could post more information (longer log messages). Especially the last lines of the file sql.errors would be interesting. If possible then attach the file here in this bug report. Could you attach as well dspam.conf? What version of PostgreSQL are you using? Is the data inside PostgreSQL new or have you taken the data from an older DSPAM release? If from an older DSPAM release: have you executed the stuff mentioned in UPGRADING? Kind Regards from Switzerland Stevan Bajic ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1126467&aid=2881853&group_id=250683 ------------------------------------------------------------------------------ This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev _______________________________________________ Dspam-devel mailing list Dspam-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspam-devel