We're experiencing some occasional ('rare' might be the better term) deadlocks, which appear to be caused by Bucardo conflicting with the monolithic application we run that is constantly hitting the db. By occasional/rare I mean perhaps a few days will pass, we'll have a deadlock and the master process will abort/crash and restart. Obviously we'd prefer not to have the minute or so of downtime as monit discovers the outage and restarts it.

Is this a case where just increasing the kid_deadlock_sleep would help? Is there an upper bounds on the sleep that would tend to make things wonky? I currently have it set at one second.

error as reported by our application (sorry for the obfuscation, customer privacy issues):

Exec: UPDATE units SET last_contact=now(), last_event='2014/12/15 19:18:51 xxxxx uuuuu id:42 evid:3 position:xx.50' ,latitude=xx.xxxxxxxx, longitude=-yyy.yyyyyyyy, location_timestamp='2014/12/15 19:18:51 GMT' WHERE unit_id=x'zzzzzzzz'::bigint
execDB: ERROR:  deadlock detected
DETAIL: Process 20196 waits for RowExclusiveLock on relation 17212 of database 16389; blocked by process 29795. Process 29795 waits for AccessExclusiveLock on relation 108623566 of database 16389; blocked by process 16616. Process 16616 waits for AccessExclusiveLock on relation 17212 of database 16389; blocked by process 20196.
HINT:  See server log for query details.
CONTEXT: SQL statement "INSERT INTO int_unit_moves SELECT NEW.unit_id, NEW.longitude,NEW.latitude, 1, NEW.xxxx_id"
PL/pgSQL function "on_units_change" line 5 at SQL statement

And as reported by postgresql:
2014-12-15 11:19:03 PST ERROR:  deadlock detected
2014-12-15 11:19:03 PST DETAIL: Process 20196 waits for RowExclusiveLock on relation 17212 of database 16389; blocked by process 29795. Process 29795 waits for AccessExclusiveLock on relation 108623566 of database 16389; blocked by process 16616. Process 16616 waits for AccessExclusiveLock on relation 17212 of database 16389; blocked by process 20196. Process 20196: UPDATE units SET last_contact=now(), last_event='2014/12/15 19:18:51 xxxxxxx uuuuu id:42 evid:3 position:xx.50' ,latitude=xx.xxxxxxxx, longitude=-yy.yyyyyyyyy, location_timestamp='2014/1
2/15 19:18:51 GMT' WHERE unit_id=x'zzzzzzzzz'::bigint
    Process 29795: TRUNCATE int_unit_moves ;
    Process 16616: TRUNCATE TABLE public.int_unit_moves
2014-12-15 11:19:03 PST HINT:  See server log for query details.
2014-12-15 11:19:03 PST CONTEXT: SQL statement "INSERT INTO int_unit_moves SELECT NEW.unit_id, NEW.longitude,NEW.latitude, 1, NEW.xxxxx_id"
    PL/pgSQL function "on_units_change" line 5 at SQL statement
2014-12-15 11:19:03 PST STATEMENT: UPDATE units SET last_contact=now(), last_event='2014/12/15 19:18:51 xxxxx uuuuu id:42 evid:3 position:xx.50' ,latitude=xx.xxxxxxx, longitude=-yyy.yyyyyyyy, location_time
stamp='2014/12/15 19:18:51 GMT' WHERE unit_id=x'zzzzzzzzzz'::bigint
2014-12-15 11:19:03 PST ERROR:  deadlock detected
2014-12-15 11:19:03 PST DETAIL: Process 16616 waits for AccessExclusiveLock on relation 17212 of database 16389; blocked by process 29795. Process 29795 waits for AccessExclusiveLock on relation 108623566 of database 16389; blocked by process 16616.
    Process 16616: TRUNCATE TABLE public.int_unit_moves
    Process 29795: TRUNCATE int_unit_moves ;
2014-12-15 11:19:03 PST HINT:  See server log for query details.
2014-12-15 11:19:03 PST STATEMENT:  TRUNCATE TABLE public.int_unit_moves

Thanks in advance!!

--
Paul Theodoropoulos
www.anastrophe.com

_______________________________________________
Bucardo-general mailing list
[email protected]
https://mail.endcrypt.com/mailman/listinfo/bucardo-general

Reply via email to