[Slony1-general] slony1 drop node failure

Tignor, Tom Thu, 22 Feb 2018 13:26:14 -0800

                Hello slony1 community,
                We have a head scratcher here. It appears a DROP NODE command 
was not fully processed. The command was issued and confirmed on all our nodes 
at approximately 2018-02-21 19:19:50 UTC. When we went to restore it over two 
hours later, all replication stopped on an sl_event constraint violation. 
Investigation showed a SYNC event for the dropped node with a timestamp of just 
a few seconds before the drop. I believe this is a first for us. The DROP NODE 
command is supposed to remove all state for the dropped node. Is that right? Is 
there a potential race condition somewhere which could leave behind state?
                Thanks in advance,


---- master log replication freeze error ----
2018-02-21 21:38:52 UTC [5775] ERROR  remoteWorkerThread_8: "insert into 
"_ams_cluster".sl_event     (ev_origin, ev_seqno, ev_timestamp,      
ev_snapshot, ev\
_type     ) values ('8', '5002075962', '2018-02-21 19:19:41.958719+00', 
'87044110:87044110:', 'SYNC'); insert into "_ams_cluster".sl_confirm       
(con_origi\
n, con_received, con_seqno, con_timestamp)    values (8, 1, '5002075962', 
now()); select "_ams_cluster".logApplySaveStats('_ams_cluster', 8, '0.139 
s'::inter\
val); commit transaction;" PGRES_FATAL_ERROR ERROR:  duplicate key value 
violates unique constraint "sl_event-pkey"
DETAIL:  Key (ev_origin, ev_seqno)=(8, 5002075962) already exists.
2018-02-21 21:38:52 UTC [13649] CONFIG slon: child terminated signal: 9; pid: 
5775, current worker pid: 5775
2018-02-21 21:38:52 UTC [13649] CONFIG slon: restart of worker in 10 seconds
---- master log replication freeze error ----


---- master DB leftover event ----
a...@ams6.cmb.netmgmt:~$ psql -U akamai -d ams
psql (9.1.24)
Type "help" for help.

ams=# select * from sl_event_bak;
 ev_origin |  ev_seqno  |         ev_timestamp          |    ev_snapshot     | 
ev_type | ev_data1 | ev_data2 | ev_data3 | ev_data4 | ev_data5 | ev_data6 | ev_
data7 | ev_data8
-----------+------------+-------------------------------+--------------------+---------+----------+----------+----------+----------+----------+----------+----
------+----------
         8 | 5002075962 | 2018-02-21 19:19:41.958719+00 | 87044110:87044110: | 
SYNC    |          |          |          |          |          |          |
      |
(1 row)

ams=#
---- master DB leftover event ----

---- master log drop node record ----
2018-02-21 19:19:50 UTC [22582] CONFIG disableNode: no_id=8
2018-02-21 19:19:50 UTC [22582] CONFIG storeListen: li_origin=4 li_receiver=1 
li_provider=4
2018-02-21 19:19:50 UTC [22582] CONFIG storeListen: li_origin=7 li_receiver=1 
li_provider=7
2018-02-21 19:19:50 UTC [22582] CONFIG storeListen: li_origin=3 li_receiver=1 
li_provider=3
2018-02-21 19:19:50 UTC [22582] CONFIG remoteWorkerThread_4: update provider 
configuration
2018-02-21 19:19:50 UTC [22582] CONFIG remoteWorkerThread_4: connection for 
provider 4 terminated
2018-02-21 19:19:50 UTC [22582] CONFIG remoteWorkerThread_4: disconnecting from 
data provider 4
2018-02-21 19:19:50 UTC [22582] CONFIG remoteWorkerThread_4: connection for 
provider 7 terminated
---- master log drop node record ----

---- replica log drop node record ----
2018-02-21 19:19:51 UTC [22650] WARN   remoteWorkerThread_1: got DROP NODE for 
local node ID
NOTICE:  Slony-I: Please drop schema "_ams_cluster"
2018-02-21 19:19:53 UTC [22650] INFO   remoteWorkerThread_7: SYNC 5001868819 
done in 2.153 seconds
NOTICE:  drop cascades to 243 other objects
DETAIL:  drop cascades to table _ams_cluster.sl_node
drop cascades to table _ams_cluster.sl_nodelock
drop cascades to table _ams_cluster.sl_set
drop cascades to table _ams_cluster.sl_setsync
drop cascades to table _ams_cluster.sl_table
drop cascades to table _ams_cluster.sl_sequence
---- replica log drop node record ----


                Tom    ☺

_______________________________________________
Slony1-general mailing list
Slony1-general@lists.slony.info
http://lists.slony.info/mailman/listinfo/slony1-general

[Slony1-general] slony1 drop node failure

Reply via email to