On 05/06/2014 10:54 AM, Steve Singer wrote:


I see what is going on (based on the logs you sent that the list didn't
like)


node 4 is configured to use node 2 as the provider for the set

Node 4 has the following in its event queue

1,5000000111 SYNC
.
.
1,5000000118 FAILOVER_NODE

remoteWorker_1 on node 4 doesn't process the FAILOVER_NODE because it
can't get beyond the SYNC.  It can't get beyond the SYNC because the
provider for 1 is 2 which has gone offline.


I *suspect* the attached patch might fix the issue, but I haven't yet done much testing with it.



2014-05-01_165630 BSTDEBUG2 remoteWorkerThread_1: SYNC 5000000111
processing 2014-05-01_165630 BSTERROR slon_connectdb:
PQconnectdb("dbname=TEST host=localhost port=5433 user=slony") failed -
could not connect to server: Connection refused Is the server running on
host "localhost" (127.0.0.1) and accepting TCP/IP connections on port
5433? 2014-05-01_165630 BSTERROR remoteWorkerThread_1: cannot connect to
data provider 2 on 'dbname=TEST host=localhost port=5433 user=slony'
2014-05-01_165630 BSTDEBUG2 remoteWorkerThread_1: rollback SYNC
transaction 2014-05-01_165632 BSTERROR slon_connectdb:
PQconnectdb("dbname=TEST host=localhost port=5434 user=slony") failed -
could not connect to server: Connection refused Is the server running on
host "localhost" (127.0.0.1) and accepting TCP/IP connections on port
5434? 2014-05-01_165632 BSTWARN remoteListenThread_3: DB connection
failed - sleep 10 seconds



diff --git a/src/slonik/slonik.c b/src/slonik/slonik.c
new file mode 100644
index 176890d..84663e2
*** a/src/slonik/slonik.c
--- b/src/slonik/slonik.c
*************** slonik_failed_node(SlonikStmt_failed_nod
*** 2987,2993 ****
  					 "    on (sl_node.no_id=sl_failover_targets.backup_id "
  					 "        and set_origin=%d )"
  					 "    where no_id not in ( %s ) "
! 					 "    and backup_id not in ( %s ) "
  					 "    order by no_id; ",
  					 stmt->hdr.script->clustername,
  					 stmt->hdr.script->clustername,
--- 2987,2993 ----
  					 "    on (sl_node.no_id=sl_failover_targets.backup_id "
  					 "        and set_origin=%d )"
  					 "    where no_id not in ( %s ) "
! 					 "    and ( backup_id not in ( %s ) or backup_id is null) "
  					 "    order by no_id; ",
  					 stmt->hdr.script->clustername,
  					 stmt->hdr.script->clustername,
_______________________________________________
Slony1-general mailing list
[email protected]
http://lists.slony.info/mailman/listinfo/slony1-general

Reply via email to