| Hi Alexis, Do you think you can try with 3.3.2? If it is due to leader election, then we might have fixed it already.
-Flavio On Jan 7, 2011, at 1:32 AM, Alexis Midon wrote: Hi there,
I have a cluster of 3 machines, running zookeeper 3.3.1. zk1 fails to join the quorum while zk2 and zk3 interact correctly. zk1 is stuck in the election loop. See the log below. I checked the config files, the connectivity between the machines. I can't find anything wrong.
Any ideas?
thanks in advance,
alexis
2011-01-07 00:14:23,156 - DEBUG [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorump...@551] - Initializing leader election protocol... 2011-01-07 00:14:23,157 - INFO [QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@649] - New election. My id = 1, Proposed zxid = 0 2011-01-07 00:14:23,158 - DEBUG [WorkerSender Thread:quorumcnxmana...@346] - Opening channel to server 2 2011-01-07 00:14:23,159 - DEBUG [WorkerReceiver Thread:fastleaderelection$messenger$workerrecei...@214] - Receive new notification message. My id = 1 2011-01-07 00:14:23,160 - INFO [QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@689] - Notification: 1, 0, 1, 1, LOOKING, LOOKING, 1 2011-01-07 00:14:23,160 - DEBUG [QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@495] - id: 1, proposed id: 1, zxid: 0, proposed zxid: 0 2011-01-07 00:14:23,161 - DEBUG [QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@717] - Adding vote: From = 1, Proposed leader = 1, Porposed zxid = 0, Proposed epoch = 1 2011-01-07 00:14:23,162 - INFO [WorkerSender Thread:quorumcnxmana...@162] - Have smaller server identifier, so dropping the connection: (2, 1) 2011-01-07 00:14:23,162 - DEBUG [WorkerSender Thread:quorumcnxmana...@346] - Opening channel to server 3 2011-01-07 00:14:23,172 - INFO [WorkerSender Thread:quorumcnxmana...@162] - Have smaller server identifier, so dropping the connection: (3, 1) 2011-01-07 00:14:23,365 - DEBUG [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorumcnxmana...@391] - Queue size: 1 2011-01-07 00:14:23,366 - DEBUG [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorumcnxmana...@391] - Queue size: 1 2011-01-07 00:14:23,366 - DEBUG [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorumcnxmana...@346] - Opening channel to server 2 2011-01-07 00:14:23,367 - INFO [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorumcnxmana...@162] - Have smaller server identifier, so dropping the connection: (2, 1) 2011-01-07 00:14:23,367 - DEBUG [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorumcnxmana...@346] - Opening channel to server 3 2011-01-07 00:14:23,378 - INFO [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorumcnxmana...@162] - Have smaller server identifier, so dropping the connection: (3, 1) 2011-01-07 00:14:23,378 - INFO [QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@683] - Notification time out: 400 2011-01-07 00:14:23,785 - DEBUG [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorumcnxmana...@391] - Queue size: 1 2011-01-07 00:14:23,785 - DEBUG [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorumcnxmana...@391] - Queue size: 1 2011-01-07 00:14:23,786 - DEBUG [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorumcnxmana...@346] - Opening channel to server 2 2011-01-07 00:14:26,786 - INFO [QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorumcnxmana...@162] - Have smaller server identifier, so dropping the connection: (2, 1) ...
flaviojunqueira research scientist [email protected]direct +34 93-183-8828 avinguda diagonal 177, 8th floor, barcelona, 08018, esphone (408) 349 3300 fax (408) 349 3301
|