RE: zookeeper node can't join the cluster

2018-04-05 Thread Rashwan, Abderahman
Hi,

Could it be something to do with Proxmox containers?
---
Could be but I tried VMs as well and gave me the same error

Which ZooKeeper version are u running?
Zookeeper version: 3.4.10-39d3a4f269333c922ed3db283be479f9deacaa0f, built on 
03/23/2017 10:13 GMT


Looks like you restarted zk01 and it was trying to connect to itself.
(zk001/172.31.254.56:3888)

Would you please attach your Zk config files too?
--
dataDir=/var/lib/zookeeper
clientPort=2181
maxClientCnxns=0
tickTime=2000
initLimit=5
syncLimit=2
server.1=zk001:2888:3888
server.2=zk002:2888:3888
server.3=zk003:2888:3888


/etc/hosts file
172.31.254.57 zk002
172.31.254.56 zk001
172.31.254.10 zk003

Regards,
Andor




On Wed, Apr 4, 2018 at 10:51 PM, Rashwan, Abderahman < 
abderahman.rash...@bell.ca> wrote:

> Hello,
>
>
>
> I have 2 servers, I installed proxmox in both and created a cluster 
> contains 6 kafka nodes and 3 zookeepers
>
> Server1: kafka1, kafka2, kafka3,zk1
>
> Server2: kafka4, kafka5, kafka6,zk2
>
> VM: zk3
>
>
>
> When i shut down one server, for example server1 (kafka1, kafka2,
> kafka3,zk1)  and then power it up Zk01 gives me an error and can’t join
> the cluster, and I got this error
>
>
>
> [2018-04-03 10:22:04,370] WARN Cannot open channel to 1 at election 
> address zk001/172.31.254.56:3888 (org.apache.zookeeper.server.
> quorum.QuorumCnxManager)
>
> java.net.ConnectException: Connection refused (Connection refused)
>
> at java.net.PlainSocketImpl.socketConnect(Native Method)
>
> at java.net.AbstractPlainSocketImpl.doConnect(
> AbstractPlainSocketImpl.java:350)
>
> at java.net.AbstractPlainSocketImpl.connectToAddress(
> AbstractPlainSocketImpl.java:206)
>
> at java.net.AbstractPlainSocketImpl.connect(
> AbstractPlainSocketImpl.java:188)
>
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>
> at java.net.Socket.connect(Socket.java:589)
>
> at org.apache.zookeeper.server.quorum.QuorumCnxManager.
> connectOne(QuorumCnxManager.java:562)
>
> at org.apache.zookeeper.server.quorum.QuorumCnxManager.
> handleConnection(QuorumCnxManager.java:479)
>
> at org.apache.zookeeper.server.quorum.QuorumCnxManager.
> receiveConnection(QuorumCnxManager.java:379)
>
> at org.apache.zookeeper.server.quorum.QuorumCnxManager$
> Listener.run(QuorumCnxManager.java:757)
>
> [2018-04-03 10:22:04,370] INFO Resolved hostname: zk001 to address: 
> zk001/
> 172.31.254.56 (org.apache.zookeeper.server.quorum.QuorumPeer)
>
> [2018-04-03 10:22:17,171] INFO Received connection request /
> 172.31.254.56:58322 
> (org.apache.zookeeper.server.quorum.QuorumCnxManager)
>
> [2018-04-03 10:22:17,172] WARN Cannot open channel to 1 at election 
> address zk001/172.31.254.56:3888 (org.apache.zookeeper.server.
> quorum.QuorumCnxManager)
>
> java.net.ConnectException: Connection refused (Connection refused)
>
>
>
>
>
>
>
> When I restart the zookeeper service, it joined the cluster
>
>
>
> Also when I start the zookeeper service after the boot with 10 sec, it 
> worked
>
>
>
> What could be the cause
>
>
>
> Abderahman Rashwan
>
> [image: bell]Bell Network | SOC
>
> Network Security Engineering|Cyber Security Analyst
>
> T: (514) 870-7001 M: (514) 443-5820
>
> C: abderahman.rash...@bell.ca
>
>
>


Re: zookeeper node can't join the cluster

2018-04-05 Thread Andor Molnar
Hi,

Could it be something to do with Proxmox containers?

Which ZooKeeper version are u running?
Looks like you restarted zk01 and it was trying to connect to itself.
(zk001/172.31.254.56:3888)

Would you please attach your Zk config files too?

Regards,
Andor




On Wed, Apr 4, 2018 at 10:51 PM, Rashwan, Abderahman <
abderahman.rash...@bell.ca> wrote:

> Hello,
>
>
>
> I have 2 servers, I installed proxmox in both and created a cluster
> contains 6 kafka nodes and 3 zookeepers
>
> Server1: kafka1, kafka2, kafka3,zk1
>
> Server2: kafka4, kafka5, kafka6,zk2
>
> VM: zk3
>
>
>
> When i shut down one server, for example server1 (kafka1, kafka2,
> kafka3,zk1)  and then power it up Zk01 gives me an error and can’t join
> the cluster, and I got this error
>
>
>
> [2018-04-03 10:22:04,370] WARN Cannot open channel to 1 at election
> address zk001/172.31.254.56:3888 (org.apache.zookeeper.server.
> quorum.QuorumCnxManager)
>
> java.net.ConnectException: Connection refused (Connection refused)
>
> at java.net.PlainSocketImpl.socketConnect(Native Method)
>
> at java.net.AbstractPlainSocketImpl.doConnect(
> AbstractPlainSocketImpl.java:350)
>
> at java.net.AbstractPlainSocketImpl.connectToAddress(
> AbstractPlainSocketImpl.java:206)
>
> at java.net.AbstractPlainSocketImpl.connect(
> AbstractPlainSocketImpl.java:188)
>
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>
> at java.net.Socket.connect(Socket.java:589)
>
> at org.apache.zookeeper.server.quorum.QuorumCnxManager.
> connectOne(QuorumCnxManager.java:562)
>
> at org.apache.zookeeper.server.quorum.QuorumCnxManager.
> handleConnection(QuorumCnxManager.java:479)
>
> at org.apache.zookeeper.server.quorum.QuorumCnxManager.
> receiveConnection(QuorumCnxManager.java:379)
>
> at org.apache.zookeeper.server.quorum.QuorumCnxManager$
> Listener.run(QuorumCnxManager.java:757)
>
> [2018-04-03 10:22:04,370] INFO Resolved hostname: zk001 to address: zk001/
> 172.31.254.56 (org.apache.zookeeper.server.quorum.QuorumPeer)
>
> [2018-04-03 10:22:17,171] INFO Received connection request /
> 172.31.254.56:58322 (org.apache.zookeeper.server.quorum.QuorumCnxManager)
>
> [2018-04-03 10:22:17,172] WARN Cannot open channel to 1 at election
> address zk001/172.31.254.56:3888 (org.apache.zookeeper.server.
> quorum.QuorumCnxManager)
>
> java.net.ConnectException: Connection refused (Connection refused)
>
>
>
>
>
>
>
> When I restart the zookeeper service, it joined the cluster
>
>
>
> Also when I start the zookeeper service after the boot with 10 sec, it
> worked
>
>
>
> What could be the cause
>
>
>
> Abderahman Rashwan
>
> [image: bell]Bell Network | SOC
>
> Network Security Engineering|Cyber Security Analyst
>
> T: (514) 870-7001 M: (514) 443-5820
>
> C: abderahman.rash...@bell.ca
>
>
>


zookeeper node can't join the cluster

2018-04-04 Thread Rashwan, Abderahman
Hello,

I have 2 servers, I installed proxmox in both and created a cluster contains 6 
kafka nodes and 3 zookeepers
Server1: kafka1, kafka2, kafka3,zk1
Server2: kafka4, kafka5, kafka6,zk2
VM: zk3

When i shut down one server, for example server1 (kafka1, kafka2, kafka3,
zk1)  and then power it up Zk01 gives me an error and can't join the cluster, 
and I got this error

[2018-04-03 10:22:04,370] WARN Cannot open channel to 1 at election address 
zk001/172.31.254.56:3888 (org.apache.zookeeper.server.quorum.QuorumCnxManager)
java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at 
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at 
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at 
org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:562)
at 
org.apache.zookeeper.server.quorum.QuorumCnxManager.handleConnection(QuorumCnxManager.java:479)
at 
org.apache.zookeeper.server.quorum.QuorumCnxManager.receiveConnection(QuorumCnxManager.java:379)
at 
org.apache.zookeeper.server.quorum.QuorumCnxManager$Listener.run(QuorumCnxManager.java:757)
[2018-04-03 10:22:04,370] INFO Resolved hostname: zk001 to address: 
zk001/172.31.254.56 (org.apache.zookeeper.server.quorum.QuorumPeer)
[2018-04-03 10:22:17,171] INFO Received connection request /172.31.254.56:58322 
(org.apache.zookeeper.server.quorum.QuorumCnxManager)
[2018-04-03 10:22:17,172] WARN Cannot open channel to 1 at election address 
zk001/172.31.254.56:3888 (org.apache.zookeeper.server.quorum.QuorumCnxManager)
java.net.ConnectException: Connection refused (Connection refused)



When I restart the zookeeper service, it joined the cluster

Also when I start the zookeeper service after the boot with 10 sec, it worked

What could be the cause

Abderahman Rashwan
[bell]Bell Network | SOC
Network Security Engineering|Cyber Security Analyst
T: (514) 870-7001 M: (514) 443-5820
C: abderahman.rash...@bell.ca