zookeeper node can't join the cluster

2018-04-04 Thread Rashwan, Abderahman
Hello,

I have 2 servers, I installed proxmox in both and created a cluster contains 6 
kafka nodes and 3 zookeepers
Server1: kafka1, kafka2, kafka3,zk1
Server2: kafka4, kafka5, kafka6,zk2
VM: zk3

When i shut down one server, for example server1 (kafka1, kafka2, kafka3,
zk1)  and then power it up Zk01 gives me an error and can't join the cluster, 
and I got this error

[2018-04-03 10:22:04,370] WARN Cannot open channel to 1 at election address 
zk001/172.31.254.56:3888 (org.apache.zookeeper.server.quorum.QuorumCnxManager)
java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at 
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at 
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at 
org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:562)
at 
org.apache.zookeeper.server.quorum.QuorumCnxManager.handleConnection(QuorumCnxManager.java:479)
at 
org.apache.zookeeper.server.quorum.QuorumCnxManager.receiveConnection(QuorumCnxManager.java:379)
at 
org.apache.zookeeper.server.quorum.QuorumCnxManager$Listener.run(QuorumCnxManager.java:757)
[2018-04-03 10:22:04,370] INFO Resolved hostname: zk001 to address: 
zk001/172.31.254.56 (org.apache.zookeeper.server.quorum.QuorumPeer)
[2018-04-03 10:22:17,171] INFO Received connection request /172.31.254.56:58322 
(org.apache.zookeeper.server.quorum.QuorumCnxManager)
[2018-04-03 10:22:17,172] WARN Cannot open channel to 1 at election address 
zk001/172.31.254.56:3888 (org.apache.zookeeper.server.quorum.QuorumCnxManager)
java.net.ConnectException: Connection refused (Connection refused)



When I restart the zookeeper service, it joined the cluster

Also when I start the zookeeper service after the boot with 10 sec, it worked

What could be the cause

Abderahman Rashwan
[bell]Bell Network | SOC
Network Security Engineering|Cyber Security Analyst
T: (514) 870-7001 M: (514) 443-5820
C: abderahman.rash...@bell.ca



RE: [3.4.6] Ephemeral node not deleted after session is gone

2018-04-04 Thread Daniel Chan
Filed a bug at https://issues.apache.org/jira/browse/ZOOKEEPER-3018

Thanks,
Daniel

-Original Message-
From: Daniel Chan 
Sent: Tuesday, April 3, 2018 11:49 AM
To: user@zookeeper.apache.org
Subject: RE: [3.4.6] Ephemeral node not deleted after session is gone

Hi Andor,

Please see my replies and requested information inline.

Thanks,
Daniel

-Original Message-
From: Andor Molnar [mailto:an...@cloudera.com]
Sent: Tuesday, April 3, 2018 2:26 AM
To: user@zookeeper.apache.org
Subject: Re: [3.4.6] Ephemeral node not deleted after session is gone

There're a few questions on the original thread which might be useful to answer 
here as well:

1) Why is the session closed, the client closed it or the cluster expired it?
[Daniel Chan] in this case, the client got killed and we expect the session 
would be expired by the cluster

2) which server was the session attached to - the first (44sec max
lat) or one of the others? Which server was the leader?
[Daniel Chan] The sessions creating the ephemeral nodes were attached to 
Server1 (443 max latency) while Server2 is the leader

3) the znode exists on all 4 servers, is that right?
[Daniel Chan] The cluster has 2 members not 4, and the ephemeral nodes are 
present on both servers

Would also be useful to attach server logs related to the session expiration as 
well as LogFormatter output of txn log files about the nodes.
[Daniel Chan] Only found these logs from Server1 related to the sessions 
(0x162183ea9f70002 and 0x162183ea9f70003):
2018-03-12 03:28:35,127 [myid:1] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted 
socket connection from /10.196.18.60:26775
2018-03-12 03:28:35,131 [myid:1] - WARN  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@822] - Connection 
request from old client /10.196.18.60:26775; will be dropped if server is in 
r-o mode
2018-03-12 03:28:35,131 [myid:1] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@868] - Client 
attempting to establish new session at /10.196.18.60:26775
2018-03-12 03:28:35,137 [myid:1] - INFO  
[CommitProcessor:1:ZooKeeperServer@617] - Established session 0x162183ea9f70002 
with negotiated timeout 9000 for client /10.196.18.60:26775

2018-03-12 03:30:36,415 [myid:1] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted 
socket connection from /10.247.114.70:39260
2018-03-12 03:30:36,422 [myid:1] - WARN  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@822] - Connection 
request from old client /10.247.114.70:39260; will be dropped if server is in 
r-o mode
2018-03-12 03:30:36,423 [myid:1] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@868] - Client 
attempting to establish new session at /10.247.114.70:39260
2018-03-12 03:30:36,428 [myid:1] - INFO  
[CommitProcessor:1:ZooKeeperServer@617] - Established session 0x162183ea9f70003 
with negotiated timeout 9000 for client /10.247.114.70:39260

2018-03-31 01:29:58,865 [myid:1] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed socket 
connection for client /10.247.114.70:39260 which had sessionid 0x162183ea9f70003

Txn logs on the two ephemeral nodes /brokers/ids/707577499 and 
/brokers/ids/822712429:
3/11/18 8:28:35 PM PDT session 0x162183ea9f70002 cxid 0x6 zxid 0x1001b 
create '/brokers/ids,,v{s{31,s{'world,'anyone}}},F,1
3/11/18 8:28:35 PM PDT session 0x162183ea9f70002 cxid 0x2c zxid 0x10028 
create 
'/brokers/ids/707577499,#7b226a6d785f706f7274223a31303130332c2274696d657374616d70223a2231353230383235333135363931222c22686f7374223a22736c6331336e79692e75732e6f7261636c652e636f6d222c2276657273696f6e223a312c22706f7274223a393039327d,v{s{31,s{'world,'anyone}}},T,1
3/11/18 8:30:36 PM PDT session 0x162183ea9f70003 cxid 0x14 zxid 0x10030 
create 
'/brokers/ids/822712429,#7b226a6d785f706f7274223a31303130332c2274696d657374616d70223a2231353230383235343336393139222c22686f7374223a22736c6331336e796a2e75732e6f7261636c652e636f6d222c2276657273696f6e223a312c22706f7274223a393039327d,v{s{31,s{'world,'anyone}}},T,2

Regards,
Andor


On Tue, Apr 3, 2018 at 10:34 AM, Andor Molnar  wrote:

> Hi Daniel,
>
> Thanks for the bugreport.
> Interesting that this issue should have been fixed already by ages:
> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org
> _jira_browse_ZOOKEEPER-2D1208=DwIBaQ=RoP1YumCXCgaWHvlZYR8PZh8Bv7qI
> rMUB65eapI_JnE=JE3yjNS4hXa8nS9n2uFCwEqMvv18hzzEnqunUhCoEns=eycsMys
> gttgbjNW3XhfWJ8TgkcWxEFjljV-TpzD5EFU=ryVABxZ1RLdjrc3D4I6M5ZpST_jU6GV
> QDWoE4AH83L0=
>
> Regards,
> Andor
>
>
> On Tue, Apr 3, 2018 at 3:22 AM, Daniel Chan 
> 
> wrote:
>
>> We have a live Zookeeper environment (quorum size is 2) and observed 
>> a strange behavior:
>> Kafka created 2 ephemeral nodes /brokers/ids/822712429 and
>> /brokers/ids/707577499 on 2018-03-12 03:30:36.933 The Kafka clients 
>> were long gone but as of today, the two ephemeral 

Is the current max packet length available via the API?

2018-04-04 Thread Shawn Heisey
Is it possible to get the current max packet length from the API?
(version 3.4.x)

If not, I'm guessing that I need to look for the jute.maxbuffer system
property and fallback to ZkClientConfig.CLIENT_MAX_PACKET_LENGTH_DEFAULT
if it's not defined.

What I'm trying to do is log a useful error message in Solr if somebody
tries to upload a file that's too big for what's allowed.  The error
that they get currently is not helpful, and figuring out what went wrong
seems to require looking at the server log.

Side note:  I can see in current code (and the 3.5.2 programmer's guide)
that the default max packet length is 4MB, but the administrators guide
(even the 3.5.3 version) still says 1MB.

Thanks,
Shawn