[ANNOUNCE] Apache ZooKeeper 3.5.9

2021-01-15 Thread Norbert Kalmar
The Apache ZooKeeper team is proud to announce Apache ZooKeeper version 3.5.9

ZooKeeper is a high-performance coordination service for distributed
applications. It exposes common services - such as naming,
configuration management, synchronization, and group services - in a
simple interface so you don't have to write them from scratch. You can
use it off-the-shelf to implement consensus, group management, leader
election, and presence protocols. And you can build on it for your
own, specific needs.

For ZooKeeper release details and downloads,
visit:https://zookeeper.apache.org/releases.html

ZooKeeper 3.5.9 Release Notes are
at:https://zookeeper.apache.org/doc/r3.5.9/releasenotes.html

We would like to thank the contributors that made the release possible.

Regards,
The ZooKeeper Team


Re: [ANNOUNCE] New ZooKeeper committer: Mate Szalay-Beko

2020-04-03 Thread Norbert Kalmar
Congratulations Máté, well deserved! :)

- Norbert

On Fri, Apr 3, 2020 at 10:42 AM Andor Molnar  wrote:

> The Apache ZooKeeper PMC recently extended committer karma to Mate and he
> has accepted.
> Mate has made some great contributions (including C client!) and we are
> looking forward to even more. :)
>
> Congratulations and welcome aboard, Mate!
>
>
>


Re: upgrade from 3.4.5 to 3.5.6

2020-03-24 Thread Norbert Kalmar
Hi,

That guide is to upgrade to 3.5.0, which was an alpha version. A lot has
changed for the first stable release of 3.5.5 and then a few more, even
rolling upgrade issues have been fixed for 3.5.6.
This is a more up-to-date guide:
https://cwiki.apache.org/confluence/display/ZOOKEEPER/Upgrade+FAQ

If you have done your testing (with prod snapshot!), then you can skip 3.4
latest upgrade, but keep in mind we do our recommendations for a reason.
There were issues reported and/or found during testing. Some are fixed with
3.5.6, some only happens if certain conditions stand (IOException: No
snapshot found - mentioned in the guide, fixed in 3.5.6).

So it is up to you, I would still recommend to do an 3.4 upgrade first, if
it's feasible.

Regards,
Norbert

On Tue, Mar 24, 2020 at 11:45 AM kuldeep singh 
wrote:

> Hi,
>
> Current Zookeeper version :- 3.4.5
> Upgraded version:- 3.5.6
>
> We are not going with 3.5.7. Our final decision is zookeeper version is
> 3.5.6
> as per your reply first we need to move latest version of 3.4.x, like below
>
> 3.4.5 -> 3.4.14 -> 3.5.6 (Correct me if I am wrong here)
>
> But if We are not facing any problem that i have shared you that we have
> set up of 3 node cluster where 2 node are on 3.5.6 version and 1 node on
> 3.4.5, Everything is running fine and didn't get any issue, So what other
> problem we can face if we directly move to 3.5.6
>
> Thanks,
> -
> Kuldeep Singh Budania
> Software Architect
>
>
> On Tue, Mar 24, 2020 at 3:58 PM Enrico Olivelli 
> wrote:
>
> > Hi
> > You have to upgrade to latest 3.4.x Zookeeper then you will upgrade to
> > 3.5.7.
> > All should run well without issues
> >
> >
> > Enrico
> >
> > Il Mar 24 Mar 2020, 10:18 kuldeep singh  ha
> > scritto:
> >
> > > Hi Team,
> > >
> > > We are upgrading zookeeper from 3.4.5 to 3.5.6. I have set up 3 node
> > > cluster where 2 node are on 3.5.6 version and 1 node on 3.4.5.
> > >
> > > Everything is running fine and didn't get any issue on my system.
> > >
> > > but I found something on apache site  that first we need to upgrade on
> > > 3.4.6 than we can upgrade to 3.5.6. So is it mandatory  to go on 3.4.6
> > > first.
> > >
> > > *Upgrading to 3.5.0*
> > >
> > > Upgrading a running ZooKeeper ensemble to 3.5.0 should be done only
> after
> > > upgrading your ensemble to the 3.4.6 release. Note that this is only
> > > necessary for rolling upgrades (if you're fine with shutting down the
> > > system completely, you don't have to go through 3.4.6). If you attempt
> a
> > > rolling upgrade without going through 3.4.6 (for example from 3.4.5),
> you
> > > may get the following error:
> > >
> > > 2013-01-30 11:32:10,663 [myid:2] - INFO [localhost/127.0.0.1:2784
> > > :QuorumCnxManager$Listener@498] - Received connection request /
> > > 127.0.0.1:60876
> > >
> > > 2013-01-30 11:32:10,663 [myid:2] - WARN [localhost/127.0.0.1:2784
> > > :QuorumCnxManager@349] - Invalid server id: -65536
> > >
> > > During a rolling upgrade, each server is taken down in turn and
> rebooted
> > > with the new 3.5.0 binaries. Before starting the server with 3.5.0
> > > binaries, we highly recommend updating the configuration file so that
> all
> > > server statements "server.x=..." contain client ports (see the section
> > > Specifying
> > > the client port). As explained earlier you may leave the configuration
> > in a
> > > single file, as well as leave the clientPort/clientPortAddress
> statements
> > > (although if you specify client ports in the new format, these
> statements
> > > are now redundant).
> > >
> > > Could you please let me know about this case. Appreciate if respond
> soon.
> > >
> > > Thanks,
> > > -
> > > Kuldeep Singh Budania
> > >
> >
>


Re: question on ZAB protocol

2020-02-15 Thread Norbert Kalmar
Hi,

A would not have confirmed in this case to the client the write. Sending
ACK means the followers have written the transaction to disc. Leader (in
this case A) still needs to send COMMIT message to the followers.
It goes like this:
- LEADER(A) receives a write, so it creates a transaction and send it to
all FOLLOWERs.
- FOLLOWERs receive the transaction and writes it to disc (txnlog). It does
NOT apply to the datatree.
- After writing to disc FOLLOWERs send ACK to LEADER(A) (Nothing at this
point is acknowledged to the client)
- After LEADER(A) receives quorum of ACK, then, and only then will it apply
to the datatree and send COMMIT message to all FOLLOWERs to do the same.
And also ACK to client that the write is complete. And at this point the
data sent by the client is saved in the txnlogs of the quorum.

Hope this helps,

Regards,
Norbert

On Sat, Feb 15, 2020 at 5:20 AM  wrote:

> How do you know A has sent the ack to client before he die ?
>
> 发自我的 iPhone
>
> > 在 2020年2月15日,09:15,jonefeewang  写道:
> >
> > I also have the same question like this below:
> >
> >
> > let's say we have nodes A B C D E, now A is the leader
> >
> > A broadcasts <1,1>,  it reaches B, then A, B die, C D E elect someone,
> > the new system is going to throw away <1,1> since it does not know its
> > existence, right?
> >
> > start from scratch,
> > A broadcasts<1,1> , it reaches all, all send ACK to A, but A dies
> > before receiving the ACK, then BCDE elects someone, and the new leader
> > sees <1,1> in log, so it broadcasts <1,1> to BCDE, which all commit
> > it.  now if we look back, when A dies, the client should get a "write
> > failure", but now after BCDE relection, the written value does get
> > into the system ??? the client and the cluster has an inconsistent view
> ??
> >
> >
> >
> >
> >
> > --
> > Sent from: http://zookeeper-user.578899.n2.nabble.com/
>
>


[ANNOUNCE] Apache ZooKeeper 3.5.7

2020-02-15 Thread Norbert Kalmar
The Apache ZooKeeper team is proud to announce Apache ZooKeeper version
3.5.7

ZooKeeper is a high-performance coordination service for distributed
applications. It exposes common services - such as naming,
configuration management, synchronization, and group services - in a
simple interface so you don't have to write them from scratch. You can
use it off-the-shelf to implement consensus, group management, leader
election, and presence protocols. And you can build on it for your
own, specific needs.

For ZooKeeper release details and downloads, visit:
https://zookeeper.apache.org/releases.html

ZooKeeper 3.5.7 Release Notes are at:
https://zookeeper.apache.org/doc/r3.5.7/releasenotes.html

We would like to thank the contributors that made the release possible.

Regards,

The ZooKeeper Team


Re: [ANNOUNCE] Enrico Olivelli new ZooKeeper PMC Member

2020-01-22 Thread Norbert Kalmar
Congratulations Enrico, well earned! :)

Regards,
Norbert

On Tue, Jan 21, 2020 at 11:15 PM rammohan ganapavarapu <
rammohanga...@gmail.com> wrote:

> Congratulations Enrico!!
>
> On Tue, Jan 21, 2020 at 1:41 PM Flavio Junqueira  wrote:
>
> > I'm pleased to announce that Enrico Olivelli recently became the newest
> > ZooKeeper PMC member. Enrico has contributed immensely to this community;
> > he became a ZooKeeper committer in May 2019 and now he joins the PMC.
> >
> > Join me in congratulating him on the achievement. Congrats, Enrico!
> >
> > -Flavio on behalf of the Apache ZooKeeper PMC
>


Re: ZK makes apache 2019 "top 5" projects

2019-12-12 Thread Norbert Kalmar
Kudos to everyone!
Also, nice to see so many contributions. And not just veterans, but plenty
of new community members! Thank you all!

Regards,
Norbert

On Thu, Dec 12, 2019 at 2:00 PM Jordan Zimmerman 
wrote:

> Fantastic
>
> 
> Jordan Zimmerman
>
> > On Dec 12, 2019, at 3:49 AM, Flavio Junqueira  wrote:
> >
> > +1, thank you all for the hard work.
> >
> > -Flavio
> >
> >> On 12 Dec 2019, at 08:36, Enrico Olivelli  wrote:
> >>
> >> Yes, great.
> >>
> >> Please also note that Kafka and Lucene/Solr that are still listed in
> that
> >> list  are using Zookeeper :)
> >>
> >>
> >> Enrico
> >>
> >> Il gio 12 dic 2019, 05:46 tison  ha scritto:
> >>
> >>> Kudos!
> >>>
> >>> Best,
> >>> tison.
> >>>
> >>>
> >>> Patrick Hunt  于2019年12月12日周四 上午11:32写道:
> >>>
>  This is really awesome, check it out:
>  https://twitter.com/phunt/status/1204966326118141952
> 
>  Kudos ZooKeeper community on all the hard work and efforts!
> 
>  Patrick
> 
> >>>
> >
>


Re: Experimental status of readonlymode feature

2019-10-07 Thread Norbert Kalmar
Hi Lewis,

I don't think the two is connected, so it's not only JVM flag because it's
experimental, it's just haven't been implemented to read this parameter
from config -experimental feature or not.
If you create an upstream ticket, create the PR if you want to contribute
(or just wait until someone implements it) it will be available in the next
release (3.5.x, not in 3.4).
I'm not aware of any rule that experimental features should not be
available from config.

As for why it is experimental. Honestly, I don't know. jute.maxbuffer is
also experimental, but it is widely used in production. This might worth a
question on the dev list.

Regards,
Norbert


On Fri, Oct 4, 2019 at 3:40 AM Lewis Gardner  wrote:

> Hi,
>
> The readonlymode feature was added 8 years ago but is still marked as
> experimental and requires setting a JVM system property to enable it.
>
> What steps are required to promote this feature to "fully supported"
> status
> and allow enablement via the "readonlymode.enabled" setting in zoo.cfg?
>
> thanks,
> Lewis
>


Re: One node crashing in 3.4.11 triggered a full ensemble restart

2019-10-03 Thread Norbert Kalmar
Hi,

Here are the issues we encountered so far upgrading to 3.5.5 from 3.4:
https://cwiki.apache.org/confluence/display/ZOOKEEPER/Upgrade+FAQ

As Enrico mentioned, nothing similar so far. One is no snapshot taken yet
the other is 4 letter words needs to be whitelisted.

As for running a mixed version of 3.5 and 3.4 quorum - I'm afraid it will
not work. From 3.5 we have a check on PROTOCOL_VERSION. 3.4 did not have
this protocol version, so when the nodes try to communicate it will throw
an exception. Plus, it is not a goal to keep quorum protocol backward
compatible, so chances are even without the check it would not work.

Regards,
Norbert

On Thu, Oct 3, 2019 at 12:09 AM Enrico Olivelli  wrote:

> Il mer 2 ott 2019, 22:52 Jerry Hebert  ha scritto:
>
> > Hi Enrico,
> >
> > The nodes that restarted did not have any errors in their logs, they
> seemed
> > to simply restart successfully so I think your hunch about the external
> > system is probably correct.
> >
> > Could you comment on my second question above regarding cross-version
> > migration or should I make a new thread?
> >
>
>
> I am not aware of any issue about an upgrade from 3.4 to 3.5 similar to
> your case. It is expected to work.
>
> Enrico
>
>
> > Are you saying that a 3.5.5 node can synchronize with a 3.4.11 ensemble?
> I
> > > wasn't sure if that would work or not. e.g., maybe I could bring up the
> > new
> > > 3.5.5 ensemble and temporarily form a 10-node ensemble (five 3.4.11
> > nodes,
> > > five 3.5.5 nodes), let them sync and then kill off the old 3.4.11
> boxes?
> >
> >
> > Thanks!
> > Jerry
> >
> > On Wed, Oct 2, 2019 at 1:12 PM Enrico Olivelli 
> > wrote:
> >
> > > Any particular error/stacktrace in the logs?
> > > If it is zookeeper that is self killing it should log it, otherwise is
> > some
> > > other external system, I am sorry I don't know Exhibitor
> > >
> > > Hope that helps
> > > Enrico
> > >
> > > Il mer 2 ott 2019, 21:40 Jerry Hebert  ha
> > scritto:
> > >
> > > > Hi Jörn,
> > > >
> > > > No, this was a very intermittent issue. We've been running this
> > ensemble
> > > > for about four years now and have never seen this problem so it seems
> > to
> > > be
> > > > super heisenbuggy. Our upgrade process will be more involved than
> what
> > > you
> > > > described (we're switching networks, instance types, underlying
> > > automation
> > > > and removing Exhibitor) but I'm glad you asked because I have a
> > question
> > > > about that too. :)
> > > >
> > > > Are you saying that a 3.5.5 node can synchronize with a 3.4.11
> > ensemble?
> > > I
> > > > wasn't sure if that would work or not. e.g., maybe I could bring up
> the
> > > new
> > > > 3.5.5 ensemble and temporarily form a 10-node ensemble (five 3.4.11
> > > nodes,
> > > > five 3.5.5 nodes), let them sync and then kill off the old 3.4.11
> > boxes?
> > > >
> > > > Thanks,
> > > > Jerry
> > > >
> > > > On Wed, Oct 2, 2019 at 12:29 PM Jörn Franke 
> > > wrote:
> > > >
> > > > > Have you tried to stop the node, delete the data and log directory,
> > > > > upgrade to 3.5.5 , start the node and wait until it is
> synchronized ?
> > > > >
> > > > > > Am 02.10.2019 um 20:14 schrieb Jerry Hebert <
> > jerry.heb...@gmail.com
> > > >:
> > > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > My first post here! I'm hoping you all might be able to offer
> some
> > > > > guidance
> > > > > > or redirect me to an existing ticket. We have a five node
> ensemble
> > on
> > > > > > 3.4.11 that we're currently in the process of upgrading to 3.5.5.
> > We
> > > > > > recently saw some bizarre behavior in our ensemble that I was
> > hoping
> > > to
> > > > > > find some sort pre-existing ticket or discussion about but I was
> > > having
> > > > > > difficulty finding hits for this in Jira.
> > > > > >
> > > > > > The behavior that we saw from our metrics is that one of our
> nodes
> > > (not
> > > > > > sure if it was a follower or a leader) started to demonstrate
> > > > > > instability (high CPU, high RAM) and it crashed. Not a big deal,
> > but
> > > as
> > > > > > soon as it crashed, all of the other four nodes all immediately
> > > > > restarted,
> > > > > > resulting in a short outage. One node crashing should never cause
> > an
> > > > > > ensemble restart of course, so I assumed that this must be a bug
> in
> > > ZK.
> > > > > The
> > > > > > nodes that restarted had no indication of errors in their logs,
> > they
> > > > just
> > > > > > simply restarted. Does this sound familiar to any of you?
> > > > > >
> > > > > > Also, we are using Exhibitor on that ensemble so it's also
> possible
> > > > that
> > > > > > the restart was caused by Exhibitor.
> > > > > >
> > > > > > My hope is that this issue will be behind us once the 3.5.5
> upgrade
> > > is
> > > > > > complete but I'd ideally like to find some concrete evidence of
> > this.
> > > > > >
> > > > > > Thanks!
> > > > > > Jerry
> > > > >
> > > >
> > >
> >
>


Re: zookeeper cannot be started after cfg changed

2019-09-18 Thread Norbert Kalmar
Hi,

zkServer.sh starts zookeeper by calling the main class, which it can't find
on the
classpath: ZOOMAIN="org.apache.zookeeper.server.quorum.QuorumPeerMain"
It looks for the binaries in the bin folder, relative to its own path. (So
make sure you are calling the original script in bin or linking it).

Something is mixed up and missing from your classpath.

Regards,
Norbert

On Wed, Sep 18, 2019 at 5:12 AM Raymond Xie  wrote:

> Hello,
>
> I am using zookeeper 3.5.5
>
> I installed it /opt
>
> I modified the zoo.cfg to be:
> # The number of milliseconds of each tick
> tickTime=2000
> initLimit=5
> syncLimit=2
> dataDir=/var/lib/zookeeperdata/1
> clientPort=2181
>
> server.1=pocnnr1n1:2888:3888
> server.2=pocdnr1n1:2889:3889
> server.3=pocdnr2n1:2890:3890
>
> *I was able to start it with:*
> [rxie@pocdnr1n1 apache-zookeeper-3.5.5]$ sudo bin/zkServer.sh start
> conf/zoo.cfg
>
> And then I realized I need to change the port from 2181 to 2182 for this is
> the second zk node, so I made the change.
>
> And then I am not able to start zookeeper, even after I reboot the server.
> output is below:
> ZooKeeper JMX enabled by default
> Using config: conf/zoo.cfg
> Starting zookeeper ... FAILED TO START
>
> Log shows:
> Exception in thread "main" java.lang.NoClassDefFoundError:
> org/apache/zookeeper/
>  server/quorum/QuorumPeerMain
> Caused by: java.lang.ClassNotFoundException:
> org.apache.zookeeper.server.quorum.
>  QuorumPeerMain
> at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
> Could not find the main class:
> org.apache.zookeeper.server.quorum.QuorumPeerMain
>  .  Program will exit.
>
>
> Can anyone enlighten me why it was ok for the first time but can't be
> started after the port is updated to a non-default one? how can I fix it?
>
> Thank you very much.
>
>
>
>
> *Sincerely yours,*
>
>
> *Raymond*
>


Re: a misunderstanding of ZAB

2019-09-03 Thread Norbert Kalmar
Hi,

That's a good question. So if I understand correctly, you are asking what
happens if there is a new Leader Election in ZooKeeper, what is the "last
seen zxid". I checked the ZAB protocol, it is not entirely clear for me as
well, but my understanding is that the last seen zxid is the last
transaction, which is read from txnlogs in case of a recovery. Honestly,
there's nothing else this could be read from. So if it hasn't been
committed to the datatree (and that exists in memory anyway, at least until
a snapshot is taken), it is still the last txn that is logged by one of the
followers, so he will win the Leader Election, and the followers will get
this txn as well.
Anyone agree/disagree? :)

Regards,
Norbert

On Mon, Sep 2, 2019 at 4:50 AM 121476...@qq.com <121476...@qq.com> wrote:

> hi, i'm a new to zookeeper, and this problem confuses me for nearly two
> months...
> papers tell me that zab must satisfy:
> A message delivered by one sever must be delivered on quorum.
> A message skipped must always be skipped.
> Then consider two cases below, L is short for leader, F is short for
> follower, p is short for proposal.
> Case1:
> L send p1 to F2 F3 F4 F5.
> F2 F3 ack p1, reach a quorum.
> L1 is about to send commit but failed...
> L2 become new leader, he should commit.
>
> Case2:
> L1 send p1 to F2 F3 F4 F5.
> Only F2 ack p1, not reach a quorum.
> Then L1 failed...
> L2 become new leader, he should skip p1.
>
> i think L2 should handle the cases in election phase. but how L2
> can know the global state and decide if commit p1 or skip p1?
> if anyone helps, i will be much appreciate.
>
>
>
> 121476...@qq.com
>


Re: Migrate from 3.4.x to 3.5.5

2019-09-03 Thread Norbert Kalmar
In my opinion we should not fallback to something else if the client wants
a CONTAINER node, which is a new feature in 3.5.
This makes ZooKeeper behave kind of unreliable. Client wants a CONTAINER
node, and gets a PERSISTENT one. How should client handle this?

For me, throwing an error seems more reasonable. And this doesn't brake
backward compatibleness, as if you upgrade your client, you are still able
to do the things you did with previous version. Just not the new feature
stuff.

Regards,
Norbert

On Tue, Sep 3, 2019 at 1:22 PM Zili Chen  wrote:

> Well, we cannot make any reasonable fallback from a previous version,
> but isn't it regraded as a break changes?
>
> Best,
> tison.
>
>
> Zili Chen  于2019年9月3日周二 下午7:20写道:
>
> > Hi here,
> >
> > If I communicate quorums running zk 3.4 with a zk 3.5 client, when I am
> > trying to create node in CONTAINER mode the quorums will compliant
> > with KeeperException.UnimplementedException.
> >
> > That is to say, as a client application, if I upgrade zk to 3.5 and then
> > make use of this new feature, we doesn't have a fallback when zk server
> > version is 3.4 to PERSISTENT or something like that, but just fail with
> > an exception. Is it expected? If so, it seems upgrade client side force
> > user to upgrade server side also.
> >
> > Best,
> > tison.
> >
> >
> > Zili Chen  于2019年8月26日周一 下午8:47写道:
> >
> >> Thanks for your insight Andor!
> >>
> >> I'll checkout the page :-)
> >>
> >> Best,
> >> tison.
> >>
> >>
> >> Andor Molnar  于2019年8月26日周一 下午8:41写道:
> >>
> >>> Hi Zili,
> >>>
> >>> There’s no migration guide available for 3.5, because it shouldn’t
> break
> >>> any existing functionality and no need to upgrade the database either.
> >>>
> >>> I’ve created a wiki page to collect upgrade experiences from users
> which
> >>> could give you some hint if you’re facing problems:
> >>> https://cwiki.apache.org/confluence/display/ZOOKEEPER/Upgrade+FAQ
> >>>
> >>> You can always drop an email here too to get help.
> >>>
> >>> Andor
> >>>
> >>>
> >>>
> >>> > On 2019. Aug 26., at 14:12, Zili Chen  wrote:
> >>> >
> >>> > Detailedly, in Flink community we try to bump ZooKeeper version from
> >>> 3.4.10
> >>> > to 3.5.5 but without accurate idea about how it would break existing
> >>> > systems.
> >>> > Mainly we make use of the "client" of ZooKeeper.
> >>> >
> >>> >
> >>> > Zili Chen  于2019年8月26日周一 下午8:02写道:
> >>> >
> >>> >> Hi,
> >>> >>
> >>> >> Is there any migration guide for potentially breaking changes and
> how
> >>> to
> >>> >> deal with them?
> >>> >>
> >>> >> Best,
> >>> >> tison.
> >>> >>
> >>>
> >>>
>


Re: An Apache Zookeeper Security Vulnerability

2019-08-10 Thread Norbert Kalmar
Hello Xiaoqin,

My understanding is that log guards is used for performance reasons. I
don't see how it can prevent information leakage.

I'd also like to add, that please use the security mailing list first if
you think you found a CVE. - secur...@zookeeper.apache.org
More info here:
https://zookeeper.apache.org/security.html

Thank you!

Regards,
Norbert

On Sat, Aug 10, 2019 at 1:31 AM Patrick Hunt  wrote:

> On Fri, Aug 9, 2019 at 9:34 AM Enrico Olivelli 
> wrote:
>
> > Those points do not seem a security issue
> >
> >
> Agree. First off the data is not sensitive. Also it's debug level and
> logged on the server. See
> https://issues.apache.org/jira/browse/ZOOKEEPER-3488 - similar situation
> although in this case debug is not the default - user would actively have
> to turn this on.
>
> Patrick
>
>
> >
> > Enrico
> >
> >
> > Il ven 9 ago 2019, 17:52 Fu, Xiaoqin  ha scritto:
> >
> > > Dear developers:
> > >  I am a Ph.D. student at Washington State University. I applied
> > > dynamic taint analyzer (distTaint) to Apache Zookeeper (version
> 3.4.11).
> > > And then I find a security vulnerability, that exists from
> 3.4.11-3.4.14
> > > and 3.5.5, from tainted paths.
> > >
> > > Possible information leakage from FileTxnSnapLog to log without LOG
> > > control LOG.isDebugEnabled():
> > > In org.apache.zookeeper.server.persistence.FileTxnSnapLog, the
> statement
> > > LOG.debug don't have LOG controls:
> > > public void processTransaction(TxnHeader hdr,DataTree dt,
> > > Map sessions, Record txn)
> > > throws KeeperException.NoNodeException {
> > > ..
> > > if (rc.err != Code.OK.intValue()) {
> > > LOG.debug("Ignoring processTxn failure hdr:" +
> hdr.getType()
> > > + ", error: " + rc.err + ", path: " + rc.path);
> > > }
> > > ..
> > > }
> > >
> > > Sensitive information about hdr type or rc path may be leaked. The
> > > conditional statement LOG.isDebugEnabled() should be added:
> > > public void processTransaction(TxnHeader hdr,DataTree dt,
> > > Map sessions, Record txn)
> > > throws KeeperException.NoNodeException {
> > > ..
> > > if (rc.err != Code.OK.intValue()) {
> > > if (LOG.isDebugEnabled())
> > > LOG.debug("Ignoring processTxn failure hdr:" + hdr.getType()
> > > + ", error: " + rc.err + ", path: " + rc.path);
> > > }
> > > ..
> > > }
> > > Please help me confirm it and give it a CVE ID.
> > >
> > > Thank you very much!
> > > Yours sincerely
> > > Xiaoqin Fu
> > >
> > >
> >
>


Re: Can SSL capability be satisfied by a smaller dependency than netty-all?

2019-08-05 Thread Norbert Kalmar
Thanks for bringing this up Shawn.

I also checked on my fork, netty-transport-native-epoll is the one actually
needed. But yeah, netty-all is overkill.
I created a jira:
https://issues.apache.org/jira/browse/ZOOKEEPER-3494

I will upload my PR soon.

Regards,
Norbert

On Fri, Aug 2, 2019 at 2:07 AM Michael Han  wrote:

> >> SSL capability can be satisfied by one of the smaller netty jars, rather
> than netty-all
>
> A brief look on the imports indicates that we might only need the handler
> and transport jars from Netty. I'd suggest to create a JIRA to request this
> change.
>
> On Tue, Jul 30, 2019 at 1:11 PM Shawn Heisey  wrote:
>
> > We neglected to notice that netty is a required dependency for ZK SSL
> > when we upgraded to ZK 3.5.5 in Solr.  We have an issue to track this:
> >
> > https://issues.apache.org/jira/browse/SOLR-13665
> >
> > I was noticing that the netty-all jar included in ZK is nearly 4MB ...
> > and we will have to include it twice in the Solr download because it is
> > needed for the SolrJ client as well as the Solr server.  The Solr
> > download is already quite large ... increasing it by another 7MB is
> > painful.
> >
> > I'm hoping that ZK's SSL capability can be satisfied by one of the
> > smaller netty jars, rather than netty-all.  Is that a question that can
> > be answered here on the ZK list?  The specific class that is mentioned
> > by the error is included in netty-transport.
> >
> > Thanks,
> > Shawn
> >
>


Re: log files not being cleaned up despite purgeInterval

2019-07-24 Thread Norbert Kalmar
1.7.25.jar:/zookeeper-3.4.13/bin/../lib/slf4j-api-1.7.25.jar:/zookeeper-3.4.13/bin/../lib/netty-3.10.6.Final.jar:/zookeeper-3.4.13/bin/../lib/log4j-1.2.17.jar:/zookeeper-3.4.13/bin/../lib/jline-0.9.94.jar:/zookeeper-3.4.13/bin/../lib/audience-annotations-0.5.0.jar:/zookeeper-3.4.13/bin/../zookeeper-3.4.13.jar:/zookeeper-3.4.13/bin/../src/java/lib/*.jar:/conf:'
> org.apache.zookeeper.server.PurgeTxnLog /data /data -n 3
>
> Changing logger to TRACE offers no output either.
>
>
>
>
> On Mon, Jul 22, 2019 at 10:13 AM Koen De Groote <
> koen.degro...@limecraft.com>
> wrote:
>
> > Performing "bash -ex ./zkCleanup.sh /data/version-2 -n 3" as root results
> > in the creation of another version-2 folder(empty) in the existing
> > version-2 folder.
> >
> > As both root and zookeeper user I am able to create files in the
> > /data/version-2 directory inside the container.
> >
> > The zookeeper user is indeed not the owner of anything in the zk/bin
> > folder(/zookeeper-3.4.13/bin). Executing zkCli.sh works, but creating a
> > file in there doesn't.
> >
> > Permission level for the folder seems to be 0755 on all files and the
> > folder itself.
> >
> > Just ran into what I think is the problem: the relative path to the
> > zoo.cfg file isn't correct.
> >
> > I tried running just plain "./zkCleanup.sh" as the zookeeper user from
> > within the folder and it printed that it could not find the zoo.cfg file,
> > but the path it printed was basically "current_dir/../expected_cfg_dir",
> > which is one ".." too little.
> >
> > Will check if this is due to a setting of mine.
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > On Fri, Jul 19, 2019 at 2:23 PM Norbert Kalmar
> >  wrote:
> >
> >> I would first check the permission on zkCleanup.sh and the bin folder.
> >> Sounds like zookeeper user has no access to the /zk/bin directory.
> >> That might also explain why it is not getting deleted by the zk
> instance.
> >>
> >> And I'm not sure in this one, but did you try giving the full path to
> the
> >> txn log files like bash -ex ./zkCleanup.sh /data/version-2 -n 3 as root?
> >> I think this script might be expecting the full path, including the
> >> version-2 directory.
> >>
> >> Regards,
> >> Norbert
> >>
> >> On Fri, Jul 19, 2019 at 2:00 PM Koen De Groote <
> >> koen.degro...@limecraft.com>
> >> wrote:
> >>
> >> > Hello Norbert,
> >> >
> >> > I've set up a new environment which then reached at least 4 *.log
> files
> >> > All snapshots and log files are kept in /data/version-2/(default for
> the
> >> > image)
> >> >
> >> > I went into the zookeeper container and executed:
> >> >
> >> > bash -ex ./zkCleanup.sh /data -n 3
> >> >
> >> > As root, this changes nothing. There are still 4 *.log files
> >> >
> >> > Changing to the zookeeper user, I get the following output:
> >> >
> >> > Path '/zookeeper-3.4.13/bin' does not exist.
> >> > Usage:
> >> > PurgeTxnLog dataLogDir [snapDir] -n count
> >> > dataLogDir -- path to the txn log directory
> >> > snapDir -- path to the snapshot directory
> >> > count -- the number of old snaps/logs you want to keep, value should
> be
> >> > greater than or equal to 3
> >> >
> >> > And the 4 *.log files still exist.
> >> > Also printing the usage, indicating, to me at least, that something
> >> about
> >> > the input is wrong, even though it is identical to the one used as
> root,
> >> > which did not result in this output.
> >> >
> >> > No actual error messages seem to be printed or logged anywhere.
> >> >
> >> > Not sure what to do next.
> >> >
> >> >
> >> >
> >> > On Fri, Jul 19, 2019 at 11:01 AM Norbert Kalmar
> >> >  wrote:
> >> >
> >> > > Hi Koen,
> >> > >
> >> > > It should do just as you said. You can also set
> >> > autopurge.snapRetainCount,
> >> > > bu default it is set to 3, so if you didn't set anything it is not a
> >> > reason
> >> > > to keep old logs.
> >> > >
> >> > > As a plan B you could use zkCleanup.sh [snapshotDir] -n 3 to delete
> >> all
> >> > > except the last 3 log files. You can add this to a cron job.
> >> > >
> >> > > As for why the old log files not getting deleted, could be something
> >> > > related to the docker image, maybe a permission problem? Do you see
> >> any
> >> > > errors in the server log?
> >> > >
> >> > > Regards,
> >> > > Norbert
> >> > >
> >> > > On Thu, Jul 18, 2019 at 9:25 PM Koen De Groote <
> >> > > koen.degro...@limecraft.com>
> >> > > wrote:
> >> > >
> >> > > > Greetings,
> >> > > >
> >> > > > Working with Zookeeper version 3.4.13 in the official docker
> image.
> >> > > >
> >> > > > I was under the impression that the setting
> >> "autopurge.purgeInterval=1"
> >> > > > meant that log files would be cleaned up every hour.
> >> > > >
> >> > > > Instead, I now find that months of these files are just sitting in
> >> > their
> >> > > > directory, untouched.
> >> > > >
> >> > > > So perhaps I'm wrong about that, but I'm not sure.
> >> > > >
> >> > > > What I wish to achieve is that these log files stop accumulating
> and
> >> > keep
> >> > > > only the most recent. Is there a way to achieve this? Or are they
> >> > merely
> >> > > > historical and can they be deleted freely?
> >> > > >
> >> > > > Kind regards,
> >> > > > Koen De Groote
> >> > > >
> >> > >
> >> >
> >>
> >
>


Re: log files not being cleaned up despite purgeInterval

2019-07-19 Thread Norbert Kalmar
I would first check the permission on zkCleanup.sh and the bin folder.
Sounds like zookeeper user has no access to the /zk/bin directory.
That might also explain why it is not getting deleted by the zk instance.

And I'm not sure in this one, but did you try giving the full path to the
txn log files like bash -ex ./zkCleanup.sh /data/version-2 -n 3 as root?
I think this script might be expecting the full path, including the
version-2 directory.

Regards,
Norbert

On Fri, Jul 19, 2019 at 2:00 PM Koen De Groote 
wrote:

> Hello Norbert,
>
> I've set up a new environment which then reached at least 4 *.log files
> All snapshots and log files are kept in /data/version-2/(default for the
> image)
>
> I went into the zookeeper container and executed:
>
> bash -ex ./zkCleanup.sh /data -n 3
>
> As root, this changes nothing. There are still 4 *.log files
>
> Changing to the zookeeper user, I get the following output:
>
> Path '/zookeeper-3.4.13/bin' does not exist.
> Usage:
> PurgeTxnLog dataLogDir [snapDir] -n count
> dataLogDir -- path to the txn log directory
> snapDir -- path to the snapshot directory
> count -- the number of old snaps/logs you want to keep, value should be
> greater than or equal to 3
>
> And the 4 *.log files still exist.
> Also printing the usage, indicating, to me at least, that something about
> the input is wrong, even though it is identical to the one used as root,
> which did not result in this output.
>
> No actual error messages seem to be printed or logged anywhere.
>
> Not sure what to do next.
>
>
>
> On Fri, Jul 19, 2019 at 11:01 AM Norbert Kalmar
>  wrote:
>
> > Hi Koen,
> >
> > It should do just as you said. You can also set
> autopurge.snapRetainCount,
> > bu default it is set to 3, so if you didn't set anything it is not a
> reason
> > to keep old logs.
> >
> > As a plan B you could use zkCleanup.sh [snapshotDir] -n 3 to delete all
> > except the last 3 log files. You can add this to a cron job.
> >
> > As for why the old log files not getting deleted, could be something
> > related to the docker image, maybe a permission problem? Do you see any
> > errors in the server log?
> >
> > Regards,
> > Norbert
> >
> > On Thu, Jul 18, 2019 at 9:25 PM Koen De Groote <
> > koen.degro...@limecraft.com>
> > wrote:
> >
> > > Greetings,
> > >
> > > Working with Zookeeper version 3.4.13 in the official docker image.
> > >
> > > I was under the impression that the setting "autopurge.purgeInterval=1"
> > > meant that log files would be cleaned up every hour.
> > >
> > > Instead, I now find that months of these files are just sitting in
> their
> > > directory, untouched.
> > >
> > > So perhaps I'm wrong about that, but I'm not sure.
> > >
> > > What I wish to achieve is that these log files stop accumulating and
> keep
> > > only the most recent. Is there a way to achieve this? Or are they
> merely
> > > historical and can they be deleted freely?
> > >
> > > Kind regards,
> > > Koen De Groote
> > >
> >
>


Re: log files not being cleaned up despite purgeInterval

2019-07-19 Thread Norbert Kalmar
Hi Koen,

It should do just as you said. You can also set autopurge.snapRetainCount,
bu default it is set to 3, so if you didn't set anything it is not a reason
to keep old logs.

As a plan B you could use zkCleanup.sh [snapshotDir] -n 3 to delete all
except the last 3 log files. You can add this to a cron job.

As for why the old log files not getting deleted, could be something
related to the docker image, maybe a permission problem? Do you see any
errors in the server log?

Regards,
Norbert

On Thu, Jul 18, 2019 at 9:25 PM Koen De Groote 
wrote:

> Greetings,
>
> Working with Zookeeper version 3.4.13 in the official docker image.
>
> I was under the impression that the setting "autopurge.purgeInterval=1"
> meant that log files would be cleaned up every hour.
>
> Instead, I now find that months of these files are just sitting in their
> directory, untouched.
>
> So perhaps I'm wrong about that, but I'm not sure.
>
> What I wish to achieve is that these log files stop accumulating and keep
> only the most recent. Is there a way to achieve this? Or are they merely
> historical and can they be deleted freely?
>
> Kind regards,
> Koen De Groote
>


Re: Zookeeper latency calculation

2019-07-17 Thread Norbert Kalmar
Hi Ram,

ZooKeeper is very fast if deployed according to recommendations (nodes on
the same network, directly connected). It's possible it gives 0 latency on
avg, although usually it's a bit higher.
I can recommend Patrick's smoke test if you wan't to test performance.
Especially zk-smoketest and zk-latencies.py. It has a good readme:
https://github.com/phunt/zk-smoketest

But "mntr" command is pretty much the easiest tool out of the box.
Especially if you are on 3.4.x

Regards,
Norbert

On Tue, Jul 16, 2019 at 7:05 PM rammohan ganapavarapu <
rammohanga...@gmail.com> wrote:

> Hi,
>
> I am trying to understand how zookeeper latency calculated, mntr command
> always give avg_latency "0", can some one help how to calculate avg request
> latency in zookeeper?
>
>
> Thanks,
> Ram
>


Re: snapshots are creating in zookeeper

2019-06-26 Thread Norbert Kalmar
Hi Srikanth,

It should be in dataDir.
The default snapCount is 100.000, did you have enough traffic to warrant a
snapshot being created?
Are the txn logs there in the dataLogDir?

Regards,
Norbert

On Wed, Jun 26, 2019 at 12:06 AM Srikanth Pippari
 wrote:

> Hello,
>
> I have installed zookeeper-3.4.14.tar.gz in my server but I don't see
> snapshots are created under data directory. Below is my configuration. Can
> someone help me to find out the issue..
>
> # The number of milliseconds of each tick
> tickTime=2000
> # The number of ticks that the initial
> # synchronization phase can take
> initLimit=10
> # The number of ticks that can pass between
> # sending a request and getting an acknowledgement
> syncLimit=5
> # the directory where the snapshot is stored.
> # do not use /tmp for storage, /tmp here is just
> # example sakes.
> dataDir=/home/ec2-user/zookeeper
> dataLogDir=/var/log/zookeeper
> # the port at which the clients will connect
> clientPort=2181
> # the maximum number of client connections.
> # increase this if you need to handle more clients
> maxClientCnxns=60
> #
> # Be sure to read the maintenance section of the
> # administrator guide before turning on autopurge.
> #
> #
> http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
> #
> # The number of snapshots to retain in dataDir
> #autopurge.snapRetainCount=3
> # Purge task interval in hours
> # Set to "0" to disable auto purge feature
> autopurge.purgeInterval=1
>
> Thanks & Regards
> --
> Srikanth Pippari  | V3OPS team.
> Email ID : spipp...@vitechinc.com
> ---
>
>
> This e-mail message and any files transmitted with it may contain
> confidential and proprietary information and are intended solely for the
> use of the individual or entity to which they are addressed. Any
> unauthorized review, use, disclosure or distribution is strictly
> prohibited. If you have received this e-mail in error please notify the
> sender by reply email and destroy all copies of the original message. Thank
> you for your cooperation.
>


Re: zookeeper services are not up when install using helm

2019-06-13 Thread Norbert Kalmar
Hi,

Looks like the port for JMX - 1099 - is already used by another process.
Just change the port, or disable remote JMX (which is a good thing to do
security wise BTW).

Regards,
Norbert

On Thu, Jun 13, 2019 at 6:53 PM Taranisen Mohanta 
wrote:

> I am trying to install the incubator/zookeeper using helm chart on the AWS
> EKS cluster.
> The zookeeper services are not starting on the pods
> Anyone can help me on this.
> $ kubectl get pods
> NAME READY   STATUSRESTARTS   AGE
> zm-zookeeper-0   1/1 Running   0  18m
> zm-zookeeper-1   1/1 Running   0  17m
> zm-zookeeper-2   1/1 Running   0  16m
> $ kubectl get service
> NAMETYPECLUSTER-IP  EXTERNAL-IP   PORT(S)
>AGE
> kubernetes  ClusterIP   10.100.0.1  443/TCP
>7d3h
> zm-zookeeperClusterIP   10.100.52.213   2181/TCP
> 18m
> zm-zookeeper-headless   ClusterIP   None
>  2181/TCP,3888/TCP,2888/TCP   18m
> $ kubectl exec zm-zookeeper-2 -- /opt/zookeeper/bin/zkServer.sh
> start-foreground;
> ZooKeeper JMX enabled by default
> ZooKeeper remote JMX Port set to 1099
> ZooKeeper remote JMX authenticate set to false
> ZooKeeper remote JMX ssl set to false
> ZooKeeper remote JMX log4j set to true
> Using config: /opt/zookeeper/bin/../conf/zoo.cfg
> Error: Exception thrown by the agent : java.rmi.server.ExportException:
> Port already in use: 1099; nested exception is:
> java.net.BindException: Address already in use (Bind failed)
> command terminated with exit code 1
>


Re: [ANNOUNCE] Apache ZooKeeper 3.5.5

2019-05-21 Thread Norbert Kalmar
Congratulations to all of you!
Thanks for making this release Andor!


On Tue, May 21, 2019 at 4:18 AM Zili Chen  wrote:

> Congratulations!
>
> rammohan ganapavarapu  于2019年5月21日周二 上午7:25写道:
>
> > Congratulations, finally it's out 
> >
> > On Mon, May 20, 2019, 11:59 AM Enrico Olivelli 
> > wrote:
> >
> > > Congratulations!
> > >
> > > Enrico
> > >
> > > Il lun 20 mag 2019, 19:28 Lars Francke  ha
> > > scritto:
> > >
> > > > Congratulations on this release! It looks great and I'm looking
> forward
> > > to
> > > > using all those new features.
> > > >
> > > > Thank you, everyone, for your work on this.
> > > >
> > > > On Mon, May 20, 2019 at 7:06 PM Andor Molnar 
> wrote:
> > > >
> > > > > The Apache ZooKeeper team is proud to announce Apache ZooKeeper
> > version
> > > > > 3.5.5
> > > > >
> > > > > ZooKeeper is a high-performance coordination service for
> distributed
> > > > > applications. It exposes common services - such as naming,
> > > > > configuration management, synchronization, and group services - in
> a
> > > > > simple interface so you don't have to write them from scratch. You
> > can
> > > > > use it off-the-shelf to implement consensus, group management,
> leader
> > > > > election, and presence protocols. And you can build on it for your
> > > > > own, specific needs.
> > > > >
> > > > > For ZooKeeper release details and downloads, visit:
> > > > > https://zookeeper.apache.org/releases.html
> > > > >
> > > > > ZooKeeper 3.5.5 Release Notes are at:
> > > > > https://zookeeper.apache.org/doc/r3.5.5/releasenotes.html
> > > > >
> > > > > We would like to thank the contributors that made the release
> > possible.
> > > > >
> > > > > Regards,
> > > > >
> > > > > The ZooKeeper Team
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
>


Re: Unexpected delay between pings sent from the client to server

2019-04-06 Thread Norbert Kalmar
Hi Gelbana,

max_latency tells you the time elapsed between creating the request and
FinalRequestProcessor processing it. So the cause for being that high could
be basically anything.
Turning on debug log for ZooKeeper server could help pinpoint at what point
the request get stuck for so long.

Regards,
Norbert

On Wed, Apr 3, 2019 at 3:07 PM Muhammad Gelbana  wrote:

> Another couple of things I found:
>
> *A couple of Zookeeper client threads are stuck at these stacktraces for
> ~30 seconds*
> "pool-2-thread-1-EventThread" #1218 daemon prio=5 os_prio=0
> tid=0x7ff3f5e23800 nid=0x5cd8 waiting on condition [0x7ff3ef803000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x00018d6d8ed8> (a
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at
>
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:504)
>
> "pool-2-thread-1-SendThread(72.55.136.25:2181)" #1217 daemon prio=5
> os_prio=0 tid=0x7ff3f5e23000 nid=0x5cd7 runnable [0x7ff3ef904000]
>java.lang.Thread.State: RUNNABLE
> at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
> at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
> at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
> at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
> - locked <0x00018d68a730> (a sun.nio.ch.Util$3)
> - locked <0x00018d68a720> (a java.util.Collections$UnmodifiableSet)
> - locked <0x00018d68a258> (a sun.nio.ch.EPollSelectorImpl)
> at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
> at
>
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:349)
> at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1145)
>
> *Running the mntr command returned the following stats*
> zk_version3.4.13-2d71af4dbe22557fda74f9a9b4309b15a7487f03, built on
> 06/29/2018 04:05 GMT
> zk_avg_latency0
> zk_max_latency*17657*
> zk_min_latency0
> zk_packets_received1427134
> zk_packets_sent1596974
> zk_num_alive_connections64
> zk_outstanding_requests0
> zk_server_statefollower
> zk_znode_count1394
> zk_watch_count592
> zk_ephemerals_count192
> zk_approximate_data_size181257
> zk_open_file_descriptor_count94
> zk_max_file_descriptor_count1048576
> zk_fsync_threshold_exceed_count1
>
> I find the *zk_max_latency* extremely hight. I'm wondering what kind of
> latency is that ? How can I debug the reason for this value ?
>
> Thanks,
> Gelbana
>
>
>
> On Wed, Apr 3, 2019 at 1:42 PM Muhammad Gelbana 
> wrote:
>
> > I'm trying to debug a problem where our client application suddenly loses
> > its Zookeeper session. I concluded that by looking at the Zookeeper
> server
> > logs.
> >
> > I increased the logging details for the client and found the following
> log
> > messages
> >
> >> DEBUG: [07:33:33] [demo | HA | Manager] Got ping response for sessionid:
> >> 0x3000da76fa904b6 after 0ms
> >> [org.apache.zookeeper.ClientCnxn$SendThread.readResponse]
> >> DEBUG: [07:34:07] [demo | HA | Manager] Got ping response for sessionid:
> >> 0x3000da76fa904b6 after 0ms
> >> [org.apache.zookeeper.ClientCnxn$SendThread.readResponse]
> >> DEBUG: [07:34:40] [demo | HA | Manager] Got ping response for sessionid:
> >> 0x3000da76fa904b6 after 0ms
> >> [org.apache.zookeeper.ClientCnxn$SendThread.readResponse]
> >> DEBUG: [07:35:13] [demo | HA | Manager] Got ping response for sessionid:
> >> 0x3000da76fa904b6 after 0ms
> >> [org.apache.zookeeper.ClientCnxn$SendThread.readResponse]
> >> DEBUG: [07:35:47] [demo | HA | Manager] Got ping response for sessionid:
> >> 0x3000da76fa904b6 after 0ms
> >> [org.apache.zookeeper.ClientCnxn$SendThread.readResponse]
> >> DEBUG: [07:36:20] [demo | HA | Manager] Got ping response for sessionid:
> >> 0x3000da76fa904b6 after 0ms
> >> [org.apache.zookeeper.ClientCnxn$SendThread.readResponse]
> >>
> >
> > I noticed that the duration between each log message is ~33 seconds while
> > on another environment (my laptop), the duration goes down to ~1 second.
> > What could be causing this huge difference ? I doubt that whatever is
> > causing this effect causes the delay to increase significantly at some
> > point to the extend that makes my client lose its session.
> >
>


Re: Zookeeper monitoring

2019-04-06 Thread Norbert Kalmar
Hi Adrien,

There's no single answer to your question. Most, if not all of these
parameter depends on your load and use case.
zk_open_file_descriptor_count is not a percentage, but you can easily do by
using the max count also.

There's hardly a best value to be used for all use cases. For example
zk_open_file_descriptor_count you could check if it's starting to grow
rapidly, or go over a certain threshold as you mentioned.
Or zk_pending_syncs - are there any bursts in your use case? How many
clients are there? How many requests on high load? What might be an alert
level for one use case can be perfectly normal load in another.

zk_fsync_threshold_exceed_count might not be the best monitoring parameter.
This is more to be used once you detect something is wrong, to check if
it's been going on for a while. But it is better to look for fsync warnings
in the log. At least I would look for that. Again, what's the use case?

Regards,
Norbert

On Fri, Apr 5, 2019 at 11:33 PM Muhammad Gelbana 
wrote:

> Yes. Suprisingly it isn't. I suppose and hope there is some other channel
> that is actively used.
>
> Thanks,
> Gelbana
>
>
> On Fri, Apr 5, 2019 at 10:06 PM adrien ruffie 
> wrote:
>
> > The community does not look too active, unfortunately ...
> > 
> > De : adrien ruffie
> > Envoyé : mercredi 3 avril 2019 14:24
> > À : user@zookeeper.apache.org
> > Objet : Zookeeper monitoring
> >
> > Hello all,
> >
> > I order to set to correct values for several monitoring parameters, I
> > would like to know the best value to monitor for following parameter:
> > (from what value would it be necessary to worry about the following
> > parameters)
> >
> > zk_outstanding_requests
> > zk_open_file_descriptor_count
> > zk_approximate_data_size
> > zk_fsync_threshold_exceed_count
> > zk_avg_latency
> > zk_pending_syncs
> >
> > Example for zk_open_file_descriptor_count tell me if I'm wrong, but I
> think
> > this should be a percentage value that should not be exceeded depending
> on
> > the parameter.
> >
> > throws an alerte if zk_open_file_descriptor_count > 50 % of
> > zk_max_file_descriptor_count , right ?
> >
> >
> > I am looking for relatively correct values for triggering alert.
> >
> > Thnak a lot and best regards
> >
> > Adrien
> >
>


Re: Recommended syncLimit for 3-node AWS cluster

2019-03-11 Thread Norbert Kalmar
Hi,

I'm not aware of any performance tests for AWS specifically, more so, I'm
not aware of any cloud based performance tests regarding your question, so
I'm going to speak general ZK deployment.

tickTime - regulates how often a heartbeat is sent or connection times out
(it's the base unit, not the exact time). This has no direct effect to sync
time.
initLimit  - initLimit * tickTime equals the time allowed for followers to
connect and sync with Leader. This will not speed up your sync times. If
you have lot of data stored in ZK, you might wan't to increase this,
allowing more time to sync.
syncLimit - Pretty similar to initLimit, minus the connection time. This is
for every sync operation, so your follower might be dropped after a while.
Again, not much to do with speeding up the sync time.

These settings are mainly for timeouts. I would touch these if I start
seeing followers getting dropped due to timeout.
There isn't really a way I'm aware of that would speed up sync time, other
than to keep data stored in ZK minimal, and keep your jute.maxbuffer fairly
small, default is 1MB I think, and you shouldn't go above a few MBs.

Regards,
Norbert

On Mon, Mar 11, 2019 at 1:49 PM Behroz Sikander  wrote:

> This seems to be a straight forward question :). Anyone?
>
> On Fri, Mar 8, 2019 at 10:32 AM Behroz Sikander 
> wrote:
>
> > Hello,
> > Currently, I have a Spark cluster which uses 3-node zookeeper underneath
> > for leader election. I want to reuse the zookeeper cluster for storing
> some
> > configuration information and traffic in zookeeper will increase. I want
> > the cluster to become synced as early as possible.
> >
> >
> > What are the recommended configuration settings for this clusters
> assuming
> > that I am running on AWS?
> >
> > The following values are the ones I am using now.
> >
> > tickTime=4000
> > initLimit=30
> > syncLimit=15
> >
> > Any reasoning on why specific values would work best would also be
> helpful.
> >
> > Regards,
> > Behroz Sikander
> >
> >
>


Re: what does it mean to have exit-code of 251

2019-03-04 Thread Norbert Kalmar
Hi Prashant,

Depending on your version, exit codes are static variables in ExitCode.java
But we do not use 251, we only go as far as 14 with exit codes. This looks
like an app specific exit code.
Are you using some kind of systemd script to start ZooKeeper?

Regards,
Norbert

On Mon, Mar 4, 2019 at 12:49 PM prashantkumar dhotre <
prashantkumardho...@gmail.com> wrote:

> Hi,
>
> In my journal log, I see ;
>
> 1199473 Mar 01 15:46:03 evo-qfx-01 systemd[1]: ifmand.service: Main process
> exited, code=exited, status=251/n/a
>
> In my app, I don't explicitly call exit(251).
> I use zookeeper lib.
> Fro my logs,I see that after a call to zookeeper lib API zookeeper_close()
> , I see that my service exited with 251.
> Is 251 a std exit code or app/lib specific custom exit code ?
> I want to know what does 251 means.
>
> Can you  please  let me know where can I see the exit-code to meaning
> mapping
> ?
>
> Thanks
>
> Prashant
>


Re: Zookeeper crashes with EOF Exception

2019-02-27 Thread Norbert Kalmar
Sounds like your snapshot is corrupted. But you said ZK is running fine for
some amount of time then crashes?
Maybe it's an invalid PROPOSE message.
By the way, sounds a bit similar to this issue:
https://issues.apache.org/jira/browse/ZOOKEEPER-1955

If it is possible, delete the snapshot and txn logs from data dir (you will
lose your data!) and restart the clusters.
Which version of ZK are you using?

On Wed, Feb 27, 2019 at 12:29 PM zoo_js 
wrote:

> There is 3 snapshot files with 1.01 GB size, each file at around 330 MB of
> size. I have a 56GB of hard disk space available.
>
>
>
> --
> Sent from: http://zookeeper-user.578899.n2.nabble.com/
>


Re: Zookeeper crashes with EOF Exception

2019-02-27 Thread Norbert Kalmar
Hi JS,

Looks like there was a Leader election, and during sync phase
(syncWithLeader), the follower tried to deserialize the snapshot, but it is
an incomplete file, hence the EOF exception.
How big is your snapshot? Did you run out of disc space?
Also worth checking for fsync warnings / errors in the log.

Hope this helps.

Regards,
Norbert

On Wed, Feb 27, 2019 at 8:05 AM zoo_js 
wrote:

> Hi all,
>
> We have a 3 node zookeeper cluster used for Vault as HA.  Starting a few
> days ago, the entire cluster crashes a few times per day, all nodes at the
> exact same time. We are running some load test using vault for Data
> encryption. Per minute 1000 keys unique keys will be generated, Once the
> issue started around 270,000 keys.
>
> The following exception is got from the syslog, not sure what's causing
> this
> crash. Please help to proceed..
>
> 2019-02-26 22:35:18,831 [myid:1] - WARN
> [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@90] - Exception when
> following the leader
> java.io.EOFException
>at
> java.base/java.io.DataInputStream.readFully(DataInputStream.java:202)
>at
> java.base/java.io.DataInputStream.readFully(DataInputStream.java:170)
>at
> org.apache.jute.BinaryInputArchive.readBuffer(BinaryInputArchive.java:94)
>at
> org.apache.zookeeper.server.DataNode.deserialize(DataNode.java:165)
>at
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:99)
>at
> org.apache.zookeeper.server.DataTree.deserialize(DataTree.java:1076)
>at
>
> org.apache.zookeeper.server.util.SerializeUtils.deserializeSnapshot(SerializeUtils.java:130)
>at
>
> org.apache.zookeeper.server.ZKDatabase.deserializeSnapshot(ZKDatabase.java:452)
>at
> org.apache.zookeeper.server.quorum.Learner.syncWithLeader(Learner.java:340)
>at
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:83)
>at
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:981)
> 2019-02-26 22:35:19,349 [myid:1] - INFO
> [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@169] - shutdown called
> java.lang.Exception: shutdown Follower
>at
> org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:169)
>
> thanks
> JS
>
>
>
>
>
>
> --
> Sent from: http://zookeeper-user.578899.n2.nabble.com/
>


Re: Configuring pseudo distributed zookeeper, followers and leader

2019-02-27 Thread Norbert Kalmar
Hi Weiqi,

How do you start the instances? My theory is that the 2 nodes on the VM
establishes the connection faster than you local machine with the two VM.
So since they already have quorum they will elect the Leader amongst them.
you could start 1 node from VM and on your localhost, and then just wait
until your localhost is the Leader. Then start the 3rd node on the VM.
There really isn't a way to define Leader at startup as much as I can tell.

Hope this helps.

Regards,
Norbert

On Tue, Feb 26, 2019 at 4:46 PM 徐炜淇  wrote:

> Hi,
> I configured a pseudo distributed zookeeper, a master two virtual
> machines. The content of file A are as follows:
> tickTime=2000
> initLimit=10
> syncLimit=5
> dataDir=/home/v7/RyaInstall/zookeeper-3.4.12/data
> clientPort=2181
> server.1=192.168.122.1:2888:3888
> server.2=192.168.122.92:2888:3888
> server.3=192.168.122.152:2888:3888
>
>
>
> The server is my local computer, server 2 and 3 are ,y virtual machines.
> But when I start zookeerper, my localhost always becomes follower, one of
> server2 or server3 becomes the leader.
> I do not know how to make my local computer becomes the leader.
> Can you help me?
>
>
> Best
> Weiqi
>
>
>
>
>
>
>


Re: Enable authentication per client

2019-02-09 Thread Norbert Kalmar
Hi Ram,

ZooKeeper only knows IP addresses. You either require authentication from
all clients, or turn it off completely.
At least I couldn't think up anything that would achieve what you want.

Regards,
Norbert

On Sat, Feb 9, 2019 at 12:52 AM rammohan ganapavarapu <
rammohanga...@gmail.com> wrote:

> Hi,
>
> Is it possible to enable authentication for specific clients in zookeeper?
>
> Thanks,
> Ram
>


Re: About the error zknode of replication

2019-01-28 Thread Norbert Kalmar
Hi Shen,

This looks like you should ask in the HBase user list. It is HBase related,
as it failed to cleanup the old znodes. Nothing in ZooKeeper you can do
really. You can either rmr recursively (which will break your replication)
or one-by-one by date maybe? Anyway, I wouldn't recommend that either.

Perhaps there is some way to do a cleanup from HBase command line. Maybe
even stop and re-enable replication so that the znodes can be removed, thus
the oldWALs cleaned.

I would write to HBase user list.

Regards,
Norbert

On Mon, Jan 28, 2019 at 6:05 PM 沈忆珠  wrote:

> Hi,
>
> I am using hbase replication in my clusters.
> Perhaps because of the region server crash previously, I find that
> there are hundreds of thousands of zknode in /hbase/replication/rs.
> And this makes files in /hbase/oldWALs cannot be cleaned.
> I want to remove theses invalid zknode in zookeeper. But it's
> unrealistic to delete them manually because the number is too large.
> Now I am confused about the method to do the deletion without
> affecting those valid zknode for current replication.
> Can you please help me.
>
> Regards,
> shen
>


Re: Update on stable 3.5 release estimate

2019-01-22 Thread Norbert Kalmar
Hi!

The filter doesn't seem to work for me. But basically what's left is the
fix on java11 build and maven migration. Both should be ready soon. There
is also ZOOKEEPER-3204
 looking at it.

It should be released fairly soon, I think, and hope by the end of February?

The filter I used:
https://issues.apache.org/jira/browse/ZOOKEEPER-3204?jql=project%20%3D%20ZooKeeper%20AND%20priority%20in%20(Blocker)%20AND%20resolution%20%3D%20Unresolved%20ORDER%20BY%20priority%20DESC
If you are referring to this one, this doesn't filter on 3.5 affect
version. So not all of this needs to be fixed for a 3.5 stable.

Regards,
Norbert

On Tue, Jan 22, 2019 at 9:58 PM Felix Jancso-Szabo
 wrote:

> Hi!
>
> I looked at an old threat on this mailing list  (
>
> http://zookeeper-user.578899.n2.nabble.com/stable-3-5-release-td7583547.html
> )
> that seemed to suggest a stable 3.5 release was coming.  The blockers
> mentioned in that thread don't seem to have really gone anywhere as far as
> I can tell?
>
> Is there an estimate on when a stable release could be expected?  Even an
> order of magnitude estimate would be helpful - is it months away or will it
> be past the end of 2019?
>
> - Felix
>


Re: Zookeeper 3.5.X beta to stable versoin

2018-11-20 Thread Norbert Kalmar
Hi Gopi,

I (and pretty much everyone) can only give you estimates, unfortunately.
So, my guess: January.
But I am not the release manager, nor a "binding voter".
I'm just also closely monitoring 3.5 stable release as well, and working on
one of the blocking issues.

Regards,
Norbert

On Tue, Nov 20, 2018 at 2:14 PM Testing Ideas  wrote:

> Hi,
>
> Do you have a time of when we can expect the stable release of Zookeeper
> 3.5? I got from your community that you are working on to release the
> stable version ASAP, but a gross time would help us to communicate with my
> stakeholders on the release and plan our migration.
>
> Many thanks,
> Gopi
>


Re: ZooKeeper 3.6.0 Release (fix for a large number of watches)

2018-11-20 Thread Norbert Kalmar
Hi Kevin,

Unfortunately, 3.4 and 3.5 is in a feature freeze. You could ask the
community to backport this feature (vote on it), but personally I don't
think it will go through, sorry. It could be backported to first 3.5
release after stable (3.5.5  think). But were still talking months here,
possibly 3.5.5 would be ready at the same time 3.6 is.
The plan is to push 3.5 stable out ASAP, and do a 3.6 release very quick
after (6 months?).

Regards,
Norbert



On Fri, Nov 16, 2018 at 9:24 PM Kevin Verhoeven  wrote:

> Thanks Enrico,
>
> Would it be possible to backport the fix (PR # 590) into a previous
> version? For example, 3.4.x or 3.5.x? This could potentially release an
> important fix quicker.
>
> Kevin
>
> On Fri, Nov 16, 2018 at 12:52 PM Enrico Olivelli 
> wrote:
>
> > Kevin,
> > I guess there is no 3.6.0 very soon.
> > The community is working hard in order to release a 'stable' 3.5.x, then
> we
> > will go for 3.6.0.
> > I think any help I very appreciated to speed up the process.
> >
> > The others (like Facebook friends) on this list can talk about uses of
> > 3.6.0 in production.
> >
> > Just my 2c
> >
> > Enrico
> >
> > Il ven 16 nov 2018, 20:32 Kevin Verhoeven  ha
> scritto:
> >
> > > I am running Accumulo with a large number of tables (> 10k). Each table
> > > sets up a number of watches in ZooKeeper, and we have many (> 50
> million
> > > watches). We are running 5 ZooKeeper servers and the watches are spread
> > > evenly across the servers. We see Accumulo performance degrade sharply
> > when
> > > we add more than 6k tables and it appears this is related to the number
> > of
> > > watches in ZooKeeper. However, there seems to be a performance fix for
> a
> > > large number of watches, in issue # 1177[1], that was resolved Aug 2018
> > and
> > > merged into master with PR # 590 in release 3.6.0.
> > >
> > > My question is how can I get the ZooKeeper 3.6.0 release? I'd like to
> > > download a copy of the 3.6.0 jar and test with this version. If all
> goes
> > > well, this will resolve my performance issue with Accumulo watches.
> > >
> > > A second question would be: is there a way to tune ZooKeeper for better
> > > performance with a large number of watches? This does not seem to be a
> > > optimal running state for ZooKeeper, but we are constrained by Accumulo
> > and
> > > their use of watches.
> > >
> > > [1] https://issues.apache.org/jira/browse/ZOOKEEPER-1177
> > > [2] https://github.com/apache/zookeeper/pull/590
> > >
> > > Regards,
> > > Kevin
> > >
> > --
> >
> >
> > -- Enrico Olivelli
> >
>


Re: Please Register: ZooKeeper Meetup @ Facebook, Nov 8th 2018

2018-11-06 Thread Norbert Kalmar
Yes, 2 days from now.

Regards,
Norbert

On Tue, Nov 6, 2018 at 9:07 PM Jeff Widman  wrote:

> This is happening this week, correct?
>
> On Fri, Sep 14, 2018 at 8:54 AM Ivan Serdyuk  >
> wrote:
>
> > Awesome.
> >
> > I wonder if you are expecting to record your talk.
> >
> > Ivan
> >
> > On Fri, Sep 14, 2018 at 2:46 AM Mohamed Jeelani  wrote:
> >
> > > Your ZooKeeper friends @ Facebook would like to invite you to share and
> > > learn what’s new with ZooKeeper.
> > >
> > > We will not only share what we at Facebook have been up to, but we have
> > > exciting talks from speakers from the ZooKeeper community lined up who
> > are
> > > eager to share what they've been working on as well. And of course,
> we've
> > > got some cool swag for you :-)
> > >
> > > When: November 8th 2018, 5pm – 8pm (Talks: 5pm - 7pm; Networking &
> Happy
> > > Hour: 7pm - 8pm)
> > > Where: Facebook HQ - MPK 16, 1 Hacker Way, Menlo Park, CA
> > > We will have remote viewing locations in our Facebook Seattle office,
> and
> > > the event will also be live streamed. You can indicate how you'd like
> to
> > > attend on the registration page.
> > >
> > > Please register here - https://zookeeperatfb.splashthat.com/
> > >
> > > We look forward to seeing you soon!
> > >
> > > ZooKeeper Friends @ Facebook
> > >
> >
>
>
> --
>
> *Jeff Widman*
> jeffwidman.com  | 740-WIDMAN-J (943-6265)
> <><
>


Re: ZooKeeperServer not running

2018-10-23 Thread Norbert Kalmar
Hi,

Looks like to me ZooKeeper quorum simply didn't complete leader election
yet. You can see server state as "LOOKING", and after the errors client
timeouting, you get the server created and FOLLOWING log message. After
that, I assume there is no more error message.

So probably ZooKeeper should be started earlier a bit if you don't want
these error messages.

Regards,
Norbert

On Tue, Oct 23, 2018 at 3:35 AM Susheel Kumar  wrote:

> Hello,
>
> I am seeing "ZookeeperServer not running" WARM messages in zookeeper logs
> which is causing the Solr client connections to timeout...
>
> What could be the problem?
>
> ZK: 3.4.10
>
> Zookeeper.out
> ==
> 2018-10-22 06:04:51,071 [myid:2] - INFO
> [WorkerReceiver[myid=2]:FastLeaderElection@600] - Notification: 1 (message
> format version), 5 (n.leader), 0xf0461 (n.zxid), 0x10 (n.round),
> FOLLOWING (n.state), 4 (n.sid), 0xf (n.peerEpoch) LOOKING (my state)
> 2018-10-22 06:04:51,093 [myid:2] - INFO  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2182:NIOServerCnxnFactory@192] - Accepted socket
> connection
> from /192.72.25.177:39514
> 2018-10-22 06:04:51,094 [myid:2] - WARN  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2182:NIOServerCnxn@373] - Exception causing close of
> session 0x0 due to java.io.IOException: ZooKeeperServer not running
> 2018-10-22 06:04:51,094 [myid:2] - INFO  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1044] - Closed socket connection for
> client /192.72.25.177:39514 (no session established for client)
> 2018-10-22 06:04:51,138 [myid:2] - INFO  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2182:NIOServerCnxnFactory@192] - Accepted socket
> connection
> from /192.3.101.219:56298
> 2018-10-22 06:04:51,138 [myid:2] - WARN  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2182:NIOServerCnxn@373] - Exception causing close of
> session 0x0 due to java.io.IOException: ZooKeeperServer not running
> 2018-10-22 06:04:51,139 [myid:2] - INFO  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1044] - Closed socket connection for
> client /192.3.101.219:56298 (no session established for client)
> 2018-10-22 06:04:51,250 [myid:2] - INFO  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2182:NIOServerCnxnFactory@192] - Accepted socket
> connection
> from /192.72.27.181:46414
> 2018-10-22 06:04:51,250 [myid:2] - WARN  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2182:NIOServerCnxn@373] - Exception causing close of
> session 0x0 due to java.io.IOException: ZooKeeperServer not running
> 2018-10-22 06:04:51,250 [myid:2] - INFO  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1044] - Closed socket connection for
> client /192.72.27.181:46414 (no session established for client)
> 2018-10-22 06:04:51,275 [myid:2] - INFO
> [WorkerReceiver[myid=2]:FastLeaderElection@600] - Notification: 1 (message
> format version), 4 (n.leader), 0xf0461 (n.zxid), 0x192 (n.round),
> LOOKING (n.state), 4 (n.sid), 0xf (n.peerEpoch) LOOKING (my state)
> 2018-10-22 06:04:51,275 [myid:2] - INFO
> [WorkerReceiver[myid=2]:FastLeaderElection@600] - Notification: 1 (message
> format version), 4 (n.leader), 0xf0461 (n.zxid), 0x192 (n.round),
> LOOKING (n.state), 2 (n.sid), 0xf (n.peerEpoch) LOOKING (my state)
> 2018-10-22 06:04:51,275 [myid:2] - INFO
> [WorkerReceiver[myid=2]:FastLeaderElection@600] - Notification: 1 (message
> format version), 4 (n.leader), 0xf0461 (n.zxid), 0x192 (n.round),
> LOOKING (n.state), 1 (n.sid), 0xf (n.peerEpoch) LOOKING (my state)
> 2018-10-22 06:04:51,309 [myid:2] - INFO  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2182:NIOServerCnxnFactory@192] - Accepted socket
> connection
> from /192.72.5.212:38944
> 2018-10-22 06:04:51,309 [myid:2] - WARN  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2182:NIOServerCnxn@373] - Exception causing close of
> session 0x0 due to java.io.IOException: ZooKeeperServer not running
> 2018-10-22 06:04:51,309 [myid:2] - INFO  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1044] - Closed socket connection for
> client /192.72.5.212:38944 (no session established for client)
> 2018-10-22 06:04:51,356 [myid:2] - INFO  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2182:NIOServerCnxnFactory@192] - Accepted socket
> connection
> from /192.72.7.201:59310
> 2018-10-22 06:04:51,356 [myid:2] - WARN  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2182:NIOServerCnxn@373] - Exception causing close of
> session 0x0 due to java.io.IOException: ZooKeeperServer not running
> 2018-10-22 06:04:51,356 [myid:2] - INFO  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2182:NIOServerCnxn@1044] - Closed socket connection for
> client /192.72.7.201:59310 (no session established for client)
> 2018-10-22 06:04:51,402 [myid:2] - INFO  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2182:NIOServerCnxnFactory@192] - Accepted socket
> connection
> from /192.3.101.219:56302
> 2018-10-22 06:04:51,402 [myid:2] - WARN  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2182:NIOServerCnxn@373] - Exception causing close of
> session 0x0 due to java.io.IOException: ZooKeeperServer 

Re: zookeeper fails to start due to EOFException

2018-10-12 Thread Norbert Kalmar
Hi,

Looks like after you run out of disk space, the transaction log file could
not be finished (no end of file character added). So it's corrupted.
Does all quorum member fail? If not, just startup the ones that can run,
delete the logs from datalogdir, and replace them from another node.

Or if losing the information stored in ZooKeeper is no concern, you can
just delete datalogdir's content and startup again (that's a big IF!)

You can also try the TxnLogToolkit, although it is not in 3.4.10 I'm afraid:
https://zookeeper.apache.org/doc/r3.4.13/zookeeperAdmin.html#Recovery+-+TxnLogToolkit

Regards,
Norbert


On Fri, Oct 12, 2018 at 1:32 PM Pushkar Deole  wrote:

> Using zookeeper version 3.4.10
> Zookeeper seems to have crashed due to no space available on the the device
> and after that zookeeper is failing to start.
> Cleared the unwanted data so the disk space is now sufficient. Rebooted the
> system, zookeeper still fails to start with below error:
>
> Any help will be appreciated...
>
> [2018-10-12 08:17:38,828] ERROR Unexpected exception, exiting abnormally
> (org.apache.zookeeper.server.ZooKeeperServerMain)
> java.io.EOFException
> at java.io.DataInputStream.readInt(DataInputStream.java:392)
> at
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> at
>
> org.apache.zookeeper.server.persistence.FileHeader.deserialize(FileHeader.java:64)
> at
>
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.inStreamCreated(FileTxnLog.java:585)
> at
>
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.createInputArchive(FileTxnLog.java:604)
> at
>
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.goToNextLog(FileTxnLog.java:570)
> at
>
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:652)
> at
>
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:166)
> at
> org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223)
> at
>
> org.apache.zookeeper.server.ZooKeeperServer.loadData(ZooKeeperServer.java:283)
> at
>
> org.apache.zookeeper.server.ZooKeeperServer.startdata(ZooKeeperServer.java:410)
> at
>
> org.apache.zookeeper.server.NIOServerCnxnFactory.startup(NIOServerCnxnFactory.java:118)
> at
>
> org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:119)
> at
>
> org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:87)
> at
>
> org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:53)
> at
>
> org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116)
> at
>
> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
>


Re: Which is the best option to secure the solr specific data in zookeeper?

2018-10-12 Thread Norbert Kalmar
Hi,

I don't know how solr authentication from the viewpoint of ZooKeeper.
ACL makes sense to restrict the content solr cretes in ZooKeeper.

I'm not sure what you mean by enabling authentication for the whole
ZooKeeper?
ZK has server-to-server (quorum auth) and client-to-server mutual
authentication.
Looks like solr has org.apache.solr.common.cloud.SaslZkACLProvider, that's
basically the client authentication part in ZooKeeper, and you can use ACL.

Quorum authentication won't help you to protect your data from fraudulent
clients. That's just against fraudulent servers.

So my short answer is that ACL is the way to go when securing client data
is the question.

Regards,
Norbert


Re: Maven migration - main src dir moved

2018-10-05 Thread Norbert Kalmar
Hopefully ready around end of next week. I'm currently working on the 3.4
branch to also move the remaining files. A little trickier compared to
master then 3.5 was to master.

Regards,
Norbert

On Fri, Oct 5, 2018 at 3:07 PM Enrico Olivelli  wrote:

> Great news!
>
> Waiting now for a root level pom.xml!
>
> Thank you Norbert
> Enrico
>
> Il ven 5 ott 2018, 14:32 Andor Molnar  ha scritto:
>
> > Hi,
> >
> > Please be aware that the patch which moved ZooKeeper server’s src folder
> > to the new location has been merged. You probably need to rebase your PRs
> > and resolve conflicts to get them merged.
> >
> > Sorry for the inconvenience.
> >
> > Regards,
> > Andor
> >
> >
> > --
>
>
> -- Enrico Olivelli
>


Re: Observer properties for SASL authentication in 3.4.13 version

2018-09-24 Thread Norbert Kalmar
Unfortunately I'm not entirely sure on this one, and I can't test it out
right now, but shouldn't be any different then a normal follower. So you
should configure SASL the same way. The only difference basically is that
they are non-voters. Everything else works the same. Clients connect and
can send read / write commands. So it would be a huge security hole if an
observer is not configured as well.

Regards,
Norbert

On Mon, Sep 24, 2018 at 10:59 AM rammohan ganapavarapu <
rammohanga...@gmail.com> wrote:

> Any thoughts?
>
> On Sun, Sep 23, 2018 at 8:00 PM rammohan ganapavarapu <
> rammohanga...@gmail.com> wrote:
>
> > Hi,
> >
> > Do we need to configure any thing on observer nodes for SASL
> > authentication?
> >
> > tcpKeepAlive=true ( this is not for sasl but just asking )
> >
> > quorum.auth.enableSasl=true
> > quorum.auth.learnerRequireSasl=true
> > quorum.auth.serverRequireSasl=true
> >
> > What will happen if i set these properties on observers nodes as well ?
> >
> > Thanks,
> > Ram
> >
>


Re: Kerberos based authentication

2018-09-24 Thread Norbert Kalmar
Hi Ram,

Yes, you will need a Kerberos instance, ZooKeeper doesn't have it bundled (
I don't think it's even possible especially in terms of security reasons).
Then you will have to configure SASL in ZooKeeper, as an additional layer
over Kerberos.

I'd say Kerberos is more secure, as for example you don't have your
password stored. But it is more complex to setup and you require a 3rd
party Kerberos instance. Lot of tutorial on it though. Like for example:
https://blog.bluesoftglobal.com/3-steps-to-apache-zookeeper-authentication/
https://github.com/ekoontz/zookeeper/wiki

At the end, I think it comes down to preference.

Regards,
Norbert

On Sat, Sep 22, 2018 at 1:40 PM rammohan ganapavarapu <
rammohanga...@gmail.com> wrote:

> Hi,
>
> To configure "Kerberos based authentication" on zookeeper server-server or
> client-server do we need to install any additional packages? do we need to
> setup kerberos server? or zookeeper embedded the kerberos server? and what
> is the recommended authentication mechanism, kerberos or digest-md5?
>
> Thanks,
> Ram
>


Re: Java 11 OpenJDK/Oracle Java Release Cadence Questions

2018-09-17 Thread Norbert Kalmar
Theres already a Jira and PR available to fix 3.4 JDK11 related errors:
https://issues.apache.org/jira/browse/ZOOKEEPER-3148
https://github.com/apache/zookeeper/pull/626

But this won't be released by September 25th that's for sure. But I think
it will until end of support of older JDK's, so by 2019 January. This is
not a promise though, as I don't know the release plans for 3.4

Regards,
Norbert

On Fri, Sep 14, 2018 at 5:01 PM Jeremiah Adams 
wrote:

>
> Java 11 is out on Sept., 25 and  makes Java 8, 9 and 10 obsolete. Java 8,
> 9, and 10 will no longer receive security patches/updates beginning in
> January 2019. Organizations have the option to pay Oracle for patches in
> order to stay on Java 8.
>
> Oracle is changing its release cycle and licensing for Java SE (which is
> no more as off 11).  Java SE is rebranded to Oracle java and is an LTS
> release requiring paid licenses per processor running java 11. Fee is $25
> per processor. LTS means fewer upgrades, but does receive security patches.
>
> Oracle will continue to contribute to OpenJDK. However, the OpenJDK
> releases will come every six month and are called “Major” releases.. Each
> release making the previous obsolete with zero security patches/updates
> applied to previous versions.
>
> All of this sounds really bad, however, the “major” versions that will
> come from OpenJDK are more akin to going from 8->8u20->8u40 than from Java
> 7 to Java 8. So the biggest hurdle is simply aligning project releases with
> OpenJDK release cadences. This implies building to release candidates prior
> to bi-annual JDK release testing and releasing.
>
> If your organization doesn’t want  to pay for Oracle Java your org will
> have to upgrade Java versions every six months. While we can easily manage
> our own code base, dependencies such as Zookeeper are key to our business
> and difficult to control.
>
> Below are links related to this topic I found while doing research. If
> anyone can comes to different conclusions please share.
>
> I do have testing Zookeeper on Java 11 in our backlog but do not know how
> soon I can get to that.
>
>
> http://www.oracle.com/us/corporate/pricing/price-lists/java-se-subscription-pricelist-5028356.pdf
>
> http://www.oracle.com/technetwork/java/javaseproducts/overview/javasesubscriptionfaq-4891443.html
> http://www.oracle.com/technetwork/java/eol-135779.html
>
> https://blogs.oracle.com/java-platform-group/update-and-faq-on-the-java-se-release-cadence
>
> https://blogs.oracle.com/java-platform-group/a-quick-summary-on-the-new-java-se-subscription
>
> http://www.oracle.com/technetwork/java/javaseproducts/javasesubscription-data-sheet-4891969.pdf
> https://www.infoq.com/presentations/java-10-11
>
>
>
> Jeremiah Adams
> Software Engineer
> www.helixeducation.com
> Blog | Twitter | Facebook | LinkedIn
>
> 
> From: Jeremiah Adams 
> Sent: Wednesday, September 12, 2018 8:28 AM
> To: user@zookeeper.apache.org
> Subject: [POSSIBLE PHISHING] Re: Java 11 OpenJDK/Oracle Java Release
> Cadence Questions
>
>   "But if it's too late for you, you can still revert to OpenJDK8 which is
>supported until June 2023 (if that's an option to you at all)."
>
> Please correct me if I am misinformed but this is not consistent with the
> research I have been doing. 8 will no longer receive security updates from
> Oracle beginning Jan 2019. To continue receiving patches from Oracle,
> processors and end users must be licensed. Redhat has an OpenJDK but it
> also requires paid support. Other vendors build OpenJDK8 but this is
> getting too far off the beaten path for me to consider production worthy.
>
>
>
> I am currently evaluating each of our core java dependencies to determine
> if paying for this support is our best path forward until Java11 is widely
> adopted.
>
> Thanks
>
>
> Jeremiah Adams
> Software Engineer
>
> https://url.emailprotection.link/?ahfhEufaAWbezBrUFPG98ZJcterGfIerU3ZwsA3Gv_C0~
> Blog | Twitter | Facebook | LinkedIn
>
> 
> From: Norbert Kalmar 
> Sent: Wednesday, September 12, 2018 1:48 AM
> To: user@zookeeper.apache.org
> Subject: Re: Java 11 OpenJDK/Oracle Java Release Cadence Questions
>
> Hi Jeremiah,
>
> I don't know what will happen with the Oracle support, I'm sure oracleJDK11
> will be sorted out for ZK sooner or later.
>
> But if it's too late for you, you can still revert to OpenJDK8 which is
> supported until June 2023 (if that's an option to you at all).
>
> But seeing people concerned with a topic here on this list sure can speed
> things up. So thanks for writing, this is definitely on the table, and we
> will k

Re: Upgrade from 3.4.5 to 3.4.13

2018-09-12 Thread Norbert Kalmar
A rolling restart is enough. Just stop one server ,replace jar, restart,
and it will join back the quorum. Then do this for the rest of the servers.

Update requires a restart, but you can save yourself a full restart this
way. You will have no downtime.

Regards,
Norbert

On Wed, Sep 12, 2018 at 3:05 PM rammohan ganapavarapu <
rammohanga...@gmail.com> wrote:

> Norbert,
>
> Thank you, do we need to restart process after jar replacement?
>
> Ram
>
> On Wed, Sep 12, 2018, 12:56 AM Norbert Kalmar  >
> wrote:
>
> > Hi,
> >
> > We had a discussion about this not long ago, it was 3.4.8 to 3.4.13, but
> > same things have to be done in this case.
> >
> >
> http://zookeeper-user.578899.n2.nabble.com/How-to-upgrade-zookeeper-from-3-4-8-to-3-4-13-td7583831.html
> >
> > If you just replace the jar, you don't even have to copy anything else.
> >
> >
> > Regards,
> > Norbert
> >
> > On Wed, Sep 12, 2018 at 12:29 AM rammohan ganapavarapu <
> > rammohanga...@gmail.com> wrote:
> >
> > > I have zk cluster with 3.4.5 version and planing to upgrade to 3.4.13
> > > version, can some one point me to a upgrade steps? When i copy snapshot
> > > from 3.4.5 node to 3.4.13 node it seems to be working but not sure what
> > is
> > > the proper way of upgrading existing cluster.
> > >
> > > Thanks,
> > > Ram
> > >
> >
>


Re: Upgrade from 3.4.5 to 3.4.13

2018-09-12 Thread Norbert Kalmar
Hi,

We had a discussion about this not long ago, it was 3.4.8 to 3.4.13, but
same things have to be done in this case.
http://zookeeper-user.578899.n2.nabble.com/How-to-upgrade-zookeeper-from-3-4-8-to-3-4-13-td7583831.html

If you just replace the jar, you don't even have to copy anything else.


Regards,
Norbert

On Wed, Sep 12, 2018 at 12:29 AM rammohan ganapavarapu <
rammohanga...@gmail.com> wrote:

> I have zk cluster with 3.4.5 version and planing to upgrade to 3.4.13
> version, can some one point me to a upgrade steps? When i copy snapshot
> from 3.4.5 node to 3.4.13 node it seems to be working but not sure what is
> the proper way of upgrading existing cluster.
>
> Thanks,
> Ram
>


Re: Java 11 OpenJDK/Oracle Java Release Cadence Questions

2018-09-12 Thread Norbert Kalmar
Hi Jeremiah,

I don't know what will happen with the Oracle support, I'm sure oracleJDK11
will be sorted out for ZK sooner or later.

But if it's too late for you, you can still revert to OpenJDK8 which is
supported until June 2023 (if that's an option to you at all).

But seeing people concerned with a topic here on this list sure can speed
things up. So thanks for writing, this is definitely on the table, and we
will keep You guys updated!

Regards,
Norbert

On Tue, Sep 11, 2018 at 7:36 PM Jeremiah Adams 
wrote:

> My primary concern is the drop of support and security patches for Java8
> in January by Oracle. It will require paying for support if not upgraded to
> Java 11. We are trying to get ahead of this because of budgets.
>
>
> Jeremiah Adams
> Software Engineer
> www.helixeducation.com
> Blog | Twitter | Facebook | LinkedIn
>
> 
> From: Patrick Hunt 
> Sent: Tuesday, September 11, 2018 11:20 AM
> To: UserZooKeeper; DevZooKeeper
> Subject: Re: Java 11 OpenJDK/Oracle Java Release Cadence Questions
>
> Hi Jeremiah. It's failing consistently on Jenkins, unlikely to officially
> support until someone addresses those:
>
> https://url.emailprotection.link/?aiZTcxUx4h_dcZxn8EbCGprch71qelj0ZbtLyOKh81lGDtcTT9nJTlYs664Fxxbo6NyeapY3ycsrKiReKk2X-NndjsOGpxSNaC9H6hYX2gFkoRiL_OQRSJbsZy3GW7njr
>
> We've been testing with openjdk for quite some time, those are supported.
> The docs are ambiguous in that regard:
>
> https://url.emailprotection.link/?axDFV0VGEqoZ5mAP7-eAuEvymmcOqMy-AQd1z3ZmjioJpcV45jei0TZPPFqveKFbWyaiBdaBjhuqlsUc3Fgt2qgYG4nxsnGPwk1t0yi0fCeA3yFq1ZN2mMQZYzVoeCqOa
> however I don't see why Oracle and OpenJDK wouldn't be supported. EOD it's
> up to the community.
>
> Patrick
>
> On Tue, Sep 11, 2018 at 10:02 AM Jeremiah Adams  >
> wrote:
>
> > Hello,
> >
> >
> > Are there any documents available concerning Zookeeper's support for Java
> > 11 and documents regarding models supporting Oracle's new licensing and
> > release cadences?
> >
> >
> > Thanks.
> >
> >
> > Jeremiah Adams
> > Software Engineer
> >
> https://url.emailprotection.link/?ahfhEufaAWbezBrUFPG98ZJcterGfIerU3ZwsA3Gv_C0~
> <
> https://url.emailprotection.link/?a49H2rNGIIBtQOw6md8OcHp-qKE3Xn2gNiZ3dlqAeSDA~
> >
> > Blog<
> https://url.emailprotection.link/?a49H2rNGIIBtQOw6md8OcHgFEZu-KYuiu8doY66NWwmmyWxz7kC-27Yfnbdgd2wyh5gjXUa6LMT_NRXsj1g1VVg~~>
> | Twitter<
> >
> https://url.emailprotection.link/?a0Q7ct5_6cOdbJ86kpWB0zx6RbtgugTVC7lU_W7za50jLdZQGpLgVlR1V06zckSaM5oOKb6QBo46Qp9xt0Tt7Aw~~>
> | Facebook<
> >
> https://url.emailprotection.link/?aAmyAO_nS_C1aDgBLeKyGTu0tksTt1_mn2PcS8KJXNJPM04iRHKgX96qGgENV-dMSER5wl8zDVRr3RsS0OmcF9A~~>
> | LinkedIn<
> >
> https://url.emailprotection.link/?aanlcNI-cN74Gdz-TD332xAl6lHu7TRNICWoHUFjYf-KlBjrCGHoYR65b3rl-OyW10nWFv6hwYvUSoVHL4b3vGA~~
> >
> >
>


Re: can not know the process name from zk log

2018-09-11 Thread Norbert Kalmar
Hi,

What do you mean by process name?
ZK doesn't know the client process, only the IP address.
If you mean what process has led to connection close (like error in
establishing session), previous logs should be useful.

Which version of ZK is this? On mster I see this as debug level log.

Regards,
Norbert

On Tue, Sep 11, 2018 at 1:29 PM wangyongqiang0...@163.com <
wangyongqiang0...@163.com> wrote:

>
> in ZK log, there are some close socket logs as follows:
>
>  [Thread-77061] INFO
> org.apache.zookeeper.server.NIOServerCnxn.closeSock(1007) -Closed socket
> connection for client /x.x.x.x:54312 (no session established for client)
>
>
> from this log info, i can not know the process name, because the port
> 54312 should be release by the process
> is there any useful methods?
>
>
> wangyongqiang0...@163.com
>


Re: How to upgrade zookeeper from 3.4.8 to 3.4.13?

2018-08-30 Thread Norbert Kalmar
It's not because of versions. It's due to the fact there are multiple
zookeeper jar's in the tarball like zookeeper-XY-bin.jar, zookeeper-XY.jar
etc. (test, sources, javadoc).

I'm not sure why there are so many and some overlap, but this is how it is
currently. So the script adds all of them to the classpath. But there
shouldn't be jar files from multiple versions of ZooKeeper present in the
directory.

Well, at least this is my understanding.



On Thu, Aug 30, 2018 at 12:01 PM Debraj Manna 
wrote:

> Thanks Norbert for replying.
>
> What is the intention of this check *for i in "$ZOOBINDIR"/../zookeeper-*.*
> *jar* ? Do we want to add the latest version first in the CLASSPATH? If
> that is the case then I think this will fail for versions like 3.4.10
> onwards if we also have 3.4.8, 3.4.5 etc in that directory?
>
> On Thu, Aug 30, 2018 at 3:13 PM Norbert Kalmar
> 
> wrote:
>
> > Hi,
> >
> > zkEnv.sh intentionally has the wild card, it's not a bug. As the
> zookeeper
> > jar has the version in its name due to the maven release, somehow we have
> > to point to the jar, but hard coding the version would be prone to error.
> >
> > Anyway, to answer your question, first of all I would just try to delete
> > the old .jar, and see if that solves the problem.
> >
> > Regards,
> > Norbert
> >
> > On Thu, Aug 30, 2018 at 9:01 AM Debraj Manna 
> > wrote:
> >
> > > Cross-posting from stackoverflow
> > > <
> > >
> >
> https://stackoverflow.com/questions/52090357/how-to-upgrade-zookeeper-from-from-3-8-to-3-13
> > > >
> > >
> > > I am trying to upgrade zookeeper from 3.4.8 to 3.4.13.
> > >
> > > Before upgrade the content of /usr/lib/zookeeper
> > >
> > > drwxr-xr-x 5 root root 4.0K Aug 23 08:39 . drwxr-xr-x 77 root root 12K
> > Aug
> > > 23 08:50 .. drwxr-xr-x 2 root root 4.0K Aug 23 08:39 bin lrwxrwxrwx 1
> > root
> > > root 19 May 24 11:25 conf -> /etc/zookeeper/conf drwxr-xr-x 2 root root
> > > 4.0K Aug 23 08:39 lib -rw-r--r-- 1 root root 12K May 24 11:25
> LICENSE.txt
> > > -rw-r--r-- 1 root root 170 May 24 11:25 NOTICE.txt -rw-r--r-- 1 root
> root
> > > 1.3M Aug 23 08:39 zookeeper-3.4.8.jar lrwxrwxrwx 1 root root 38 Aug 23
> > > 08:39 zookeeper.jar -> /usr/lib/zookeeper/zookeeper-3.4.8.jar
> > >
> > > As mentioned in answer <https://serverfault.com/a/758671/300869> I
> have
> > > downloaded the zookeeper from this link
> > > <
> > >
> >
> http://mirrors.fibergrid.in/apache/zookeeper/zookeeper-3.4.13/zookeeper-3.4.13.tar.gz
> > > >
> > > and
> > > placed the zookeeper-3.4.13.jar in /usr/lib/zookeeper and pointed the
> > > symbolic link like below
> > >
> > > lrwxrwxrwx  1 root root   39 Aug 30 03:19 zookeeper.jar ->
> > > /usr/lib/zookeeper/zookeeper-3.4.13.jar
> > >
> > > But on checking the status after resarting zookeeper it is still
> pointing
> > > to 3.4.8
> > >
> > > ubuntu@vrni-platform:/etc/zookeeper/conf$ telnet localhost 2181
> > > Trying 127.0.0.1...
> > > Connected to localhost.
> > > Escape character is '^]'.
> > > status
> > > Zookeeper version: 3.4.8--1, built on 02/06/2016 03:18 GMT
> > >
> > > It appears this is because of the way the jars are loaded from
> > > /usr/lib/zookeeper/bin/zkEnv.sh
> > >
> > > #release tarball format
> > > for i in "$ZOOBINDIR"/../zookeeper-*.jar
> > > do
> > >   CLASSPATH="$i:$CLASSPATH"
> > > done
> > >
> > > Can someone let me know is this some known issue is zkEnv.sh? Is this
> > > expected?
> > >
> >
>


Re: How to upgrade zookeeper from 3.4.8 to 3.4.13?

2018-08-30 Thread Norbert Kalmar
Hi,

zkEnv.sh intentionally has the wild card, it's not a bug. As the zookeeper
jar has the version in its name due to the maven release, somehow we have
to point to the jar, but hard coding the version would be prone to error.

Anyway, to answer your question, first of all I would just try to delete
the old .jar, and see if that solves the problem.

Regards,
Norbert

On Thu, Aug 30, 2018 at 9:01 AM Debraj Manna 
wrote:

> Cross-posting from stackoverflow
> <
> https://stackoverflow.com/questions/52090357/how-to-upgrade-zookeeper-from-from-3-8-to-3-13
> >
>
> I am trying to upgrade zookeeper from 3.4.8 to 3.4.13.
>
> Before upgrade the content of /usr/lib/zookeeper
>
> drwxr-xr-x 5 root root 4.0K Aug 23 08:39 . drwxr-xr-x 77 root root 12K Aug
> 23 08:50 .. drwxr-xr-x 2 root root 4.0K Aug 23 08:39 bin lrwxrwxrwx 1 root
> root 19 May 24 11:25 conf -> /etc/zookeeper/conf drwxr-xr-x 2 root root
> 4.0K Aug 23 08:39 lib -rw-r--r-- 1 root root 12K May 24 11:25 LICENSE.txt
> -rw-r--r-- 1 root root 170 May 24 11:25 NOTICE.txt -rw-r--r-- 1 root root
> 1.3M Aug 23 08:39 zookeeper-3.4.8.jar lrwxrwxrwx 1 root root 38 Aug 23
> 08:39 zookeeper.jar -> /usr/lib/zookeeper/zookeeper-3.4.8.jar
>
> As mentioned in answer  I have
> downloaded the zookeeper from this link
> <
> http://mirrors.fibergrid.in/apache/zookeeper/zookeeper-3.4.13/zookeeper-3.4.13.tar.gz
> >
> and
> placed the zookeeper-3.4.13.jar in /usr/lib/zookeeper and pointed the
> symbolic link like below
>
> lrwxrwxrwx  1 root root   39 Aug 30 03:19 zookeeper.jar ->
> /usr/lib/zookeeper/zookeeper-3.4.13.jar
>
> But on checking the status after resarting zookeeper it is still pointing
> to 3.4.8
>
> ubuntu@vrni-platform:/etc/zookeeper/conf$ telnet localhost 2181
> Trying 127.0.0.1...
> Connected to localhost.
> Escape character is '^]'.
> status
> Zookeeper version: 3.4.8--1, built on 02/06/2016 03:18 GMT
>
> It appears this is because of the way the jars are loaded from
> /usr/lib/zookeeper/bin/zkEnv.sh
>
> #release tarball format
> for i in "$ZOOBINDIR"/../zookeeper-*.jar
> do
>   CLASSPATH="$i:$CLASSPATH"
> done
>
> Can someone let me know is this some known issue is zkEnv.sh? Is this
> expected?
>


Re: Port 3888 closed on Leader

2018-08-23 Thread Norbert Kalmar
Hi,

I don't fully understand the events here, but as I see:
- Leader election port should not just disappear, maybe some config issue?
- If a follower is unable to join back in the quorum, but there is still
n/2 + 1 nodes functioning (in case of 3 nodes - 2 functioning nodes), the
quorum will work without a problem, and leader will not come out of the
cluster.

Regards,
Norbert

On Wed, Aug 15, 2018 at 3:46 PM harish lohar  wrote:

> Hi,
>
> In a deployment of 3 Node Zk Cluster we have seen that sometime port 3888
> is absent after the cluster is formed , this causes Follower node to not
> able to connect to leader if they restart.
>
> Don't leader itself should come out of clustering if this happens  ??
>
> Thanks
> Harish
>


Re: Unable to stop zookeeper-server service

2018-08-14 Thread Norbert Kalmar
What does the ZK log say when you try to stop it?

( As a side note, I'm not a big expert on systemd )

On Tue, Aug 14, 2018 at 1:26 PM Suraj Bora  wrote:

> Yes.. its 3.4.6 to 3.4.10, sorry it my bad.
>
> Whenever i am trying to restart zookeeper service, I am getting below
> error.
>
> [root@localhost]# /etc/init.d/zookeeper-server stop
> Stopping zookeeper-server (via systemctl):  Job for
> zookeeper-server.service canceled.
>[FAILED]
> [root@installer localhost]#
>
> On Tue, 14 Aug 2018 at 15:40 Norbert Kalmar 
> wrote:
>
> > Hi,
> >
> > There should be no problem upgrading between minor versions. But there is
> > no release of 3.6 yet, so I think you meant 3.4.x ?
> > Also, what is the error you mentioned?
> >
> > There are some new configuration introduced with Quorum peer mutual
> > authentication, for more info, see:
> >
> >
> https://cwiki.apache.org/confluence/display/ZOOKEEPER/Server-Server+mutual+authentication
> >
> > Mainly look out for:
> > quorum.auth.enableSasl
> > quorum.auth.learnerRequireSasl
> > quorum.auth.serverRequireSasl
> > quorum.auth.kerberos.servicePrincipal
> >
> > Regards,
> > Norbert
> >
> > On Mon, Aug 13, 2018 at 7:48 PM Suraj Bora 
> wrote:
> >
> > > Hi Team,
> > >
> > > I am trying to upgrade zookeeper-server from 3.6.4 to 3.6.10., but
> > getting
> > > below error during stop zookeeper-server stage.
> > >
> > >
> > > Step followed:
> > >
> > > 1. Upgraded zookeeper to 3.6.10
> > >
> > > 2. systemctl daemon-reload
> > >
> > > 3. restart zookeeper-server
> > >
> > > [root@localhost]# /usr/bin/systemctl stop zookeeper-server
> > >
> > > Job for zookeeper-server.service canceled. [root@localhost]#
> > >
> > > Also please let me know configuration required to support Missing
> > > Authentication Remote Quorum Joining functionality.
> > >
> > > Thanks in advance.
> > >
> > > Regards,
> > >
> > > Suraj Bora
> > >
> > > --
> > >
> > > Thanks and Regards,
> > > Suraj Bora
> > > M. No.: 7745082011 <077450%2082011>
> > >
> >
> --
>
> Thanks and Regards,
> Suraj Bora
> M. No.: 7745082011
>


Re: Unable to stop zookeeper-server service

2018-08-14 Thread Norbert Kalmar
Hi,

There should be no problem upgrading between minor versions. But there is
no release of 3.6 yet, so I think you meant 3.4.x ?
Also, what is the error you mentioned?

There are some new configuration introduced with Quorum peer mutual
authentication, for more info, see:
https://cwiki.apache.org/confluence/display/ZOOKEEPER/Server-Server+mutual+authentication

Mainly look out for:
quorum.auth.enableSasl
quorum.auth.learnerRequireSasl
quorum.auth.serverRequireSasl
quorum.auth.kerberos.servicePrincipal

Regards,
Norbert

On Mon, Aug 13, 2018 at 7:48 PM Suraj Bora  wrote:

> Hi Team,
>
> I am trying to upgrade zookeeper-server from 3.6.4 to 3.6.10., but getting
> below error during stop zookeeper-server stage.
>
>
> Step followed:
>
> 1. Upgraded zookeeper to 3.6.10
>
> 2. systemctl daemon-reload
>
> 3. restart zookeeper-server
>
> [root@localhost]# /usr/bin/systemctl stop zookeeper-server
>
> Job for zookeeper-server.service canceled. [root@localhost]#
>
> Also please let me know configuration required to support Missing
> Authentication Remote Quorum Joining functionality.
>
> Thanks in advance.
>
> Regards,
>
> Suraj Bora
>
> --
>
> Thanks and Regards,
> Suraj Bora
> M. No.: 7745082011
>


Re: Problem with ZK log files and snapshots

2018-07-18 Thread Norbert Kalmar
Hi,

You should use PurgeTxnLog as per
https://zookeeper.apache.org/doc/r3.4.8/zookeeperAdmin.html#sc_maintenance
It needs 3 parameters: txn log dir, smapshot dir and number of snaps/log to
keep (3 minimum!)

Example:
java -cp
zookeeper.jar:lib/slf4j-api-1.6.1.jar:lib/slf4j-log4j12-1.6.1.jar:lib/log4j-1.2.15.jar:conf
org.apache.zookeeper.server.PurgeTxnLog /my/txn/log /my/snapshot/dir -n 3

A cronjob can handle the cleanup.

Or from 3.4.0 you can enavle it automatically: autopurge.snapRetainCount
and autopurge.purgeInterval
See
https://zookeeper.apache.org/doc/r3.4.8/zookeeperAdmin.html#sc_advancedConfiguration

Regards,
Norbert

On Wed, Jul 18, 2018 at 11:49 AM Jostein Elvaker Haande 
wrote:

> Another thing I can't make heads or tails of, is whether or not it's safe
> to delete these transaction logs and snapshots manually.
>
> If yes, should this be done while ZooKeeper is stopped? Some of my
> instances are consuming quite a bit of disk space because of this issue,
> and to keep increasing disk space to "fix" the issue just isn't an option.
>
> --
> Yours sincerely Jostein Elvaker Haande
> "A free society is a society where it is safe to be unpopular"
> - Adlai Stevenson
>
> http://tolecnal.net -- tolecnal at tolecnal dot net
>


Re: ZooKeeper Cluster Health Checking

2018-07-17 Thread Norbert Kalmar
Hi Adrien,

Take a look at monitoring in src/contrib/monitoring - it does what you
would like to achieve, in python. Read the README for more information:
https://github.com/apache/zookeeper/tree/master/src/contrib/monitoring

If this one is not good for you, you can use JMX to query MBeans.

A heads-up: At some point, 4letter words will be deprecated and possibly
removed due to security issues.

Regards,
Norbert

On Tue, Jul 17, 2018 at 8:00 AM adrien ruffie 
wrote:

> Hi Harish,
>
>
> thank you very much for this advise and explanation !
>
> Do you think with just a simple script shell for checking all this metrics
> is enough ? Or would better to do it in a Java with a simple monitoring
> application?
>
>
> Thank again,
>
>
> Best regards,
>
>
> Adrien
>
> 
> De : harish lohar 
> Envoyé : mardi 17 juillet 2018 04:13:51
> À : user@zookeeper.apache.org
> Objet : Re: ZooKeeper Cluster Health Checking
>
> Hi Adrian,
> Below zookeeper commands are generally used to get health of zookeeper
> cluster
> stat
>
> Lists brief details for the server and connected clients.
>
> usage echo stat | nc server port
>
> This gives whether cluster is up /down. If down this will give that
>
> Zookeeper instance is currently not serving any request -  which means
> either the leader election is failing or <= 50% of zookeeper node in
> cluster are down.
>
>
> mntr
>
> *New in 3.4.0:* Outputs a list of variables that could be used for
> monitoring the health of the cluster.
>
> $ echo mntr | nc localhost 2185
>
> zk_version  3.4.0
> zk_avg_latency  0
> zk_max_latency  0
> zk_min_latency  0
> zk_packets_received 70
> zk_packets_sent 69
> zk_outstanding_requests 0
> zk_server_state leader
> zk_znode_count   4
> zk_watch_count  0
> zk_ephemerals_count 0
> zk_approximate_data_size27
> zk_followers4   - only exposed by the Leader
> zk_synced_followers 4   - only exposed by the Leader
> zk_pending_syncs0   - only exposed by the Leader
> zk_open_file_descriptor_count 23- only available on Unix platforms
> zk_max_file_descriptor_count 1024   - only available on Unix platforms
>
> The output is compatible with java properties format and the content may
> change over time (new keys added). Your scripts should expect changes.
>
> ATTENTION: Some of the keys are platform specific and some of the keys are
> only exported by the Leader.
>
> The output contains multiple lines with the following format:
>
>
> On Mon, Jul 16, 2018 at 10:13 AM adrien ruffie 
> wrote:
>
> > Hello all,
> >
> >
> > In my company we have a Zookeeper production cluster.
> >
> >
> > But we don't really know how can we check the health of our cluster...
> >
> >
> > Can we advise us about this topic ?
> >
> >
> > I know this topic may has been cropping up for a while, but I don't
> really
> > found any concrete solution.
> >
> >
> > Do you use a monitoring tools ? Which can launch alert ?
> >
> > What metrics/properties/any thing which can indicate that our cluster
> > isn't in good health.
> >
> >
> > Thank you very much and best regards
> >
> >
> > Adrien
> >
>


Re: zookeeper as systemd

2018-07-16 Thread Norbert Kalmar
Some prefer to use start-foreground to basically "bypass" the bash script
(and the background management), and make systemd care about it. Also to
ignore any nohup. So while it is also my understanding that the recommended
way is to start ZK with simple start, some use start-foreground to start as
a systemd service.

Type=simple - drawback is depending services (if any) will not know if it
is truly ready (sockets as well)
Type=forking - drawback is there might be some issues with the PID (file).
So you should also define where the pid file
is: PIDFILE="/var/run/zookeeper/zookeeper_server.pid"

Any combinations should work. It really comes down to preferans I think.

Disclaimer: I'm kind of trying to make educated guesses here, so a
Linux-slash-zookeeper guru could probably be more of a help here :)

This is how I would create systemd service file (this time using start not
start-foreground) - or something like that:
Put the systemd service file in: /etc/systemd/system/zookeeper.service

[Unit]

Description=Zookeeper

After=network.target syslog.target


[Service]

SyslogIdentifier=zookeeper

TimeoutStartSec=10min

Type=forking

User=zookeeper

Group=zookeeper

PIDFILE="/var/run/zookeeper/zookeeper_server.pid"

ExecStart=/usr/lib/zookeeper/bin/zkServer.sh start

ExecStop=/usr/lib/zookeeper/bin/zkServer.sh stop

WorkingDirectory=/var/lib/zookeeper


[Install]

WantedBy=multi-user.target


Of course not everything is required (User, Group, WorkingDirectory
possibly optional)


But I would dig deeper before saying anything for sure.


Regards,

Norbert

On Mon, Jul 16, 2018 at 3:01 PM adrien ruffie 
wrote:

> Thank Norbert for this good explanation.
>
> Yes I'm also a really lost here a bit ...
>
> But it's a zookeeper production's cluster of 5 nodes.
> It should be start instead start-foreground isn't it ? (with simple type)
> ____
> De : Norbert Kalmar 
> Envoyé : lundi 16 juillet 2018 13:49:16
> À : user@zookeeper.apache.org
> Objet : Re: zookeeper as systemd
>
> The type is Linux config, and forking is used when we want the process to
> call fork() during startup. This should guarantee that when startup is
> finished, all ports are open. In my understanding, using FORKING will tell
> systemd the startup is complete and ports are open. See
> https://www.freedesktop.org/software/systemd/man/systemd.service.html for
> more information.
> TYPE=simple is mainly for daemons that don't use network channels, or they
> have their own socket activation.
>
> So in my opinion, TYPE=forking is better for ZK. But! Forking should
> require a PIDfile in order for it to work properly... but since systemd is
> used, it should know the PID as it is a supervisor. I'm also lost here a
> bit...
> Looking at ZK PID file should be
> available: ZOOPIDFILE="$ZOO_DATADIR/zookeeper_server.pid"
>
>
> About the simple start or start-foreground... I think I was quick to say
> only use simple start in prod environment. This is much more complicated,
> for example start-foreground would just make it ignore nohup.
>
> Doing some search, these 2 questions might help:
>
> https://stackoverflow.com/questions/40620544/systemd-zookeeper-service-failed
>
> https://askubuntu.com/questions/979498/how-to-start-a-zookeeper-daemon-after-booting-under-specific-user-in-ubuntu-serv
>
> Putting this all together, I think if you don't really care when socket is
> ready, just use TYPE=simple and call with start-foreground.
>
> Regards,
> Norbert
>
>
>
> On Mon, Jul 16, 2018 at 11:48 AM adrien ruffie 
> wrote:
>
> > Thank Nobert !
> >
> >
> > It really help me,
> >
> >
> > according to you, what would you recommend?
> >
> >
> > To launch with "Type=simple" and zkServer.sh start ?
> >
> > Or "Type=forking" and zkServer.sh start ?
> >
> >
> > Because start command launch Zookeeper as a Daemon,
> >
> > but if I use "Type=simple" the system already daemonize the process ...
> >
> > Do you think that can be daemonize a daemon ... ? Strange
> >
> >
> > I really ready to use "Type=forking" option but, according to this
> > following post
> >
> >
> > https://bbs.archlinux.org/viewtopic.php?id=191669
> >
> >
> > the "Type=forking" is not really recommended ...
> >
> > what do you think ?
> >
> >
> > Adrien
> >
> >
> > 
> > De : Norbert Kalmar 
> > Envoyé : lundi 16 juillet 2018 10:15
> > À : user@zookeeper.apache.org
> > Objet : Re: zookeeper as systemd
> >
> > Hi Adrien,
> >
> > zkServer.sh start-foreg

Re: zookeeper as systemd

2018-07-16 Thread Norbert Kalmar
The type is Linux config, and forking is used when we want the process to
call fork() during startup. This should guarantee that when startup is
finished, all ports are open. In my understanding, using FORKING will tell
systemd the startup is complete and ports are open. See
https://www.freedesktop.org/software/systemd/man/systemd.service.html for
more information.
TYPE=simple is mainly for daemons that don't use network channels, or they
have their own socket activation.

So in my opinion, TYPE=forking is better for ZK. But! Forking should
require a PIDfile in order for it to work properly... but since systemd is
used, it should know the PID as it is a supervisor. I'm also lost here a
bit...
Looking at ZK PID file should be
available: ZOOPIDFILE="$ZOO_DATADIR/zookeeper_server.pid"


About the simple start or start-foreground... I think I was quick to say
only use simple start in prod environment. This is much more complicated,
for example start-foreground would just make it ignore nohup.

Doing some search, these 2 questions might help:
https://stackoverflow.com/questions/40620544/systemd-zookeeper-service-failed
https://askubuntu.com/questions/979498/how-to-start-a-zookeeper-daemon-after-booting-under-specific-user-in-ubuntu-serv

Putting this all together, I think if you don't really care when socket is
ready, just use TYPE=simple and call with start-foreground.

Regards,
Norbert



On Mon, Jul 16, 2018 at 11:48 AM adrien ruffie 
wrote:

> Thank Nobert !
>
>
> It really help me,
>
>
> according to you, what would you recommend?
>
>
> To launch with "Type=simple" and zkServer.sh start ?
>
> Or "Type=forking" and zkServer.sh start ?
>
>
> Because start command launch Zookeeper as a Daemon,
>
> but if I use "Type=simple" the system already daemonize the process ...
>
> Do you think that can be daemonize a daemon ... ? Strange
>
>
> I really ready to use "Type=forking" option but, according to this
> following post
>
>
> https://bbs.archlinux.org/viewtopic.php?id=191669
>
>
> the "Type=forking" is not really recommended ...
>
> what do you think ?
>
>
> Adrien
>
>
> 
> De : Norbert Kalmar 
> Envoyé : lundi 16 juillet 2018 10:15
> À : user@zookeeper.apache.org
> Objet : Re: zookeeper as systemd
>
> Hi Adrien,
>
> zkServer.sh start-foreground - starts the ZooKeeper process in the
> foreground. Good for debugging (thats what I use it for), or check
> something, as you will have the logs printed to standard output (console
> most probably).
> The "start" is what you want to use in production environment. the process
> will run in the background.
>
>' What is "After=network.target" ? ' - ZooKeeper should only start after
> the network... available? I think this should be something like
> After=network-online.target
>
> But looking at the others, I'm not entirely sure either what they really
> do. But checko out this jira -
> https://issues.apache.org/jira/browse/ZOOKEEPER-2095
>
> There was a patch about this, and they added systemd startup/conf files.
>
> Sorry, this is all I could come up with, as I'm not familiar with this part
> either. Hope it helps.
>
> Regards,
> Norbert
>
>
> On Fri, Jul 13, 2018 at 5:30 PM adrien ruffie 
> wrote:
>
> > Hello Zookeeper's users,
> >
> >
> > I have 2 questions for you.
> >
> >
> > what is the real difference between these 2 following commands ? (I don't
> > find any documentation)
> >
> >
> > zkServer.sh start-foreground
> >
> > and
> >
> > zkServer.sh start
> >
> >
> >
> > My second question is, how I can correctly start my zookeeper as a
> > systemclt service ?
> >
> > What is the common best template to write into
> > /etc/systemd/system/zookeeper.service ?
> >
> > Do you use Restart=always ? RestartSec=0s ?
> >
> > What is "After=network.target" ?
> >
> > If my Zookeeper does not really start in 300 sec, the process will be
> > shutdown ?
> >
> >
> > Do you have any example of zookeeper service file ?
> >
> >
> > Because our zookeeper.service is right now:
> >
> >
> > [Unit]
> > Description=ZooKeeper
> >
> > [Service]
> > Type=simple
> > User=zookeeper
> > Group=zookeeper
> > ExecStart=/usr/local/zookeeper-3.4.9/bin/zkServer.sh start-foreground
> >
> > TimeoutSec=300
> >
> > [Install]
> > WantedBy=multi-user.target
> >
> > --- But I found this following on a blog:
> >
> >
> > [Unit]
> > Description=Apache Zookeeper
> > After=network.target
> >
> > [Service]
> > Type=forking
> > User=zookeeper
> > Group=zookeeper
> > SyslogIdentifier=zookeeper
> > Restart=always
> > RestartSec=0s
> > ExecStart=/usr/bin/zookeeper-server start
> > ExecStop=/usr/bin/zookeeper-server stop
> > ExecReload=/usr/bin/zookeeper-server restart
> >
> > [Install]
> > WantedBy=multi-user.target
> >
> >
> > Thank you very much and best regards
> >
> > Adrien
> >
>


Re: zookeeper as systemd

2018-07-16 Thread Norbert Kalmar
Hi Adrien,

zkServer.sh start-foreground - starts the ZooKeeper process in the
foreground. Good for debugging (thats what I use it for), or check
something, as you will have the logs printed to standard output (console
most probably).
The "start" is what you want to use in production environment. the process
will run in the background.

   ' What is "After=network.target" ? ' - ZooKeeper should only start after
the network... available? I think this should be something like
After=network-online.target

But looking at the others, I'm not entirely sure either what they really
do. But checko out this jira -
https://issues.apache.org/jira/browse/ZOOKEEPER-2095

There was a patch about this, and they added systemd startup/conf files.

Sorry, this is all I could come up with, as I'm not familiar with this part
either. Hope it helps.

Regards,
Norbert


On Fri, Jul 13, 2018 at 5:30 PM adrien ruffie 
wrote:

> Hello Zookeeper's users,
>
>
> I have 2 questions for you.
>
>
> what is the real difference between these 2 following commands ? (I don't
> find any documentation)
>
>
> zkServer.sh start-foreground
>
> and
>
> zkServer.sh start
>
>
>
> My second question is, how I can correctly start my zookeeper as a
> systemclt service ?
>
> What is the common best template to write into
> /etc/systemd/system/zookeeper.service ?
>
> Do you use Restart=always ? RestartSec=0s ?
>
> What is "After=network.target" ?
>
> If my Zookeeper does not really start in 300 sec, the process will be
> shutdown ?
>
>
> Do you have any example of zookeeper service file ?
>
>
> Because our zookeeper.service is right now:
>
>
> [Unit]
> Description=ZooKeeper
>
> [Service]
> Type=simple
> User=zookeeper
> Group=zookeeper
> ExecStart=/usr/local/zookeeper-3.4.9/bin/zkServer.sh start-foreground
>
> TimeoutSec=300
>
> [Install]
> WantedBy=multi-user.target
>
> --- But I found this following on a blog:
>
>
> [Unit]
> Description=Apache Zookeeper
> After=network.target
>
> [Service]
> Type=forking
> User=zookeeper
> Group=zookeeper
> SyslogIdentifier=zookeeper
> Restart=always
> RestartSec=0s
> ExecStart=/usr/bin/zookeeper-server start
> ExecStop=/usr/bin/zookeeper-server stop
> ExecReload=/usr/bin/zookeeper-server restart
>
> [Install]
> WantedBy=multi-user.target
>
>
> Thank you very much and best regards
>
> Adrien
>


Re: Observer went down with Read timed out exception

2018-07-03 Thread Norbert Kalmar
Hi Ram,

Are you sure there were no network error? For me, this looks like it could
be due to failed heartbeats (as shutdown was called after the timeout).

It is also possible the leader was busy (maybe garbage collection caused
pause?) - especially if you store big(ish) chunks of data in ZooKeeper.
(There is plan to integrate JVMPauseMonitor to ZooKeeper for this reason
actually).

Regards,
Norbert

On Mon, Jul 2, 2018 at 9:13 PM rammohan ganapavarapu <
rammohanga...@gmail.com> wrote:

> All,
>
> I have multi data-center ldap cluster setup with other data-center with all
> observers all of sudden all the observer threads went down with the
> following message, any idea why they went down? We don't see any network
> related issues between data-centers.
>
>
> 2018-06-29 05:32:59,036 [myid:222] - WARN
> [QuorumPeer[myid=222]/0:0:0:0:0:0:0:0:2181:Observer@79] - Exception when
> observing the leader
> java.net.SocketTimeoutException: Read timed out
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
> at java.net.SocketInputStream.read(SocketInputStream.java:170)
> at java.net.SocketInputStream.read(SocketInputStream.java:141)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
> at java.io.DataInputStream.readInt(DataInputStream.java:387)
> at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> at
>
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
> at
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
> at org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:152)
> at
> org.apache.zookeeper.server.quorum.Observer.observeLeader(Observer.java:75)
> at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:727)
> 2018-06-29 05:32:59,244 [myid:222] - INFO
> [QuorumPeer[myid=222]/0:0:0:0:0:0:0:0:2181:Observer@137] - shutdown called
> java.lang.Exception: shutdown Observer
> at org.apache.zookeeper.server.quorum.Observer.shutdown(Observer.java:137)
> at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:731)
>
>
> Thanks,
> Ram
>


Re: How to new quorum leader in ZK Cluster ( except from stat command)

2018-06-18 Thread Norbert Kalmar
Sorry, Enrico is right, the two methods I mentioned only gives information
on themself, just checked.

Regards,
Norbert

On Mon, Jun 18, 2018 at 3:37 PM Norbert Kalmar  wrote:

> Hi,
>
> You can also use AdminServer - jetty (unless you are using 3.4 - see
> https://zookeeper.apache.org/doc/r3.5.0-alpha/zookeeperAdmin.html#sc_adminserver
> ) or JMX (
> http://zookeeper.apache.org/doc/r3.5.4-beta/zookeeperJMX.html#ch_reference
> ).
>
> (It can also be found in the logs of each instance)
>
> Regards,
> Norbert
>
> On Mon, Jun 18, 2018 at 2:57 PM harish lohar  wrote:
>
>> Hi,
>>
>> Is there a way to query on any follower node and find out about the leader
>> of the ZK cluster.
>>
>>
>> Thanks
>> Harish
>>
>


Re: How to new quorum leader in ZK Cluster ( except from stat command)

2018-06-18 Thread Norbert Kalmar
Hi,

You can also use AdminServer - jetty (unless you are using 3.4 - see
https://zookeeper.apache.org/doc/r3.5.0-alpha/zookeeperAdmin.html#sc_adminserver
) or JMX (
http://zookeeper.apache.org/doc/r3.5.4-beta/zookeeperJMX.html#ch_reference
).

(It can also be found in the logs of each instance)

Regards,
Norbert

On Mon, Jun 18, 2018 at 2:57 PM harish lohar  wrote:

> Hi,
>
> Is there a way to query on any follower node and find out about the leader
> of the ZK cluster.
>
>
> Thanks
> Harish
>


Re: Error starting zookeeper using bash on Windows 10

2018-04-30 Thread Norbert Kalmar
Hi,

The zkServer.sh sets the classpath itself (with -cp, but including
CLASSPATH variable). You don't need to set it manually.
But if that doesn't work, for jar file, you have to include the .jar file.
So it is not enough to set the directory where the .jar file is, you have
to specify the .jar file. i.e.:
C:\DevelopmentTools\zookeeper\zookeeper-3.4.10\zookeeper.jar

Let me know if it is still not working. You can try to print the classpath
with echo to see if the .jar file is included (or the root package).

By the way, if you are using linux bash, I would also try to use forward
slash, as backslash tend to be problematic sometimes.

Hope this helps.

Regards,
Norbert





On Sun, Apr 29, 2018 at 7:41 PM, THADC 
wrote:

> Hello,
>
> I am not sure whether my problem has anything to do with bash exactly, but
> I
> am trying to run zookeeper server (3.4.10) for the first time. By the way,
> I
> did search for this in archive but could not find a match that worked for
> me.
>
> When I started it in foreground mode, I get the following:
>
> *$ ./bin/zkServer.sh start-foreground
> ZooKeeper JMX enabled by default
> Using config:
> /c/DevelopmentTools/zookeeper/zookeeper-3.4.10/bin/../conf/zoo.cfg
> Error: Could not find or load main class
> org.apache.zookeeper.server.quorum.QuorumPeerMain*
>
> , that class is in the zookeeper-3.4.10.jar, which is directly in the
> zookeeper-3.4.10 root directory. I added that directory to my classpath:
>
> *$ echo $CLASSPATH
> C:\Program Files\Java\jdk1.8.0_112\bin;C:\Program
> Files\Java\jre1.8.0_112\bin;C:\DevelopmentTools\zookeeper\
> zookeeper-3.4.10*
>
> , so as you can see above, the directory is in the classpath, but I am
> still
> getting the error. Any ideas what my problem is?
>
> Thank you!
>
>
>
>
> --
> Sent from: http://zookeeper-user.578899.n2.nabble.com/
>


Re: Zookepeper compatability with RHEL 7.4

2018-04-19 Thread Norbert Kalmar
Hi Apoorva,

It should, as only thing is needed is java and .sh scripts to work.
I just tested on a new CentOS 7.4 (should be the same as RHEL 7.4 library
wise), I only installed java and git with yum, and ant with wget.
master (~3.5) and 3.4 runs both on standalone and in a cluster of 3 without
an error.

Maybe someone with RHEL 7.4 can also confirm, but again, binary wise as far
as I know CentOS should be same as RHEL 7.4.

Regards,
Norbert

On Wed, Apr 18, 2018 at 6:41 AM, Apoorva Maheshwari <
apoorva.maheshw...@ericsson.com> wrote:

> Hello,
>
>
> Request you to please confirm RHEL 7.4 compatibility with Apache Zookeeper
> 3.x
>
>
>
>
> Thanks
>
> [Ericsson]
>
> APOORVA MAHESHWARI
> Configuration Engineer
> BDGS, R
> 2nd Floor, ASF Insignia - Block B Kings Canyon,
> Gwal Pahari, Gurgaon, Haryana 122003, India
> Phone: 8860498817
> apoorva.maheshw...@ericsson.com
> www.ericsson.com
>
>


Re: Need help installing Zookeeper service in Ubuntu 16.04

2018-04-11 Thread Norbert Kalmar
Hi Greg,

I second Shawn, can you try by installing the current supported package, ZK
3.4.8 on ubuntu 16.04?
End of April the new LTS Ubuntu should be released (see
https://wiki.ubuntu.com/BionicBeaver/ReleaseSchedule), which has the 3.4.10
zookeeper version (https://packages.ubuntu.com/bionic/zookeeper).
You could upgrade then.

Best regards,
Norbert

On Wed, Apr 11, 2018 at 4:09 AM, Shawn Heisey  wrote:

> On 4/10/2018 7:43 PM, Gregorius Soedharmo wrote:
>
>> Thank you for your help, but unfortunately, it all sounds gibberish to me.
>> As stated in the stack exchange question, I'm a complete Linux newbie that
>> couldn't even properly install a piece of software in Ubuntu. I did
>> include
>> all of my efforts so far in the question.
>>
>> Do you think it is best to scrap it and try a different installation
>> approach instead?
>>
>
> Unless you're absolutely certain that you need a new feature only
> available in 3.4.9 or newer, I would just run "apt-get install zookeeper"
> and use the 3.4.8 version provided by Ubuntu.  I do not know where that
> package will install its configuration, but it probably won't be all that
> hard to find.
>
> It is likely that some of the bug fixes from later versions have been
> incorporated into the debian/ubuntu package by the people who maintain that
> package.  Usually new functionality is not backported, but bug fixes often
> are.
>
> Thanks,
> Shawn
>
>