Re: ZooKeeper Cluster Health Checking

2020-09-23 Thread Szalay-Bekő Máté
Hi Adrien,

I noticed you are setting "dataLogDir" to /var/log/zookeeper. Please
note that ZooKeeper stores transaction logs in the dataLogDir, what is
real data needed for ZooKeeper recovery. These are not regular
application log text files, what you usually want to put into
/var/log.

Otherwise as far as I can tell, your config seems to be OK. ZooKeeper
should trigger the autopurge job in each 48 hours, keeping only the 3
most recent snapshots (plus some transaction logs from the same time
period). Although this ZooKeeper version (3.4.10) is an old one and
not even supported by the community officially. You should consider
upgrading your zookeeper cluster independently from the autopurge
problems... Also there might be some fixes around autoPurge in more
recent versions.

Also you can maybe try to kick-in the purge job manually (and also
looking for errors in the log). I never did this, but there is an
example command in the documentation:
java -cp 
zookeeper.jar:lib/slf4j-api-1.7.5.jar:lib/slf4j-log4j12-1.7.5.jar:lib/log4j-1.2.17.jar:conf
org.apache.zookeeper.server.PurgeTxnLog   -n 

see: https://zookeeper.apache.org/doc/r3.4.14/zookeeperAdmin.html

Best regards,
Mate


On Wed, Sep 23, 2020 at 11:04 AM Enrico Olivelli  wrote:
>
> Adrien
>
> Il giorno mer 23 set 2020 alle ore 10:59 adrien ruffie <
> adriennolar...@hotmail.fr> ha scritto:
>
> > Hello all,
> >
> > I have a problem in production ...
> >
> > We have the following zoo configuration file:
> >
> > tickTime=4000
> > dataDir=/var/lib/zookeeper
> >
> > dataLogDir=/var/log/zookeeper
> >
> > initLimit=30
> > syncLimit=15
> >
> > autopurge.snapRetainCount=3
> > autopurge.purgeInterval=48
> >
> > clientPort=2181
> > maxClientCnxns=60
> >
> > server.1=ZOO1:2888:3888
> > server.2=ZOO2:2888:3888
> > server.3=ZOO3:2888:3888
> > server.4=ZOO4:2888:3888
> > server.5=ZOO5:2888:3888
> >
> > We are in zookeeper-3.4.10, but we recently saw, that log and snapshot
> > aren't purge ...
> > do you know this issue, is a bug, or bad configuration ?
> >
>
> Do you see errors in logs ?
>
> Are you using standard Apache distributions?
>
> Enrico
>
>
> >
> > Thank you very much and best regards
> >
> > Adrien Ruffié
> > 
> > De : adrien ruffie 
> > Envoyé : mercredi 18 juillet 2018 09:01
> > À : user@zookeeper.apache.org 
> > Objet : RE: ZooKeeper Cluster Health Checking
> >
> > Ok thank Harish,
> >
> > I keep the idea !
> >
> >
> > Best regards,
> >
> >
> > Adrien
> >
> > 
> > De : harish lohar 
> > Envoyé : mardi 17 juillet 2018 23:13:28
> > À : user@zookeeper.apache.org
> > Objet : Re: ZooKeeper Cluster Health Checking
> >
> > We did it via java monitoring app , using zookeeper java api which sends 4
> > lw commands to zookeeper and returns the output.
> >
> >
> > Thanks
> > Harish
> >
> > On Tue, Jul 17, 2018 at 2:00 AM adrien ruffie 
> > wrote:
> >
> > > Hi Harish,
> > >
> > >
> > > thank you very much for this advise and explanation !
> > >
> > > Do you think with just a simple script shell for checking all this
> > metrics
> > > is enough ? Or would better to do it in a Java with a simple monitoring
> > > application?
> > >
> > >
> > > Thank again,
> > >
> > >
> > > Best regards,
> > >
> > >
> > > Adrien
> > >
> > > 
> > > De : harish lohar 
> > > Envoyé : mardi 17 juillet 2018 04:13:51
> > > À : user@zookeeper.apache.org
> > > Objet : Re: ZooKeeper Cluster Health Checking
> > >
> > > Hi Adrian,
> > > Below zookeeper commands are generally used to get health of zookeeper
> > > cluster
> > > stat
> > >
> > > Lists brief details for the server and connected clients.
> > >
> > > usage echo stat | nc server port
> > >
> > > This gives whether cluster is up /down. If down this will give that
> > >
> > > Zookeeper instance is currently not serving any request -  which means
> > > either the leader election is failing or <= 50% of zookeeper node in
> > > cluster are down.
> > >
> > >
> > > mntr
> > >
> > > *New in 3.4.0:* Outputs a list of variables that could be used for
> > > monitoring the health of the cluster.
> > >
> > > $ echo mntr | nc localhost 2185
> > >
> > > zk_version  3.4.0
> > > zk_avg_latency  0
> > > zk_max_latency  0
> > > zk_min_latency  0
> > > zk_packets_received 70
> > > zk_packets_sent 69
> > > zk_outstanding_requests 0
> > > zk_server_state leader
> > > zk_znode_count   4
> > > zk_watch_count  0
> > > zk_ephemerals_count 0
> > > zk_approximate_data_size27
> > > zk_followers4   - only exposed by the Leader
> > > zk_synced_followers 4   - only exposed by the Leader
> > > zk_pending_syncs0   - only exposed by the Leader
> > > zk_open_file_descriptor_count 23- only available on Unix platforms
> > > zk_max_file_descriptor_count 1024   - only available on Unix platforms
> > >
> > > The output is compatible with java properties format and the content may
> > > change over time (new keys added). Your 

Re: ZooKeeper Cluster Health Checking

2020-09-23 Thread Enrico Olivelli
Adrien

Il giorno mer 23 set 2020 alle ore 10:59 adrien ruffie <
adriennolar...@hotmail.fr> ha scritto:

> Hello all,
>
> I have a problem in production ...
>
> We have the following zoo configuration file:
>
> tickTime=4000
> dataDir=/var/lib/zookeeper
>
> dataLogDir=/var/log/zookeeper
>
> initLimit=30
> syncLimit=15
>
> autopurge.snapRetainCount=3
> autopurge.purgeInterval=48
>
> clientPort=2181
> maxClientCnxns=60
>
> server.1=ZOO1:2888:3888
> server.2=ZOO2:2888:3888
> server.3=ZOO3:2888:3888
> server.4=ZOO4:2888:3888
> server.5=ZOO5:2888:3888
>
> We are in zookeeper-3.4.10, but we recently saw, that log and snapshot
> aren't purge ...
> do you know this issue, is a bug, or bad configuration ?
>

Do you see errors in logs ?

Are you using standard Apache distributions?

Enrico


>
> Thank you very much and best regards
>
> Adrien Ruffié
> 
> De : adrien ruffie 
> Envoyé : mercredi 18 juillet 2018 09:01
> À : user@zookeeper.apache.org 
> Objet : RE: ZooKeeper Cluster Health Checking
>
> Ok thank Harish,
>
> I keep the idea !
>
>
> Best regards,
>
>
> Adrien
>
> 
> De : harish lohar 
> Envoyé : mardi 17 juillet 2018 23:13:28
> À : user@zookeeper.apache.org
> Objet : Re: ZooKeeper Cluster Health Checking
>
> We did it via java monitoring app , using zookeeper java api which sends 4
> lw commands to zookeeper and returns the output.
>
>
> Thanks
> Harish
>
> On Tue, Jul 17, 2018 at 2:00 AM adrien ruffie 
> wrote:
>
> > Hi Harish,
> >
> >
> > thank you very much for this advise and explanation !
> >
> > Do you think with just a simple script shell for checking all this
> metrics
> > is enough ? Or would better to do it in a Java with a simple monitoring
> > application?
> >
> >
> > Thank again,
> >
> >
> > Best regards,
> >
> >
> > Adrien
> >
> > 
> > De : harish lohar 
> > Envoyé : mardi 17 juillet 2018 04:13:51
> > À : user@zookeeper.apache.org
> > Objet : Re: ZooKeeper Cluster Health Checking
> >
> > Hi Adrian,
> > Below zookeeper commands are generally used to get health of zookeeper
> > cluster
> > stat
> >
> > Lists brief details for the server and connected clients.
> >
> > usage echo stat | nc server port
> >
> > This gives whether cluster is up /down. If down this will give that
> >
> > Zookeeper instance is currently not serving any request -  which means
> > either the leader election is failing or <= 50% of zookeeper node in
> > cluster are down.
> >
> >
> > mntr
> >
> > *New in 3.4.0:* Outputs a list of variables that could be used for
> > monitoring the health of the cluster.
> >
> > $ echo mntr | nc localhost 2185
> >
> > zk_version  3.4.0
> > zk_avg_latency  0
> > zk_max_latency  0
> > zk_min_latency  0
> > zk_packets_received 70
> > zk_packets_sent 69
> > zk_outstanding_requests 0
> > zk_server_state leader
> > zk_znode_count   4
> > zk_watch_count  0
> > zk_ephemerals_count 0
> > zk_approximate_data_size27
> > zk_followers4   - only exposed by the Leader
> > zk_synced_followers 4   - only exposed by the Leader
> > zk_pending_syncs0   - only exposed by the Leader
> > zk_open_file_descriptor_count 23- only available on Unix platforms
> > zk_max_file_descriptor_count 1024   - only available on Unix platforms
> >
> > The output is compatible with java properties format and the content may
> > change over time (new keys added). Your scripts should expect changes.
> >
> > ATTENTION: Some of the keys are platform specific and some of the keys
> are
> > only exported by the Leader.
> >
> > The output contains multiple lines with the following format:
> >
> >
> > On Mon, Jul 16, 2018 at 10:13 AM adrien ruffie <
> adriennolar...@hotmail.fr>
> > wrote:
> >
> > > Hello all,
> > >
> > >
> > > In my company we have a Zookeeper production cluster.
> > >
> > >
> > > But we don't really know how can we check the health of our cluster...
> > >
> > >
> > > Can we advise us about this topic ?
> > >
> > >
> > > I know this topic may has been cropping up for a while, but I don't
> > really
> > > found any concrete solution.
> > >
> > >
> > > Do you use a monitoring tools ? Which can launch alert ?
> > >
> > > What metrics/properties/any thing which can indicate that our cluster
> > > isn't in good health.
> > >
> > >
> > > Thank you very much and best regards
> > >
> > >
> > > Adrien
> > >
> >
>


RE: ZooKeeper Cluster Health Checking

2020-09-23 Thread adrien ruffie
Hello all,

I have a problem in production ...

We have the following zoo configuration file:

tickTime=4000
dataDir=/var/lib/zookeeper

dataLogDir=/var/log/zookeeper

initLimit=30
syncLimit=15

autopurge.snapRetainCount=3
autopurge.purgeInterval=48

clientPort=2181
maxClientCnxns=60

server.1=ZOO1:2888:3888
server.2=ZOO2:2888:3888
server.3=ZOO3:2888:3888
server.4=ZOO4:2888:3888
server.5=ZOO5:2888:3888

We are in zookeeper-3.4.10, but we recently saw, that log and snapshot aren't 
purge ...
do you know this issue, is a bug, or bad configuration ?

Thank you very much and best regards

Adrien Ruffié

De : adrien ruffie 
Envoyé : mercredi 18 juillet 2018 09:01
À : user@zookeeper.apache.org 
Objet : RE: ZooKeeper Cluster Health Checking

Ok thank Harish,

I keep the idea !


Best regards,


Adrien


De : harish lohar 
Envoyé : mardi 17 juillet 2018 23:13:28
À : user@zookeeper.apache.org
Objet : Re: ZooKeeper Cluster Health Checking

We did it via java monitoring app , using zookeeper java api which sends 4
lw commands to zookeeper and returns the output.


Thanks
Harish

On Tue, Jul 17, 2018 at 2:00 AM adrien ruffie 
wrote:

> Hi Harish,
>
>
> thank you very much for this advise and explanation !
>
> Do you think with just a simple script shell for checking all this metrics
> is enough ? Or would better to do it in a Java with a simple monitoring
> application?
>
>
> Thank again,
>
>
> Best regards,
>
>
> Adrien
>
> 
> De : harish lohar 
> Envoyé : mardi 17 juillet 2018 04:13:51
> À : user@zookeeper.apache.org
> Objet : Re: ZooKeeper Cluster Health Checking
>
> Hi Adrian,
> Below zookeeper commands are generally used to get health of zookeeper
> cluster
> stat
>
> Lists brief details for the server and connected clients.
>
> usage echo stat | nc server port
>
> This gives whether cluster is up /down. If down this will give that
>
> Zookeeper instance is currently not serving any request -  which means
> either the leader election is failing or <= 50% of zookeeper node in
> cluster are down.
>
>
> mntr
>
> *New in 3.4.0:* Outputs a list of variables that could be used for
> monitoring the health of the cluster.
>
> $ echo mntr | nc localhost 2185
>
> zk_version  3.4.0
> zk_avg_latency  0
> zk_max_latency  0
> zk_min_latency  0
> zk_packets_received 70
> zk_packets_sent 69
> zk_outstanding_requests 0
> zk_server_state leader
> zk_znode_count   4
> zk_watch_count  0
> zk_ephemerals_count 0
> zk_approximate_data_size27
> zk_followers4   - only exposed by the Leader
> zk_synced_followers 4   - only exposed by the Leader
> zk_pending_syncs0   - only exposed by the Leader
> zk_open_file_descriptor_count 23- only available on Unix platforms
> zk_max_file_descriptor_count 1024   - only available on Unix platforms
>
> The output is compatible with java properties format and the content may
> change over time (new keys added). Your scripts should expect changes.
>
> ATTENTION: Some of the keys are platform specific and some of the keys are
> only exported by the Leader.
>
> The output contains multiple lines with the following format:
>
>
> On Mon, Jul 16, 2018 at 10:13 AM adrien ruffie 
> wrote:
>
> > Hello all,
> >
> >
> > In my company we have a Zookeeper production cluster.
> >
> >
> > But we don't really know how can we check the health of our cluster...
> >
> >
> > Can we advise us about this topic ?
> >
> >
> > I know this topic may has been cropping up for a while, but I don't
> really
> > found any concrete solution.
> >
> >
> > Do you use a monitoring tools ? Which can launch alert ?
> >
> > What metrics/properties/any thing which can indicate that our cluster
> > isn't in good health.
> >
> >
> > Thank you very much and best regards
> >
> >
> > Adrien
> >
>


Re: Issues building and running Zookeeper Inspector

2020-09-23 Thread Enrico Olivelli - Diennea
Brent
Please go ahead and send a PR

You can also subscribe to d...@zookeeper.apache.org for discussions related to 
patches.

Thank you very much
Enrico

Il giorno 22/09/20, 19:52 "Brent"  ha scritto:

Hi everyone,

I just filed a Jira related to the Zookeeper Inspector contrib project here:

https://issues.apache.org/jira/browse/ZOOKEEPER-3943

I just wanted to reach out and make sure I'm going about attempting to use
it correctly.  I just ran "mvn clean install -DskipTests" at all levels of
the code tree and then attempted to use both the "zooInspector.sh" script
and invoke the Java directly (with all the CLASSPATH set up properly).

It seems like the core of my issue was that the icons for the UI couldn't
be found (they don't seem to get built into the JAR by default) and
resulted in a bunch of NullPointerExceptions.  I put a proposal to fix this
in the Jira, but wanted to make sure it seems like an acceptable approach
and double-check that I'm not just doing something incorrectly.

If this seems OK and nobody is actively working on this already, I'd be
happy to submit a PR if it would help.

Thanks!




CONFIDENTIALITY & PRIVACY NOTICE
This e-mail (including any attachments) is strictly confidential and may also 
contain privileged information. If you are not the intended recipient you are 
not authorised to read, print, save, process or disclose this message. If you 
have received this message by mistake, please inform the sender immediately and 
destroy this e-mail, its attachments and any copies. Any use, distribution, 
reproduction or disclosure by any person other than the intended recipient is 
strictly prohibited and the person responsible may incur in penalties.
The use of this e-mail is only for professional purposes; there is no guarantee 
that the correspondence towards this e-mail will be read only by the recipient, 
because, under certain circumstances, there may be a need to access this email 
by third subjects belonging to the Company.