Re: ZooKeeper Cluster Health Checking
Hi Adrien, I noticed you are setting "dataLogDir" to /var/log/zookeeper. Please note that ZooKeeper stores transaction logs in the dataLogDir, what is real data needed for ZooKeeper recovery. These are not regular application log text files, what you usually want to put into /var/log. Otherwise as far as I can tell, your config seems to be OK. ZooKeeper should trigger the autopurge job in each 48 hours, keeping only the 3 most recent snapshots (plus some transaction logs from the same time period). Although this ZooKeeper version (3.4.10) is an old one and not even supported by the community officially. You should consider upgrading your zookeeper cluster independently from the autopurge problems... Also there might be some fixes around autoPurge in more recent versions. Also you can maybe try to kick-in the purge job manually (and also looking for errors in the log). I never did this, but there is an example command in the documentation: java -cp zookeeper.jar:lib/slf4j-api-1.7.5.jar:lib/slf4j-log4j12-1.7.5.jar:lib/log4j-1.2.17.jar:conf org.apache.zookeeper.server.PurgeTxnLog -n see: https://zookeeper.apache.org/doc/r3.4.14/zookeeperAdmin.html Best regards, Mate On Wed, Sep 23, 2020 at 11:04 AM Enrico Olivelli wrote: > > Adrien > > Il giorno mer 23 set 2020 alle ore 10:59 adrien ruffie < > adriennolar...@hotmail.fr> ha scritto: > > > Hello all, > > > > I have a problem in production ... > > > > We have the following zoo configuration file: > > > > tickTime=4000 > > dataDir=/var/lib/zookeeper > > > > dataLogDir=/var/log/zookeeper > > > > initLimit=30 > > syncLimit=15 > > > > autopurge.snapRetainCount=3 > > autopurge.purgeInterval=48 > > > > clientPort=2181 > > maxClientCnxns=60 > > > > server.1=ZOO1:2888:3888 > > server.2=ZOO2:2888:3888 > > server.3=ZOO3:2888:3888 > > server.4=ZOO4:2888:3888 > > server.5=ZOO5:2888:3888 > > > > We are in zookeeper-3.4.10, but we recently saw, that log and snapshot > > aren't purge ... > > do you know this issue, is a bug, or bad configuration ? > > > > Do you see errors in logs ? > > Are you using standard Apache distributions? > > Enrico > > > > > > Thank you very much and best regards > > > > Adrien Ruffié > > > > De : adrien ruffie > > Envoyé : mercredi 18 juillet 2018 09:01 > > À : user@zookeeper.apache.org > > Objet : RE: ZooKeeper Cluster Health Checking > > > > Ok thank Harish, > > > > I keep the idea ! > > > > > > Best regards, > > > > > > Adrien > > > > > > De : harish lohar > > Envoyé : mardi 17 juillet 2018 23:13:28 > > À : user@zookeeper.apache.org > > Objet : Re: ZooKeeper Cluster Health Checking > > > > We did it via java monitoring app , using zookeeper java api which sends 4 > > lw commands to zookeeper and returns the output. > > > > > > Thanks > > Harish > > > > On Tue, Jul 17, 2018 at 2:00 AM adrien ruffie > > wrote: > > > > > Hi Harish, > > > > > > > > > thank you very much for this advise and explanation ! > > > > > > Do you think with just a simple script shell for checking all this > > metrics > > > is enough ? Or would better to do it in a Java with a simple monitoring > > > application? > > > > > > > > > Thank again, > > > > > > > > > Best regards, > > > > > > > > > Adrien > > > > > > > > > De : harish lohar > > > Envoyé : mardi 17 juillet 2018 04:13:51 > > > À : user@zookeeper.apache.org > > > Objet : Re: ZooKeeper Cluster Health Checking > > > > > > Hi Adrian, > > > Below zookeeper commands are generally used to get health of zookeeper > > > cluster > > > stat > > > > > > Lists brief details for the server and connected clients. > > > > > > usage echo stat | nc server port > > > > > > This gives whether cluster is up /down. If down this will give that > > > > > > Zookeeper instance is currently not serving any request - which means > > > either the leader election is failing or <= 50% of zookeeper node in > > > cluster are down. > > > > > > > > > mntr > > > > > > *New in 3.4.0:* Outputs a list of variables that could be used for > > > monitoring the health of the cluster. > > > > > > $ echo mntr | nc localhost 2185 > > > > > > zk_version 3.4.0 > > > zk_avg_latency 0 > > > zk_max_latency 0 > > > zk_min_latency 0 > > > zk_packets_received 70 > > > zk_packets_sent 69 > > > zk_outstanding_requests 0 > > > zk_server_state leader > > > zk_znode_count 4 > > > zk_watch_count 0 > > > zk_ephemerals_count 0 > > > zk_approximate_data_size27 > > > zk_followers4 - only exposed by the Leader > > > zk_synced_followers 4 - only exposed by the Leader > > > zk_pending_syncs0 - only exposed by the Leader > > > zk_open_file_descriptor_count 23- only available on Unix platforms > > > zk_max_file_descriptor_count 1024 - only available on Unix platforms > > > > > > The output is compatible with java properties format and the content may > > > change over time (new keys added). Your
Re: ZooKeeper Cluster Health Checking
Adrien Il giorno mer 23 set 2020 alle ore 10:59 adrien ruffie < adriennolar...@hotmail.fr> ha scritto: > Hello all, > > I have a problem in production ... > > We have the following zoo configuration file: > > tickTime=4000 > dataDir=/var/lib/zookeeper > > dataLogDir=/var/log/zookeeper > > initLimit=30 > syncLimit=15 > > autopurge.snapRetainCount=3 > autopurge.purgeInterval=48 > > clientPort=2181 > maxClientCnxns=60 > > server.1=ZOO1:2888:3888 > server.2=ZOO2:2888:3888 > server.3=ZOO3:2888:3888 > server.4=ZOO4:2888:3888 > server.5=ZOO5:2888:3888 > > We are in zookeeper-3.4.10, but we recently saw, that log and snapshot > aren't purge ... > do you know this issue, is a bug, or bad configuration ? > Do you see errors in logs ? Are you using standard Apache distributions? Enrico > > Thank you very much and best regards > > Adrien Ruffié > > De : adrien ruffie > Envoyé : mercredi 18 juillet 2018 09:01 > À : user@zookeeper.apache.org > Objet : RE: ZooKeeper Cluster Health Checking > > Ok thank Harish, > > I keep the idea ! > > > Best regards, > > > Adrien > > > De : harish lohar > Envoyé : mardi 17 juillet 2018 23:13:28 > À : user@zookeeper.apache.org > Objet : Re: ZooKeeper Cluster Health Checking > > We did it via java monitoring app , using zookeeper java api which sends 4 > lw commands to zookeeper and returns the output. > > > Thanks > Harish > > On Tue, Jul 17, 2018 at 2:00 AM adrien ruffie > wrote: > > > Hi Harish, > > > > > > thank you very much for this advise and explanation ! > > > > Do you think with just a simple script shell for checking all this > metrics > > is enough ? Or would better to do it in a Java with a simple monitoring > > application? > > > > > > Thank again, > > > > > > Best regards, > > > > > > Adrien > > > > > > De : harish lohar > > Envoyé : mardi 17 juillet 2018 04:13:51 > > À : user@zookeeper.apache.org > > Objet : Re: ZooKeeper Cluster Health Checking > > > > Hi Adrian, > > Below zookeeper commands are generally used to get health of zookeeper > > cluster > > stat > > > > Lists brief details for the server and connected clients. > > > > usage echo stat | nc server port > > > > This gives whether cluster is up /down. If down this will give that > > > > Zookeeper instance is currently not serving any request - which means > > either the leader election is failing or <= 50% of zookeeper node in > > cluster are down. > > > > > > mntr > > > > *New in 3.4.0:* Outputs a list of variables that could be used for > > monitoring the health of the cluster. > > > > $ echo mntr | nc localhost 2185 > > > > zk_version 3.4.0 > > zk_avg_latency 0 > > zk_max_latency 0 > > zk_min_latency 0 > > zk_packets_received 70 > > zk_packets_sent 69 > > zk_outstanding_requests 0 > > zk_server_state leader > > zk_znode_count 4 > > zk_watch_count 0 > > zk_ephemerals_count 0 > > zk_approximate_data_size27 > > zk_followers4 - only exposed by the Leader > > zk_synced_followers 4 - only exposed by the Leader > > zk_pending_syncs0 - only exposed by the Leader > > zk_open_file_descriptor_count 23- only available on Unix platforms > > zk_max_file_descriptor_count 1024 - only available on Unix platforms > > > > The output is compatible with java properties format and the content may > > change over time (new keys added). Your scripts should expect changes. > > > > ATTENTION: Some of the keys are platform specific and some of the keys > are > > only exported by the Leader. > > > > The output contains multiple lines with the following format: > > > > > > On Mon, Jul 16, 2018 at 10:13 AM adrien ruffie < > adriennolar...@hotmail.fr> > > wrote: > > > > > Hello all, > > > > > > > > > In my company we have a Zookeeper production cluster. > > > > > > > > > But we don't really know how can we check the health of our cluster... > > > > > > > > > Can we advise us about this topic ? > > > > > > > > > I know this topic may has been cropping up for a while, but I don't > > really > > > found any concrete solution. > > > > > > > > > Do you use a monitoring tools ? Which can launch alert ? > > > > > > What metrics/properties/any thing which can indicate that our cluster > > > isn't in good health. > > > > > > > > > Thank you very much and best regards > > > > > > > > > Adrien > > > > > >
RE: ZooKeeper Cluster Health Checking
Hello all, I have a problem in production ... We have the following zoo configuration file: tickTime=4000 dataDir=/var/lib/zookeeper dataLogDir=/var/log/zookeeper initLimit=30 syncLimit=15 autopurge.snapRetainCount=3 autopurge.purgeInterval=48 clientPort=2181 maxClientCnxns=60 server.1=ZOO1:2888:3888 server.2=ZOO2:2888:3888 server.3=ZOO3:2888:3888 server.4=ZOO4:2888:3888 server.5=ZOO5:2888:3888 We are in zookeeper-3.4.10, but we recently saw, that log and snapshot aren't purge ... do you know this issue, is a bug, or bad configuration ? Thank you very much and best regards Adrien Ruffié De : adrien ruffie Envoyé : mercredi 18 juillet 2018 09:01 À : user@zookeeper.apache.org Objet : RE: ZooKeeper Cluster Health Checking Ok thank Harish, I keep the idea ! Best regards, Adrien De : harish lohar Envoyé : mardi 17 juillet 2018 23:13:28 À : user@zookeeper.apache.org Objet : Re: ZooKeeper Cluster Health Checking We did it via java monitoring app , using zookeeper java api which sends 4 lw commands to zookeeper and returns the output. Thanks Harish On Tue, Jul 17, 2018 at 2:00 AM adrien ruffie wrote: > Hi Harish, > > > thank you very much for this advise and explanation ! > > Do you think with just a simple script shell for checking all this metrics > is enough ? Or would better to do it in a Java with a simple monitoring > application? > > > Thank again, > > > Best regards, > > > Adrien > > > De : harish lohar > Envoyé : mardi 17 juillet 2018 04:13:51 > À : user@zookeeper.apache.org > Objet : Re: ZooKeeper Cluster Health Checking > > Hi Adrian, > Below zookeeper commands are generally used to get health of zookeeper > cluster > stat > > Lists brief details for the server and connected clients. > > usage echo stat | nc server port > > This gives whether cluster is up /down. If down this will give that > > Zookeeper instance is currently not serving any request - which means > either the leader election is failing or <= 50% of zookeeper node in > cluster are down. > > > mntr > > *New in 3.4.0:* Outputs a list of variables that could be used for > monitoring the health of the cluster. > > $ echo mntr | nc localhost 2185 > > zk_version 3.4.0 > zk_avg_latency 0 > zk_max_latency 0 > zk_min_latency 0 > zk_packets_received 70 > zk_packets_sent 69 > zk_outstanding_requests 0 > zk_server_state leader > zk_znode_count 4 > zk_watch_count 0 > zk_ephemerals_count 0 > zk_approximate_data_size27 > zk_followers4 - only exposed by the Leader > zk_synced_followers 4 - only exposed by the Leader > zk_pending_syncs0 - only exposed by the Leader > zk_open_file_descriptor_count 23- only available on Unix platforms > zk_max_file_descriptor_count 1024 - only available on Unix platforms > > The output is compatible with java properties format and the content may > change over time (new keys added). Your scripts should expect changes. > > ATTENTION: Some of the keys are platform specific and some of the keys are > only exported by the Leader. > > The output contains multiple lines with the following format: > > > On Mon, Jul 16, 2018 at 10:13 AM adrien ruffie > wrote: > > > Hello all, > > > > > > In my company we have a Zookeeper production cluster. > > > > > > But we don't really know how can we check the health of our cluster... > > > > > > Can we advise us about this topic ? > > > > > > I know this topic may has been cropping up for a while, but I don't > really > > found any concrete solution. > > > > > > Do you use a monitoring tools ? Which can launch alert ? > > > > What metrics/properties/any thing which can indicate that our cluster > > isn't in good health. > > > > > > Thank you very much and best regards > > > > > > Adrien > > >
Re: Issues building and running Zookeeper Inspector
Brent Please go ahead and send a PR You can also subscribe to d...@zookeeper.apache.org for discussions related to patches. Thank you very much Enrico Il giorno 22/09/20, 19:52 "Brent" ha scritto: Hi everyone, I just filed a Jira related to the Zookeeper Inspector contrib project here: https://issues.apache.org/jira/browse/ZOOKEEPER-3943 I just wanted to reach out and make sure I'm going about attempting to use it correctly. I just ran "mvn clean install -DskipTests" at all levels of the code tree and then attempted to use both the "zooInspector.sh" script and invoke the Java directly (with all the CLASSPATH set up properly). It seems like the core of my issue was that the icons for the UI couldn't be found (they don't seem to get built into the JAR by default) and resulted in a bunch of NullPointerExceptions. I put a proposal to fix this in the Jira, but wanted to make sure it seems like an acceptable approach and double-check that I'm not just doing something incorrectly. If this seems OK and nobody is actively working on this already, I'd be happy to submit a PR if it would help. Thanks! CONFIDENTIALITY & PRIVACY NOTICE This e-mail (including any attachments) is strictly confidential and may also contain privileged information. If you are not the intended recipient you are not authorised to read, print, save, process or disclose this message. If you have received this message by mistake, please inform the sender immediately and destroy this e-mail, its attachments and any copies. Any use, distribution, reproduction or disclosure by any person other than the intended recipient is strictly prohibited and the person responsible may incur in penalties. The use of this e-mail is only for professional purposes; there is no guarantee that the correspondence towards this e-mail will be read only by the recipient, because, under certain circumstances, there may be a need to access this email by third subjects belonging to the Company.