Hi Flavio, Thank you for getting back with me so quickly. I will try out my patch with newer versions of ZK before submitting it.
As to your question, I do not have a full triage data available to me. I can only guess that on an undersized system ZK can be pushed beyond the heap size with which JVM was spun up. In order to have some visibility into it, there was a request to make ZK memory stats available in Icinga via NRPE. This is where I actually started. If you like, I can dig up more information about the exact scenario but I think the OOM condition is what drove it. Regards, /Sergey On Mon, May 2, 2016 at 8:45 AM, Flavio Junqueira <[email protected]> wrote: > Sounds like a good addition, Sergey. Please go ahead and create a jira for > this and attach a patch whenever you're ready. Note that this is will be > for the 3.5 and 3.6 branches. For the 3.4 branch, we pretty much accept > only bug fixes at this point, unless the community speaks up and pushes for > the change. Since this is for monitoring, it doesn't sound bad to have it > in the 3.4 branch too, but I'll let others chime in and say what they > prefer. > > Also, you mention that you're using this to determine when an instance is > about to fail. Could you be more specific? > > -Flavio > > > On 02 May 2016, at 14:40, Sergey Maslyakov <[email protected]> wrote: > > > > I have a patch for the 3.4.x line that adds some simple heap stats into > the > > "mntr" 4lw response. I use it to monitor the memory utilization by the > > service to have an early warning that the instance is about to fail. > Would > > the community/maintainers be interested in this contribution? If so, I > can > > open a JIRA issue and then submit a patch with it. Please let me know if > > there is interest in this patch. > > > > This is how the proposed change looks like (see the yellow highlight > below): > > > > [evolvah@vp-backup-1 zookeeper-3.4.8]$ echo mntr | nc localhost 2181 > > > > zk_version 3.4.8-1740158, built on 04/20/2016 16:14 GMT > > > > zk_avg_latency 0 > > > > zk_max_latency 0 > > > > zk_min_latency 0 > > > > zk_packets_received 1 > > > > zk_packets_sent 0 > > > > zk_num_alive_connections 1 > > > > zk_outstanding_requests 0 > > > > zk_server_state standalone > > > > zk_znode_count 4 > > > > zk_watch_count 0 > > > > zk_ephemerals_count 0 > > > > zk_approximate_data_size 27 > > > > zk_max_memory 894959616 > > > > zk_total_memory 60293120 > > > > zk_free_memory 50368208 > > > > zk_open_file_descriptor_count 25 > > > > zk_max_file_descriptor_count 1048576 > >
