[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mate Szalay-Beko resolved ZOOKEEPER-4566.
-----------------------------------------
    Fix Version/s: 3.9.0
       Resolution: Fixed

Issue resolved by pull request 1902
[https://github.com/apache/zookeeper/pull/1902]

> Create tool for recursive snapshot analysis
> -------------------------------------------
>
>                 Key: ZOOKEEPER-4566
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4566
>             Project: ZooKeeper
>          Issue Type: Improvement
>            Reporter: Szabolcs Bukros
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 3.9.0
>
>          Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> I needed to analyze snapshots to determine which application caused a massive 
> snapshot size increase by recursively checking child node count and data size 
> for nodes, but could not find a tool for the job. Loading the snapshots one 
> by one and using a ZooKeeper client proved too slow and SnapshotFormatter was 
> very fast but processing the output to get the relevant data for my usecase 
> proved more work than writing a tool that has the output I need. So I wrote 
> SnapshotSumFormatter based on SnapshotFormatter:
> {code:java}
> USAGE: SnapshotSumFormatter snapshot_file starting_node max_depth
> {code}
> The tool recursively travels the child nodes under "starting_node" and 
> collects both node count and summarizes the data stored in every node under 
> the current one. This helps to identify problematic jobs/applications that 
> either store too much data or does not properly clean up. "max_depth" defines 
> the depth where the tool still writes to the output. 0 means there is no 
> depth limit, every node's stats will be displayed, 1 means it will only 
> contain the starting node's and it's children's stats, 2 ads another level 
> and so on. This ONLY affects the level of details displayed, NOT the 
> calculation.
> An example output looks like this (with "SnapshotSumFormatter <snapshot_file> 
> / 2"):
> {code:java}
> /
>    children: 1250511
>    data: 1952186580
> -- /zookeeper
> --   children: 1
> --   data: 0
> -- /solr
> --   children: 1773
> --   data: 8419162
> ---- /solr/configs
> ----   children: 1640
> ----   data: 8407643
> ---- /solr/overseer
> ----   children: 6
> ----   data: 0
> ---- /solr/live_nodes
> ----   children: 3
> ----   data: 0
> {code}
> I think this might prove useful for others too and would like to share it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to