Andrew Wong created KUDU-3165:
---------------------------------
Summary: Add statistics to data directories that can be useful for
performance debugging
Key: KUDU-3165
URL: https://issues.apache.org/jira/browse/KUDU-3165
Project: Kudu
Issue Type: Improvement
Components: fs
Affects Versions: 1.12.0
Reporter: Andrew Wong
We sometimes find it useful to drill into what data directories are being used
as a means to determining why flushes or compactions are slow. Currently the
easiest way to discover this data is to run the {{kudu remote_replica list}}
which shows the data directory paths, and to look through tserver logs for any
associated slowness warnings.
We should bring put this information front and center in some tooling or web UI
page. Off the top of my head, it'd be really nice to understand:
* How many tablets are have data in each data directory.
* How many data blocks are in each data directory.
* How full each data directory is.
* How many maintenance ops have recently writing into each data directory (and
conversely, which data directories each maintenance op is writing into).
* Average write/read latency, realtime - usertime, etc. per byte written in
each data directory.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)