[
https://issues.apache.org/jira/browse/KAFKA-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376103#comment-15376103
]
ASF GitHub Bot commented on KAFKA-3857:
---------------------------------------
GitHub user kiranptivo reopened a pull request:
https://github.com/apache/kafka/pull/1593
KAFKA-3857 Additional log cleaner metrics
Fixes KAFKA-3857
Changes proposed in this pull request:
The following additional log cleaner metrics have been added.
1. num-runs: Cumulative number of successful log cleaner runs since last
broker restart.
2. last-run-time: Time of last log cleaner run.
3. num-filthy-logs: Number of filthy logs. A non zero value for an extended
period of time indicates that the cleaner has not been successful in cleaning
the logs.
A note on num-filthy-logs: It is incremented whenever a filthy topic
partition is added to inProgress HashMap. And it is decremented once the
cleaning is successful, or if the cleaning is aborted. Note that the existing
LogCleaner code does not provide a metric to check if the clean operation is
successful or not. There is an inProgress HashMap with topicPartition =>
LogCleaningInProgress entries in it, but the entries are removed from the
HashMap even when clean operation throws an exception. So, added an additional
metric num-filthy-logs, to differentiate between a successful log clean case
and an exception case.
The code is ready. I have tested and verified JMX metrics. There is one
case I couldn't test though. It's the case where numFilthyLogs is decremented
in 'resumeCleaning(...)' in LogCleanerManager.scala Line 188. It seems to be a
part of the workflow that aborts the cleaning of a particular partition. Any
ideas on how to test this scenario?
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/TiVo/kafka log_cleaner_jmx_metrics
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/kafka/pull/1593.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #1593
----
commit f00de412f6b1f6568adef479687ae0df789f9c96
Author: Kiran Pillarisetty <[email protected]>
Date: 2016-06-14T17:40:26Z
Create a couple of additional Log Cleaner JMX metrics
log-clean-last-run: Log cleaner's last run time
log-clean-runs: Number of log cleaner runs.
commit 7dc7511ee2b6d3cdf9df0c366fe23bf34d062a54
Author: Kiran Pillarisetty <[email protected]>
Date: 2016-06-14T20:24:00Z
Created a couple of additional Log Cleaner JMX metrics
log-clean-last-run: a metric to track last log cleaner run (unix timestamp)
log-clean-runs: a metric to track number of log cleaner runs
Committer: Kiran Pillarisetty <[email protected]>
commit 7f1214ff1118103dd639df717e988a22bad8033d
Author: Kiran Pillarisetty <[email protected]>
Date: 2016-07-01T22:14:57Z
Add additional JMX metric to track successful cleaning of a log segment
commit 1ac346bb37008312e41035167dbfd75803595cd6
Author: Kiran Pillarisetty <[email protected]>
Date: 2016-07-01T22:17:25Z
Add additional JMX metric to track successful cleaning of a log segment
commit 4f08d875e05c35bd7d7c849584b8b029031f884b
Author: Kiran Pillarisetty <[email protected]>
Date: 2016-07-05T22:23:20Z
Metric name updated to num-filthy-logs. Metric incremented as it is grabbed
for cleaning, and decremented once the cleaning is done, or if the cleaning is
aborted
commit cd887c05bf1d56b7566c5b72b3ddf3bcdfb70898
Author: Kiran Pillarisetty <[email protected]>
Date: 2016-07-05T23:31:32Z
Changed a metric name (number-of-runs to num-runs). Removed an extra \n
around line 164. It is not present in the trunk
----
> Additional log cleaner metrics
> ------------------------------
>
> Key: KAFKA-3857
> URL: https://issues.apache.org/jira/browse/KAFKA-3857
> Project: Kafka
> Issue Type: Improvement
> Reporter: Kiran Pillarisetty
>
> The proposal would be to add a couple of additional log cleaner metrics:
> 1. Time of last log cleaner run
> 2. Cumulative number of successful log cleaner runs since last broker restart.
> Existing log cleaner metrics (max-buffer-utilization-percent,
> cleaner-recopy-percent, max-clean-time-secs, max-dirty-percent) do not
> differentiate an idle log cleaner from a dead log cleaner. It would be useful
> to have the above two metrics added, to indicate whether log cleaner is alive
> (and successfully cleaning) or not.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)