[jira] [Commented] (KAFKA-3857) Additional log cleaner metrics

2017-01-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15822647#comment-15822647
 ] 

ASF GitHub Bot commented on KAFKA-3857:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2378


> Additional log cleaner metrics
> --
>
> Key: KAFKA-3857
> URL: https://issues.apache.org/jira/browse/KAFKA-3857
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Kiran Pillarisetty
> Fix For: 0.10.2.0
>
>
> The proposal would be to add a couple of additional log cleaner metrics: 
> 1. Time of last log cleaner run 
> 2. Cumulative number of successful log cleaner runs since last broker restart.
> Existing log cleaner metrics (max-buffer-utilization-percent, 
> cleaner-recopy-percent, max-clean-time-secs, max-dirty-percent) do not 
> differentiate an idle log cleaner from a dead log cleaner. It would be useful 
> to have the above two metrics added, to indicate whether log cleaner is alive 
> (and successfully cleaning) or not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3857) Additional log cleaner metrics

2017-01-13 Thread Kiran Pillarisetty (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15822579#comment-15822579
 ] 

Kiran Pillarisetty commented on KAFKA-3857:
---

I just created a new branch based off of the trunk, applied my changes there 
and created a new PR. 

[~junrao], [~ijuma] Could you please take a look?  Would it be possible to 
include it in 0.10.2.0? (I believe Feature Freeze date is today)
https://github.com/apache/kafka/pull/2378


> Additional log cleaner metrics
> --
>
> Key: KAFKA-3857
> URL: https://issues.apache.org/jira/browse/KAFKA-3857
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Kiran Pillarisetty
>
> The proposal would be to add a couple of additional log cleaner metrics: 
> 1. Time of last log cleaner run 
> 2. Cumulative number of successful log cleaner runs since last broker restart.
> Existing log cleaner metrics (max-buffer-utilization-percent, 
> cleaner-recopy-percent, max-clean-time-secs, max-dirty-percent) do not 
> differentiate an idle log cleaner from a dead log cleaner. It would be useful 
> to have the above two metrics added, to indicate whether log cleaner is alive 
> (and successfully cleaning) or not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3857) Additional log cleaner metrics

2017-01-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15822576#comment-15822576
 ] 

ASF GitHub Bot commented on KAFKA-3857:
---

Github user kiranptivo closed the pull request at:

https://github.com/apache/kafka/pull/1593


> Additional log cleaner metrics
> --
>
> Key: KAFKA-3857
> URL: https://issues.apache.org/jira/browse/KAFKA-3857
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Kiran Pillarisetty
>
> The proposal would be to add a couple of additional log cleaner metrics: 
> 1. Time of last log cleaner run 
> 2. Cumulative number of successful log cleaner runs since last broker restart.
> Existing log cleaner metrics (max-buffer-utilization-percent, 
> cleaner-recopy-percent, max-clean-time-secs, max-dirty-percent) do not 
> differentiate an idle log cleaner from a dead log cleaner. It would be useful 
> to have the above two metrics added, to indicate whether log cleaner is alive 
> (and successfully cleaning) or not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3857) Additional log cleaner metrics

2017-01-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15822571#comment-15822571
 ] 

ASF GitHub Bot commented on KAFKA-3857:
---

GitHub user kiranptivo opened a pull request:

https://github.com/apache/kafka/pull/2378

KAFKA-3857 Additional log cleaner metrics

Fixes KAFKA-3857

Changes proposed in this pull request:

An additional log cleaner metric has been added:
time-since-last-run-ms: Time since the last log cleaner run, in 
milliseconds.  This metric would be reset to 0 every time log cleaner thread 
runs. If this metric keeps constantly increasing, it indicates that the log 
cleaner thread is not alive.

If you are creating alerts around log cleaner, you could monitor this 
metric. A high "time-since-last-run-ms" value (eg: 60) indicates that the 
log cleaner hasn't been running since the last 10 minutes.

The code has been tested. JMX metric has been verified.

Note: This pull request is a continuation of the following pull request.  
PR#1593 was quite old and I had some trouble rebasing it. Decided to start a 
fresh PR.


https://github.com/apache/kafka/pull/1593/files/927b28cf41275874945beb7377f7f36c462f27c8#diff-ca1c127eee4b3c748ae73028f6abeab8

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kiranptivo/kafka log_cleaner_jmx_metric

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2378.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2378


commit a8635ff4a13e66b3f142ad97fff0ab082ecaf466
Author: Kiran Pillarisetty 
Date:   2017-01-14T00:23:45Z

Added a new metric time-since-last-run-ms, to track the time since the last 
log cleaner run, in milli seconds




> Additional log cleaner metrics
> --
>
> Key: KAFKA-3857
> URL: https://issues.apache.org/jira/browse/KAFKA-3857
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Kiran Pillarisetty
>
> The proposal would be to add a couple of additional log cleaner metrics: 
> 1. Time of last log cleaner run 
> 2. Cumulative number of successful log cleaner runs since last broker restart.
> Existing log cleaner metrics (max-buffer-utilization-percent, 
> cleaner-recopy-percent, max-clean-time-secs, max-dirty-percent) do not 
> differentiate an idle log cleaner from a dead log cleaner. It would be useful 
> to have the above two metrics added, to indicate whether log cleaner is alive 
> (and successfully cleaning) or not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3857) Additional log cleaner metrics

2016-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376102#comment-15376102
 ] 

ASF GitHub Bot commented on KAFKA-3857:
---

Github user kiranptivo closed the pull request at:

https://github.com/apache/kafka/pull/1593


> Additional log cleaner metrics
> --
>
> Key: KAFKA-3857
> URL: https://issues.apache.org/jira/browse/KAFKA-3857
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Kiran Pillarisetty
>
> The proposal would be to add a couple of additional log cleaner metrics: 
> 1. Time of last log cleaner run 
> 2. Cumulative number of successful log cleaner runs since last broker restart.
> Existing log cleaner metrics (max-buffer-utilization-percent, 
> cleaner-recopy-percent, max-clean-time-secs, max-dirty-percent) do not 
> differentiate an idle log cleaner from a dead log cleaner. It would be useful 
> to have the above two metrics added, to indicate whether log cleaner is alive 
> (and successfully cleaning) or not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3857) Additional log cleaner metrics

2016-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376103#comment-15376103
 ] 

ASF GitHub Bot commented on KAFKA-3857:
---

GitHub user kiranptivo reopened a pull request:

https://github.com/apache/kafka/pull/1593

KAFKA-3857 Additional log cleaner metrics

Fixes KAFKA-3857

Changes proposed in this pull request:

The following additional log cleaner metrics have been added.
1. num-runs: Cumulative number of successful log cleaner runs since last 
broker restart.
2. last-run-time: Time of last log cleaner run.
3. num-filthy-logs: Number of filthy logs. A non zero value for an extended 
period of time indicates that the cleaner has not been successful in cleaning 
the logs.

A note on num-filthy-logs: It is incremented whenever a filthy topic 
partition is added to inProgress HashMap. And it is decremented once the 
cleaning is successful, or if the cleaning is aborted. Note that the existing 
LogCleaner code does not provide a metric to check if the clean operation is 
successful or not. There is an inProgress HashMap with topicPartition  => 
LogCleaningInProgress entries in it, but the entries are removed from the 
HashMap even when clean operation throws an exception. So, added an additional 
metric num-filthy-logs, to differentiate between a successful log clean case 
and an exception case.

The code is ready. I have tested and verified JMX metrics. There is one 
case I couldn't test though. It's the case where numFilthyLogs is decremented 
in 'resumeCleaning(...)' in LogCleanerManager.scala Line 188. It seems to be a 
part of the workflow that aborts the cleaning of a particular partition. Any 
ideas on how to test this scenario?

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/TiVo/kafka log_cleaner_jmx_metrics

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1593.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1593


commit f00de412f6b1f6568adef479687ae0df789f9c96
Author: Kiran Pillarisetty 
Date:   2016-06-14T17:40:26Z

Create a couple of additional Log Cleaner JMX metrics
log-clean-last-run: Log cleaner's last run time
log-clean-runs: Number of log cleaner runs.

commit 7dc7511ee2b6d3cdf9df0c366fe23bf34d062a54
Author: Kiran Pillarisetty 
Date:   2016-06-14T20:24:00Z

Created a couple of additional Log Cleaner JMX metrics
log-clean-last-run: a metric to track last log cleaner run (unix timestamp)
log-clean-runs: a metric to track number of log cleaner runs

Committer: Kiran Pillarisetty 

commit 7f1214ff1118103dd639df717e988a22bad8033d
Author: Kiran Pillarisetty 
Date:   2016-07-01T22:14:57Z

Add additional JMX metric to track successful cleaning of a log segment

commit 1ac346bb37008312e41035167dbfd75803595cd6
Author: Kiran Pillarisetty 
Date:   2016-07-01T22:17:25Z

Add additional JMX metric to track successful cleaning of a log segment

commit 4f08d875e05c35bd7d7c849584b8b029031f884b
Author: Kiran Pillarisetty 
Date:   2016-07-05T22:23:20Z

Metric name updated to num-filthy-logs. Metric incremented as it is grabbed 
for cleaning, and decremented once the cleaning is done, or if the cleaning is 
aborted

commit cd887c05bf1d56b7566c5b72b3ddf3bcdfb70898
Author: Kiran Pillarisetty 
Date:   2016-07-05T23:31:32Z

Changed a metric name (number-of-runs to num-runs). Removed an extra \n 
around line 164. It is not present in the trunk




> Additional log cleaner metrics
> --
>
> Key: KAFKA-3857
> URL: https://issues.apache.org/jira/browse/KAFKA-3857
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Kiran Pillarisetty
>
> The proposal would be to add a couple of additional log cleaner metrics: 
> 1. Time of last log cleaner run 
> 2. Cumulative number of successful log cleaner runs since last broker restart.
> Existing log cleaner metrics (max-buffer-utilization-percent, 
> cleaner-recopy-percent, max-clean-time-secs, max-dirty-percent) do not 
> differentiate an idle log cleaner from a dead log cleaner. It would be useful 
> to have the above two metrics added, to indicate whether log cleaner is alive 
> (and successfully cleaning) or not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3857) Additional log cleaner metrics

2016-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376100#comment-15376100
 ] 

ASF GitHub Bot commented on KAFKA-3857:
---

Github user kiranptivo closed the pull request at:

https://github.com/apache/kafka/pull/1593


> Additional log cleaner metrics
> --
>
> Key: KAFKA-3857
> URL: https://issues.apache.org/jira/browse/KAFKA-3857
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Kiran Pillarisetty
>
> The proposal would be to add a couple of additional log cleaner metrics: 
> 1. Time of last log cleaner run 
> 2. Cumulative number of successful log cleaner runs since last broker restart.
> Existing log cleaner metrics (max-buffer-utilization-percent, 
> cleaner-recopy-percent, max-clean-time-secs, max-dirty-percent) do not 
> differentiate an idle log cleaner from a dead log cleaner. It would be useful 
> to have the above two metrics added, to indicate whether log cleaner is alive 
> (and successfully cleaning) or not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3857) Additional log cleaner metrics

2016-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376101#comment-15376101
 ] 

ASF GitHub Bot commented on KAFKA-3857:
---

GitHub user kiranptivo reopened a pull request:

https://github.com/apache/kafka/pull/1593

KAFKA-3857 Additional log cleaner metrics

Fixes KAFKA-3857

Changes proposed in this pull request:

The following additional log cleaner metrics have been added.
1. num-runs: Cumulative number of successful log cleaner runs since last 
broker restart.
2. last-run-time: Time of last log cleaner run.
3. num-filthy-logs: Number of filthy logs. A non zero value for an extended 
period of time indicates that the cleaner has not been successful in cleaning 
the logs.

A note on num-filthy-logs: It is incremented whenever a filthy topic 
partition is added to inProgress HashMap. And it is decremented once the 
cleaning is successful, or if the cleaning is aborted. Note that the existing 
LogCleaner code does not provide a metric to check if the clean operation is 
successful or not. There is an inProgress HashMap with topicPartition  => 
LogCleaningInProgress entries in it, but the entries are removed from the 
HashMap even when clean operation throws an exception. So, added an additional 
metric num-filthy-logs, to differentiate between a successful log clean case 
and an exception case.

The code is ready. I have tested and verified JMX metrics. There is one 
case I couldn't test though. It's the case where numFilthyLogs is decremented 
in 'resumeCleaning(...)' in LogCleanerManager.scala Line 188. It seems to be a 
part of the workflow that aborts the cleaning of a particular partition. Any 
ideas on how to test this scenario?

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/TiVo/kafka log_cleaner_jmx_metrics

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1593.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1593


commit f00de412f6b1f6568adef479687ae0df789f9c96
Author: Kiran Pillarisetty 
Date:   2016-06-14T17:40:26Z

Create a couple of additional Log Cleaner JMX metrics
log-clean-last-run: Log cleaner's last run time
log-clean-runs: Number of log cleaner runs.

commit 7dc7511ee2b6d3cdf9df0c366fe23bf34d062a54
Author: Kiran Pillarisetty 
Date:   2016-06-14T20:24:00Z

Created a couple of additional Log Cleaner JMX metrics
log-clean-last-run: a metric to track last log cleaner run (unix timestamp)
log-clean-runs: a metric to track number of log cleaner runs

Committer: Kiran Pillarisetty 

commit 7f1214ff1118103dd639df717e988a22bad8033d
Author: Kiran Pillarisetty 
Date:   2016-07-01T22:14:57Z

Add additional JMX metric to track successful cleaning of a log segment

commit 1ac346bb37008312e41035167dbfd75803595cd6
Author: Kiran Pillarisetty 
Date:   2016-07-01T22:17:25Z

Add additional JMX metric to track successful cleaning of a log segment

commit 4f08d875e05c35bd7d7c849584b8b029031f884b
Author: Kiran Pillarisetty 
Date:   2016-07-05T22:23:20Z

Metric name updated to num-filthy-logs. Metric incremented as it is grabbed 
for cleaning, and decremented once the cleaning is done, or if the cleaning is 
aborted

commit cd887c05bf1d56b7566c5b72b3ddf3bcdfb70898
Author: Kiran Pillarisetty 
Date:   2016-07-05T23:31:32Z

Changed a metric name (number-of-runs to num-runs). Removed an extra \n 
around line 164. It is not present in the trunk




> Additional log cleaner metrics
> --
>
> Key: KAFKA-3857
> URL: https://issues.apache.org/jira/browse/KAFKA-3857
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Kiran Pillarisetty
>
> The proposal would be to add a couple of additional log cleaner metrics: 
> 1. Time of last log cleaner run 
> 2. Cumulative number of successful log cleaner runs since last broker restart.
> Existing log cleaner metrics (max-buffer-utilization-percent, 
> cleaner-recopy-percent, max-clean-time-secs, max-dirty-percent) do not 
> differentiate an idle log cleaner from a dead log cleaner. It would be useful 
> to have the above two metrics added, to indicate whether log cleaner is alive 
> (and successfully cleaning) or not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3857) Additional log cleaner metrics

2016-07-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15365013#comment-15365013
 ] 

ASF GitHub Bot commented on KAFKA-3857:
---

GitHub user kiranptivo reopened a pull request:

https://github.com/apache/kafka/pull/1593

KAFKA-3857 Additional log cleaner metrics

Fixes KAFKA-3857

Changes proposed in this pull request:

The following additional log cleaner metrics have been added.
1. num-runs: Cumulative number of successful log cleaner runs since last 
broker restart.
2. last-run-time: Time of last log cleaner run.
3. num-filthy-logs: Number of filthy logs. A non zero value for an extended 
period of time indicates that the cleaner has not been successful in cleaning 
the logs.

A note on num-filthy-logs: It is incremented whenever a filthy topic 
partition is added to inProgress HashMap. And it is decremented once the 
cleaning is successful, or if the cleaning is aborted. Note that the existing 
LogCleaner code does not provide a metric to check if the clean operation is 
successful or not. There is an inProgress HashMap with topicPartition  => 
LogCleaningInProgress entries in it, but the entries are removed from the 
HashMap even when clean operation throws an exception. So, added an additional 
metric num-filthy-logs, to differentiate between a successful log clean case 
and an exception case.

The code is ready. I have tested and verified JMX metrics. There is one 
case I couldn't test though. It's the case where numFilthyLogs is decremented 
in 'resumeCleaning(...)' in LogCleanerManager.scala Line 188. It seems to be a 
part of the workflow that aborts the cleaning of a particular partition. Any 
ideas on how to test this scenario?

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/TiVo/kafka log_cleaner_jmx_metrics

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1593.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1593


commit f00de412f6b1f6568adef479687ae0df789f9c96
Author: Kiran Pillarisetty 
Date:   2016-06-14T17:40:26Z

Create a couple of additional Log Cleaner JMX metrics
log-clean-last-run: Log cleaner's last run time
log-clean-runs: Number of log cleaner runs.

commit 7dc7511ee2b6d3cdf9df0c366fe23bf34d062a54
Author: Kiran Pillarisetty 
Date:   2016-06-14T20:24:00Z

Created a couple of additional Log Cleaner JMX metrics
log-clean-last-run: a metric to track last log cleaner run (unix timestamp)
log-clean-runs: a metric to track number of log cleaner runs

Committer: Kiran Pillarisetty 

commit 7f1214ff1118103dd639df717e988a22bad8033d
Author: Kiran Pillarisetty 
Date:   2016-07-01T22:14:57Z

Add additional JMX metric to track successful cleaning of a log segment

commit 1ac346bb37008312e41035167dbfd75803595cd6
Author: Kiran Pillarisetty 
Date:   2016-07-01T22:17:25Z

Add additional JMX metric to track successful cleaning of a log segment

commit 4f08d875e05c35bd7d7c849584b8b029031f884b
Author: Kiran Pillarisetty 
Date:   2016-07-05T22:23:20Z

Metric name updated to num-filthy-logs. Metric incremented as it is grabbed 
for cleaning, and decremented once the cleaning is done, or if the cleaning is 
aborted

commit cd887c05bf1d56b7566c5b72b3ddf3bcdfb70898
Author: Kiran Pillarisetty 
Date:   2016-07-05T23:31:32Z

Changed a metric name (number-of-runs to num-runs). Removed an extra \n 
around line 164. It is not present in the trunk




> Additional log cleaner metrics
> --
>
> Key: KAFKA-3857
> URL: https://issues.apache.org/jira/browse/KAFKA-3857
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Kiran Pillarisetty
>
> The proposal would be to add a couple of additional log cleaner metrics: 
> 1. Time of last log cleaner run 
> 2. Cumulative number of successful log cleaner runs since last broker restart.
> Existing log cleaner metrics (max-buffer-utilization-percent, 
> cleaner-recopy-percent, max-clean-time-secs, max-dirty-percent) do not 
> differentiate an idle log cleaner from a dead log cleaner. It would be useful 
> to have the above two metrics added, to indicate whether log cleaner is alive 
> (and successfully cleaning) or not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3857) Additional log cleaner metrics

2016-07-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15365010#comment-15365010
 ] 

ASF GitHub Bot commented on KAFKA-3857:
---

Github user kiranptivo closed the pull request at:

https://github.com/apache/kafka/pull/1593


> Additional log cleaner metrics
> --
>
> Key: KAFKA-3857
> URL: https://issues.apache.org/jira/browse/KAFKA-3857
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Kiran Pillarisetty
>
> The proposal would be to add a couple of additional log cleaner metrics: 
> 1. Time of last log cleaner run 
> 2. Cumulative number of successful log cleaner runs since last broker restart.
> Existing log cleaner metrics (max-buffer-utilization-percent, 
> cleaner-recopy-percent, max-clean-time-secs, max-dirty-percent) do not 
> differentiate an idle log cleaner from a dead log cleaner. It would be useful 
> to have the above two metrics added, to indicate whether log cleaner is alive 
> (and successfully cleaning) or not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3857) Additional log cleaner metrics

2016-07-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15364847#comment-15364847
 ] 

ASF GitHub Bot commented on KAFKA-3857:
---

GitHub user kiranptivo opened a pull request:

https://github.com/apache/kafka/pull/1593

KAFKA-3857 Additional log cleaner metrics

Fixes KAFKA-3857

Changes proposed in this pull request:

The following additional log cleaner metrics have been added.
1. num-runs: Cumulative number of successful log cleaner runs since last 
broker restart.
2. last-run-time: Time of last log cleaner run.
3. num-filthy-logs: Number of filthy logs. A non zero value for an extended 
period of time indicates that the cleaner has not been successful in cleaning 
the logs.

A note on num-filthy-logs: It is incremented whenever a filthy topic 
partition is added to inProgress HashMap. And it is decremented once the 
cleaning is successful, or if the cleaning is aborted. Note that the existing 
LogCleaner code does not provide a metric to check if the clean operation is 
successful or not. There is an inProgress HashMap with topicPartition  => 
LogCleaningInProgress entries in it, but the entries are removed from the 
HashMap even when clean operation throws an exception. So, added an additional 
metric num-filthy-logs, to differentiate between a successful log clean case 
and an exception case.

The code is ready. I have tested and verified JMX metrics. There is one 
case I couldn't test though. It's the case where numFilthyLogs is decremented 
in 'resumeCleaning(...)' in LogCleanerManager.scala Line 188. It seems to be a 
part of the workflow that aborts the cleaning of a particular partition. Any 
ideas on how to test this scenario?

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/TiVo/kafka log_cleaner_jmx_metrics

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1593.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1593


commit f00de412f6b1f6568adef479687ae0df789f9c96
Author: Kiran Pillarisetty 
Date:   2016-06-14T17:40:26Z

Create a couple of additional Log Cleaner JMX metrics
log-clean-last-run: Log cleaner's last run time
log-clean-runs: Number of log cleaner runs.

commit 7dc7511ee2b6d3cdf9df0c366fe23bf34d062a54
Author: Kiran Pillarisetty 
Date:   2016-06-14T20:24:00Z

Created a couple of additional Log Cleaner JMX metrics
log-clean-last-run: a metric to track last log cleaner run (unix timestamp)
log-clean-runs: a metric to track number of log cleaner runs

Committer: Kiran Pillarisetty 

commit 7f1214ff1118103dd639df717e988a22bad8033d
Author: Kiran Pillarisetty 
Date:   2016-07-01T22:14:57Z

Add additional JMX metric to track successful cleaning of a log segment

commit 1ac346bb37008312e41035167dbfd75803595cd6
Author: Kiran Pillarisetty 
Date:   2016-07-01T22:17:25Z

Add additional JMX metric to track successful cleaning of a log segment

commit 4f08d875e05c35bd7d7c849584b8b029031f884b
Author: Kiran Pillarisetty 
Date:   2016-07-05T22:23:20Z

Metric name updated to num-filthy-logs. Metric incremented as it is grabbed 
for cleaning, and decremented once the cleaning is done, or if the cleaning is 
aborted

commit cd887c05bf1d56b7566c5b72b3ddf3bcdfb70898
Author: Kiran Pillarisetty 
Date:   2016-07-05T23:31:32Z

Changed a metric name (number-of-runs to num-runs). Removed an extra \n 
around line 164. It is not present in the trunk




> Additional log cleaner metrics
> --
>
> Key: KAFKA-3857
> URL: https://issues.apache.org/jira/browse/KAFKA-3857
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Kiran Pillarisetty
>
> The proposal would be to add a couple of additional log cleaner metrics: 
> 1. Time of last log cleaner run 
> 2. Cumulative number of successful log cleaner runs since last broker restart.
> Existing log cleaner metrics (max-buffer-utilization-percent, 
> cleaner-recopy-percent, max-clean-time-secs, max-dirty-percent) do not 
> differentiate an idle log cleaner from a dead log cleaner. It would be useful 
> to have the above two metrics added, to indicate whether log cleaner is alive 
> (and successfully cleaning) or not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3857) Additional log cleaner metrics

2016-07-02 Thread Peter Davis (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15360064#comment-15360064
 ] 

Peter Davis commented on KAFKA-3857:


Related to KAFKA-3894 - dup?

> Additional log cleaner metrics
> --
>
> Key: KAFKA-3857
> URL: https://issues.apache.org/jira/browse/KAFKA-3857
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Kiran Pillarisetty
>
> The proposal would be to add a couple of additional log cleaner metrics: 
> 1. Time of last log cleaner run 
> 2. Cumulative number of successful log cleaner runs since last broker restart.
> Existing log cleaner metrics (max-buffer-utilization-percent, 
> cleaner-recopy-percent, max-clean-time-secs, max-dirty-percent) do not 
> differentiate an idle log cleaner from a dead log cleaner. It would be useful 
> to have the above two metrics added, to indicate whether log cleaner is alive 
> (and successfully cleaning) or not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)