[
https://issues.apache.org/jira/browse/KAFKA-1616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Guozhang Wang updated KAFKA-1616:
---------------------------------
Attachment: KAFKA-1616_2014-09-04_13:26:02.patch
> Purgatory Size and Num.Delayed.Request metrics are incorrect
> ------------------------------------------------------------
>
> Key: KAFKA-1616
> URL: https://issues.apache.org/jira/browse/KAFKA-1616
> Project: Kafka
> Issue Type: Bug
> Reporter: Guozhang Wang
> Assignee: Guozhang Wang
> Fix For: 0.9.0
>
> Attachments: KAFKA-1616.patch, KAFKA-1616_2014-08-28_10:12:17.patch,
> KAFKA-1616_2014-09-01_14:41:56.patch, KAFKA-1616_2014-09-02_12:58:07.patch,
> KAFKA-1616_2014-09-02_13:23:13.patch, KAFKA-1616_2014-09-03_12:53:09.patch,
> KAFKA-1616_2014-09-04_13:26:02.patch
>
>
> The request purgatory used two atomic integers "watched" and "unsatisfied" to
> record the purgatory size ( = watched + unsatisfied) and number of delayed
> requests ( = unsatisfied). But due to some race conditions these two atomic
> integers are not updated correctly, result in incorrect metrics.
> Proposed solution: to have a cleaner semantics, we can define the "purgatory
> size" to be just the number of elements in the watched lists, and the "number
> of delayed requests" to be just the length of the expiry queue. And instead
> of using two atomic integeres we just compute the size of the lists / queue
> on the fly each time the metrics are pulled. This may use some more CPU
> cycles for these two metrics but should be minor, and the correctness is
> guaranteed.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)