[ 
https://issues.apache.org/jira/browse/KAFKA-3811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15324838#comment-15324838
 ] 

Greg Fodor commented on KAFKA-3811:
-----------------------------------

Hey [~aartigupta], I ran an attached yourkit profiler to one of our jobs 
running dark against production data. The job has 200-300 topic-partition pairs 
and generally discards most messages early in the pipeline, and was processing 
a few thousand tps from the top level topics. Unfortunately since this issue 
came up we implemented changes to reduce the amount of data running through the 
system (discarding it earlier) so we didn't have to worry about this 
performance problem. In my tests a majority of the CPU time of the job was 
spent inside of the code walking and emitting to the Sensors for the 
per-message process metrics and the per-k/v read/write latency metrics. I also 
found 6-7% of the time was spent in the fetcher metrics which was addressed 
here: https://github.com/apache/kafka/pull/1464. 

Good news: I managed to find the snapshot data :) I will attach it here. The 
majority of the time is *not* the milliseconds() call but the actual 
(synchronized?) walk of Sensors in Sensor.record.

> Introduce Kafka Streams metrics recording levels
> ------------------------------------------------
>
>                 Key: KAFKA-3811
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3811
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>            Reporter: Greg Fodor
>            Assignee: aarti gupta
>
> Follow-up from the discussions here:
> https://github.com/apache/kafka/pull/1447
> https://issues.apache.org/jira/browse/KAFKA-3769
> The proposal is to introduce configuration to control the granularity/volumes 
> of metrics emitted by Kafka Streams jobs, since the per-record level metrics 
> introduce non-trivial overhead and are possibly less useful once a job has 
> been optimized. 
> Proposal from guozhangwang:
> level0 (stream thread global): per-record process / punctuate latency, commit 
> latency, poll latency, etc
> level1 (per processor node, and per state store): IO latency, per-record .. 
> latency, forward throughput, etc.
> And by default we only turn on level0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to