[ https://issues.apache.org/jira/browse/KAFKA-3811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15324838#comment-15324838 ]
Greg Fodor commented on KAFKA-3811: ----------------------------------- Hey [~aartigupta], I ran an attached yourkit profiler to one of our jobs running dark against production data. The job has 200-300 topic-partition pairs and generally discards most messages early in the pipeline, and was processing a few thousand tps from the top level topics. Unfortunately since this issue came up we implemented changes to reduce the amount of data running through the system (discarding it earlier) so we didn't have to worry about this performance problem. In my tests a majority of the CPU time of the job was spent inside of the code walking and emitting to the Sensors for the per-message process metrics and the per-k/v read/write latency metrics. I also found 6-7% of the time was spent in the fetcher metrics which was addressed here: https://github.com/apache/kafka/pull/1464. Good news: I managed to find the snapshot data :) I will attach it here. The majority of the time is *not* the milliseconds() call but the actual (synchronized?) walk of Sensors in Sensor.record. > Introduce Kafka Streams metrics recording levels > ------------------------------------------------ > > Key: KAFKA-3811 > URL: https://issues.apache.org/jira/browse/KAFKA-3811 > Project: Kafka > Issue Type: Improvement > Components: streams > Reporter: Greg Fodor > Assignee: aarti gupta > > Follow-up from the discussions here: > https://github.com/apache/kafka/pull/1447 > https://issues.apache.org/jira/browse/KAFKA-3769 > The proposal is to introduce configuration to control the granularity/volumes > of metrics emitted by Kafka Streams jobs, since the per-record level metrics > introduce non-trivial overhead and are possibly less useful once a job has > been optimized. > Proposal from guozhangwang: > level0 (stream thread global): per-record process / punctuate latency, commit > latency, poll latency, etc > level1 (per processor node, and per state store): IO latency, per-record .. > latency, forward throughput, etc. > And by default we only turn on level0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)