siying commented on PR #41525: URL: https://github.com/apache/spark/pull/41525#issuecomment-1587669644
@HeartSaVioR in my understanding, we are right now essentially timing Poll(), which is pretty much what InternalKafkaConsumer.fetch() does and what we measure. I guess your concern is that when we report the metric per microbatch, some microbatch might show higher latency and some lower and it is misleading to take individual ones? I feel it OK as people do understand buffering and they need to add up multiple microbatches to find the real costs. The alternative approach is not less confusing either. If we, for example, trace in InternalKafkaConsumer, and report it periodically outside microbatch boundary, the execution period might include tasks from other jobs, and make it hard to match with tasks it serves. Another alternative is to log the timing every time poll() is called. Won't it be potentially to spamming? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
