seglo edited a comment on issue #24613: [SPARK-27549][SS] Add support for 
committing kafka offsets per batch for supporting external tooling
URL: https://github.com/apache/spark/pull/24613#issuecomment-494009293
 
 
   Hey everyone, just weighing in with my 2 cents.
   
   I can't speak for how all Kafka consumer group monitoring software works, 
but 
[`kafka-consumer-group.sh`](https://kafka.apache.org/documentation/#basic_ops_consumer_lag)
 and [`kafka-lag-exporter`](https://github.com/lightbend/kafka-lag-exporter) 
both use the `AdminClient` to obtain consumer group metadata and offsets.  It's 
true you could consume `__consumer_offsets` and parse this information 
yourself, but this is an internal topic and I assume it's not meant to be 
consumed by external tooling.  The AdminClient is a public-facing API that lets 
you get offsets and other information, such as group member metadata, and more.
   
   Ideally, if the user wants to enable this feature then they would have full 
control of the `group.id` used.  This would make Spark apps consistent with any 
other Kafka consumer app.  More importantly, it would make it consistent when 
monitoring consumer group lag too.  If Spark only allows for an automatically 
generated ID, and if that ID was generated each lifetime of the app, then this 
Spark/Kafka `group.id` generation concern would leak out and become a problem 
that must be handled in a Spark only way in the monitoring tool as well.  If 
the `group.id` were stable and user-chosen then the monitoring tool wouldn't 
need to give Spark apps any special consideration.  Is there a way for the user 
to optionally provide a full `group.id` per Spark Query?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to