skonto edited a comment on issue #24613: [SPARK-27549][SS] Add support for 
committing kafka offsets per batch for supporting external tooling
URL: https://github.com/apache/spark/pull/24613#issuecomment-494092510
 
 
   @gaborgsomogyi @HeartSaVioR what if I introduce queryOptions at the 
`query.start()` call as optional parameters, so I can then pass a query 
specific unique gID 
[here](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala#L101).
   For example:
   ```
   query1.start(id1)
   query2.start(id2)
   ```
   This way we dont share anything if we want and the user will have to pass 
explicitly the gId per query if he wants to integrate with external monitoring, 
thoughts? This way we could make the per query config more flexible in the 
future in case we want to add more query specific options.
   
   > query ID might be considered as unique group id since it can provide both 
unique and continuous, but it should consider the case where multiple Kafka 
sources are being used in same query.
   
   In that case we could have an increasing id added as sources are getting 
registered, assuming code is not modified keeping registration order the same 
on restart.
   
   I can do both but let me know what is the viable option here. I prefer the 
first one as it is what monitoring tools expect but may be too intrusive for 
Spark, as for the second one it is also possible.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to