[ 
https://issues.apache.org/jira/browse/SPARK-27549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16834603#comment-16834603
 ] 

Tarush Grover commented on SPARK-27549:
---------------------------------------

Anyone working on this?

> Commit Kafka Source offsets to facilitate external tooling
> ----------------------------------------------------------
>
>                 Key: SPARK-27549
>                 URL: https://issues.apache.org/jira/browse/SPARK-27549
>             Project: Spark
>          Issue Type: Improvement
>          Components: Structured Streaming
>    Affects Versions: 3.0.0
>            Reporter: Stavros Kontopoulos
>            Priority: Major
>
> Tools monitoring consumer lag could benefit from having the option of saving 
> the source offsets. Sources use the implementation of 
> org.apache.spark.sql.sources.v2.reader.streaming.
> SparkDataStream. KafkaMicroBatchStream currently [does not 
> commit|https://github.com/apache/spark/blob/5bf5d9d854db53541956dedb03e2de8eecf65b81/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaMicroBatchStream.scala#L170]
>  anything as expected so we could expand that.
> Other streaming engines like 
> [Flink|https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/kafka.html#kafka-consumers-offset-committing-behaviour-configuration]
>  allow you to enable `auto.commit` at the expense of not having checkpointing.
> Here the proposal is to allow commit the sources offsets when progress has 
> been made.
> I am also aware that another option would be to have a StreamingQueryListener 
> and intercept when batches are completed and then write the offsets anywhere 
> you need to but it would be great if Kafka integration with Structured 
> Streaming could do some of this work anyway.
> [~c...@koeninger.org]  [~marmbrus] what do you think?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to