[ 
https://issues.apache.org/jira/browse/SPARK-39591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17629252#comment-17629252
 ] 

Apache Spark commented on SPARK-39591:
--------------------------------------

User 'jerrypeng' has created a pull request for this issue:
https://github.com/apache/spark/pull/38517

> Offset Management Improvements in Structured Streaming
> ------------------------------------------------------
>
>                 Key: SPARK-39591
>                 URL: https://issues.apache.org/jira/browse/SPARK-39591
>             Project: Spark
>          Issue Type: Improvement
>          Components: Structured Streaming
>    Affects Versions: 3.3.0
>            Reporter: Boyang Jerry Peng
>            Priority: Major
>
> Currently in Structured Streaming, at the beginning of every micro-batch the 
> offset to process up to for the current batch is persisted to durable 
> storage.  At the end of every micro-batch, a marker to indicate the 
> completion of this current micro-batch is persisted to durable storage. For 
> pipelines such as one that read from Kafka and write to Kafka, end-to-end 
> exactly once is not support and latency is sensitive, we can allow users to 
> configure offset commits to be written asynchronously thus this commit 
> operation will not contribute to the batch duration and effectively lowering 
> the overall latency of the pipeline.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to