wenxuanguan commented on a change in pull request #24631: 
[MINOR][CORE][DOC]Avoid hardcoded configs and fix kafka sink write semantics in 
document
URL: https://github.com/apache/spark/pull/24631#discussion_r284976140
 
 

 ##########
 File path: docs/structured-streaming-kafka-integration.md
 ##########
 @@ -441,7 +441,7 @@ Apache Kafka only supports at least once write semantics. 
Consequently, when wri
 or Batch Queries---to Kafka, some records may be duplicated; this can happen, 
for example, if Kafka needs
 to retry a message that was not acknowledged by a Broker, even though that 
Broker received and wrote the message record.
 Structured Streaming cannot prevent such duplicates from occurring due to 
these Kafka write semantics. However,
-if writing the query is successful, then you can assume that the query output 
was written at least once. A possible
+if writing the query is successful, then you can assume that the query output 
was written exactly once. A possible
 
 Review comment:
   Thanks for your reply. @dongjoon-hyun 
   I thought this describes the Spark exactly once write semantics if ignore 
the effect of Kafka. How about change to `So if writing the query is 
successful, then you can assume that the query output was written at least 
once`, which will not be confused by `However`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to