jerrypeng commented on PR #56314:
URL: https://github.com/apache/spark/pull/56314#issuecomment-4633774793

   @viirya thank you for the review.  Addressing your comments inline
   
   > Comparison table: Continuous Processing should be "At-least-once", not 
"Exactly-once"
   
   I want to make a distinction between exactly-once processing guarantees and 
at-least-once delivery in sinks.  Exactly-once processing guarantees means 
changes to state managed by the engine as a result of processing rows is 
applied **effectively** once.  At-least-once delivery means output will be 
written to the external system at-least-once, i.e. duplicates possible.  I 
think it is a important distinction to make.  Real-time Mode offers 
exactly-once processing semantics just like the existing engine.  The 
difference is in the sinks it supports.  The only sink that supports 
exactly-once delivery is the delta sink (through idempotent writes).  The kafka 
sink supports al-least-once delivery semantics regardless of whether real-time 
mode is used or not.  This is an important distinction and I want to call this 
out in the documentation.  In theory you can write an exactly-once sink for 
RTM, there is just no implementation of it yet.
   
   In regards to continuous mode, it does not support state so the argument is 
moot here.  Let do this:
   1. clearly define the terms
   2. clarify what is supported in continuous mode.
   
   > The page should state that Real-time Mode is experimental
   
   I think this is a mistake.  The real-time mode APIs are stable.  Let me 
create a PR to remove the experimental annotation.
   
   I will address the minor nits as well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to