HeartSaVioR edited a comment on issue #23859: [SPARK-26956][SQL] remove 
streaming output mode from data source v2 APIs
URL: https://github.com/apache/spark/pull/23859#issuecomment-466273566
 
 
   I'm sorry but I can't find mentioning of complete mode from neither the doc 
nor the PR you linked.
   
   Btw, here's explanation of `Complete Mode` in doc on Structured Streaming 
http://spark.apache.org/docs/latest/structured-streaming-programming-guide.html 
:
   
   > Complete Mode - The entire updated Result Table will be written to the 
external storage. It is up to the storage connector to decide how to handle 
writing of the entire table.
   
   I'd understand this semantic as the sink always receives entire result table 
at every batch, and then your explanation of complete mode sounds to work 
differently. I guess this is due to the fact we can't expect upsert works well 
with sinks so have to truncate all, but then shouldn't we provide updated 
entire result table including old data instead of only providing new data? 
Maybe there's confusion regarding definition of "new data".
   
   Previously I would only think final result always contain every rows the 
query outputs. I would like to be clear that we keep this as it is, or let sink 
decide it.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to