HeartSaVioR edited a comment on issue #23859: [SPARK-26956][SQL] remove streaming output mode from data source v2 APIs URL: https://github.com/apache/spark/pull/23859#issuecomment-466273566 I'm sorry but I can't find mentioning of complete mode from neither the doc nor the PR you linked. Btw, here's explanation of `Complete Mode` in doc on Structured Streaming http://spark.apache.org/docs/latest/structured-streaming-programming-guide.html : > Complete Mode - The entire updated Result Table will be written to the external storage. It is up to the storage connector to decide how to handle writing of the entire table. I'd understand this semantic as the sink always receives entire result table at every batch, and then your explanation of complete mode sounds to work differently. I guess this is due to the fact we can't expect upsert works well with sinks so have to truncate all, but then shouldn't we provide updated entire result table including old data instead of only providing new data? Maybe there's confusion regarding definition of "new data". Previously I would only think final result always contain every rows the query outputs. I would like to be clear that we keep this as it is, or let sink decide it.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
