[GitHub] HeartSaVioR commented on issue #23859: [SPARK-26956][SQL] remove streaming output mode from data source v2 APIs

GitBox Thu, 21 Feb 2019 15:35:42 -0800

HeartSaVioR commented on issue #23859: [SPARK-26956][SQL] remove streaming 
output mode from data source v2 APIs
URL: https://github.com/apache/spark/pull/23859#issuecomment-466212629
 
 
   Does we have any docs representing the background/discussion around this 
change? Doesn't sound small change and directly impact to structured streaming 
so I would like to fully understand it.
   
   > 2. complete mode: call `SupportsTruncate#truncate`. Complete mode means 
truncating all the old data and appending new data, and `SupportsTruncate` has 
exactly the same semantic.
   
   Looks like we will have only new data then. Is it correct or it should be 
fixed as appending all data instead of new data?
   
   > 3. update mode: fail. The current streaming framework can't propagate the 
update keys, so v2 sinks are not able to implement update mode. In the future 
we can introduce a `SupportsUpdate` trait.
   
   I guess this means we don't separate keys and values while passing to sink, 
so sink cannot perform upsert (though target system still can upsert if target 
system knows about keys and values). So SupportsUpdate will get keys and values 
separately. Do I understand correctly?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] HeartSaVioR commented on issue #23859: [SPARK-26956][SQL] remove streaming output mode from data source v2 APIs

Reply via email to