HeartSaVioR edited a comment on pull request #30521:
URL: https://github.com/apache/spark/pull/30521#issuecomment-736957997


   > You can perform complete mode writes, which overwrites the entire data 
every time.
   
   Sorry probably I wasn't clear. This isn't true for DSv1 Sink interface 
unless data source does the hack to require providing output mode to Sink 
directly. You have no idea of output mode in DSv1, and that's what I have been 
concerned about. Output mode is effectively no-op at least for DSv1 sink. For 
the backward compatibility we allow to do update/complete as append, but that's 
just to not break backward compatibility on old data sources and we shouldn't 
continue doing this.
   
   I've already raised related discussion in dev. mailing list months ago, but 
no response. I wish we don't ignore the discussion thread in dev mailing list.
   
http://apache-spark-developers-list.1001551.n3.nabble.com/Output-mode-in-Structured-Streaming-and-DSv1-sink-DSv2-table-tt30216.html#a30239
   
   > Users are LAAAAZZY. As a developer, I would also prefer that people 
explicitly create their tables first, but plenty of users complain about that 
workflow.
   
   I agree about this, but user are not always wanted to create a table if it 
doesn't exist. That's the reason there's `append` in save mode, and we don't 
have such in new approach. Yes, users are lazy, and that said they don't always 
want to assume a new table could be created and provide all informations in 
case of table creation. If the table exists, these provided options are 
meaningless and just a burden (and also quite confused if the existing table 
has different options).
   
   > Can't we parse the string partitions as expressions?
   
   DSv1 interface doesn't allow to provide expression to partition. Please 
refer the definition of DataSource. That would be completely data source's role 
to parse and interpret the string partition column. This is quite different 
from what we do for DSv2. That said, we can't fully leverage the functionality 
of create table against DSv2 in interfaces based on DSv1, like DataStreamWriter.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to