HeartSaVioR edited a comment on pull request #30521:
URL: https://github.com/apache/spark/pull/30521#issuecomment-738508350


   >> So their only workaround is ensuring create table is made all the time, 
which is not a thing I can agree with.
   
   > For such users, they still need to ensure creating table is made all the 
time if we don't create it for them. Right? And we would ask other users to 
create table all the time as well.
   
   I'm not sure why you're still considering these users as "second class". 
We're affecting ecosystem data sources which are happily adopting the shiny new 
DSv2 API with their efforts. (Despite things are like a beta testing.) Saying 
again, DSv1 streaming writer interface is "behind" the private package and 
ecosystem is "not" encouraged to leverage it. The root problem is that file 
table is still v1, not something else.
   
   I tend to concern about the "surprise" moment if predictable. I concern more 
about the possibility of table being created mistakenly without proper options. 
That's not a trade-off for usability. Even I stepped back about default 
behavior if we really want to retain only one method, but still enable end 
users to claim avoiding creating table. Is it too hard for us to do that?
   
   > Does creating the table by default block any future improvement we can do? 
It sounds to me that you agree that DataStreamWriter is lacking on table 
support, and we need to add a new DataStreamWriterV2 similar to 
DataFrameWriterV2. I don't see this behavior would make any work we may do in 
future harder.
   
   I see we're trying to think v1 vs v2, but in any moment, we must ensure the 
interface is reasonable for both v1 and v2. That said, if DataStreamWriterV2 is 
marked as blocker for 3.1.0 I'm totally OK with it (as the lack must be 
addressed before releasing 3.1.0), but we're not expecting it. Right? Please 
correct me if I'm mistaken - I'm happy to go with dealing with SPARK-33638 in 
QA period.
   
   If you guys strongly insist to have only one method which creates table by 
default and no workaround other than create table by end users, let's just 
disable automatic creating table for v2 table (throw exception if the table 
doesn't exist), and mark SPARK-33638 as blocker for 3.2.0.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to