brkyvz commented on pull request #30521:
URL: https://github.com/apache/spark/pull/30521#issuecomment-736948942


   > In streaming query we only do append - there's no other options for 
creating table if we handle it.
   
   That's not necessarily true. You can perform complete mode writes, which 
overwrites the entire data every time. I'd argue that the semantics for 
`toTable` would be to resolve the table if it exists, and start writing to it. 
If it doesn't exist, the table should be created first, then the writes should 
start happening on the table.
   
   >  I don't think it's a difficult requirement for end users to create table 
in prior, hence I'd in favor of dealing with existing table only. 
   
   Users are LAAAAZZY. As a developer, I would also prefer that people 
explicitly create their tables first, but plenty of users complain about that 
workflow. Sometimes they want to perform some debugging, get started 
immediately. They don't want to perform 2-3 steps to create the table first.
   
   > No support for Transform
   
   Can't we parse the string partitions as expressions? If they're making a 
function call such as `years(eventTime)` as a partition column that would mean 
that they want to partition by expressions. I haven't seen any user who has 
`(`, `)` in their partition column names - e.g. it's forbidden in Delta Lake to 
do so and no one has complained.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to