brkyvz commented on pull request #30521: URL: https://github.com/apache/spark/pull/30521#issuecomment-736948942
> In streaming query we only do append - there's no other options for creating table if we handle it. That's not necessarily true. You can perform complete mode writes, which overwrites the entire data every time. I'd argue that the semantics for `toTable` would be to resolve the table if it exists, and start writing to it. If it doesn't exist, the table should be created first, then the writes should start happening on the table. > I don't think it's a difficult requirement for end users to create table in prior, hence I'd in favor of dealing with existing table only. Users are LAAAAZZY. As a developer, I would also prefer that people explicitly create their tables first, but plenty of users complain about that workflow. Sometimes they want to perform some debugging, get started immediately. They don't want to perform 2-3 steps to create the table first. > No support for Transform Can't we parse the string partitions as expressions? If they're making a function call such as `years(eventTime)` as a partition column that would mean that they want to partition by expressions. I haven't seen any user who has `(`, `)` in their partition column names - e.g. it's forbidden in Delta Lake to do so and no one has complained. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
