zsxwing commented on pull request #30521: URL: https://github.com/apache/spark/pull/30521#issuecomment-738493120
There are three types of users: - Users want to create a table if not exist automatically, and the default table we create here is sufficient. Their codes will be shorter if we create the table automatically. - Users want to create a table if not exist automatically, but find DataStreamWriter doesn't have APIs to create the specific types of v2 tables they need. They need to create the table no matter which behavior we decide. - Users don't want to create a table if not exist. They need to create the table no matter which behavior we decide. IMO, since creating the table automatically makes the first type of users write less code and doesn't change the codes for the second type and the third type, why not do this? In addition, as I already pointed out, this is consistent with existing Kafka sink and file stream sink. Regarding the implementation part, I totally understand the limitations you pointed out. But as I said above, I don't see this behavior will make any differences for future work we will do. IMO, these limitations are not good reasons to block this. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
