HeartSaVioR edited a comment on pull request #30521: URL: https://github.com/apache/spark/pull/30521#issuecomment-738336115
> Per #30521 (comment), for the partition column. Now the param only takes effects in the v1 sink. How do we address the conflict between user input and the partitioning for the existing table? If we want to let the data source decide, the V2 plan didn't carry the partition info for now. I don't know which is the best way to handle. It seems both are not good. The configuration of partition columns was added without table support. Without table support, we don't know whether data source has such partition information or not, so we're forced to "always" provide the information, even it's unnecessary. (I don't like to make this happen again against table - that's one of reasons I don't like the proposal on creating table by default. But I'm fine if there's a way to avoid, like I said in "lowest bar".) With existing table, the table should have partition information in prior, hence the configuration is useless unless we mean to create table. In DataFrameWriterV2, once you provide the partition information or table property, you are no longer able to do append. You are forced to create or replace, which should always respect the input or simply fail. There's no confusion on such part. More and more I revisit DataFrameWriterV2, more and more I realize how much DataStreamWriter is lacking on table support. That was OK (shouldn't be blamed), because there's no support on table write, but that's no longer an excuse once we are adding it. Anyway it would be safer to follow how we do with SaveMode.Append in DataFrameWriter. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
