cloud-fan commented on PR #53173: URL: https://github.com/apache/spark/pull/53173#issuecomment-3573522716
I think the root cause is that Delta Lake claims to be a v2 source, which means `DataFrameWriter#saveAsTable` will create v2 commands for it. However, Delta Lake used to be a v1 source and still want to retain the old behavior for `DataFrameWriter#saveAsTable` which is slightly different from the v2 commands. I understand the intention to avoid behavior changes, and Delta used to rely on a hack to identify `DataFrameWriter#saveAsTable` commands, but the hack does not work anymore after some reasonable changes in Spark 4.1. Instead of reverting that reasonable change from Spark, I'm more in favor of adding a way to identify the v2 commands from v1 `DataFrameWriter#saveAsTable`. We can add a new bool flag in `TableProvider`, something like `boolean includeV1WriteOption`. For data sources that migrated from v1 to v2, they now have a way to retain the old v1 behavior. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
