juliuszsompolski commented on PR #53173:
URL: https://github.com/apache/spark/pull/53173#issuecomment-3575494492

   I have slight preference for 2, which is a bit more verbose, but:
   1. From Spark perspective, it's cleaner to not pullute the TableProvider 
directly with something that is a specific workaround for a specific migration 
case. Having mixins that provide extra addons to interfaces is the usual way 
DSv2 interfaces seem to be structured. 
   2. From Delta perspective, Delta will cross-compile against 4.0 and 4.1 for 
a while. If it's a separate mixin, it's easier to shim: I just need to add a 
shim for it for 4.0 cross-compile, and do nothing for 4.1 cross-compile. After 
we drop 4.0, no change is needed other than dropping the shim. If I just add a 
field in TableProvider, I need to add an intermediate shim subclass in 
TableProvider that adds this field in 4.0, and then actual DeltaDataSource 
extend that shim instead of TableProvider; for 4.1 I also need to then make an 
empty shim; then it needs cleanup after 4.0 is dropped.
   
   For 4, Delta community is actively working in developing a proper V2 
datasource. See already closed PRs in 
https://github.com/delta-io/delta/pulls?q=is%3Apr+dsv2+is%3Aclosed. But it 
won't be ready now, for 4.1...
   
   For 5, I agree with @cloud-fan that we shouldn't be reverting reasonable 
changes to Spark. Delta, and the fact that we let that horrible hack be there 
for 6 years is at fault here. Because of the timeline of it being detected this 
late in 4.1 process, and the behaviour change being very unfriendly (causing 
silent overwrite of metadata where it was not overwritten before) to users of 
open source Spark with open source Delta I'd much prefer to be allowed to move 
forward with a narrowly scoped fix.
   
   I will prepare a PR for option 2.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to