potiuk commented on issue #15933:
URL: https://github.com/apache/airflow/issues/15933#issuecomment-970278184


   > Even now we have providers that has dependencies between each other - 
Postgres and AWS (for redshift), any transfer operator between two providers 
etc.. The problem that you are worried about indeed needs attention but I'm 
happy with providing the answer "keep it aligned and up to date".
   
   Just to comment on that part - those cross-provider dependencies are (I 
believe) way simpler. They use very stable (by definition) APIs that are shared 
between providers now (Hooks). That's about it. And we already had 1 case where 
there was a breaking change which required to document it in the changelog, add 
cross-provider-versioned-dependency with quite some communication overhead. 
   
   What we are talking about in Google is a deep integration with a number of 
common functionalities revolving mostly about authentication but also some 
"common" functionality that is shared and expectation is (and current state) 
that if we implement it for one service, it is implemented for all of them at 
once. This is far closer to the DBAPI issues than cross-provider dependencies 
IMHO.
   
   Just to give a realistic example and where it would create problems. An 
example of "breaking" change I am talking here was adding "impersonation chain" 
to GoogleBaseHook and a number of "services" 
https://github.com/apache/airflow/pull/9915.  
   
   This was a functionality added acrross pretty much all google services (it 
was done before we released Airflow 2 and split  providers) and if we did not 
have all those changes in one release, the risk of introducing problems when 
implementing it in fully backwards compatible way in the common part was high. 
   
   You'd either:
   
   1) make all providers depend on new version of 'commons-2.0' (which is 
equivalent to releasing the "google provider" now.
   
   or 
   
   2)  in most "split" providers you'd have handle the case where you have "old 
commons 1.0" without impersonation and "new commons 2.0" with it  (and test it 
with both versions for all providers that use them!) 
   
   Case 2) is an absolute nightmare from the code/maintenance point of view. 
You cannot rely on having new "common" code (it has to work with already 
released commons-1.0). So basically what you have to do is to properly handle 
the "if impersonation available" code in all the providers in similar way. The 
terrible part of it is tha you have to copy the if-ed code to all split 
providers that use impersonation - because you can only rely on the common code 
that is out there in commons 1.0.
   
   There are yet a number of authentication features we do not have and at some 
point we might want to (or need to) add them. Surely - we could make it a 
"breaking" change and release all providers only depending on the new commons 
then. Those changes are not "often" but IMHO those are super painful. 
   
   Because in this case our users completely loose the benefits of using 
"separately released" providers. Users who will be used to just upgrading one 
"split" provider at a time will complain with exactly this thing: "I am using 
gcs 2.0" but in order to use this new feature from ads 4.0 I have to upgraed to 
gcs 5.0 because I have to upgrade "commons" to 2.0 and it is not compatible 
with my version. I still want to continue using gcs 2.0  and get the feature 
from ads 4.0 - how can I do it?" . 
   
   IMHO it's much better to have our users get "used" to upgrading all the 
google stuff together rather than risk this kind of problems. 
   
   But once again - if someone woudl like to take a lead on that, I am happy to 
help to set it up :).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to