potiuk commented on PR #26162:
URL: https://github.com/apache/airflow/pull/26162#issuecomment-1238159017

   Wrong box :).
   
   Both of the examples you mentioned @Taragolis are clearly AWS ones they need 
no fix as they are in the right place. 
   
   They are (and should be) in AWS. In Both cases the "USER" of the 
authentication (AWS) has dependency on the functionality they use. Having such 
a functionality for optional AWS <> Google dependency is perfectly fine and it 
is even "blessed" in our package system.  the amazon provider has [google] 
extra and google provider has [amazon] extra. We even have an exception 
specially foreseen for that - not yet heavily used but once we split providers 
into separate repos I was planning to consistently apply it in a places that do 
not have it.
   
   ```
   class AirflowOptionalProviderFeatureException(AirflowException):
       """Raise by providers when imports are missing for optional provider 
features."""
   ```
   
   Those two stories are different thatn  here is very easy (even if not 
"technical" - this is more looking at the landscape of our providers from the 
business side of things than pure interface/API and it is more based on 
likelihood of being better maintained than anything else).  The decision where 
to put so code that is in-between is bound more to whether there are clear 
stakeholders behind the provider that mostly maintain the provider and are 
interested in having this functionality in. 
   
   It is captured in a few places already. We have [Release process for 
providers](https://github.com/apache/airflow/blob/main/README.md#release-process-for-providers)
 but also we already used the same line of thoughts when we decided where to 
put transfer operators. See AIP-21 [Changes in import 
paths](https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-21%3A+Changes+in+import+paths)
 - and also touched upon in AIP-8 - [Split Provider packages for Airflow 
2.0](https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-8+Split+Providers+into+Separate+Packages+for+Airflow+2.0).
 One of the decisions we made there is that the the "transfer" operators are 
put in the "target" of the transfer when there is a clear stakeholder behind 
the target. And in this case circular references are unavoidable.
   
   To sum up the linne of thoughts in one sentence:  the idea is that the code 
should be put in the "provider" where there is a stakeholder that is most 
likely to maintain the code. 
   
   An 'Postgres` and `MySQL` are different. They have no clear stakeholders 
behind that are interested in maintaining those. Even though they are 
commercial databases with companies behind, they are 'commodity' APIS and they 
are not interested in maintaining services. And neither Postgres nor MySQL are 
insterested in adding interfacing with AWS/GCP to the provider. But both AWS 
and GCP are respectively interested to allow AWS/GCP authentication WITH the 
Postgres/MySQL provider they expose in their own Services offering. 
   
   This has been discussed for a long time when AIP-21 was discussed - there 
are different kinds of providers. Simply speaking SAML/GSSAPI are "protocols", 
there are also "databases" (like Postgres/MySQL) and then there are "Services" 
which are higher layer of abstraction and while "Services" can use "Protocols" 
and "Databases", neither "Protocols" nor "Databases" should use "Services" - 
they can only be "used" by services.
   
   This will be much more visible and obvious when we split providers to 
separate repos. The ideal situation should be, that people who are from AWS 
should be subscribed to that single "amazon" provider repo to get notifications 
about all the AWS-specific changes they are interested in. And this 
authentication code clearly falls into this "bucket". 
   
   The Google Federated identity code is also a very good example here - while 
it is clearly GCP-related, this is AWS people who are interested to get it 
working and make sure they can connect to GCP (for example to be able ot Pull 
data from it etc.). They will be maintaining the code, not the Google people. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to