potiuk commented on issue #27292:
URL: https://github.com/apache/airflow/issues/27292#issuecomment-1454858434

   > the issue with spliting the provider is mostly that no one from Google 
picked it. Once someone picks it and start working on it we will be able to 
overcome the tech difficulties. We don't know yet how the provider will be 
split but we do know it must be done.
   
   I am not so sure. Actually - using extras might be way simpler approach that 
is going to solve most of the pains with getting all the libraries in I 
**think**, without introducing the **huge** hassle of extracting common code 
and using it from multiple "google" providers. If we do split google provider, 
then the maintenance pain willl absolutely pale in comparision comparing to 
what we had in case of `common.sql`  - and there were  at least 4 or 5 traps of 
the common code extraction and maintenance which were really painful to protect 
against and fix. If we find a way to solve most of the user problems about 
dependencies with extras as suggested by @r-richmond (which I **think** is 
actually possible) then I see no reason to split the provider to be honest.
   
   Splitting the google provider will be massive undertaking and if we do that, 
it will take us more than a few iterations on multiple providers to solve most 
of the teething problems that we will not realise when splitting and those 
problems will keep on coming back for as long as the common part of the google 
provider will keep on evolving - we will keep on breaking things with older 
versions of "specific" providers when we will release the new common code. This 
is all but given that it will happen and we have almost no way to protect 
against it.
   
   Look how small common.sql "API" surface was and how many problems we had:
   
   * https://github.com/apache/airflow/pull/25430
   * https://github.com/apache/airflow/pull/25822
   * https://github.com/apache/airflow/pull/25855
   * https://github.com/apache/airflow/pull/25939
   * https://github.com/apache/airflow/pull/26758
   * https://github.com/apache/airflow/pull/26761
   * https://github.com/apache/airflow/pull/26944
   * https://github.com/apache/airflow/pull/27599
   * https://github.com/apache/airflow/pull/27843
   * https://github.com/apache/airflow/pull/27854
   * https://github.com/apache/airflow/pull/27912
   * https://github.com/apache/airflow/pull/28744
   
   Not all of those - but most were directly caused by decision to extract 
common code for a number of SQL operators. And the main problem why those 
errors affected users was because there is no way to test new release of 
"common" code with all possble releases of all possible providers that are 
using it. You can at most test semi-thoroughly the latest versions of the 
providers and common code together. This is what we do.  Thats' why splitting 
google provider is SCARY. because you will have  order of magnitude more of 
similar problems and we will have no way to avoid them. And even more. Google 
common code will keep evolving in much faster rate than common.sql code. Our 
problem wiht common.sql stopped at the moment it stopped changing. But Google 
common code will never stop changing. So decision about splitting gogole 
provider is not as "light" as you think.
   
   And that's why I am very, very sceptical about splitting it (otherwise I 
would have done that myself a long time ago). Of course using extras does not 
solve "all" problems - but I think most, It won't solve the case where you 
would like to use different provider version for one Google service and 
different for another. But - to be honest - if we get to the point where 
someone needs to do it, then we have bigger issue and this is one of those 
problems that leads to more issues than it solves. I would very strongly prefer 
the situation that user has to modify their dags for google - if they want to 
(for example) use new features from another service. Yes, it's a bit of pain 
for them - but far, far, less pain for everyone else (including them) in the 
future, where some incompatibilities in the common code will cause even more 
problems.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to