potiuk commented on PR #35861:
URL: https://github.com/apache/airflow/pull/35861#issuecomment-1826815324

   I am not going to block it in this form - but - as explained in a number of 
messages - I have a huge concern about 
   
   a) making it part of official airflow tooling 
   b) making it part of our repository in general
   c) and especially about making it part of the `airflow db` command (for me 
this is a "no-go")
   
   The problem is that if there are any issues found that need to be fixed, we 
will have to RELEASE fixes to it. Users are not able to modify and manipulate 
the export file itself, any problems with the migration will come to us as an 
issue to be solved and user's expectation will be to release a fix. Currenlty 
we do not have a mechanism to do so. We release stuff in providers, airflow, 
helm chart and we have formal process to do so. Once we make it part of any of 
these, users will expect us to release a .1, .2. version or whatever in case 
there are any issues to be fixed. 
   
   There are many problems if we try to support this as "official tool" 
released by the Airflow PMC (and having it in our repo will make people expect 
it).
   
   For example what happens if a new version of sqlite library gets released 
that we will have to support it?
   
   By the time we release airflow 2.8, 2.9, the DAG objects of our will be 
different than 2.7.3. So what should we do if we make it part of airflow `db` ? 
Should we still support users dumping versin 2.7.3 and importing it to a new 
version in version of this scropt that will import airflow from 2.9? How do we 
test if it still works? How do we keep up with dependencies? What are the exact 
instructions the users should use for it  in this case.
   
   There will be implicit expectations that this tool should be evolving 
together with airflow. So eventually we will have to add --version-from and 
--version-too and all the stuff because users will be confused. Is this script 
working with the current version of airflow? Is this script only supposed to be 
used on 2.7.3? What if I take version of that script released in 2.9 and run it 
on my "2.7.3" database? Should I expect it to work? Should I get my target 
airflow db migrated to latest version before using the sqlite dump to import 
the data? 
   
   What happens if somoene has 2.5.3 version of MSSQL? Can they migrate without 
upgrading to 2.7.3? Will it still work?
   
   etc. etc. 
   
   And most importantly - should we support issues and questions of users when 
- for whatever reason the sqlite export will stop working (for example because 
they will use sqlite 9.13 (imaginary) released 2 years from now)? Should we 
release new versions of that script when sqlite 9.13 gets released? I'd love to 
avoid all those questions and issues to be honest.
   
   That's why I am hesitating to get it in this form and as part of "airflow" 
repository.
   
   PROPOSAL:
   
   I am quite ok to have this as MSSQL migration tool for 2.7.3 -> 
Postgres/Sqlite 2.7.3 ONLY. Nothing else. No generic migration tool. We should 
limit it to specific versions of sqlite, airflow and whatever external 
dependency needed, ideally we should have requirements.txt with fixed versions 
of everything that is needed to run it. AND we should have it outside of 
airlfow repo.
   
   It's very easy (I can do it in minutes) to create a separate repository 
(`apache/airflow-mssql-2-7-3-migration` for example) where we could put ONLY 
this script and all the documentation/requirements.txt files and make it 
non-released. Just keep it there - separately from Airflow, fixing all the 
versions of everything, making it very clear that this is a MSSQL migration 
tool only for 2.7.3 version and we should NOT RELEASE IT - just point people to 
that separate repository.
   
   I think then, I will be even ok with using sqlite db as intermediary form - 
not being a text file.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to