potiuk commented on PR #35861: URL: https://github.com/apache/airflow/pull/35861#issuecomment-1826815324
I am not going to block it in this form - but - as explained in a number of messages - I have a huge concern about a) making it part of official airflow tooling b) making it part of our repository in general c) and especially about making it part of the `airflow db` command (for me this is a "no-go") The problem is that if there are any issues found that need to be fixed, we will have to RELEASE fixes to it. Users are not able to modify and manipulate the export file itself, any problems with the migration will come to us as an issue to be solved and user's expectation will be to release a fix. Currenlty we do not have a mechanism to do so. We release stuff in providers, airflow, helm chart and we have formal process to do so. Once we make it part of any of these, users will expect us to release a .1, .2. version or whatever in case there are any issues to be fixed. There are many problems if we try to support this as "official tool" released by the Airflow PMC (and having it in our repo will make people expect it). For example what happens if a new version of sqlite library gets released that we will have to support it? By the time we release airflow 2.8, 2.9, the DAG objects of our will be different than 2.7.3. So what should we do if we make it part of airflow `db` ? Should we still support users dumping versin 2.7.3 and importing it to a new version in version of this scropt that will import airflow from 2.9? How do we test if it still works? How do we keep up with dependencies? What are the exact instructions the users should use for it in this case. There will be implicit expectations that this tool should be evolving together with airflow. So eventually we will have to add --version-from and --version-too and all the stuff because users will be confused. Is this script working with the current version of airflow? Is this script only supposed to be used on 2.7.3? What if I take version of that script released in 2.9 and run it on my "2.7.3" database? Should I expect it to work? Should I get my target airflow db migrated to latest version before using the sqlite dump to import the data? What happens if somoene has 2.5.3 version of MSSQL? Can they migrate without upgrading to 2.7.3? Will it still work? etc. etc. And most importantly - should we support issues and questions of users when - for whatever reason the sqlite export will stop working (for example because they will use sqlite 9.13 (imaginary) released 2 years from now)? Should we release new versions of that script when sqlite 9.13 gets released? I'd love to avoid all those questions and issues to be honest. That's why I am hesitating to get it in this form and as part of "airflow" repository. PROPOSAL: I am quite ok to have this as MSSQL migration tool for 2.7.3 -> Postgres/Sqlite 2.7.3 ONLY. Nothing else. No generic migration tool. We should limit it to specific versions of sqlite, airflow and whatever external dependency needed, ideally we should have requirements.txt with fixed versions of everything that is needed to run it. AND we should have it outside of airlfow repo. It's very easy (I can do it in minutes) to create a separate repository (`apache/airflow-mssql-2-7-3-migration` for example) where we could put ONLY this script and all the documentation/requirements.txt files and make it non-released. Just keep it there - separately from Airflow, fixing all the versions of everything, making it very clear that this is a MSSQL migration tool only for 2.7.3 version and we should NOT RELEASE IT - just point people to that separate repository. I think then, I will be even ok with using sqlite db as intermediary form - not being a text file. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
