vitaly-krugl opened a new issue, #25209:
URL: https://github.com/apache/airflow/issues/25209

   ### Apache Airflow version
   
   2.3.2
   
   ### What happened
   
   Airflow uses unsafe practice in the implementation of multiple migration 
version scripts, including some recent ones that are guaranteed to fail in the 
future. Multiple migration version scripts import from Airflow itself, 
including imports of models.
   
   The problem with this approach is that while the logic implemented by those 
imports at the time when the version script is created works fine, this may no 
longer be the case some number of releases later, because the implementation of 
those Airflow imports will change, so that when those functions/methods/classes 
are executed in the future (when migrating from an older version of airflow to 
a newer one), the new implementation will no longer match the state of the  
user's system which is being upgraded.
   
   For example, if an older migration version script attempts to load a Model 
from database using the code of the target - newer - version of Airflow code, 
the fetching of the Model will if the new model class contains columns that 
didn't yet exist in the version of the system that is being upgraded by the 
user.
   
   This unsafe practice is used in many version scripts, including some of the 
most recent ones:
   
   ```
   $ git grep  import | grep airflow | grep model
   db_types.py:    from airflow.models.base import StringID
   env.py:from airflow import models, settings
   versions/0023_1_8_2_add_max_tries_column_to_task_instance.py:from 
airflow.models import DagBag
   versions/0054_1_10_10_add_dag_code_table.py:from airflow.models.dagcode 
import DagCode
   versions/0061_2_0_0_increase_length_of_pool_name.py:from airflow.models.base 
import COLLATION_ARGS
   versions/0064_2_0_0_add_unique_constraint_to_conn_id.py:from 
airflow.models.base import COLLATION_ARGS
   versions/0073_2_0_0_prefix_dag_permissions.py:from 
airflow.www.fab_security.sqla.models import Action, Permission, Resource
   
versions/0092_2_2_0_add_data_interval_start_end_to_dagmodel_and_dagrun.py:from 
airflow.migrations.db_types import TIMESTAMP
   versions/0099_2_3_0_add_task_log_filename_template_model.py:from 
airflow.migrations.utils import disable_sqlite_fkeys
   versions/0099_2_3_0_add_task_log_filename_template_model.py:from 
airflow.utils.sqlalchemy import UtcDateTime
   versions/0100_2_3_0_add_taskmap_and_map_id_on_taskinstance.py:from 
airflow.models.base import StringID
   ```
   
   Please note that this issue is not purely theoretical! Here is an actual 
example from a production system: #25075. Please don't invalidate the current 
ticket due to the example being from an older version of Airflow because the 
same error-prone practice is found in more recent migration version scripts, so 
this type of failure is guaranteed to happen again.
   
   ### What you think should happen instead
   
   The implementations of migration version scripts MUST NOT make use Airflow 
functions/methods/classes - such as Models - whose implementations could change 
in the future (as it did in the reported issue #25075) - such as columns having 
been renamed/added/removed in more recent versions - resulting in guaranteed 
database migration failures in the future.
   
   Also, to ensure robust database migration, Airflow maintainers - at a 
minimum - need to automate testing of migrations from various versions starting 
with a database where all tables are populated. Such test code needs to test 
migrations/upgrades starting from tables populated by different versions of 
airflow to ensure that migrations are safe and reliable.
   
   
   
   ### How to reproduce
   
   See issue #25075 as an example of reproducing. Please don't invalidate the 
current ticket due to the example being from an older version of Airflow 
because the same error-prone practice is found in more recent migration version 
scripts, so this type of failure is guaranteed to happen again.
   
   ### Operating System
   
   OSX and Linux
   
   ### Versions of Apache Airflow Providers
   
   apache-airflow-providers-ftp==2.1.2
   apache-airflow-providers-http==2.1.2
   apache-airflow-providers-imap==2.2.3
   apache-airflow-providers-sqlite==2.1.3
   
   ### Deployment
   
   Virtualenv installation
   
   ### Deployment details
   
   _No response_
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to