[ https://issues.apache.org/jira/browse/AIRFLOW-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17129759#comment-17129759 ]
ASF subversion and git services commented on AIRFLOW-3973: ---------------------------------------------------------- Commit 5b48a5394ecf5aa1f2b50a00807e6149ade21968 in airflow's branch refs/heads/v1-10-stable from Elliott Shugerman [ https://gitbox.apache.org/repos/asf?p=airflow.git;h=5b48a53 ] [AIRFLOW-3973] Commit after each alembic migration (#4797) If `Variable`s are used in DAGs, and Postgres is used for the internal database, a fresh `$ airflow initdb` (or `$ airflow resetdb`) spams the logs with error messages (but does not fail). This commit corrects this by running each migration in a separate transaction. Co-authored-by: Elliott Shugerman <eeshuger...@medianewsgroup.com> (cherry picked from commit ea95e9c7236969acc807c65de0f12633d04753a0) > `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is > used for the internal database > ----------------------------------------------------------------------------------------------------------- > > Key: AIRFLOW-3973 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3973 > Project: Apache Airflow > Issue Type: Bug > Reporter: Elliott Shugerman > Assignee: Elliott Shugerman > Priority: Minor > Fix For: 2.0.0 > > > h2. Notes: > * This does not occur if the database is already initialized. If it is, run > `resetdb` instead to observe the bug. > * This does not occur with the default SQLite database. > h2. Example > {{ERROR [airflow.models.DagBag] Failed to import: > /home/elliott/clean-airflow/dags/dag.py Traceback (most recent call last): > File > "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/base.py", > line 1236, in _execute_context cursor, statement, parameters, context File > "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/default.py", > line 536, in do_execute cursor.execute(statement, parameters) > psycopg2.ProgrammingError: relation "variable" does not exist LINE 2: FROM > variable}} > h2. Explanation > The first thing {{airflow initdb}} does is run the Alembic migrations. All > migrations are run in one transaction. Most tables, including the > {{variable}} table, are defined in the initial migration. A [later > migration|https://github.com/apache/airflow/blob/master/airflow/migrations/versions/cc1e65623dc7_add_max_tries_column_to_task_instance.py] > imports and initializes {{models.DagBag}}. Upon initialization, {{DagBag}} > calls its {{collect_dags}} method, which scans the DAGs directory and > attempts to load all DAGs it finds. When it loads a DAG that uses a > {{Variable}}, it will query the database to see if that {{Variable}} is > defined in the {{variable}} table. It's not clear to me how exactly the > connection for that query is created, but I think it is apparent that it does > _not_ use the same transaction that is used to run the migrations. Since the > migrations are not yet complete, and all migrations are run in one > transaction, the migration that creates the {{variable}} table has not yet > been committed, and therefore the table does not exist to any other > connection/transaction. This raises {{ProgrammingError}}, which is caught and > logged by {{collect_dags}}. > > h2. Proposed Solution > Run each Alembic migration in its own transaction. I will open a pull request > which accomplishes this shortly. -- This message was sent by Atlassian Jira (v8.3.4#803005)