[
https://issues.apache.org/jira/browse/AIRFLOW-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140716#comment-17140716
]
ASF subversion and git services commented on AIRFLOW-3973:
----------------------------------------------------------
Commit 5bc50183e0934f7368d9cd991074b2b581114395 in airflow's branch
refs/heads/v1-10-test from Elliott Shugerman
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=5bc5018 ]
[AIRFLOW-3973] Commit after each alembic migration (#4797)
If `Variable`s are used in DAGs, and Postgres is used for the internal
database, a fresh `$ airflow initdb` (or `$ airflow resetdb`) spams the
logs with error messages (but does not fail).
This commit corrects this by running each migration in a separate
transaction.
Co-authored-by: Elliott Shugerman <[email protected]>
(cherry picked from commit ea95e9c7236969acc807c65de0f12633d04753a0)
(cherry picked from commit 5b48a5394ecf5aa1f2b50a00807e6149ade21968)
> `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is
> used for the internal database
> -----------------------------------------------------------------------------------------------------------
>
> Key: AIRFLOW-3973
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3973
> Project: Apache Airflow
> Issue Type: Bug
> Reporter: Elliott Shugerman
> Assignee: Elliott Shugerman
> Priority: Minor
> Fix For: 2.0.0
>
>
> h2. Notes:
> * This does not occur if the database is already initialized. If it is, run
> `resetdb` instead to observe the bug.
> * This does not occur with the default SQLite database.
> h2. Example
> {{ERROR [airflow.models.DagBag] Failed to import:
> /home/elliott/clean-airflow/dags/dag.py Traceback (most recent call last):
> File
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/base.py",
> line 1236, in _execute_context cursor, statement, parameters, context File
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/default.py",
> line 536, in do_execute cursor.execute(statement, parameters)
> psycopg2.ProgrammingError: relation "variable" does not exist LINE 2: FROM
> variable}}
> h2. Explanation
> The first thing {{airflow initdb}} does is run the Alembic migrations. All
> migrations are run in one transaction. Most tables, including the
> {{variable}} table, are defined in the initial migration. A [later
> migration|https://github.com/apache/airflow/blob/master/airflow/migrations/versions/cc1e65623dc7_add_max_tries_column_to_task_instance.py]
> imports and initializes {{models.DagBag}}. Upon initialization, {{DagBag}}
> calls its {{collect_dags}} method, which scans the DAGs directory and
> attempts to load all DAGs it finds. When it loads a DAG that uses a
> {{Variable}}, it will query the database to see if that {{Variable}} is
> defined in the {{variable}} table. It's not clear to me how exactly the
> connection for that query is created, but I think it is apparent that it does
> _not_ use the same transaction that is used to run the migrations. Since the
> migrations are not yet complete, and all migrations are run in one
> transaction, the migration that creates the {{variable}} table has not yet
> been committed, and therefore the table does not exist to any other
> connection/transaction. This raises {{ProgrammingError}}, which is caught and
> logged by {{collect_dags}}.
>
> h2. Proposed Solution
> Run each Alembic migration in its own transaction. I will open a pull request
> which accomplishes this shortly.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)