[
https://issues.apache.org/jira/browse/AIRFLOW-6609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17021345#comment-17021345
]
Chris Schmautz commented on AIRFLOW-6609:
-----------------------------------------
Either checking ahead of table existence before the addition, or dropping the
table ahead of the addition, works to mitigate the error.
Not sure why the error came up - resetdb numerous times at the lower revision
with little effect. I was able to step incrementally through revisions until
the 1.10.6 to 1.10.7 migration specifically.
> Airflow upgradedb fails serialized_dag table add on revision id d38e04c12aa2
> ----------------------------------------------------------------------------
>
> Key: AIRFLOW-6609
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6609
> Project: Apache Airflow
> Issue Type: Bug
> Components: database
> Affects Versions: 1.10.7
> Reporter: Chris Schmautz
> Priority: Major
> Labels: database, postgres
>
> We're attempting an upgrade from 1.10.3 to 1.10.7 to use some of the great
> features available in later revisions; however, the upgrade from 1.10.6 to
> 1.10.7 is causing some heartburn.
> +Runtime environment:+
> - Docker containers for each runtime segment (webserver, scheduler, flower,
> postgres, redis, worker)
> - Using CeleryExecutor queued with Redis
> - Using Postgres backend
>
> +Steps to reproduce:+
> 1. Author base images relating to each version of Airflow between 1.10.3 and
> 1.10.7 (if you want the full regression we have done)
> 2. 'airflow initdb' on revision 1.10.3
> 3. Start up the containers, run some dags, produce metadata
> 4. Increment / swap out base image revision from 1.10.3 base to 1.10.4 base
> image
> 5. Run 'airflow upgradedb'
> 6. Validate success
> n. Eventually you will get to the 1.10.6 revision, stepping up to 1.10.7,
> which produces the error below
>
> {code:java}
> INFO [alembic.runtime.migration] Running upgrade 6e96a59344a4 ->
> d38e04c12aa2, add serialized_dag table
> Revision ID: d38e04c12aa2
> Revises: 6e96a59344a4
> Create Date: 2019-08-01 14:39:35.616417
> Traceback (most recent call last):
> File
> "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/base.py",
> line 1246, in _execute_context
> cursor, statement, parameters, context
> File
> "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/default.py",
> line 581, in do_execute
> cursor.execute(statement, parameters)
> psycopg2.errors.DuplicateTable: relation "serialized_dag" already exists
> The above exception was the direct cause of the following exception:Traceback
> (most recent call last):
> File "/opt/anaconda/miniconda3/envs/airflow/bin/airflow", line 37, in
> <module>
> args.func(args)
> File
> "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/airflow/utils/cli.py",
> line 75, in wrapper
> return f(*args, **kwargs)
> File
> "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/airflow/bin/cli.py",
> line 1193, in upgradedb
> db.upgradedb()
> File
> "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/airflow/utils/db.py",
> line 376, in upgradedb
> command.upgrade(config, 'heads')
> File
> "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/alembic/command.py",
> line 298, in upgrade
> script.run_env()
> File
> "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/alembic/script/base.py",
> line 489, in run_env
> util.load_python_file(self.dir, "env.py")
> File
> "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/alembic/util/pyfiles.py",
> line 98, in load_python_file
> module = load_module_py(module_id, path)
> File
> "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/alembic/util/compat.py",
> line 173, in load_module_py
> spec.loader.exec_module(module)
> File "<frozen importlib._bootstrap_external>", line 678, in exec_module
> File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
> File
> "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/airflow/migrations/env.py",
> line 96, in <module>
> run_migrations_online()
> File
> "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/airflow/migrations/env.py",
> line 90, in run_migrations_online
> context.run_migrations()
> File "<string>", line 8, in run_migrations
> File
> "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/alembic/runtime/environment.py",
> line 846, in run_migrations
> self.get_context().run_migrations(**kw)
> File
> "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/alembic/runtime/migration.py",
> line 518, in run_migrations
> step.migration_fn(**kw)
> File
> "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/airflow/migrations/versions/d38e04c12aa2_add_serialized_dag_table.py",
> line 54, in upgrade
> sa.PrimaryKeyConstraint('dag_id'))
> File "<string>", line 8, in create_table
> File "<string>", line 3, in create_table
> File
> "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/alembic/operations/ops.py",
> line 1250, in create_table
> return operations.invoke(op)
> File
> "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/alembic/operations/base.py",
> line 345, in invoke
> return fn(self, operation)
> File
> "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/alembic/operations/toimpl.py",
> line 101, in create_table
> operations.impl.create_table(table)
> File
> "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/alembic/ddl/impl.py",
> line 252, in create_table
> self._exec(schema.CreateTable(table))
> File
> "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/alembic/ddl/impl.py",
> line 134, in _exec
> return conn.execute(construct, *multiparams, **params)
> File
> "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/base.py",
> line 982, in execute
> return meth(self, multiparams, params)
> File
> "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/sqlalchemy/sql/ddl.py",
> line 72, in _execute_on_connection
> return connection._execute_ddl(self, multiparams, params)
> File
> "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/base.py",
> line 1044, in _execute_ddl
> compiled,
> File
> "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/base.py",
> line 1250, in _execute_context
> e, statement, parameters, cursor, context
> File
> "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/base.py",
> line 1476, in _handle_dbapi_exception
> util.raise_from_cause(sqlalchemy_exception, exc_info)
> File
> "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/sqlalchemy/util/compat.py",
> line 398, in raise_from_cause
> reraise(type(exception), exception, tb=exc_tb, cause=cause)
> File
> "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/sqlalchemy/util/compat.py",
> line 152, in reraise
> raise value.with_traceback(tb)
> File
> "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/base.py",
> line 1246, in _execute_context
> cursor, statement, parameters, context
> File
> "/opt/anaconda/miniconda3/envs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/default.py",
> line 581, in do_execute
> cursor.execute(statement, parameters)
> sqlalchemy.exc.ProgrammingError: (psycopg2.errors.DuplicateTable) relation
> "serialized_dag" already exists[SQL:
> CREATE TABLE serialized_dag (
> dag_id VARCHAR(250) NOT NULL,
> fileloc VARCHAR(2000) NOT NULL,
> fileloc_hash INTEGER NOT NULL,
> data JSON NOT NULL,
> last_updated TIMESTAMP WITHOUT TIME ZONE NOT NULL,
> PRIMARY KEY (dag_id)
> )]
> (Background on this error at: http://sqlalche.me/e/f405)
> {code}
>
>
> It doesn't make much sense seeing [only one
> reference|https://github.com/apache/airflow/blob/1.10.7/airflow/migrations/versions/d38e04c12aa2_add_serialized_dag_table.py#L48]
> to this table addition in the codebase so... not sure why this migration is
> going awry.
> +Possible solutions:+
> - Instead of bailing out, it may be more productive to issue warnings when
> these things fail instead. The intent of the migration process is to say 'you
> can't run on version x' but here I'm more confused about the migration
> outcome.
> - Migrations could check ahead for patches being applied ahead of revision
> (we did this for a bug found in later revisions, for a different backend
> MSSQL); this could add more overhead but metadata upgrades could at least be
> then self-aware
> - Something else I'm missing in the broader picture
>
> If the db truly already has the table, end users would still be able to
> upgrade their version, so it's kind of odd to have an error changing
> revisions.. if things are already in place for the future revision.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)