[
https://issues.apache.org/jira/browse/AIRFLOW-1063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16622127#comment-16622127
]
Morten Post commented on AIRFLOW-1063:
--------------------------------------
I am seeing this issue as well. What backend are you using?
> A manually-created DAG run can prevent a scheduled run to be created
> --------------------------------------------------------------------
>
> Key: AIRFLOW-1063
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1063
> Project: Apache Airflow
> Issue Type: Improvement
> Components: scheduler
> Affects Versions: 1.7.1.3
> Reporter: Vitor Baptista
> Priority: Major
>
> I manually created a DAG Run with the {{execution_date}} as {{2017-03-01
> 00:00:00}} on a monthly-recurrent DAG. After a while, I noticed that the
> scheduled run was never created and checked the scheduler's logs, finding
> this traceback:
> {quote}
> Process Process-475397:
> Traceback (most recent call last):
> File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in
> _bootstrap
> self.run()
> File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
> self._target(*self._args, **self._kwargs)
> File "/usr/local/lib/python2.7/dist-packages/airflow/jobs.py", line 664, in
> _do_dags
> dag = dagbag.get_dag(dag.dag_id)
> File "/usr/local/lib/python2.7/dist-packages/airflow/models.py", line 188,
> in get_dag
> orm_dag = DagModel.get_current(root_dag_id)
> File "/usr/local/lib/python2.7/dist-packages/airflow/models.py", line 2320,
> in get_current
> obj = session.query(cls).filter(cls.dag_id == dag_id).first()
> File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/query.py", line
> 2690, in first
> ret = list(self[0:1])
> File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/query.py", line
> 2482, in __getitem__
> return list(res)
> File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/query.py", line
> 2790, in __iter__
> return self._execute_and_instances(context)
> File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/query.py", line
> 2811, in _execute_and_instances
> close_with_result=True)
> File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/query.py", line
> 2820, in _get_bind_args
> **kw
> File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/query.py", line
> 2802, in _connection_from_session
> conn = self.session.connection(**kw)
> File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/session.py",
> line 966, in connection
> execution_options=execution_options)
> File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/session.py",
> line 971, in _connection_for_bind
> engine, execution_options)
> File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/session.py",
> line 382, in _connection_for_bind
> self._assert_active()
> File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/session.py",
> line 276, in _assert_active
> % self._rollback_exception
> InvalidRequestError: This Session's transaction has been rolled back due to a
> previous exception during flush. To begin a new transaction with this
> Session, first issue Session.rollback(). Original exception was:
> (psycopg2.IntegrityError)
> duplicate key value violates unique constraint
> "dag_run_dag_id_execution_date_key"
> DETAIL: Key (dag_id, execution_date)=(nct, 2017-03-01 00:00:00) already
> exists.
> [SQL: 'INSERT INTO dag_run (dag_id, execution_date, start_date, end_date,
> state, run_id, external_trigger, conf) VALUES (%(dag_id)s,
> %(execution_date)s, %(start_date)s, %(end_date)s, %(state)s, %(run_id)s,
> %(external_trigger)s, %(conf)s)
> RETURNING dag_run.id'] [parameters: {'end_date': None, 'run_id':
> u'scheduled__2017-03-01T00:00:00', 'execution_date': datetime.datetime(2017,
> 3, 1, 0, 0), 'external_trigger': False, 'state': u'running', 'conf': None,
> 'start_date': dateti
> me.datetime(2017, 4, 3, 13, 48, 39, 168456), 'dag_id': 'nct'}]
> {quote}
> The problem is that the {{dag_runs}} table require the {{(dag_id,
> execution_date)}} pair to be unique, so the scheduler was stuck in a loop
> where it tried creating a new scheduled dag run but failed, as I had already
> created one on the same {{execution_date}}. This was surprising. As a user, I
> would expect that it would either schedule the run normally, even if there's
> a manual one on the same date, or maybe it would skip that execution date.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)