Vitor Baptista created AIRFLOW-1063:
---------------------------------------

             Summary: A manually-created DAG run can prevent a scheduled run to 
be created
                 Key: AIRFLOW-1063
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-1063
             Project: Apache Airflow
          Issue Type: Improvement
          Components: scheduler
    Affects Versions: Airflow 1.7.1.3
            Reporter: Vitor Baptista


I manually created a DAG Run with the {{execution_date}} as {{2017-03-01 
00:00:00}} on a monthly-recurrent DAG. After a while, I noticed that the 
scheduled run was never created and checked the scheduler's logs, finding this 
traceback:

{quote}
Process Process-475397:
Traceback (most recent call last):
  File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python2.7/dist-packages/airflow/jobs.py", line 664, in 
_do_dags
    dag = dagbag.get_dag(dag.dag_id)
  File "/usr/local/lib/python2.7/dist-packages/airflow/models.py", line 188, in 
get_dag
    orm_dag = DagModel.get_current(root_dag_id)
  File "/usr/local/lib/python2.7/dist-packages/airflow/models.py", line 2320, 
in get_current
    obj = session.query(cls).filter(cls.dag_id == dag_id).first()
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/query.py", line 
2690, in first
    ret = list(self[0:1])
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/query.py", line 
2482, in __getitem__
    return list(res)
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/query.py", line 
2790, in __iter__
    return self._execute_and_instances(context)
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/query.py", line 
2811, in _execute_and_instances
    close_with_result=True)
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/query.py", line 
2820, in _get_bind_args
    **kw
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/query.py", line 
2802, in _connection_from_session
    conn = self.session.connection(**kw)
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 
966, in connection
    execution_options=execution_options)
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 
971, in _connection_for_bind
    engine, execution_options)
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 
382, in _connection_for_bind
    self._assert_active()
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 
276, in _assert_active
    % self._rollback_exception
InvalidRequestError: This Session's transaction has been rolled back due to a 
previous exception during flush. To begin a new transaction with this Session, 
first issue Session.rollback(). Original exception was: 
(psycopg2.IntegrityError)
 duplicate key value violates unique constraint 
"dag_run_dag_id_execution_date_key"
DETAIL:  Key (dag_id, execution_date)=(nct, 2017-03-01 00:00:00) already exists.
 [SQL: 'INSERT INTO dag_run (dag_id, execution_date, start_date, end_date, 
state, run_id, external_trigger, conf) VALUES (%(dag_id)s, %(execution_date)s, 
%(start_date)s, %(end_date)s, %(state)s, %(run_id)s, %(external_trigger)s, 
%(conf)s)
 RETURNING dag_run.id'] [parameters: {'end_date': None, 'run_id': 
u'scheduled__2017-03-01T00:00:00', 'execution_date': datetime.datetime(2017, 3, 
1, 0, 0), 'external_trigger': False, 'state': u'running', 'conf': None, 
'start_date': dateti
me.datetime(2017, 4, 3, 13, 48, 39, 168456), 'dag_id': 'nct'}]
{quote}

The problem is that the {{dag_runs}} table require the {{(dag_id, 
execution_date)}} pair to be unique, so the scheduler was stuck in a loop where 
it tried creating a new scheduled dag run but failed, as I had already created 
one on the same {{execution_date}}. This was surprising. As a user, I would 
expect that it would either schedule the run normally, even if there's a manual 
one on the same date, or maybe it would skip that execution date.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to