I just cleared the history for this DAG (using CLI clear) then attempted a backfill. I'm consistently getting a deadlock error. And I suspect this is really from Celery failing to execute the task.
$ airflow backfill --ignore_first_depends_on_past --start_date 2016-01-01T12:00:00 --end_date 2016-01-02T12:00:00 etl_gsn_daily_kpi_email ... BackfillJob is deadlocked.Some of the deadlocked tasks were unable to run because of "depends_on_past" relationships. Try running the backfill with the option "ignore_first_depends_on_past=True" or passing "-I" at the command line. These tasks were unable to run: ... The DAG has depends_on_past: True, wait_for_downstream: True, start_date: datetime(2016, 1, 1, 12). For each task in the DAG I see an output message that it was added to the queue, that Celery is queuing it, and that it failed. For example, for one of the tasks, there are these three messages: [2016-06-24 00:56:14,613] {base_executor.py:36} INFO - Adding to queue: airflow run etl_gsn_daily_kpi_email t_validate_gsncom_events_dau 2016-01-01T12:00:00 --pickle 18 --local ... [2016-06-24 00:56:19,471] {celery_executor.py:64} INFO - [celery] queuing ('etl_gsn_daily_kpi_email', 't_validate_gsnmobile_events_dau', datetime.datetime(2016, 1, 1, 12, 0)) through celery, queue=default ... [2016-06-24 00:56:29,480] {jobs.py:924} ERROR - Task instance ('etl_gsn_daily_kpi_email', 't_validate_gsnmobile_events_dau', datetime.datetime(2016, 1, 1, 12, 0)) failed It seems like I'm in a state where the DAG run through the scheduler succeeds, but fails when run from the CLI or web interface. Any ideas for debugging this will be much appreciated. thanks, Dennis On Wed, Jun 15, 2016 at 1:44 PM Dennis O'Brien <den...@dennisobrien.net> wrote: > Hi, > > I'm occasionally trying to force the final task in a DAG by clicking on > the task in the web UI and clicking 'Run' (with 'Force' and 'Ignore > Dependencies' selected). The UI shows 'Sent <TaskInstance: ...> to the > message queue, it should start any moment now'. Going back to that task > and selecting 'Log' I see something like this: > > [2016-06-15 20:11:44,172] {models.py:154} INFO - Filling up the DagBag from > /home/airflow/workspace/verticadw/airflow/dags/etl_gsn_daily_kpi_email.py > [2016-06-15 20:11:44,881] {base_hook.py:53} INFO - Using connection to: > localhost > > > No sign of an error but no sign of success. But if I look at the Celery > Flower dashboard, I can see this resulted in a failure with result: > AirflowException('Celery command failed',) > > If I 'Clear'|'Downstream' the task, the task succeeds. And the task run > in the scheduled DAG succeeds. > > Example args for failed tasks: > [u'airflow run etl_gsn_daily_kpi_email t_final_send_email > 2016-06-14T10:00:00 --force --local -sd > DAGS_FOLDER/etl_gsn_daily_kpi_email.py '] > [u'airflow run etl_gsn_daily_kpi_email t_final_send_email > 2016-06-14T10:00:00 -i --force --local -sd > DAGS_FOLDER/etl_gsn_daily_kpi_email.py '] > > Example args for successful tasks: > [u'airflow run etl_gsn_daily_kpi_email t_final_send_email > 2016-06-14T10:00:00 --local -sd DAGS_FOLDER/etl_gsn_daily_kpi_email.py '] > > and the log: > > [2016-06-15 19:44:46,147] {models.py:1219} INFO - Executing > <Task(PythonOperator): t_final_send_email> on 2016-06-14 10:00:00 > [2016-06-15 19:44:51,979] {python_operator.py:67} INFO - Done. Returned value > was: [Airflow] GSN Games Metrics 2016-06-14 sent to ('dobr...@gsngames.com',) > > > I looked in the logs (airflow-flower.log, airflow-scheduler.log, > airflow-webserver.log, and airflow-worker.log) but didn't find any clues. > > I'm running airflow v1.7.1.3, using CeleryExecutor backed by redis. > > Celery portion of airflow.cfg: > [celery] > celery_app_name = airflow.executors.celery_executor > celeryd_concurrency = 16 > worker_log_server_port = 8793 > broker_url = redis://localhost > celery_result_backend = redis://localhost > flower_port = 5555 > default_queue = default > > > Any advice on debugging this? > > thanks, > Dennis > >