chasuner created AIRFLOW-1113:
---------------------------------
Summary: spark_submit_operator run again after an hour and log is
not realtime
Key: AIRFLOW-1113
URL: https://issues.apache.org/jira/browse/AIRFLOW-1113
Project: Apache Airflow
Issue Type: Bug
Components: contrib
Affects Versions: Airflow 1.8
Reporter: chasuner
I have a dag that contains a spark_submit_operator task, which is really wield
and confused me is that the spark_submit task run again after an hour the spark
task run, but actually the spark job has been successfully finished in about 47
min, but the subprocess of the spark run task is Sl state in the worker
machine; but another dag that contains another spark_submit task that runs
about 5 min works normally.
the log of the abnormal spark task:
[2017-04-15 07:01:06,048] {models.py:167} INFO - Filling up the DagBag from
/data/airflow/dags/dag_etl.py
[2017-04-15 07:01:06,742] {base_task_runner.py:112} INFO - Running: ['bash',
'-c', u'airflow run dag_etl spark_etl 2017-04-14T07:00:00 --job_id 1239 --raw
-sd DAGS_FOLDER/dag_etl.py']
[2017-04-15 07:01:07,844] {base_task_runner.py:95} INFO - Subtask: [2017-04-15
07:01:07,843] {__init__.py:57} INFO - Using executor CeleryExecutor
[2017-04-15 07:01:08,504] {base_task_runner.py:95} INFO - Subtask: [2017-04-15
07:01:08,504] {models.py:167} INFO - Filling up the DagBag from
/data/airflow/dags/dag_etl.py
[2017-04-15 07:01:09,423] {base_task_runner.py:95} INFO - Subtask: [2017-04-15
07:01:09,423] {models.py:1126} INFO - Dependencies all met for <TaskInstance:
dag_etl.spark_etl 2017-04-14 07:00:00 [queued]>
[2017-04-15 07:01:09,440] {base_task_runner.py:95} INFO - Subtask: [2017-04-15
07:01:09,440] {models.py:1126} INFO - Dependencies all met for <TaskInstance:
dag_etl.spark_etl 2017-04-14 07:00:00 [queued]>
[2017-04-15 07:01:09,441] {base_task_runner.py:95} INFO - Subtask: [2017-04-15
07:01:09,440] {models.py:1318} INFO -
[2017-04-15 07:01:09,441] {base_task_runner.py:95} INFO - Subtask:
--------------------------------------------------------------------------------
[2017-04-15 07:01:09,441] {base_task_runner.py:95} INFO - Subtask: Starting
attempt 1 of 2
[2017-04-15 07:01:09,441] {base_task_runner.py:95} INFO - Subtask:
--------------------------------------------------------------------------------
[2017-04-15 07:01:09,441] {base_task_runner.py:95} INFO - Subtask:
[2017-04-15 07:01:09,453] {base_task_runner.py:95} INFO - Subtask: [2017-04-15
07:01:09,453] {models.py:1342} INFO - Executing <Task(SparkSubmitOperator):
spark_etl> on 2017-04-14 07:00:00
[2017-04-15 07:01:09,501] {base_task_runner.py:95} INFO - Subtask: [2017-04-15
07:01:09,500] {base_hook.py:67} INFO - Using connection to: yarn
[2017-04-15 08:01:54,030] {models.py:167} INFO - Filling up the DagBag from
/data/airflow/dags/dag_etl.py
[2017-04-15 08:01:54,713] {base_task_runner.py:112} INFO - Running: ['bash',
'-c', u'airflow run dag_etl spark_etl 2017-04-14T07:00:00 --job_id 1243 --raw
-sd DAGS_FOLDER/dag_etl.py']
[2017-04-15 08:01:56,131] {base_task_runner.py:95} INFO - Subtask: [2017-04-15
08:01:56,130] {__init__.py:57} INFO - Using executor CeleryExecutor
[2017-04-15 08:01:56,867] {base_task_runner.py:95} INFO - Subtask: [2017-04-15
08:01:56,867] {models.py:167} INFO - Filling up the DagBag from
/data/airflow/dags/dag_etl.py
[2017-04-15 08:01:57,503] {base_task_runner.py:95} INFO - Subtask: [2017-04-15
08:01:57,502] {models.py:1120} INFO - Dependencies not met for <TaskInstance:
dag_etl.spark_etl 2017-04-14 07:00:00 [running]>, dependency 'Task Instance Not
Already Running' FAILED: Task is already running, it started on 2017-04-15
07:01:09.
[2017-04-15 08:01:57,512] {base_task_runner.py:95} INFO - Subtask: [2017-04-15
08:01:57,512] {models.py:1120} INFO - Dependencies not met for <TaskInstance:
dag_etl.spark_etl 2017-04-14 07:00:00 [running]>, dependency 'Task Instance
State' FAILED: Task is in the 'running' state which is not a valid state for
execution. The task must be cleared in order to be run.
[2017-04-15 08:01:59,104] {jobs.py:2148} WARNING - Recorded pid 26249 is not a
descendant of the current pid 9046
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)