[
https://issues.apache.org/jira/browse/AIRFLOW-3378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Victor updated AIRFLOW-3378:
----------------------------
Description:
I am runnig airflow 1.10.1rc2 on kubernetes with the LocalExecutior and
KubernetesPodOperator to run pods.
When a failure happens because the pod can't be created (see logs below), the
whole operator fails and if is_delete_operator_pod is set to True, the pod is
NOT deleted as expected.
Logs:
{noformat}
*** Reading local file:
/airflow/logs/demo/datapipe/2018-11-21T08:29:57.206456+00:00/1.log
[2018-11-21 08:30:02,027] {models.py:1361} INFO - Dependencies all met for
<TaskInstance: demo.datapipe 2018-11-21T08:29:57.206456+00:00 [queued]>
[2018-11-21 08:30:02,040] {models.py:1361} INFO - Dependencies all met for
<TaskInstance: demo.datapipe 2018-11-21T08:29:57.206456+00:00 [queued]>
[2018-11-21 08:30:02,041] {models.py:1573} INFO -
--------------------------------------------------------------------------------
Starting attempt 1 of 1
--------------------------------------------------------------------------------
[2018-11-21 08:30:02,066] {models.py:1595} INFO - Executing
<Task(KubernetesPodOperator): datapipe> on 2018-11-21T08:29:57.206456+00:00
[2018-11-21 08:30:02,067] {base_task_runner.py:118} INFO - Running: ['bash',
'-c', 'airflow run demo datapipe 2018-11-21T08:29:57.206456+00:00 --job_id 4
--raw -sd /usr/local/airflow/dags_volume/..data/demo.py --cfg_path
/tmp/tmp5qa5qyhs']
[2018-11-21 08:30:03,065] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe [2018-11-21 08:30:03,065] {settings.py:174} INFO -
setting.configure_orm(): Using pool settings. pool_size=5, pool_recycle=1800
[2018-11-21 08:30:03,734] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe [2018-11-21 08:30:03,732] {__init__.py:51} INFO - Using executor
LocalExecutor
[2018-11-21 08:30:04,038] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe [2018-11-21 08:30:04,038] {models.py:271} INFO - Filling up the DagBag
from /usr/local/airflow/dags_volume/..data/demo.py
[2018-11-21 08:30:04,590] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe [2018-11-21 08:30:04,590] {cli.py:484} INFO - Running <TaskInstance:
demo.datapipe 2018-11-21T08:29:57.206456+00:00 [running]> on host
infra-airflow-6d78c56489-r9trl
[2018-11-21 08:30:04,684] {logging_mixin.py:95} INFO - [2018-11-21
08:30:04,676] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
event of type Pending
[2018-11-21 08:30:05,698] {logging_mixin.py:95} INFO - [2018-11-21
08:30:05,698] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
event of type Pending
[2018-11-21 08:30:06,706] {logging_mixin.py:95} INFO - [2018-11-21
08:30:06,706] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
event of type Pending
[2018-11-21 08:30:07,715] {logging_mixin.py:95} INFO - [2018-11-21
08:30:07,714] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
event of type Pending
[2018-11-21 08:30:08,723] {logging_mixin.py:95} INFO - [2018-11-21
08:30:08,723] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
event of type Pending
[2018-11-21 08:30:09,730] {logging_mixin.py:95} INFO - [2018-11-21
08:30:09,730] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
event of type Pending
[2018-11-21 08:30:10,736] {logging_mixin.py:95} INFO - [2018-11-21
08:30:10,736] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
event of type Pending
[2018-11-21 08:30:11,742] {logging_mixin.py:95} INFO - [2018-11-21
08:30:11,742] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
event of type Pending
[2018-11-21 08:30:12,748] {logging_mixin.py:95} INFO - [2018-11-21
08:30:12,748] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
event of type Pending
[2018-11-21 08:30:13,754] {logging_mixin.py:95} INFO - [2018-11-21
08:30:13,754] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
event of type Pending
[2018-11-21 08:30:14,764] {logging_mixin.py:95} INFO - [2018-11-21
08:30:14,764] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
event of type Pending
[2018-11-21 08:30:15,769] {logging_mixin.py:95} INFO - [2018-11-21
08:30:15,769] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
event of type Pending
[2018-11-21 08:30:16,777] {logging_mixin.py:95} INFO - [2018-11-21
08:30:16,777] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
event of type Pending
[2018-11-21 08:30:17,783] {logging_mixin.py:95} INFO - [2018-11-21
08:30:17,783] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
event of type Pending
[2018-11-21 08:30:18,790] {logging_mixin.py:95} INFO - [2018-11-21
08:30:18,790] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
event of type Pending
[2018-11-21 08:30:19,796] {logging_mixin.py:95} INFO - [2018-11-21
08:30:19,796] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
event of type Pending
[2018-11-21 08:30:20,802] {logging_mixin.py:95} INFO - [2018-11-21
08:30:20,802] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
event of type Pending
[2018-11-21 08:30:21,808] {logging_mixin.py:95} INFO - [2018-11-21
08:30:21,808] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
event of type Pending
[2018-11-21 08:30:22,815] {logging_mixin.py:95} INFO - [2018-11-21
08:30:22,815] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
event of type Pending
[2018-11-21 08:30:23,822] {logging_mixin.py:95} INFO - [2018-11-21
08:30:23,822] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
event of type Pending
[2018-11-21 08:30:24,833] {logging_mixin.py:95} INFO - [2018-11-21
08:30:24,833] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
event of type Pending
[2018-11-21 08:30:24,841] {models.py:1760} ERROR - Pod Launching failed: Pod
took too long to start
Traceback (most recent call last):
File
"/usr/local/lib/python3.6/site-packages/airflow/contrib/operators/kubernetes_pod_operator.py",
line 123, in execute
get_logs=self.get_logs)
File
"/usr/local/lib/python3.6/site-packages/airflow/contrib/kubernetes/pod_launcher.py",
line 86, in run_pod
raise AirflowException("Pod took too long to start")
airflow.exceptions.AirflowException: Pod took too long to start
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/airflow/models.py", line 1659,
in _run_raw_task
result = task_copy.execute(context=context)
File
"/usr/local/lib/python3.6/site-packages/airflow/contrib/operators/kubernetes_pod_operator.py",
line 135, in execute
raise AirflowException('Pod Launching failed: {error}'.format(error=ex))
airflow.exceptions.AirflowException: Pod Launching failed: Pod took too long to
start
[2018-11-21 08:30:24,844] {models.py:1791} INFO - Marking task as FAILED.
[2018-11-21 08:30:25,001] {logging_mixin.py:95} INFO - [2018-11-21
08:30:25,001] {email.py:127} INFO - Sent an alert email to
['[email protected]']
[2018-11-21 08:30:26,646] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe Traceback (most recent call last):
[2018-11-21 08:30:26,646] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe File
"/usr/local/lib/python3.6/site-packages/airflow/contrib/operators/kubernetes_pod_operator.py",
line 123, in execute
[2018-11-21 08:30:26,646] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe get_logs=self.get_logs)
[2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe File
"/usr/local/lib/python3.6/site-packages/airflow/contrib/kubernetes/pod_launcher.py",
line 86, in run_pod
[2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe raise AirflowException("Pod took too long to start")
[2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe airflow.exceptions.AirflowException: Pod took too long to start
[2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe
[2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe During handling of the above exception, another exception occurred:
[2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe
[2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe Traceback (most recent call last):
[2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe File "/usr/local/bin/airflow", line 32, in <module>
[2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe args.func(args)
[2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe File "/usr/local/lib/python3.6/site-packages/airflow/utils/cli.py",
line 74, in wrapper
[2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe return f(*args, **kwargs)
[2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe File "/usr/local/lib/python3.6/site-packages/airflow/bin/cli.py",
line 490, in run
[2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe _run(args, dag, ti)
[2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe File "/usr/local/lib/python3.6/site-packages/airflow/bin/cli.py",
line 406, in _run
[2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe pool=args.pool,
[2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe File "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py",
line 74, in wrapper
[2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe return func(*args, **kwargs)
[2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe File "/usr/local/lib/python3.6/site-packages/airflow/models.py",
line 1659, in _run_raw_task
[2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe result = task_copy.execute(context=context)
[2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe File
"/usr/local/lib/python3.6/site-packages/airflow/contrib/operators/kubernetes_pod_operator.py",
line 135, in execute
[2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe raise AirflowException('Pod Launching failed:
{error}'.format(error=ex))
[2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe airflow.exceptions.AirflowException: Pod Launching failed: Pod took
too long to start
[2018-11-21 08:30:27,054] {logging_mixin.py:95} INFO - [2018-11-21
08:30:27,053] {jobs.py:2627} INFO - Task exited with return code 1
{noformat}
was:
I am runnig airflow 1.10.1rc2 on kubernetes with the LocalExecutior and
KubernetesPodOperator to run pods.
When a failure happens because the pod can't be created (see logs below), the
whole operator fails and if is_delete_operator_pod is set to True, the pod is
NOT deleted as expected.
Logs:
{noformat}
*** Reading local file:
/airflow/logs/demo/datapipe/2018-11-21T08:29:57.206456+00:00/1.log
[2018-11-21 08:30:02,027] {models.py:1361} INFO - Dependencies all met for
<TaskInstance: demo.datapipe 2018-11-21T08:29:57.206456+00:00 [queued]>
[2018-11-21 08:30:02,040] {models.py:1361} INFO - Dependencies all met for
<TaskInstance: demo.datapipe 2018-11-21T08:29:57.206456+00:00 [queued]>
[2018-11-21 08:30:02,041] {models.py:1573} INFO -
--------------------------------------------------------------------------------
Starting attempt 1 of 1
--------------------------------------------------------------------------------
[2018-11-21 08:30:02,066] {models.py:1595} INFO - Executing
<Task(KubernetesPodOperator): datapipe> on 2018-11-21T08:29:57.206456+00:00
[2018-11-21 08:30:02,067] {base_task_runner.py:118} INFO - Running: ['bash',
'-c', 'airflow run demo datapipe 2018-11-21T08:29:57.206456+00:00 --job_id 4
--raw -sd /usr/local/airflow/dags_volume/..data/demo.py --cfg_path
/tmp/tmp5qa5qyhs']
[2018-11-21 08:30:03,065] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe [2018-11-21 08:30:03,065] {settings.py:174} INFO -
setting.configure_orm(): Using pool settings. pool_size=5, pool_recycle=1800
[2018-11-21 08:30:03,734] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe [2018-11-21 08:30:03,732] {__init__.py:51} INFO - Using executor
LocalExecutor
[2018-11-21 08:30:04,038] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe [2018-11-21 08:30:04,038] {models.py:271} INFO - Filling up the DagBag
from /usr/local/airflow/dags_volume/..data/demo.py
[2018-11-21 08:30:04,590] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe [2018-11-21 08:30:04,590] {cli.py:484} INFO - Running <TaskInstance:
demo.datapipe 2018-11-21T08:29:57.206456+00:00 [running]> on host
infra-airflow-6d78c56489-r9trl
[2018-11-21 08:30:04,684] {logging_mixin.py:95} INFO - [2018-11-21
08:30:04,676] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
event of type Pending
[2018-11-21 08:30:05,698] {logging_mixin.py:95} INFO - [2018-11-21
08:30:05,698] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
event of type Pending
[2018-11-21 08:30:06,706] {logging_mixin.py:95} INFO - [2018-11-21
08:30:06,706] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
event of type Pending
[2018-11-21 08:30:07,715] {logging_mixin.py:95} INFO - [2018-11-21
08:30:07,714] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
event of type Pending
[2018-11-21 08:30:08,723] {logging_mixin.py:95} INFO - [2018-11-21
08:30:08,723] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
event of type Pending
[2018-11-21 08:30:09,730] {logging_mixin.py:95} INFO - [2018-11-21
08:30:09,730] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
event of type Pending
[2018-11-21 08:30:10,736] {logging_mixin.py:95} INFO - [2018-11-21
08:30:10,736] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
event of type Pending
[2018-11-21 08:30:11,742] {logging_mixin.py:95} INFO - [2018-11-21
08:30:11,742] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
event of type Pending
[2018-11-21 08:30:12,748] {logging_mixin.py:95} INFO - [2018-11-21
08:30:12,748] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
event of type Pending
[2018-11-21 08:30:13,754] {logging_mixin.py:95} INFO - [2018-11-21
08:30:13,754] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
event of type Pending
[2018-11-21 08:30:14,764] {logging_mixin.py:95} INFO - [2018-11-21
08:30:14,764] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
event of type Pending
[2018-11-21 08:30:15,769] {logging_mixin.py:95} INFO - [2018-11-21
08:30:15,769] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
event of type Pending
[2018-11-21 08:30:16,777] {logging_mixin.py:95} INFO - [2018-11-21
08:30:16,777] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
event of type Pending
[2018-11-21 08:30:17,783] {logging_mixin.py:95} INFO - [2018-11-21
08:30:17,783] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
event of type Pending
[2018-11-21 08:30:18,790] {logging_mixin.py:95} INFO - [2018-11-21
08:30:18,790] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
event of type Pending
[2018-11-21 08:30:19,796] {logging_mixin.py:95} INFO - [2018-11-21
08:30:19,796] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
event of type Pending
[2018-11-21 08:30:20,802] {logging_mixin.py:95} INFO - [2018-11-21
08:30:20,802] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
event of type Pending
[2018-11-21 08:30:21,808] {logging_mixin.py:95} INFO - [2018-11-21
08:30:21,808] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
event of type Pending
[2018-11-21 08:30:22,815] {logging_mixin.py:95} INFO - [2018-11-21
08:30:22,815] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
event of type Pending
[2018-11-21 08:30:23,822] {logging_mixin.py:95} INFO - [2018-11-21
08:30:23,822] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
event of type Pending
[2018-11-21 08:30:24,833] {logging_mixin.py:95} INFO - [2018-11-21
08:30:24,833] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
event of type Pending
[2018-11-21 08:30:24,841] {models.py:1760} ERROR - Pod Launching failed: Pod
took too long to start
Traceback (most recent call last):
File
"/usr/local/lib/python3.6/site-packages/airflow/contrib/operators/kubernetes_pod_operator.py",
line 123, in execute
get_logs=self.get_logs)
File
"/usr/local/lib/python3.6/site-packages/airflow/contrib/kubernetes/pod_launcher.py",
line 86, in run_pod
raise AirflowException("Pod took too long to start")
airflow.exceptions.AirflowException: Pod took too long to start
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/airflow/models.py", line 1659,
in _run_raw_task
result = task_copy.execute(context=context)
File
"/usr/local/lib/python3.6/site-packages/airflow/contrib/operators/kubernetes_pod_operator.py",
line 135, in execute
raise AirflowException('Pod Launching failed: {error}'.format(error=ex))
airflow.exceptions.AirflowException: Pod Launching failed: Pod took too long to
start
[2018-11-21 08:30:24,844] {models.py:1791} INFO - Marking task as FAILED.
[2018-11-21 08:30:25,001] {logging_mixin.py:95} INFO - [2018-11-21
08:30:25,001] {email.py:127} INFO - Sent an alert email to
['[email protected]']
[2018-11-21 08:30:26,646] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe Traceback (most recent call last):
[2018-11-21 08:30:26,646] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe File
"/usr/local/lib/python3.6/site-packages/airflow/contrib/operators/kubernetes_pod_operator.py",
line 123, in execute
[2018-11-21 08:30:26,646] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe get_logs=self.get_logs)
[2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe File
"/usr/local/lib/python3.6/site-packages/airflow/contrib/kubernetes/pod_launcher.py",
line 86, in run_pod
[2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe raise AirflowException("Pod took too long to start")
[2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe airflow.exceptions.AirflowException: Pod took too long to start
[2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe
[2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe During handling of the above exception, another exception occurred:
[2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe
[2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe Traceback (most recent call last):
[2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe File "/usr/local/bin/airflow", line 32, in <module>
[2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe args.func(args)
[2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe File "/usr/local/lib/python3.6/site-packages/airflow/utils/cli.py",
line 74, in wrapper
[2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe return f(*args, **kwargs)
[2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe File "/usr/local/lib/python3.6/site-packages/airflow/bin/cli.py",
line 490, in run
[2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe _run(args, dag, ti)
[2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe File "/usr/local/lib/python3.6/site-packages/airflow/bin/cli.py",
line 406, in _run
[2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe pool=args.pool,
[2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe File "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py",
line 74, in wrapper
[2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe return func(*args, **kwargs)
[2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe File "/usr/local/lib/python3.6/site-packages/airflow/models.py",
line 1659, in _run_raw_task
[2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe result = task_copy.execute(context=context)
[2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe File
"/usr/local/lib/python3.6/site-packages/airflow/contrib/operators/kubernetes_pod_operator.py",
line 135, in execute
[2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe raise AirflowException('Pod Launching failed:
{error}'.format(error=ex))
[2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
datapipe airflow.exceptions.AirflowException: Pod Launching failed: Pod took
too long to start
[2018-11-21 08:30:27,054] {logging_mixin.py:95} INFO - [2018-11-21
08:30:27,053] {jobs.py:2627} INFO - Task exited with return code 1
{noformat}
> KubernetesPodOperator does not delete on timeout failure
> --------------------------------------------------------
>
> Key: AIRFLOW-3378
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3378
> Project: Apache Airflow
> Issue Type: Bug
> Components: kubernetes
> Affects Versions: 1.10.1
> Reporter: Victor
> Priority: Major
>
> I am runnig airflow 1.10.1rc2 on kubernetes with the LocalExecutior and
> KubernetesPodOperator to run pods.
>
> When a failure happens because the pod can't be created (see logs below), the
> whole operator fails and if is_delete_operator_pod is set to True, the pod is
> NOT deleted as expected.
>
> Logs:
> {noformat}
> *** Reading local file:
> /airflow/logs/demo/datapipe/2018-11-21T08:29:57.206456+00:00/1.log
> [2018-11-21 08:30:02,027] {models.py:1361} INFO - Dependencies all met for
> <TaskInstance: demo.datapipe 2018-11-21T08:29:57.206456+00:00 [queued]>
> [2018-11-21 08:30:02,040] {models.py:1361} INFO - Dependencies all met for
> <TaskInstance: demo.datapipe 2018-11-21T08:29:57.206456+00:00 [queued]>
> [2018-11-21 08:30:02,041] {models.py:1573} INFO -
> --------------------------------------------------------------------------------
> Starting attempt 1 of 1
> --------------------------------------------------------------------------------
> [2018-11-21 08:30:02,066] {models.py:1595} INFO - Executing
> <Task(KubernetesPodOperator): datapipe> on 2018-11-21T08:29:57.206456+00:00
> [2018-11-21 08:30:02,067] {base_task_runner.py:118} INFO - Running: ['bash',
> '-c', 'airflow run demo datapipe 2018-11-21T08:29:57.206456+00:00 --job_id 4
> --raw -sd /usr/local/airflow/dags_volume/..data/demo.py --cfg_path
> /tmp/tmp5qa5qyhs']
> [2018-11-21 08:30:03,065] {base_task_runner.py:101} INFO - Job 4: Subtask
> datapipe [2018-11-21 08:30:03,065] {settings.py:174} INFO -
> setting.configure_orm(): Using pool settings. pool_size=5, pool_recycle=1800
> [2018-11-21 08:30:03,734] {base_task_runner.py:101} INFO - Job 4: Subtask
> datapipe [2018-11-21 08:30:03,732] {__init__.py:51} INFO - Using executor
> LocalExecutor
> [2018-11-21 08:30:04,038] {base_task_runner.py:101} INFO - Job 4: Subtask
> datapipe [2018-11-21 08:30:04,038] {models.py:271} INFO - Filling up the
> DagBag from /usr/local/airflow/dags_volume/..data/demo.py
> [2018-11-21 08:30:04,590] {base_task_runner.py:101} INFO - Job 4: Subtask
> datapipe [2018-11-21 08:30:04,590] {cli.py:484} INFO - Running <TaskInstance:
> demo.datapipe 2018-11-21T08:29:57.206456+00:00 [running]> on host
> infra-airflow-6d78c56489-r9trl
> [2018-11-21 08:30:04,684] {logging_mixin.py:95} INFO - [2018-11-21
> 08:30:04,676] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
> event of type Pending
> [2018-11-21 08:30:05,698] {logging_mixin.py:95} INFO - [2018-11-21
> 08:30:05,698] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
> event of type Pending
> [2018-11-21 08:30:06,706] {logging_mixin.py:95} INFO - [2018-11-21
> 08:30:06,706] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
> event of type Pending
> [2018-11-21 08:30:07,715] {logging_mixin.py:95} INFO - [2018-11-21
> 08:30:07,714] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
> event of type Pending
> [2018-11-21 08:30:08,723] {logging_mixin.py:95} INFO - [2018-11-21
> 08:30:08,723] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
> event of type Pending
> [2018-11-21 08:30:09,730] {logging_mixin.py:95} INFO - [2018-11-21
> 08:30:09,730] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
> event of type Pending
> [2018-11-21 08:30:10,736] {logging_mixin.py:95} INFO - [2018-11-21
> 08:30:10,736] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
> event of type Pending
> [2018-11-21 08:30:11,742] {logging_mixin.py:95} INFO - [2018-11-21
> 08:30:11,742] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
> event of type Pending
> [2018-11-21 08:30:12,748] {logging_mixin.py:95} INFO - [2018-11-21
> 08:30:12,748] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
> event of type Pending
> [2018-11-21 08:30:13,754] {logging_mixin.py:95} INFO - [2018-11-21
> 08:30:13,754] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
> event of type Pending
> [2018-11-21 08:30:14,764] {logging_mixin.py:95} INFO - [2018-11-21
> 08:30:14,764] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
> event of type Pending
> [2018-11-21 08:30:15,769] {logging_mixin.py:95} INFO - [2018-11-21
> 08:30:15,769] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
> event of type Pending
> [2018-11-21 08:30:16,777] {logging_mixin.py:95} INFO - [2018-11-21
> 08:30:16,777] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
> event of type Pending
> [2018-11-21 08:30:17,783] {logging_mixin.py:95} INFO - [2018-11-21
> 08:30:17,783] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
> event of type Pending
> [2018-11-21 08:30:18,790] {logging_mixin.py:95} INFO - [2018-11-21
> 08:30:18,790] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
> event of type Pending
> [2018-11-21 08:30:19,796] {logging_mixin.py:95} INFO - [2018-11-21
> 08:30:19,796] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
> event of type Pending
> [2018-11-21 08:30:20,802] {logging_mixin.py:95} INFO - [2018-11-21
> 08:30:20,802] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
> event of type Pending
> [2018-11-21 08:30:21,808] {logging_mixin.py:95} INFO - [2018-11-21
> 08:30:21,808] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
> event of type Pending
> [2018-11-21 08:30:22,815] {logging_mixin.py:95} INFO - [2018-11-21
> 08:30:22,815] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
> event of type Pending
> [2018-11-21 08:30:23,822] {logging_mixin.py:95} INFO - [2018-11-21
> 08:30:23,822] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
> event of type Pending
> [2018-11-21 08:30:24,833] {logging_mixin.py:95} INFO - [2018-11-21
> 08:30:24,833] {pod_launcher.py:121} INFO - Event: datapipe-a5591f0c had an
> event of type Pending
> [2018-11-21 08:30:24,841] {models.py:1760} ERROR - Pod Launching failed: Pod
> took too long to start
> Traceback (most recent call last):
> File
> "/usr/local/lib/python3.6/site-packages/airflow/contrib/operators/kubernetes_pod_operator.py",
> line 123, in execute
> get_logs=self.get_logs)
> File
> "/usr/local/lib/python3.6/site-packages/airflow/contrib/kubernetes/pod_launcher.py",
> line 86, in run_pod
> raise AirflowException("Pod took too long to start")
> airflow.exceptions.AirflowException: Pod took too long to start
> During handling of the above exception, another exception occurred:
> Traceback (most recent call last):
> File "/usr/local/lib/python3.6/site-packages/airflow/models.py", line 1659,
> in _run_raw_task
> result = task_copy.execute(context=context)
> File
> "/usr/local/lib/python3.6/site-packages/airflow/contrib/operators/kubernetes_pod_operator.py",
> line 135, in execute
> raise AirflowException('Pod Launching failed: {error}'.format(error=ex))
> airflow.exceptions.AirflowException: Pod Launching failed: Pod took too long
> to start
> [2018-11-21 08:30:24,844] {models.py:1791} INFO - Marking task as FAILED.
> [2018-11-21 08:30:25,001] {logging_mixin.py:95} INFO - [2018-11-21
> 08:30:25,001] {email.py:127} INFO - Sent an alert email to
> ['[email protected]']
> [2018-11-21 08:30:26,646] {base_task_runner.py:101} INFO - Job 4: Subtask
> datapipe Traceback (most recent call last):
> [2018-11-21 08:30:26,646] {base_task_runner.py:101} INFO - Job 4: Subtask
> datapipe File
> "/usr/local/lib/python3.6/site-packages/airflow/contrib/operators/kubernetes_pod_operator.py",
> line 123, in execute
> [2018-11-21 08:30:26,646] {base_task_runner.py:101} INFO - Job 4: Subtask
> datapipe get_logs=self.get_logs)
> [2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
> datapipe File
> "/usr/local/lib/python3.6/site-packages/airflow/contrib/kubernetes/pod_launcher.py",
> line 86, in run_pod
> [2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
> datapipe raise AirflowException("Pod took too long to start")
> [2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
> datapipe airflow.exceptions.AirflowException: Pod took too long to start
> [2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
> datapipe
> [2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
> datapipe During handling of the above exception, another exception occurred:
> [2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
> datapipe
> [2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
> datapipe Traceback (most recent call last):
> [2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
> datapipe File "/usr/local/bin/airflow", line 32, in <module>
> [2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
> datapipe args.func(args)
> [2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
> datapipe File
> "/usr/local/lib/python3.6/site-packages/airflow/utils/cli.py", line 74, in
> wrapper
> [2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
> datapipe return f(*args, **kwargs)
> [2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
> datapipe File "/usr/local/lib/python3.6/site-packages/airflow/bin/cli.py",
> line 490, in run
> [2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
> datapipe _run(args, dag, ti)
> [2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
> datapipe File "/usr/local/lib/python3.6/site-packages/airflow/bin/cli.py",
> line 406, in _run
> [2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
> datapipe pool=args.pool,
> [2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
> datapipe File "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py",
> line 74, in wrapper
> [2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
> datapipe return func(*args, **kwargs)
> [2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
> datapipe File "/usr/local/lib/python3.6/site-packages/airflow/models.py",
> line 1659, in _run_raw_task
> [2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
> datapipe result = task_copy.execute(context=context)
> [2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
> datapipe File
> "/usr/local/lib/python3.6/site-packages/airflow/contrib/operators/kubernetes_pod_operator.py",
> line 135, in execute
> [2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
> datapipe raise AirflowException('Pod Launching failed:
> {error}'.format(error=ex))
> [2018-11-21 08:30:26,647] {base_task_runner.py:101} INFO - Job 4: Subtask
> datapipe airflow.exceptions.AirflowException: Pod Launching failed: Pod took
> too long to start
> [2018-11-21 08:30:27,054] {logging_mixin.py:95} INFO - [2018-11-21
> 08:30:27,053] {jobs.py:2627} INFO - Task exited with return code 1
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)