[jira] [Commented] (AIRFLOW-5447) KubernetesExecutor hangs on task queueing

2019-09-17 Thread Henry Cohen (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16931681#comment-16931681
 ] 

Henry Cohen commented on AIRFLOW-5447:
--

Thank you guys so much for working on this, any idea on how long until the fix 
is out now that it's been merged?

> KubernetesExecutor hangs on task queueing
> -
>
> Key: AIRFLOW-5447
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5447
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: executor-kubernetes
>Affects Versions: 1.10.4, 1.10.5
> Environment: Kubernetes version v1.14.3, Airflow version 1.10.4-1.10.5
>Reporter: Henry Cohen
>Assignee: Daniel Imberman
>Priority: Blocker
>
> Starting in 1.10.4, and continuing in 1.10.5, when using the 
> KubernetesExecutor, with the webserver and scheduler running in the 
> kubernetes cluster, tasks are scheduled, but when added to the task queue, 
> the executor process hangs indefinitely. Based on log messages, it appears to 
> be stuck at this line 
> https://github.com/apache/airflow/blob/v1-10-stable/airflow/contrib/executors/kubernetes_executor.py#L761



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (AIRFLOW-5447) KubernetesExecutor hangs on task queueing

2019-09-13 Thread Henry Cohen (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16929191#comment-16929191
 ] 

Henry Cohen commented on AIRFLOW-5447:
--

If it helps, my pod running the webserver and scheduler is on a node with 5 
cpu, and 6GB memory

> KubernetesExecutor hangs on task queueing
> -
>
> Key: AIRFLOW-5447
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5447
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: executor-kubernetes
>Affects Versions: 1.10.4, 1.10.5
> Environment: Kubernetes version v1.14.3, Airflow version 1.10.4-1.10.5
>Reporter: Henry Cohen
>Assignee: Daniel Imberman
>Priority: Blocker
>
> Starting in 1.10.4, and continuing in 1.10.5, when using the 
> KubernetesExecutor, with the webserver and scheduler running in the 
> kubernetes cluster, tasks are scheduled, but when added to the task queue, 
> the executor process hangs indefinitely. Based on log messages, it appears to 
> be stuck at this line 
> https://github.com/apache/airflow/blob/v1-10-stable/airflow/contrib/executors/kubernetes_executor.py#L761



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (AIRFLOW-5447) KubernetesExecutor hangs on task queueing

2019-09-12 Thread Henry Cohen (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16928907#comment-16928907
 ] 

Henry Cohen commented on AIRFLOW-5447:
--

[~dimberman]p py3

> KubernetesExecutor hangs on task queueing
> -
>
> Key: AIRFLOW-5447
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5447
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: executor-kubernetes
>Affects Versions: 1.10.4, 1.10.5
> Environment: Kubernetes version v1.14.3, Airflow version 1.10.4-1.10.5
>Reporter: Henry Cohen
>Assignee: Daniel Imberman
>Priority: Blocker
>
> Starting in 1.10.4, and continuing in 1.10.5, when using the 
> KubernetesExecutor, with the webserver and scheduler running in the 
> kubernetes cluster, tasks are scheduled, but when added to the task queue, 
> the executor process hangs indefinitely. Based on log messages, it appears to 
> be stuck at this line 
> https://github.com/apache/airflow/blob/v1-10-stable/airflow/contrib/executors/kubernetes_executor.py#L761



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (AIRFLOW-5447) KubernetesExecutor hangs on task queueing

2019-09-12 Thread Henry Cohen (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16928804#comment-16928804
 ] 

Henry Cohen commented on AIRFLOW-5447:
--

This line in particular is what lead me through my investigation to 
https://github.com/apache/airflow/blob/v1-10-stable/airflow/contrib/executors/kubernetes_executor.py#L761:
{noformat}
[2019-09-12 17:56:05,186] kubernetes_executor.py:764 INFO - Add task 
('example_subdag_operator', 'start', datetime.datetime(2019, 9, 10, 0, 0, 
tzinfo=), 1) with command ['airflow', 
'run', 'example_subdag_operator', 'start', '2019-09-10T00:00:00+00:00', 
'--local', '--pool', 'default_pool', '-sd', 
'/usr/local/lib/python3.7/site-packages/airflow/example_dags/example_subdag_operator.py']
 with executor_config {}{noformat}

> KubernetesExecutor hangs on task queueing
> -
>
> Key: AIRFLOW-5447
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5447
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: executor-kubernetes
>Affects Versions: 1.10.4, 1.10.5
> Environment: Kubernetes version v1.14.3, Airflow version 1.10.4-1.10.5
>Reporter: Henry Cohen
>Assignee: Daniel Imberman
>Priority: Blocker
>
> Starting in 1.10.4, and continuing in 1.10.5, when using the 
> KubernetesExecutor, with the webserver and scheduler running in the 
> kubernetes cluster, tasks are scheduled, but when added to the task queue, 
> the executor process hangs indefinitely. Based on log messages, it appears to 
> be stuck at this line 
> https://github.com/apache/airflow/blob/v1-10-stable/airflow/contrib/executors/kubernetes_executor.py#L761



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Comment Edited] (AIRFLOW-5447) KubernetesExecutor hangs on task queueing

2019-09-12 Thread Henry Cohen (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16928788#comment-16928788
 ] 

Henry Cohen edited comment on AIRFLOW-5447 at 9/12/19 6:00 PM:
---

This is a sample of what I see when running with the example DAGs, they queue, 
but when the first one tries to start it just sits, and eventually the 
processes die and the scheduler hangs
{noformat}
[2019-09-12 17:56:03,034] kubernetes_executor.py:698 INFO - TaskInstance: 
 found in 
queued state but was not launched, rescheduling
 [2019-09-12 17:56:03,043] scheduler_job.py:1376 INFO - Resetting orphaned 
tasks for active dag runs
 [2019-09-12 17:56:03,085] base_job.py:308 INFO - Reset the following 30 
TaskInstances:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 [2019-09-12 17:56:03,092] dag_processing.py:545 INFO - Launched 
DagFileProcessorManager with pid: 35
 [2019-09-12 17:56:03,093] scheduler_job.py:1390 DEBUG - Starting Loop...
 [2019-09-12 17:56:03,093] scheduler_job.py:1401 DEBUG - Harvesting DAG parsing 
results
 [2019-09-12 17:56:03,093] scheduler_job.py:1403 DEBUG - Harvested 0 SimpleDAGs
 [2019-09-12 17:56:03,093] scheduler_job.py:1438 DEBUG - Heartbeating the 
executor
 [2019-09-12 17:56:03,093] base_executor.py:124 DEBUG - 0 running task instances
 [2019-09-12 17:56:03,094] base_executor.py:125 DEBUG - 0 in queue
 [2019-09-12 17:56:03,094] base_executor.py:126 DEBUG - 96 open slots
 [2019-09-12 17:56:03,094] base_executor.py:135 DEBUG - Calling the  sync method
 [2019-09-12 17:56:03,100] scheduler_job.py:1459 DEBUG - Ran scheduling loop in 
0.01 seconds
 [2019-09-12 17:56:03,101] scheduler_job.py:1462 DEBUG - Sleeping for 1.00 
seconds
 [2019-09-12 17:56:03,107] settings.py:54 INFO - Configured default timezone 

 [2019-09-12 17:56:03,109] settings.py:327 DEBUG - Failed to import 
airflow_local_settings.
 Traceback (most recent call last):
 File "/usr/local/lib/python3.7/site-packages/airflow/settings.py", line 315, 
in import_local_settings
 import airflow_local_settings
 ModuleNotFoundError: No module named 'airflow_local_settings'
 [2019-09-12 17:56:03,111] logging_config.py:47 INFO - Successfully imported 
user-defined logging config from log_config.LOGGING_CONFIG
 [2019-09-12 17:56:03,120] settings.py:170 DEBUG - Setting up DB connection 
pool (PID 35)
 [2019-09-12 17:56:03,121] settings.py:213 INFO - settings.configure_orm(): 
Using pool settings. pool_size=5, max_overflow=10, pool_recycle=1800, pid=35
 [2019-09-12 17:56:03,289] settings.py:238 DEBUG - Disposing DB connection pool 
(PID 45)
 [2019-09-12 17:56:03,356] settings.py:238 DEBUG - Disposing DB connection pool 
(PID 41)
 [2019-09-12 17:56:04,101] scheduler_job.py:1474 DEBUG - Sleeping for 0.99 
seconds to prevent excessive logging
 [2019-09-12 17:56:04,126] scheduler_job.py:257 DEBUG - Waiting for 

 [2019-09-12 17:56:04,127] scheduler_job.py:257 DEBUG - Waiting for 

 [2019-09-12 17:56:04,162] settings.py:238 DEBUG - Disposing DB connection pool 
(PID 55)
 [2019-09-12 17:56:04,223] settings.py:238 DEBUG - Disposing DB connection pool 
(PID 58)
 [2019-09-12 17:56:05,095] scheduler_job.py:1390 DEBUG - Starting Loop...
 [2019-09-12 17:56:05,095] scheduler_job.py:1401 DEBUG - Harvesting DAG parsing 
results
 [2019-09-12 17:56:05,097] dag_processing.py:637 DEBUG - Received message of 
type DagParsingStat
 [2019-09-12 17:56:05,098] dag_processing.py:637 DEBUG - Received message of 
type SimpleDag
 [2019-09-12 17:56:05,098] dag_processing.py:637 DEBUG - Received message of 
type SimpleDag
 [2019-09-12 17:56:05,099] dag_processing.py:637 DEBUG - Received message of 
type SimpleDag
 [2019-09-12 17:56:05,099] dag_processing.py:637 DEBUG - Received message of 
type SimpleDag
 [2019-09-12 17:56:05,100] dag_processing.py:637 DEBUG - Received message of 
type DagParsingStat
 [2019-09-12 17:56:05,101] dag_processing.py:637 DEBUG - Received message of 
type DagParsingStat
 [2019-09-12 17:56:05,101] scheduler_job.py:1403 DEBUG - Harvested 4 SimpleDAGs
 [2019-09-12 17:56:05,128] scheduler_job.py:921 INFO - 5 tasks up for execution:
 
 
 
 
 
 [2019-09-12 17:56:05,138] scheduler_job.py:953 INFO - Figuring out tasks to 
run in Pool(name=default_pool) with 128 open slots and 5 task instances ready 
to be queued
 [2019-09-12 17:56:05,139] scheduler_job.py:981 INFO - DAG 
example_subdag_operator has 0/48 running and queued tasks
 [2019-09-12 17:56:05,139] scheduler_job.py:981 INFO - DAG 
latest_only_with_trigger has 0/48 running and queued tasks
 [2019-09-12 17:56:05,139] scheduler_job.py:981 INFO - DAG 
latest_only_with_trigger has 1/48 running and queued tasks
 [2019-09-12 17:56:05,139] scheduler_job.py:981 INFO - DAG 
latest_only_with_trigger has 2/48 running and queued tasks
 [2019-09-12 17:56:05,139] scheduler_job.py:981 INFO - DAG 
latest_only_with_trigger has 3/48 running and queued tasks
 [2019-09-12 17:56:05,139] scheduler_job.py:257 

[jira] [Comment Edited] (AIRFLOW-5447) KubernetesExecutor hangs on task queueing

2019-09-12 Thread Henry Cohen (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16928788#comment-16928788
 ] 

Henry Cohen edited comment on AIRFLOW-5447 at 9/12/19 5:59 PM:
---

`[2019-09-12 17:56:03,034] {{kubernetes_executor.py:698}} INFO - TaskInstance: 
 found in 
queued state but was not launched, rescheduling
 [2019-09-12 17:56:03,043] {{scheduler_job.py:1376}} INFO - Resetting orphaned 
tasks for active dag runs
 [2019-09-12 17:56:03,085] {{base_job.py:308}} INFO - Reset the following 30 
TaskInstances:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 [2019-09-12 17:56:03,092] {{dag_processing.py:545}} INFO - Launched 
DagFileProcessorManager with pid: 35
 [2019-09-12 17:56:03,093] {{scheduler_job.py:1390}} DEBUG - Starting Loop...
 [2019-09-12 17:56:03,093] {{scheduler_job.py:1401}} DEBUG - Harvesting DAG 
parsing results
 [2019-09-12 17:56:03,093] {{scheduler_job.py:1403}} DEBUG - Harvested 0 
SimpleDAGs
 [2019-09-12 17:56:03,093] {{scheduler_job.py:1438}} DEBUG - Heartbeating the 
executor
 [2019-09-12 17:56:03,093] {{base_executor.py:124}} DEBUG - 0 running task 
instances
 [2019-09-12 17:56:03,094] {{base_executor.py:125}} DEBUG - 0 in queue
 [2019-09-12 17:56:03,094] {{base_executor.py:126}} DEBUG - 96 open slots
 [2019-09-12 17:56:03,094] {{base_executor.py:135}} DEBUG - Calling the  sync method
 [2019-09-12 17:56:03,100] {{scheduler_job.py:1459}} DEBUG - Ran scheduling 
loop in 0.01 seconds
 [2019-09-12 17:56:03,101] {{scheduler_job.py:1462}} DEBUG - Sleeping for 1.00 
seconds
 [2019-09-12 17:56:03,107] {{settings.py:54}} INFO - Configured default 
timezone 
 [2019-09-12 17:56:03,109] {{settings.py:327}} DEBUG - Failed to import 
airflow_local_settings.
 Traceback (most recent call last):
 File "/usr/local/lib/python3.7/site-packages/airflow/settings.py", line 315, 
in import_local_settings
 import airflow_local_settings
 ModuleNotFoundError: No module named 'airflow_local_settings'
 [2019-09-12 17:56:03,111] {{logging_config.py:47}} INFO - Successfully 
imported user-defined logging config from log_config.LOGGING_CONFIG
 [2019-09-12 17:56:03,120] {{settings.py:170}} DEBUG - Setting up DB connection 
pool (PID 35)
 [2019-09-12 17:56:03,121] {{settings.py:213}} INFO - settings.configure_orm(): 
Using pool settings. pool_size=5, max_overflow=10, pool_recycle=1800, pid=35
 [2019-09-12 17:56:03,289] {{settings.py:238}} DEBUG - Disposing DB connection 
pool (PID 45)
 [2019-09-12 17:56:03,356] {{settings.py:238}} DEBUG - Disposing DB connection 
pool (PID 41)
 [2019-09-12 17:56:04,101] {{scheduler_job.py:1474}} DEBUG - Sleeping for 0.99 
seconds to prevent excessive logging
 [2019-09-12 17:56:04,126] {{scheduler_job.py:257}} DEBUG - Waiting for 

 [2019-09-12 17:56:04,127] {{scheduler_job.py:257}} DEBUG - Waiting for 

 [2019-09-12 17:56:04,162] {{settings.py:238}} DEBUG - Disposing DB connection 
pool (PID 55)
 [2019-09-12 17:56:04,223] {{settings.py:238}} DEBUG - Disposing DB connection 
pool (PID 58)
 [2019-09-12 17:56:05,095] {{scheduler_job.py:1390}} DEBUG - Starting Loop...
 [2019-09-12 17:56:05,095] {{scheduler_job.py:1401}} DEBUG - Harvesting DAG 
parsing results
 [2019-09-12 17:56:05,097] {{dag_processing.py:637}} DEBUG - Received message 
of type DagParsingStat
 [2019-09-12 17:56:05,098] {{dag_processing.py:637}} DEBUG - Received message 
of type SimpleDag
 [2019-09-12 17:56:05,098] {{dag_processing.py:637}} DEBUG - Received message 
of type SimpleDag
 [2019-09-12 17:56:05,099] {{dag_processing.py:637}} DEBUG - Received message 
of type SimpleDag
 [2019-09-12 17:56:05,099] {{dag_processing.py:637}} DEBUG - Received message 
of type SimpleDag
 [2019-09-12 17:56:05,100] {{dag_processing.py:637}} DEBUG - Received message 
of type DagParsingStat
 [2019-09-12 17:56:05,101] {{dag_processing.py:637}} DEBUG - Received message 
of type DagParsingStat
 [2019-09-12 17:56:05,101] {{scheduler_job.py:1403}} DEBUG - Harvested 4 
SimpleDAGs
 [2019-09-12 17:56:05,128] {{scheduler_job.py:921}} INFO - 5 tasks up for 
execution:
 
 
 
 
 
 [2019-09-12 17:56:05,138] {{scheduler_job.py:953}} INFO - Figuring out tasks 
to run in Pool(name=default_pool) with 128 open slots and 5 task instances 
ready to be queued
 [2019-09-12 17:56:05,139] {{scheduler_job.py:981}} INFO - DAG 
example_subdag_operator has 0/48 running and queued tasks
 [2019-09-12 17:56:05,139] {{scheduler_job.py:981}} INFO - DAG 
latest_only_with_trigger has 0/48 running and queued tasks
 [2019-09-12 17:56:05,139] {{scheduler_job.py:981}} INFO - DAG 
latest_only_with_trigger has 1/48 running and queued tasks
 [2019-09-12 17:56:05,139] {{scheduler_job.py:981}} INFO - DAG 
latest_only_with_trigger has 2/48 running and queued tasks
 [2019-09-12 17:56:05,139] {{scheduler_job.py:981}} INFO - DAG 
latest_only_with_trigger has 3/48 running and queued tasks
 [2019-09-12 17:56:05,139] {{scheduler_job.py:257}} DEBUG - Waiting for 

 

[jira] [Commented] (AIRFLOW-5447) KubernetesExecutor hangs on task queueing

2019-09-12 Thread Henry Cohen (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16928788#comment-16928788
 ] 

Henry Cohen commented on AIRFLOW-5447:
--

```[2019-09-12 17:56:03,034] \{{kubernetes_executor.py:698}} INFO - 
TaskInstance:  found in queued state but was not launched, rescheduling
[2019-09-12 17:56:03,043] \{{scheduler_job.py:1376}} INFO - Resetting orphaned 
tasks for active dag runs
[2019-09-12 17:56:03,085] \{{base_job.py:308}} INFO - Reset the following 30 
TaskInstances:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
[2019-09-12 17:56:03,092] \{{dag_processing.py:545}} INFO - Launched 
DagFileProcessorManager with pid: 35
[2019-09-12 17:56:03,093] \{{scheduler_job.py:1390}} DEBUG - Starting Loop...
[2019-09-12 17:56:03,093] \{{scheduler_job.py:1401}} DEBUG - Harvesting DAG 
parsing results
[2019-09-12 17:56:03,093] \{{scheduler_job.py:1403}} DEBUG - Harvested 0 
SimpleDAGs
[2019-09-12 17:56:03,093] \{{scheduler_job.py:1438}} DEBUG - Heartbeating the 
executor
[2019-09-12 17:56:03,093] \{{base_executor.py:124}} DEBUG - 0 running task 
instances
[2019-09-12 17:56:03,094] \{{base_executor.py:125}} DEBUG - 0 in queue
[2019-09-12 17:56:03,094] \{{base_executor.py:126}} DEBUG - 96 open slots
[2019-09-12 17:56:03,094] \{{base_executor.py:135}} DEBUG - Calling the  sync method
[2019-09-12 17:56:03,100] \{{scheduler_job.py:1459}} DEBUG - Ran scheduling 
loop in 0.01 seconds
[2019-09-12 17:56:03,101] \{{scheduler_job.py:1462}} DEBUG - Sleeping for 1.00 
seconds
[2019-09-12 17:56:03,107] \{{settings.py:54}} INFO - Configured default 
timezone 
[2019-09-12 17:56:03,109] \{{settings.py:327}} DEBUG - Failed to import 
airflow_local_settings.
Traceback (most recent call last):
 File "/usr/local/lib/python3.7/site-packages/airflow/settings.py", line 315, 
in import_local_settings
 import airflow_local_settings
ModuleNotFoundError: No module named 'airflow_local_settings'
[2019-09-12 17:56:03,111] \{{logging_config.py:47}} INFO - Successfully 
imported user-defined logging config from log_config.LOGGING_CONFIG
[2019-09-12 17:56:03,120] \{{settings.py:170}} DEBUG - Setting up DB connection 
pool (PID 35)
[2019-09-12 17:56:03,121] \{{settings.py:213}} INFO - settings.configure_orm(): 
Using pool settings. pool_size=5, max_overflow=10, pool_recycle=1800, pid=35
[2019-09-12 17:56:03,289] \{{settings.py:238}} DEBUG - Disposing DB connection 
pool (PID 45)
[2019-09-12 17:56:03,356] \{{settings.py:238}} DEBUG - Disposing DB connection 
pool (PID 41)
[2019-09-12 17:56:04,101] \{{scheduler_job.py:1474}} DEBUG - Sleeping for 0.99 
seconds to prevent excessive logging
[2019-09-12 17:56:04,126] \{{scheduler_job.py:257}} DEBUG - Waiting for 

[2019-09-12 17:56:04,127] \{{scheduler_job.py:257}} DEBUG - Waiting for 

[2019-09-12 17:56:04,162] \{{settings.py:238}} DEBUG - Disposing DB connection 
pool (PID 55)
[2019-09-12 17:56:04,223] \{{settings.py:238}} DEBUG - Disposing DB connection 
pool (PID 58)
[2019-09-12 17:56:05,095] \{{scheduler_job.py:1390}} DEBUG - Starting Loop...
[2019-09-12 17:56:05,095] \{{scheduler_job.py:1401}} DEBUG - Harvesting DAG 
parsing results
[2019-09-12 17:56:05,097] \{{dag_processing.py:637}} DEBUG - Received message 
of type DagParsingStat
[2019-09-12 17:56:05,098] \{{dag_processing.py:637}} DEBUG - Received message 
of type SimpleDag
[2019-09-12 17:56:05,098] \{{dag_processing.py:637}} DEBUG - Received message 
of type SimpleDag
[2019-09-12 17:56:05,099] \{{dag_processing.py:637}} DEBUG - Received message 
of type SimpleDag
[2019-09-12 17:56:05,099] \{{dag_processing.py:637}} DEBUG - Received message 
of type SimpleDag
[2019-09-12 17:56:05,100] \{{dag_processing.py:637}} DEBUG - Received message 
of type DagParsingStat
[2019-09-12 17:56:05,101] \{{dag_processing.py:637}} DEBUG - Received message 
of type DagParsingStat
[2019-09-12 17:56:05,101] \{{scheduler_job.py:1403}} DEBUG - Harvested 4 
SimpleDAGs
[2019-09-12 17:56:05,128] \{{scheduler_job.py:921}} INFO - 5 tasks up for 
execution:
 
 
 
 
 
[2019-09-12 17:56:05,138] \{{scheduler_job.py:953}} INFO - Figuring out tasks 
to run in Pool(name=default_pool) with 128 open slots and 5 task instances 
ready to be queued
[2019-09-12 17:56:05,139] \{{scheduler_job.py:981}} INFO - DAG 
example_subdag_operator has 0/48 running and queued tasks
[2019-09-12 17:56:05,139] \{{scheduler_job.py:981}} INFO - DAG 
latest_only_with_trigger has 0/48 running and queued tasks
[2019-09-12 17:56:05,139] \{{scheduler_job.py:981}} INFO - DAG 
latest_only_with_trigger has 1/48 running and queued tasks
[2019-09-12 17:56:05,139] \{{scheduler_job.py:981}} INFO - DAG 
latest_only_with_trigger has 2/48 running and queued tasks
[2019-09-12 17:56:05,139] \{{scheduler_job.py:981}} INFO - DAG 
latest_only_with_trigger has 3/48 running and queued tasks
[2019-09-12 17:56:05,139] \{{scheduler_job.py:257}} DEBUG - Waiting for 

[2019-09-12 17:56:05,140] 

[jira] [Updated] (AIRFLOW-5447) KubernetesExecutor hangs on task queueing

2019-09-09 Thread Henry Cohen (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Cohen updated AIRFLOW-5447:
-
Priority: Blocker  (was: Critical)

> KubernetesExecutor hangs on task queueing
> -
>
> Key: AIRFLOW-5447
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5447
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: executor-kubernetes
>Affects Versions: 1.10.4, 1.10.5
> Environment: Kubernetes version v1.14.3, Airflow version 1.10.4-1.10.5
>Reporter: Henry Cohen
>Assignee: Daniel Imberman
>Priority: Blocker
>
> Starting in 1.10.4, and continuing in 1.10.5, when using the 
> KubernetesExecutor, with the webserver and scheduler running in the 
> kubernetes cluster, tasks are scheduled, but when added to the task queue, 
> the executor process hangs indefinitely. Based on log messages, it appears to 
> be stuck at this line 
> https://github.com/apache/airflow/blob/v1-10-stable/airflow/contrib/executors/kubernetes_executor.py#L761



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (AIRFLOW-5447) KubernetesExecutor hangs on task queueing

2019-09-09 Thread Henry Cohen (Jira)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Cohen updated AIRFLOW-5447:
-
Priority: Critical  (was: Major)

> KubernetesExecutor hangs on task queueing
> -
>
> Key: AIRFLOW-5447
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5447
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: executor-kubernetes
>Affects Versions: 1.10.4, 1.10.5
> Environment: Kubernetes version v1.14.3, Airflow version 1.10.4-1.10.5
>Reporter: Henry Cohen
>Assignee: Daniel Imberman
>Priority: Critical
>
> Starting in 1.10.4, and continuing in 1.10.5, when using the 
> KubernetesExecutor, with the webserver and scheduler running in the 
> kubernetes cluster, tasks are scheduled, but when added to the task queue, 
> the executor process hangs indefinitely. Based on log messages, it appears to 
> be stuck at this line 
> https://github.com/apache/airflow/blob/v1-10-stable/airflow/contrib/executors/kubernetes_executor.py#L761



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (AIRFLOW-5447) KubernetesExecutor hangs on task queueing

2019-09-09 Thread Henry Cohen (Jira)
Henry Cohen created AIRFLOW-5447:


 Summary: KubernetesExecutor hangs on task queueing
 Key: AIRFLOW-5447
 URL: https://issues.apache.org/jira/browse/AIRFLOW-5447
 Project: Apache Airflow
  Issue Type: Bug
  Components: executor-kubernetes
Affects Versions: 1.10.5, 1.10.4
 Environment: Kubernetes version v1.14.3, Airflow version 1.10.4-1.10.5
Reporter: Henry Cohen
Assignee: Daniel Imberman


Starting in 1.10.4, and continuing in 1.10.5, when using the 
KubernetesExecutor, with the webserver and scheduler running in the kubernetes 
cluster, tasks are scheduled, but when added to the task queue, the executor 
process hangs indefinitely. Based on log messages, it appears to be stuck at 
this line 
https://github.com/apache/airflow/blob/v1-10-stable/airflow/contrib/executors/kubernetes_executor.py#L761



--
This message was sent by Atlassian Jira
(v8.3.2#803003)