Shlomi Cohen created AIRFLOW-6032:
-------------------------------------
Summary: Sagemaker sensors with log printing are causing worker to
stuck
Key: AIRFLOW-6032
URL: https://issues.apache.org/jira/browse/AIRFLOW-6032
Project: Apache Airflow
Issue Type: Bug
Components: scheduler
Affects Versions: 1.10.5
Reporter: Shlomi Cohen
Hi
we are trying to use Sagemaker sensors to wait on long running tasks in
sagemaker.
problem is that the scheduler is filled up with sensors and stops to function.
we have tried changing the sensor mode to "rescheduled" and also changed its
priority to be lower than other tasks- that didn't work
the indication of the problem is - you have a sensor which works once in every
60 seconds
with task that take 20 minute and you see only 1 line like this in the log
{{Rescheduling task, marking task as UP_FOR_RESCHEDULE}}
{{Writing our own sensor works as expected and the line above appear as many
times as needed until the job finish.}}
{{looks like something with the AwsHook which uses a connection to get the logs
or something is wrong.}}
{{}}
{{}}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)