[GitHub] [airflow] vandonr-amz opened a new pull request, #28528: Fixes to how DebugExecutor handles sensors

GitBox Wed, 21 Dec 2022 15:12:05 -0800


vandonr-amz opened a new pull request, #28528:
URL: https://github.com/apache/airflow/pull/28528


   Current behavior of the Debug Executor with sensors:
   - When the sensor code gets executed the (copy of the) task is in 
"reschedule" mode, so the sensor releases it and reschedules it `now + 
poke_interval` in the future
   - The original task object stays in "poke" mode, so when the rescheduling 
event is received, it isn't handled properly, and the task is rescheduled for 
immediate execution (instead of rescheduling it int the future). This is taking 
place here: 
https://github.com/apache/airflow/blob/681835a67c89784944f41fce86099bcb2c3a0614/airflow/ti_deps/deps/ready_to_reschedule.py#L47-L52
   
   Because of this, the `poke` method is effectively called in a tight loop, 
hammering whichever API it's querying, eventually leading to trouble such as 
rate-limiting if the sensor waits long enough, which hampers debugging, the 
initial purpose of this executor.
   
   In this change, I propose a fix that permanently modifies the `mode` of the 
sensor if the executor is the debug one, so that everything is handled in 
"reschedule" mode.
   
   While testing this change, I noticed that the Executor itself also spins in 
a tight loop when it has no task to execute, leading to unnecessary resource 
usage and huge log files. With limited knowledge of how executors work, I'm 
proposing a poor man's fix for this here as well, where the executor would 
sleep for 500ms if there are no task ready to be executed.
   I think this time is short enough that humans won't have to wait too long 
for their tasks to be picked up when ready, and also long enough that the 
amount of logs is manageable and can be reasonably scrolled.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [airflow] vandonr-amz opened a new pull request, #28528: Fixes to how DebugExecutor handles sensors

Reply via email to