Re: [I] Retry if failed from queued should be separate from try_number [airflow]
Bowrna commented on issue #38304: URL: https://github.com/apache/airflow/issues/38304#issuecomment-2228571504 @collinmcnulty I have tagged you in this link https://github.com/apache/airflow/pull/39398#issuecomment-2227829264 and we can have further conversation on this part in that PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] Retry if failed from queued should be separate from try_number [airflow]
collinmcnulty commented on issue #38304: URL: https://github.com/apache/airflow/issues/38304#issuecomment-2228548855 @Bowrna Mine were running on celery executor, but I think the solution ought to be agnostic to the executor used. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] Retry if failed from queued should be separate from try_number [airflow]
Bowrna commented on issue #38304: URL: https://github.com/apache/airflow/issues/38304#issuecomment-2227803308 @collinmcnulty I know this is a long time, but can you share with me the executor type on which these task instances are running. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] Retry if failed from queued should be separate from try_number [airflow]
potiuk commented on issue #38304: URL: https://github.com/apache/airflow/issues/38304#issuecomment-2056306973 it's good understanding - it should likely be done somewhere there. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] Retry if failed from queued should be separate from try_number [airflow]
Bowrna commented on issue #38304: URL: https://github.com/apache/airflow/issues/38304#issuecomment-2050826701 > The thing is that task_instance is not created yet because ... the task is in the queue. So what needs to happen is that the whole logic should happen in scheduler - becuase it's the scheduler (and to be precise - executor) that realizes that task is queued state. And it should be a different handling in executor, not in task instance -that's the whole complexity of the task. @potiuk task_instance would be created with state as queued right? i could understand this should be handled in executor part as how the _fail_tasks_stuck_in_queued is handled. But I can see that TI models are queried to find the queued tasks and that could mean that task_instance object is created. If I am missing to understand the point you have specified, please let me know. https://github.com/apache/airflow/blob/0af5d923d99591576b3758ab3c694d02dbe152bf/airflow/jobs/scheduler_job_runner.py#L1541-L1576 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] Retry if failed from queued should be separate from try_number [airflow]
Bowrna commented on issue #38304: URL: https://github.com/apache/airflow/issues/38304#issuecomment-2049541069 Got it ... let me check where this is handled in the executor part where the failed task is moved to the queue again but the count is deducted from the try_number. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] Retry if failed from queued should be separate from try_number [airflow]
potiuk commented on issue #38304: URL: https://github.com/apache/airflow/issues/38304#issuecomment-2049503845 The thing is that task_instance is not created yet because ... the task is in the queue. So what needs to happen is that the whole logic should happen in scheduler - becuase it's the scheduler (and to be precise - executor) that realizes that task is queued state. And it should be a different handling in executor, not in task instance -that's the whole complexity of the task. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] Retry if failed from queued should be separate from try_number [airflow]
Bowrna commented on issue #38304: URL: https://github.com/apache/airflow/issues/38304#issuecomment-2048824149 The retry logic handled here in taskinstance.py. It is failed and therefore checks if its eligible for retry, if yes its queued again. But having another logic like FAILED_IN_QUEUE and TRY_NUMBER_FOR_QUEUE to handle the queue failed task makes sense to me for now. If you see other way, please let me know. https://github.com/apache/airflow/blob/b6ff085679c283cd3ccc3edf20dd3e6b0eaec967/airflow/models/taskinstance.py#L2992-L3015 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] Retry if failed from queued should be separate from try_number [airflow]
Bowrna commented on issue #38304: URL: https://github.com/apache/airflow/issues/38304#issuecomment-2046334014 @potiuk Currently if the task is struck in the queue for a longer time, we fail the task. To have a separate try_number for failed queue task, we may not know during the retry part, why the task failed. Like if it's due to fail on the run or fail due to struck in the queue. How do you think we can handle this case? https://github.com/apache/airflow/blob/0af5d923d99591576b3758ab3c694d02dbe152bf/airflow/jobs/scheduler_job_runner.py#L1541-L1576 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] Retry if failed from queued should be separate from try_number [airflow]
potiuk commented on issue #38304: URL: https://github.com/apache/airflow/issues/38304#issuecomment-2036386909 Assigned you -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] Retry if failed from queued should be separate from try_number [airflow]
Bowrna commented on issue #38304: URL: https://github.com/apache/airflow/issues/38304#issuecomment-2036328564 Could i check this issue @potiuk ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] Retry if failed from queued should be separate from try_number [airflow]
AMK9978 commented on issue #38304: URL: https://github.com/apache/airflow/issues/38304#issuecomment-2028903102 @potiuk Unfortunately, I have had difficulties running the project and its test although the code of this feature may not be complex. I may add an issue about my problem because I didn't find anything relevant to it. I unassigned myself. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] Retry if failed from queued should be separate from try_number [airflow]
AMK9978 commented on issue #38304: URL: https://github.com/apache/airflow/issues/38304#issuecomment-2010174034 @potiuk As a new contributor to Airflow, I would like to work on this issue! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[I] Retry if failed from queued should be separate from try_number [airflow]
collinmcnulty opened a new issue, #38304: URL: https://github.com/apache/airflow/issues/38304 ### Description I think Airflow should have a configureable number of attempts for re-attempting to launch a task if it was killed for being stuck in queued for too long. Currently, such re-attempts consume task retries, but these are conceptually distinct from a task failing to run at all. ### Use case/motivation On a certain task that happens to not be idempotent, an Airflow user sets retries to zero intentionally, as a human will need to examine if the task can be safely retried or if manual intervention is necessary. However, if the same task is killed for being stuck in queued, the task never started, so the lack of idempotency does not matter and the task should definitely be re-attempted. Airflow currently does not allow a user to express this set of preferences. ### Related issues _No response_ ### Are you willing to submit a PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org