Re: [PR] Raising Exception if dag is not in active state while triggering dag [airflow]

via GitHub Sat, 09 Mar 2024 00:41:37 -0800


jscheffl commented on PR #37836:
URL: https://github.com/apache/airflow/pull/37836#issuecomment-1986796235


   > If we do this it must be consistent between CLI and API.
   > 
   > > I think there were related discussions in the past about how to treat a 
paused DAG when it is triggered
   > 
   > I think this was about the trigger DAG from the UI (if it should 
automaticly set the DAG to active in case it's paused) cc @jscheffl I guess you 
would know?
   
   I don't remember there was an explicit discussion or decision about this. At 
least in the UI we explicitly handle this. If the DAG is not enabled then the 
user has the option (on per default) to enable the DAG while triggering. But 
the user is not forced to do. (+ a bit of beauty, if the DAG is already active 
then the option is not displayed because not needed)
   
   My personal opinion is that there are valid use cases that you can trigger a 
DAG even if it is not active. Two that are directly coming into my mind:
   - We use the API as interface to other systems and they can use the API to 
inject workload. But here might be administrative things going on such that we 
want to have the options as Ops to temporarily turn off scheduling a DAG to 
prevent errors or overload. But in such case I'd not like to block usage of 
trigger or the need to implement an upstream additional queue to buffer calls. 
Especially I see this as a real cool USP feature of Airflow that the DAG runs 
can be used as queue.
   - When testing a DAG via UI/interactively you might want to prepare test 
data/runs by triggering them and add the DAG runs into the queue w/o launching 
immediately. Once all test data is prepared (e.g. create a could of runs and 
test for concurrency, in our case prepare DAG runs for a small load test and 
wait how long processing takes)
   
   There might be more. But I would like to prevent changing this behavior 
ad-hoc w/o discussion or design decision as it touches and changes the external 
API and behavior. If it is only about intuitiveness, then I'd refer to 
documentation. Would be fair to extend the docs on this.
   
   I think it would be totally legit to extend the API to be like the UI to 
provide an option for the potentially three use cases like:
   1. Call it, create a DAG run and "I don't care what the state is" == like 
today
   2. Call it, create a DAG run and is disable then turn on DAG scheduling == 
like possible and default in UI
   3. Call it and raise an exception to caller to signal that no workload is 
accepted == as proposed in this PR
   
   For backwards compatibility the option 1 would need to be default.
   And of course besides UI make API + CLI consistent :-D


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Raising Exception if dag is not in active state while triggering dag [airflow]

Reply via email to