jscheffl commented on PR #37836: URL: https://github.com/apache/airflow/pull/37836#issuecomment-1986796235
> If we do this it must be consistent between CLI and API. > > > I think there were related discussions in the past about how to treat a paused DAG when it is triggered > > I think this was about the trigger DAG from the UI (if it should automaticly set the DAG to active in case it's paused) cc @jscheffl I guess you would know? I don't remember there was an explicit discussion or decision about this. At least in the UI we explicitly handle this. If the DAG is not enabled then the user has the option (on per default) to enable the DAG while triggering. But the user is not forced to do. (+ a bit of beauty, if the DAG is already active then the option is not displayed because not needed) My personal opinion is that there are valid use cases that you can trigger a DAG even if it is not active. Two that are directly coming into my mind: - We use the API as interface to other systems and they can use the API to inject workload. But here might be administrative things going on such that we want to have the options as Ops to temporarily turn off scheduling a DAG to prevent errors or overload. But in such case I'd not like to block usage of trigger or the need to implement an upstream additional queue to buffer calls. Especially I see this as a real cool USP feature of Airflow that the DAG runs can be used as queue. - When testing a DAG via UI/interactively you might want to prepare test data/runs by triggering them and add the DAG runs into the queue w/o launching immediately. Once all test data is prepared (e.g. create a could of runs and test for concurrency, in our case prepare DAG runs for a small load test and wait how long processing takes) There might be more. But I would like to prevent changing this behavior ad-hoc w/o discussion or design decision as it touches and changes the external API and behavior. If it is only about intuitiveness, then I'd refer to documentation. Would be fair to extend the docs on this. I think it would be totally legit to extend the API to be like the UI to provide an option for the potentially three use cases like: 1. Call it, create a DAG run and "I don't care what the state is" == like today 2. Call it, create a DAG run and is disable then turn on DAG scheduling == like possible and default in UI 3. Call it and raise an exception to caller to signal that no workload is accepted == as proposed in this PR For backwards compatibility the option 1 would need to be default. And of course besides UI make API + CLI consistent :-D -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
