I agree with Elad. If I pause a DAG, I expect for whatever is running to
continue and for the scheduler to stop scheduling dag runs and new tasks. I
don't think we should change the behaviour to fix a chart. That feels
backwards.

Separately: In my view, choosing to to prevent or run new tasks from the
same dag run after a dag is paused could have gone either way (if you
consider the job the entire dag run, then letting the job finish feels the
most ideal), but I can see why the choice was made and at this point think
it's what people expect.

On Thu, Apr 24, 2025 at 2:09 PM Elad Kalif <elad...@apache.org> wrote:

> I am  -1 for changing this especially just to solve duration calculation.
> The current behavior is key for drain use case which is very useful.
>
> I don't think I will change my -1 before
> https://github.com/apache/airflow/issues/22006 is resolved
>
>
> On Thu, Apr 24, 2025 at 7:56 PM Brent Bovenzi <br...@astronomer.io.invalid
> >
> wrote:
>
> > Yeah, if we do a similar endpoint we should filter it to only include
> > unpaused Dags. We do check if the dag is paused during auto refresh in a
> > lot of places.
> >
> > On Fri, Apr 18, 2025 at 3:44 PM Pedro Nunes Leal
> > <pedro.n.l...@tecnico.ulisboa.pt.invalid> wrote:
> >
> > > A 2025-04-03 19:28, Brent Bovenzi escreveu:
> > > > The issue is that duration is based off of start and end dates. If
> > > > there is
> > > > no end date we usually default to now. But that is misleading when a
> > > > dag
> > > > run is running but the dag is paused.
> > > > Let me take a look at where we use duration in the 3.0 UI and see if
> we
> > > > can
> > > > reduce that confusion. We don't have the "5 longest dag runs" in our
> > > > new
> > > > dashboard page, which replaces cluster activity. If we wanted that
> > > > feature
> > > > again, we should be mindful of this and filter out paused dags in the
> > > > API
> > > > request.
> > > >
> > > >
> > > >
> > > > On Thu, Apr 3, 2025, 1:27 PM Pedro Nunes Leal
> > > > <pedro.n.l...@tecnico.ulisboa.pt.invalid> wrote:
> > > >
> > > >> A 2025-03-31 22:26, Jens Scheffler escreveu:
> > > >> > Hi,
> > > >> >
> > > >> > thanks for working on the bug and raising a PR to fix it.
> > > >> >
> > > >> > As other commiters also commented I think from product view I'd
> > expect
> > > >> > a
> > > >> > different resolution. We use the "Pause DAG" in most cases for
> > > >> > administrative or infrastructure problems to prevent further
> > failures
> > > >> > and/or to drain infra to switch some backend.
> > > >> >
> > > >> > I assume when we pause a long-running DAG that is in-between
> > execution
> > > >> > of tasks we want to really "pause" scheduling, we don't want to
> set
> > it
> > > >> > to failed. That would also not be correct because once we un-pause
> > the
> > > >> > running DAGs should continoue to work. I see no reason marking
> this
> > > >> > failed anf then manually running behind to reset the state later.
> > > >> >
> > > >> > My view on this is that as also proposed in the discussion of the
> > bug,
> > > >> > we should rather filter the paused DAG from clouster activity
> > > reporting
> > > >> > such that paused DAGs are not reported with excessive runtime.
> Also
> > > >> > later if un-paused it would be "right" that the overall DAG
> runtime
> > > was
> > > >> > longer than normal (would not expect to deduct the paused time
> from
> > > >> > runtime of the DAG.)
> > > >> >
> > > >> > If I want (as operator/admin) to really terminate existing running
> > > >> > instances I'd rather walk through Browse -> DAG Runs --> Filter
> for
> > > >> > running with paused DAG id and mark them as failed explicitly.
> > > >> >
> > > >> > Jens
> > > >> >
> > > >> > On 31.03.25 20:50, Pedro Nunes Leal wrote:
> > > >> >> Hello everyone,
> > > >> >>
> > > >> >> Currently, I'm trying to fix this bug:
> > > >> >> https://github.com/apache/airflow/issues/44443
> > > >> >>
> > > >> >> Basically, the issue is that the DAGs would be stuck on running
> > even
> > > >> >> though they were paused.
> > > >> >> Consequently, the duration of the dag run will keep on increasing
> > > even
> > > >> >> though the DAG is paused.
> > > >> >>
> > > >> >> My proposal to solve this problem is changing the DAGs state from
> > > >> >> running to failed, when paused, to avoid the increment of their
> > > >> >> duration.
> > > >> >>
> > > >> >> Since this can be an impactful change, I would like to hear what
> > > >> >> others think about it.
> > > >> >>
> > > >> >> Link for the Pull Request:
> > > >> >> https://github.com/apache/airflow/pull/47557
> > > >> >>
> > > >> >>
> > > >> >>
> > ---------------------------------------------------------------------
> > > >> >> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> > > >> >> For additional commands, e-mail: dev-h...@airflow.apache.org
> > > >> >>
> > > >> >
> > > >> >
> > ---------------------------------------------------------------------
> > > >> > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> > > >> > For additional commands, e-mail: dev-h...@airflow.apache.org
> > > >> That can be a better approach.
> > > >>
> > > >> However, if I'm not mistaken, the code related to the cluster
> activity
> > > >> page doesn't exist in Airflow 3 (the version where I'm trying to do
> > > >> the
> > > >> changes).
> > > >>
> > > >> So what should I do in this case?
> > > >> Is there any other way not involving cluster activity to solve this
> > > >> problem?
> > > >>
> > > >> The change to queued state instead of fail was my proposal at the
> > > >> beginning, and it really pauses the DAG.
> > > >> This is the type of solution I was thinking, because as I said
> before
> > > >> in
> > > >> the pull request, I feel that the cluster activity behavior is just
> a
> > > >> symptom from a bigger problem (the DAGs doesn't really pause, they
> > > >> just
> > > >> keep running).
> > > >>
> > > >>
> ---------------------------------------------------------------------
> > > >> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> > > >> For additional commands, e-mail: dev-h...@airflow.apache.org
> > > >>
> > > >>
> > > Hello,
> > >
> > > Any update related to the use of duration in the UI 3.0?
> > >
> > > Maybe this bug isn't really an issue if cluster activity was removed in
> > > the newer version, and it's just something to have in mind in case
> > > something similar to cluster activity is implemented in 3.0 UI.
> > >
> > >  From what I understand, the current behavior of staying on running and
> > > the duration increasing is what is expected from the pause
> > > functionality.
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> > > For additional commands, e-mail: dev-h...@airflow.apache.org
> > >
> > >
> >
>

Reply via email to