Great discussion. I generally agree with the "Custom scheduling class" / subclass approach which would run as part of the "scheduler" set of processes, rather than an internal DAG approach.
I do think it would be good to have boundaries on what information this class would operate on and at what frequency. This is primarily from a performance standpoint, though it could be argued that there are security concerns with that as well. Specifically from the "what information would this have access to" perspective, I think that interface would be helpful in clarifying some of the use cases and making sure that those are covered. One example I was thinking about in the "sunset" example is location. I was originally thinking of a timezone, but this is more specific than that. On Thu, Jan 21, 2021 at 10:35 AM Ash Berlin-Taylor <a...@apache.org> wrote: > It shouldn't need something that complex (or to my mind hacky) as in > internal DAG. > > The way the scheduler works now it just looks at two columns on the dag > (model) table called I think "next_dagrun_after" (which is the earliest > date that the dag run can be created, and "next execution date" (which is > the value to put in the execution date of the dag run when it's created. > > Both these values are set by the dag parser process, which has full access > to run code. What ever interface for defining new schedule expression > should run in the existing process, much like how James C did in a subclass. > > Ash > > On 21 January 2021 18:21:58 GMT, Daniel Imberman < > daniel.imber...@gmail.com> wrote: >> >> I think James Idea sounds like a pretty good idea. What would you all >> think of us doing something similar to how we handle smart sensors for how >> we implement this? Have an internal DAG that reads all custom timetables >> and triggers a DAG if the function returns True? Seems like a pretty >> simple/customizeable solution. >> >> On Wed, Jan 20, 2021 at 5:52 PM, James Timmins <ja...@astronomer.io> >> wrote: >> >> Django provides a really good model for allowing users to customize the >> behavior of Class Based Views. It's in line w/ what Daniel/Kaxil and co are >> saying about a consistent backend class. It uses a standard base class as >> well as a default concrete implementation. Customization then only requires >> setting an explicit class if you're overriding the default. >> >> Seems that the interface is more important than the backend mechanism to >> make this work. There are multiple ways to make this work internally, but >> the interface should be in line with future plans for hooks/extensible >> areas. >> >> Just to make things concrete, here's my understanding of what that would >> look like / what they're suggesting. >> >> *BaseTimetable abstract class* >> - Defines a `*get_next_execution_time*` method. This method accepts one >> argument, an arbitrary datetime value. Based on that datetime, this method >> returns the next time the DAG should start. This makes it easy to schedule >> past events, and also makes it easy to print out a "dry run" of execution >> times for testing purposes. >> - Defines a *'_check_timetable_arguments*` method that looks for any >> existing timetable args in the DAG and makes sure they're used by whatever >> Timetable class is selected. Error checking. >> >> *CronTimetable* - Default TimetableClass. Built on BaseTimetable. >> >> If they want a different timetable, they can just extend BaseTimetable >> and define a custom `get_next_execution_time` class. Then pass the class >> into the DAG constructor under the `timetable_class` argument. So for >> `sunset` or `sunrise`, they could easily create a `SolarTimetable` class >> and pass that in. >> >> `get_next_execution_time` can then be called whenever DAGs are parsed or >> whenever tasks run. >> >> On Wed, Jan 20, 2021 at 3:53 PM James Coder <jcode...@gmail.com> wrote: >> >>> Kaxil you beat me to it. I actually have a dag where I achieve an >>> irregular schedule by overriding DAG.next_dagrun_info(). If that method >>> were swapped out for an object it may be a semi-easy way to make the >>> schedule “plugable”. >>> >>> James Coder >>> >>> On Jan 20, 2021, at 6:37 PM, Kaxil Naik <kaxiln...@gmail.com> wrote: >>> >>> >>> "CronBackend" / "ScheduleIntervalBackend" :D similar to Xcom and Secrets >>> Backend >>> >>> Would be definitely good to have Custom Schedule intervals using >>> functions/class that is Serializable too. >>> >>> >>> On Wed, Jan 20, 2021 at 11:02 PM QP Hou <q...@scribd.com.invalid> wrote: >>> >>>> On Wed, Jan 20, 2021 at 10:22 AM Daniel Imberman >>>> <daniel.imber...@gmail.com> wrote: >>>> > >>>> > I love the idea of allowing users to create their own scheduling >>>> objects/scheduling python functions. They could either live in the >>>> scheduler or as a seperate process that trips some value in the DB when it >>>> is “true”. Would be great from a “marketplace” standpoint as well as users >>>> could post their custom scheduling objects for others to use. >>>> > >>>> >>>> I like this idea as well, a quick escape patch for custom and complex >>>> scheduling behaviors without having to wait for upstream support. >>>> >>>