Great discussion.

I generally agree with the "Custom scheduling class"  / subclass approach
which would run as part of the "scheduler" set of processes, rather than an
internal DAG approach.

I do think it would be good to have boundaries on what information this
class would operate on and at what frequency. This is primarily from a
performance standpoint, though it could be argued that there are security
concerns with that as well.

Specifically from the "what information would this have access to"
perspective, I think that interface would be helpful in clarifying some of
the use cases and making sure that those are covered. One example I was
thinking about in the "sunset" example is location. I was originally
thinking of a timezone, but this is more specific than that.



On Thu, Jan 21, 2021 at 10:35 AM Ash Berlin-Taylor <a...@apache.org> wrote:

> It shouldn't need something that complex (or to my mind hacky) as in
> internal DAG.
>
> The way the scheduler works now it just looks at two columns on the dag
> (model) table called I think "next_dagrun_after" (which is the earliest
> date that the dag run can be created, and "next execution date" (which is
> the value to put in the execution date of the dag run when it's created.
>
> Both these values are set by the dag parser process, which has full access
> to run code. What ever interface for defining new schedule expression
> should run in the existing process, much like how James C did in a subclass.
>
> Ash
>
> On 21 January 2021 18:21:58 GMT, Daniel Imberman <
> daniel.imber...@gmail.com> wrote:
>>
>> I think James Idea sounds like a pretty good idea. What would you all
>> think of us doing something similar to how we handle smart sensors for how
>> we implement this? Have an internal DAG that reads all custom timetables
>> and triggers a DAG if the function returns True? Seems like a pretty
>> simple/customizeable solution.
>>
>> On Wed, Jan 20, 2021 at 5:52 PM, James Timmins <ja...@astronomer.io>
>> wrote:
>>
>> Django provides a really good model for allowing users to customize the
>> behavior of Class Based Views. It's in line w/ what Daniel/Kaxil and co are
>> saying about a consistent backend class. It uses a standard base class as
>> well as a default concrete implementation. Customization then only requires
>> setting an explicit class if you're overriding the default.
>>
>> Seems that the interface is more important than the backend mechanism to
>> make this work. There are multiple ways to make this work internally, but
>> the interface should be in line with future plans for hooks/extensible
>> areas.
>>
>> Just to make things concrete, here's my understanding of what that would
>> look like / what they're suggesting.
>>
>> *BaseTimetable abstract class*
>> - Defines a `*get_next_execution_time*` method. This method accepts one
>> argument, an arbitrary datetime value. Based on that datetime, this method
>> returns the next time the DAG should start. This makes it easy to schedule
>> past events, and also makes it easy to print out a "dry run" of execution
>> times for testing purposes.
>> - Defines a *'_check_timetable_arguments*` method that looks for any
>> existing timetable args in the DAG and makes sure they're used by whatever
>> Timetable class is selected. Error checking.
>>
>> *CronTimetable* - Default TimetableClass. Built on BaseTimetable.
>>
>> If they want a different timetable, they can just extend BaseTimetable
>> and define a custom `get_next_execution_time` class. Then pass the class
>> into the DAG constructor under the `timetable_class` argument. So for
>> `sunset` or `sunrise`, they could easily create a `SolarTimetable` class
>> and pass that in.
>>
>> `get_next_execution_time` can then be called whenever DAGs are parsed or
>> whenever tasks run.
>>
>> On Wed, Jan 20, 2021 at 3:53 PM James Coder <jcode...@gmail.com> wrote:
>>
>>> Kaxil you beat me to it. I actually have a dag where I achieve an
>>> irregular schedule by overriding DAG.next_dagrun_info(). If that method
>>> were swapped out for an object it may be a semi-easy way to make the
>>> schedule “plugable”.
>>>
>>> James Coder
>>>
>>> On Jan 20, 2021, at 6:37 PM, Kaxil Naik <kaxiln...@gmail.com> wrote:
>>>
>>> 
>>> "CronBackend" / "ScheduleIntervalBackend" :D similar to Xcom and Secrets
>>> Backend
>>>
>>> Would be definitely good to have Custom Schedule intervals using
>>> functions/class that is Serializable too.
>>>
>>>
>>> On Wed, Jan 20, 2021 at 11:02 PM QP Hou <q...@scribd.com.invalid> wrote:
>>>
>>>> On Wed, Jan 20, 2021 at 10:22 AM Daniel Imberman
>>>> <daniel.imber...@gmail.com> wrote:
>>>> >
>>>> > I love the idea of allowing users to create their own scheduling
>>>> objects/scheduling python functions. They could either live in the
>>>> scheduler or as a seperate process that trips some value in the DB when it
>>>> is “true”. Would be great from a “marketplace” standpoint as well as users
>>>> could post their custom scheduling objects for others to use.
>>>> >
>>>>
>>>> I like this idea as well, a quick escape patch for custom and complex
>>>> scheduling behaviors without having to wait for upstream support.
>>>>
>>>

Reply via email to