My only concern with tying this to the dag_parsing process is that that process might miss SLAs because it takes too long to loop around. I could imagine a separate thread or component that can read either TimeTable objects or SmartSensor objects and run them might make sense. Ultimately I don’t see anything about SmartSensors that specifically need to run in a DAG. It could just as easily be while loop or something embarrasingly parallel (as sensors/timetables shouldn’t depend on each other).
On Thu, Jan 21, 2021 at 11:07 AM, Vikram Koka <vik...@astronomer.io> wrote: Great discussion. I generally agree with the "Custom scheduling class" / subclass approach which would run as part of the "scheduler" set of processes, rather than an internal DAG approach. I do think it would be good to have boundaries on what information this class would operate on and at what frequency. This is primarily from a performance standpoint, though it could be argued that there are security concerns with that as well. Specifically from the "what information would this have access to" perspective, I think that interface would be helpful in clarifying some of the use cases and making sure that those are covered. One example I was thinking about in the "sunset" example is location. I was originally thinking of a timezone, but this is more specific than that. On Thu, Jan 21, 2021 at 10:35 AM Ash Berlin-Taylor < a...@apache.org [a...@apache.org] > wrote: It shouldn't need something that complex (or to my mind hacky) as in internal DAG. The way the scheduler works now it just looks at two columns on the dag (model) table called I think "next_dagrun_after" (which is the earliest date that the dag run can be created, and "next execution date" (which is the value to put in the execution date of the dag run when it's created. Both these values are set by the dag parser process, which has full access to run code. What ever interface for defining new schedule expression should run in the existing process, much like how James C did in a subclass. Ash On 21 January 2021 18:21:58 GMT, Daniel Imberman < daniel.imber...@gmail.com [daniel.imber...@gmail.com] > wrote: I think James Idea sounds like a pretty good idea. What would you all think of us doing something similar to how we handle smart sensors for how we implement this? Have an internal DAG that reads all custom timetables and triggers a DAG if the function returns True? Seems like a pretty simple/customizeable solution. On Wed, Jan 20, 2021 at 5:52 PM, James Timmins < ja...@astronomer.io [ja...@astronomer.io] > wrote: Django provides a really good model for allowing users to customize the behavior of Class Based Views. It's in line w/ what Daniel/Kaxil and co are saying about a consistent backend class. It uses a standard base class as well as a default concrete implementation. Customization then only requires setting an explicit class if you're overriding the default. Seems that the interface is more important than the backend mechanism to make this work. There are multiple ways to make this work internally, but the interface should be in line with future plans for hooks/extensible areas. Just to make things concrete, here's my understanding of what that would look like / what they're suggesting. BaseTimetable abstract class - Defines a ` get_next_execution_time ` method. This method accepts one argument, an arbitrary datetime value. Based on that datetime, this method returns the next time the DAG should start. This makes it easy to schedule past events, and also makes it easy to print out a "dry run" of execution times for testing purposes. - Defines a '_check_timetable_arguments ` method that looks for any existing timetable args in the DAG and makes sure they're used by whatever Timetable class is selected. Error checking. CronTimetable - Default TimetableClass. Built on BaseTimetable. If they want a different timetable, they can just extend BaseTimetable and define a custom `get_next_execution_time` class. Then pass the class into the DAG constructor under the `timetable_class` argument. So for `sunset` or `sunrise`, they could easily create a `SolarTimetable` class and pass that in. `get_next_execution_time` can then be called whenever DAGs are parsed or whenever tasks run. On Wed, Jan 20, 2021 at 3:53 PM James Coder < jcode...@gmail.com [jcode...@gmail.com] > wrote: Kaxil you beat me to it. I actually have a dag where I achieve an irregular schedule by overriding DAG.next [http://DAG.next] _dagrun_info(). If that method were swapped out for an object it may be a semi-easy way to make the schedule “plugable”. James Coder On Jan 20, 2021, at 6:37 PM, Kaxil Naik < kaxiln...@gmail.com [kaxiln...@gmail.com] > wrote: "CronBackend" / "ScheduleIntervalBackend" :D similar to Xcom and Secrets Backend Would be definitely good to have Custom Schedule intervals using functions/class that is Serializable too. On Wed, Jan 20, 2021 at 11:02 PM QP Hou <q...@scribd.com.invalid> wrote: On Wed, Jan 20, 2021 at 10:22 AM Daniel Imberman < daniel.imber...@gmail.com [daniel.imber...@gmail.com] > wrote: > > I love the idea of allowing users to create their own scheduling > objects/scheduling python functions. They could either live in the scheduler > or as a seperate process that trips some value in the DB when it is “true”. > Would be great from a “marketplace” standpoint as well as users could post > their custom scheduling objects for others to use. > I like this idea as well, a quick escape patch for custom and complex scheduling behaviors without having to wait for upstream support.