My only concern with tying this to the dag_parsing process is that that process 
might miss SLAs because it takes too long to loop around. I could imagine a 
separate thread or component that can read either TimeTable objects or 
SmartSensor objects and run them might make sense.
Ultimately I don’t see anything about SmartSensors that specifically need to 
run in a DAG. It could just as easily be while loop or something embarrasingly 
parallel (as sensors/timetables shouldn’t depend on each other).

On Thu, Jan 21, 2021 at 11:07 AM, Vikram Koka <vik...@astronomer.io> wrote:
Great discussion.
I generally agree with the "Custom scheduling class" / subclass approach which 
would run as part of the "scheduler" set of processes, rather than an internal 
DAG approach.
I do think it would be good to have boundaries on what information this class 
would operate on and at what frequency. This is primarily from a performance 
standpoint, though it could be argued that there are security concerns with 
that as well.
Specifically from the "what information would this have access to" perspective, 
I think that interface would be helpful in clarifying some of the use cases and 
making sure that those are covered. One example I was thinking about in the 
"sunset" example is location. I was originally thinking of a timezone, but this 
is more specific than that.


On Thu, Jan 21, 2021 at 10:35 AM Ash Berlin-Taylor < a...@apache.org 
[a...@apache.org] > wrote:
It shouldn't need something that complex (or to my mind hacky) as in internal 
DAG.

The way the scheduler works now it just looks at two columns on the dag (model) 
table called I think "next_dagrun_after" (which is the earliest date that the 
dag run can be created, and "next execution date" (which is the value to put in 
the execution date of the dag run when it's created.

Both these values are set by the dag parser process, which has full access to 
run code. What ever interface for defining new schedule expression should run 
in the existing process, much like how James C did in a subclass.

Ash

On 21 January 2021 18:21:58 GMT, Daniel Imberman < daniel.imber...@gmail.com 
[daniel.imber...@gmail.com] > wrote: I think James Idea sounds like a pretty 
good idea. What would you all think of us doing something similar to how we 
handle smart sensors for how we implement this? Have an internal DAG that reads 
all custom timetables and triggers a DAG if the function returns True? Seems 
like a pretty simple/customizeable solution.
On Wed, Jan 20, 2021 at 5:52 PM, James Timmins < ja...@astronomer.io 
[ja...@astronomer.io] > wrote:
Django provides a really good model for allowing users to customize the 
behavior of Class Based Views. It's in line w/ what Daniel/Kaxil and co are 
saying about a consistent backend class. It uses a standard base class as well 
as a default concrete implementation. Customization then only requires setting 
an explicit class if you're overriding the default.
Seems that the interface is more important than the backend mechanism to make 
this work. There are multiple ways to make this work internally, but the 
interface should be in line with future plans for hooks/extensible areas.
Just to make things concrete, here's my understanding of what that would look 
like / what they're suggesting.
BaseTimetable abstract class - Defines a ` get_next_execution_time ` method. 
This method accepts one argument, an arbitrary datetime value. Based on that 
datetime, this method returns the next time the DAG should start. This makes it 
easy to schedule past events, and also makes it easy to print out a "dry run" 
of execution times for testing purposes. - Defines a 
'_check_timetable_arguments ` method that looks for any existing timetable args 
in the DAG and makes sure they're used by whatever Timetable class is selected. 
Error checking.
CronTimetable - Default TimetableClass. Built on BaseTimetable.
If they want a different timetable, they can just extend BaseTimetable and 
define a custom `get_next_execution_time` class. Then pass the class into the 
DAG constructor under the `timetable_class` argument. So for `sunset` or 
`sunrise`, they could easily create a `SolarTimetable` class and pass that in.
`get_next_execution_time` can then be called whenever DAGs are parsed or 
whenever tasks run.
On Wed, Jan 20, 2021 at 3:53 PM James Coder < jcode...@gmail.com 
[jcode...@gmail.com] > wrote:
Kaxil you beat me to it. I actually have a dag where I achieve an irregular 
schedule by overriding DAG.next [http://DAG.next] _dagrun_info(). If that 
method were swapped out for an object it may be a semi-easy way to make the 
schedule “plugable”.

James Coder
On Jan 20, 2021, at 6:37 PM, Kaxil Naik < kaxiln...@gmail.com 
[kaxiln...@gmail.com] > wrote:

"CronBackend" / "ScheduleIntervalBackend" :D similar to Xcom and Secrets Backend
Would be definitely good to have Custom Schedule intervals using 
functions/class that is Serializable too.

On Wed, Jan 20, 2021 at 11:02 PM QP Hou <q...@scribd.com.invalid> wrote:
On Wed, Jan 20, 2021 at 10:22 AM Daniel Imberman
< daniel.imber...@gmail.com [daniel.imber...@gmail.com] > wrote:
>
> I love the idea of allowing users to create their own scheduling 
> objects/scheduling python functions. They could either live in the scheduler 
> or as a seperate process that trips some value in the DB when it is “true”. 
> Would be great from a “marketplace” standpoint as well as users could post 
> their custom scheduling objects for others to use.
>

I like this idea as well, a quick escape patch for custom and complex
scheduling behaviors without having to wait for upstream support.

Reply via email to