Well, this is a discussion thread, so let's discuss!

Vikram, what kind of implications are you thinking of?  Maybe you could
provide a little more detail about your concerns?

There are certainly alternatives, and of course each of them has tradeoffs.

Here are the options I see:

1. consolidate to schedule (deprecate schedule_interval and timetable; the
proposal under discussion in this thread)
2. do nothing (keep 3 different scheduling params)
3. consolidate to schedule_interval (deprecate timetable)
4. don't deprecate timetable, but remove schedule_on (AIP-48 has added it
in main) and take List[Dataset] in `schedule_interval`

Which would you propose we go with?

If we have a veto against 1 (and again, we're in hypothetical territory
since this is still a discussion thread), my vote would definitely be to go
with 3.

The param `schedule_interval` as catch all is not *sooo *bad, but
`schedule` definitely seems better.  From a new user's perspective, having
misleading names is confusing, and can be a negative when looking at
adoption.  And for a tenured user, it can nag at you.  And dealing with
deprecations as an Airflow user, well it's sort of part of the job, though
it can be bothersome, certainly.

Just to look at / compare feasibility.

Deprecating schedule_interval, it would be more prominent than some other
deprecations because it's a DAG param.  That said, I think it's pretty
straightforward to handle.  For one, it's at the cardinality of DAG and not
*task*, so there is less to change with than some other deprecations.  And
being kwargs-only, it's an easy find and replace that could be automated
with an `upgrade-check` script.  It's very easy to detect, either to warn
about, or ultimately, after upgrade, to fail the DAG parse (which forces
user to fix).  Perhaps we could even experiment with a deprecation fix
tool, so that users could get help fixing deprecations, not necessarily in
the context of an upgrade, since we're not looking at 3.0 any time soon.

We could compare with execution_date, if execution_date is to be removed in
3.0.  I think the deprecation of execution_date will be a more difficult
thing for users to handle.  For one, potentially higher cardinality, since
it is used in tasks.  But also, the use of execution_date can be harder to
detect, since it can be in templates and elsewhere in custom operators.  *(With
schedule_interval, since it's a top-level dag kwarg, it will be easier.)*
And execution_date was replaced with `logical_date` for similar reasons --
avoiding user confusion.




On Fri, Jul 29, 2022 at 2:42 PM Vikram Koka <vik...@astronomer.io.invalid>
wrote:

> -1 from me.
>
> Though I agree in principle with the idea of consolidation, I don't think
> we should be doing this yet until we understand the implications
> completely.
> I am really not in favor of deprecation of the existing params, unless
> there is really no alternative.
>
>
> On Fri, Jul 29, 2022 at 2:37 PM Daniel Standish
> <daniel.stand...@astronomer.io.invalid> wrote:
>
>> So far, seems all in favor.
>>
>> I will just highlight, in case it's not clear.... when we release this
>> (presumably 2.4), basically every single dag in existence will start
>> emitting deprecation warnings, and prior to 3.0, basically every dag in
>> existence will need to be updated.
>>
>> Thankfully, for most people, this should be an easy update since DAG is
>> kwargs only, IIRC.
>>
>>
>>
>> On Fri, Jul 29, 2022, 2:27 PM Brent Bovenzi <br...@astronomer.io.invalid>
>> wrote:
>>
>>> +1 to consolidating and "schedule".
>>>
>>> On Fri, Jul 29, 2022, 10:12 PM Drew Hubl <drew.h...@astronomer.io.invalid>
>>> wrote:
>>>
>>>> +1 for ‘schedule', and another +1 to the importance of consolidating to
>>>> one being more important than the name of that one
>>>>
>>>> On Jul 29, 2022, at 1:58 PM, Josh Fell <
>>>> josh.d.f...@astronomer.io.INVALID> wrote:
>>>>
>>>> Consolidating to a single scheduling parameter is a big +1 from me.
>>>> `schedule` seems like a nice catch-all.
>>>>
>>>> On Fri, Jul 29, 2022 at 9:10 AM Jed Cunningham <
>>>> jedcunning...@apache.org> wrote:
>>>>
>>>>> +1 on moving to a single `schedule` param.
>>>>>
>>>>
>>>>

Reply via email to