It's not addressed explicitly in the vote.  Personally I think we should
add a deprecation warning immediately.  I understand that there is the
thought to "give users time to make the change" before adding the warning.
But deprecation warnings are probably the most effective, most relied upon
mechanism by which we signal that change is required.  I don't really see a
benefit to delaying.  And even though 3.0 is not anywhere in sight, some
users will probably end up staying on 2.4 until 3.0 đŸ¤·.  Adding the warning
immediately maximizes the time that users have to make the change, and
maximizes the likelihood that users will receive the warning.  That said, I
would certainly not veto if that's what the group wants to do.  And if it
seems that folks are in favor of this, I can amend the vote thread to add
that provision.

On Sat, Aug 6, 2022 at 12:01 PM Ash Berlin-Taylor <a...@apache.org> wrote:

> One question: are we deprecating the existing parameters straight away or
> leaving it without a warning for one release? (I think someone talked about
> that at one point)
>
> On 6 August 2022 19:46:02 BST, Daniel Standish
> <daniel.stand...@astronomer.io.INVALID> wrote:
>>
>> Well, thanks for forcing us to make the case :)
>>
>> I'll initiate a lazy consensus "vote"
>>
>>
>> On Fri, Aug 5, 2022 at 1:01 PM Vikram Koka <vik...@astronomer.io.invalid>
>> wrote:
>>
>>> Fair points all.
>>> I withdraw my objection and agree with the consolidation proposal.
>>>
>>> Vikram
>>>
>>>
>>> On Fri, Aug 5, 2022 at 10:06 AM Jarek Potiuk <ja...@potiuk.com> wrote:
>>>
>>>> Let me also add to why I think we should consolidate to one param::
>>>>
>>>> * "schedule_interval"
>>>> * "schedule_on"
>>>> * "timetable"
>>>>
>>>> Imagine you are a new user coming to Airflow.
>>>>
>>>> Do you immediately recognise which one is which ? Do you see what the
>>>> difference is?
>>>>
>>>> * "Schedule_on" = "Every Sunday" is equally good as "schedule_on"
>>>> "dataset".
>>>> * "timetable" - what even is a timetable? Is it a "cron TAB" ? Or what?
>>>>
>>>> For me even if we wanted to have different names, the current set of
>>>> names for those params is just terribly ambiguous.
>>>>
>>>> Making it "schedule" and acting depending on what you pass to it is so
>>>> much better and simpler.
>>>>
>>>> J.
>>>>
>>>>
>>>> On Fri, Aug 5, 2022 at 6:57 PM Daniel Standish
>>>> <daniel.stand...@astronomer.io.invalid> wrote:
>>>>
>>>>> @Vikram, yeah, I understand that concern.  Making a breaking change is
>>>>> something we shouldn't do carelessly, without consideration, without good
>>>>> reason.  But I think we have a good reason here.
>>>>>
>>>>> Deprecations are somewhat regular in airflow.  In 2.0 we moved a lot
>>>>> of imports.  We also changed the CLI semantics.  Probably there were other
>>>>> breaking changes that I'm forgetting *(granted sometimes we didn't
>>>>> remove the backward-compatibility immediately in 2.0, and that remains an
>>>>> option here)*.  Probably just about every dag in existence had to be
>>>>> changed then too.  And that's what a major release is for.
>>>>>
>>>>> In 3.0, it looks like we're deprecating execution_date...  That's even
>>>>> bigger.  So, were we to do this, (1) moving to `schedule` wouldn't be the
>>>>> only notable breaking change and (2) by comparison it's a pretty simple
>>>>> change.  And if, as part of the upgrade to a major release, you have to do
>>>>> renames and such anyway, the marginal cost of one more isn't massive.
>>>>>
>>>>> I guess my take is that we can't be chained to our previous decisions
>>>>> forever.  And we have to have the flexibility to adapt our interfaces so
>>>>> that they still make sense and are consistent and don't cause excessive
>>>>> eyebrow wrinkling.  We have adopted semantic versioning to provide a
>>>>> mechanism for signaling when breaking changes are coming.  And we can add
>>>>> tooling to help users along the way.
>>>>>
>>>>> Additionally, we can't *only *look at the burden of such a change for
>>>>> users. We must also weigh this against the burden that we maintain by
>>>>> *not* changing, i.e. by having too many params, or ill-named params.
>>>>>
>>>>>
>>>>> On Thu, Aug 4, 2022 at 2:57 PM Jed Cunningham <
>>>>> jedcunning...@apache.org> wrote:
>>>>>
>>>>>> I'm not in favor of reusing `schedule_interval` for datasets (or
>>>>>> folding `timetable` into it either), primarily because it doesn't feel 
>>>>>> like
>>>>>> a good name to describe what it does. If we don't have the appetite to
>>>>>> consolidate, my vote is we stick with 3 separate params. So preferably 
>>>>>> #1,
>>>>>> if not then #2 for me.
>>>>>>
>>>>>> I'm less concerned about the "everyone has to update every DAG"
>>>>>> issue, as it only becomes a problem for Airflow 3, folks can freely 
>>>>>> ignore
>>>>>> it until then, and when they do update it it's a simple find/replace.
>>>>>>
>>>>>>
>>>>>> On Thu, Aug 4, 2022 at 2:17 PM Andrew Gibbs <
>>>>>> a.r.gibbs.massma...@gmail.com> wrote:
>>>>>>
>>>>>>> I'll just +1 that IMO a UI reminder is a great option.
>>>>>>>
>>>>>>> 2.3.0 added an alert when robots.txt was being scanned, which was
>>>>>>> great, because I ignored it for a while, but eventually caved and looked
>>>>>>> into it. This I would see as more of the same. I currently get loads of
>>>>>>> warnings about things in my config file that have moved; but typically 
>>>>>>> only
>>>>>>> if I have to do a `airflow db upgrade` or `airflow dags backfill`. I see
>>>>>>> the alert, think "I must fix those", and then the Homer effect kicks in,
>>>>>>> and something else takes up the space in my brain. Having them as a 
>>>>>>> banner
>>>>>>> in the GUI would 100% get me to sort it sooner or later.
>>>>>>>
>>>>>>> On Thu, Aug 4, 2022 at 9:32 PM Vikram Koka
>>>>>>> <vik...@astronomer.io.invalid> wrote:
>>>>>>>
>>>>>>>> Hi Daniel,
>>>>>>>>
>>>>>>>> I completely agree with wanting to consolidate, but I am very
>>>>>>>> concerned about people needing to change their existing code.
>>>>>>>> As Ping and Felix also said, I worry that people ignore deprecation
>>>>>>>> warnings, especially since the original code authors don't really look 
>>>>>>>> at
>>>>>>>> logs or other pipeline artifacts unless something is broken.
>>>>>>>>
>>>>>>>> With respect to the options you outlined, I am personally leaning
>>>>>>>> towards (4) i.e. "don't deprecate timetable, but remove schedule_on 
>>>>>>>> (AIP-48
>>>>>>>> has added it in main) and take List[Dataset] in `schedule_interval`". I
>>>>>>>> could be convinced with (3), since I think that "timetable" is 
>>>>>>>> relatively
>>>>>>>> recent and have not seen as much usage for this invocation yet. I am
>>>>>>>> definitely curious about other perspectives on this as well.
>>>>>>>>
>>>>>>>> I have a hard time with option (1) of deprecating "schedule
>>>>>>>> interval", because of the historical nature of it and the code changes 
>>>>>>>> that
>>>>>>>> people may have to make to deal with it.
>>>>>>>>
>>>>>>>> Thanks for raising this topic and the discussion,
>>>>>>>>
>>>>>>>> Vikram
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Aug 4, 2022 at 10:24 AM Daniel Standish
>>>>>>>> <daniel.stand...@astronomer.io.invalid> wrote:
>>>>>>>>
>>>>>>>>> I agree with those (including myself) who have suggested that it
>>>>>>>>> would be great to have a deprecation fix tool.
>>>>>>>>>
>>>>>>>>> I also think that UI notifications would be valuable.  Another
>>>>>>>>> option would be to provide a less intrusive indicator somewhere (like 
>>>>>>>>> when
>>>>>>>>> chrome lets you know you need to update), and then you click on it 
>>>>>>>>> and it
>>>>>>>>> takes you to a page that lists out all the deprecation warnings.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Aug 4, 2022 at 1:03 AM Jarek Potiuk <ja...@potiuk.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> I know we are distracting a bit from the main subject, but I
>>>>>>>>>> think it's worth it.
>>>>>>>>>>
>>>>>>>>>> I think this is a fact of life that people don't remove
>>>>>>>>>> deprecations. But we have a choice of "not doing anything about it" 
>>>>>>>>>> and
>>>>>>>>>> complaining. Or doing something that might improve it.
>>>>>>>>>>
>>>>>>>>>> I usually err on the side of "doing something" when I see a
>>>>>>>>>> problem. I am a doer :).
>>>>>>>>>>
>>>>>>>>>> And I have a possibly controversial proposal. I got the
>>>>>>>>>> inspiration from node.js old versions. Whenever an old version of 
>>>>>>>>>> Node is
>>>>>>>>>> being used, you have a 20 seconds delay artificially added and 
>>>>>>>>>> information
>>>>>>>>>> that in order to remove those 20 seconds you need to upgrade.
>>>>>>>>>>
>>>>>>>>>> Why don't we do that for some really old/serious deprecations ?
>>>>>>>>>> Maybe not 20 seconds. Maybe adding an unremovable RED warning in the 
>>>>>>>>>> UI if
>>>>>>>>>> you have any deprecation unhandled. Something that is annoying 
>>>>>>>>>> enough that
>>>>>>>>>> will cause people to make an action but not intrusive enough to break
>>>>>>>>>> things.
>>>>>>>>>>
>>>>>>>>>> I think this is a very basic psychological trait:
>>>>>>>>>>
>>>>>>>>>> If it does not cost the user anything to leave the deprecations,
>>>>>>>>>> there will be a big number of people who will do that.
>>>>>>>>>> We need to make sure that deprecations are costing our users "a
>>>>>>>>>> little" and that they are aware that they can remove the cost (and 
>>>>>>>>>> know how
>>>>>>>>>> - this is super important that we only show such a message with clear
>>>>>>>>>> instructions on what to do).
>>>>>>>>>>
>>>>>>>>>> Such a message might be annoying in many ways:
>>>>>>>>>>
>>>>>>>>>> * it can obstruct significant part of the screen
>>>>>>>>>> * It can be flashy and had very annoying colours (think bold pink
>>>>>>>>>> fonts on flash-green background for example)
>>>>>>>>>> * It can not allow to do anything before the user closes it with
>>>>>>>>>> a REALLY small 'x' that it is difficult to point at (think cookie 
>>>>>>>>>> banners).
>>>>>>>>>> And when they click they can be greeted with "You know - you can 
>>>>>>>>>> actually
>>>>>>>>>> remove this warning by doing THIS".
>>>>>>>>>> * and we have probably more options - our UX specialists might
>>>>>>>>>> have some good ideas here
>>>>>>>>>>
>>>>>>>>>> For anyone who complains about "people do not act on
>>>>>>>>>> deprecation." - yeah this is true.
>>>>>>>>>> But one universal truth there is that if we won't act, this will
>>>>>>>>>> change nothing in the behaviour of our users - this is 100% certain.
>>>>>>>>>>
>>>>>>>>>> If we do something - it might improve it. We will not get a 100%
>>>>>>>>>> success for sure, but we can at least do something to improve it.
>>>>>>>>>>
>>>>>>>>>> WDYT?
>>>>>>>>>>
>>>>>>>>>> J.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Thu, Aug 4, 2022 at 6:52 AM Felix Uellendall
>>>>>>>>>> <felue...@pm.me.invalid> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Ping,
>>>>>>>>>>>
>>>>>>>>>>> I had this idea about auto-fixing deprecation warnings too when
>>>>>>>>>>> I started the airflint project.
>>>>>>>>>>> I totally agree with what you said that often people ignore
>>>>>>>>>>> these warnings and therefore a cli tool would come in very handy. 
>>>>>>>>>>> It could
>>>>>>>>>>> look similar to the airflint project and also be based on the python
>>>>>>>>>>> refactor package, but I would keep it out of this discussion for 
>>>>>>>>>>> now.
>>>>>>>>>>>
>>>>>>>>>>> Regarding the schedule arg, I don’t have a strong opinion there
>>>>>>>>>>> but also tend to having one schedule arg, because of simplicity
>>>>>>>>>>> for the dag author.
>>>>>>>>>>>
>>>>>>>>>>> Best,
>>>>>>>>>>> Felix
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Sent from Proton Mail for iOS
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Aug 4, 2022 at 00:23, Ping Zhang <pin...@umich.edu>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hi Daniel,
>>>>>>>>>>>
>>>>>>>>>>> +1 for ‘schedule' and thanks for bringing this up.
>>>>>>>>>>>
>>>>>>>>>>> I agree with Vikram, we should be very careful about deprecating
>>>>>>>>>>> existing params even if we have warnings around it. Not sure if 
>>>>>>>>>>> this is a
>>>>>>>>>>> general case, but I notice that people ignore warnings all the time 
>>>>>>>>>>> until
>>>>>>>>>>> it starts to break. Thus I am thinking if airflow can have a cli 
>>>>>>>>>>> tool to
>>>>>>>>>>> help users to batch update these deprecating params just like 
>>>>>>>>>>> python's
>>>>>>>>>>> futurize tool. This can greatly ease the burden of upgrading the 
>>>>>>>>>>> airflow
>>>>>>>>>>> version.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>>
>>>>>>>>>>> Ping
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Aug 3, 2022 at 3:13 PM Daniel Standish
>>>>>>>>>>> <daniel.stand...@astronomer.io.invalid> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> In case you forgot about us @Vikram, gentle nudge
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Aug 1, 2022 at 8:40 AM Daniel Standish <
>>>>>>>>>>>> daniel.stand...@astronomer.io> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Well, this is a discussion thread, so let's discuss!
>>>>>>>>>>>>>
>>>>>>>>>>>>> Vikram, what kind of implications are you thinking of? Maybe
>>>>>>>>>>>>> you could provide a little more detail about your concerns?
>>>>>>>>>>>>>
>>>>>>>>>>>>> There are certainly alternatives, and of course each of them
>>>>>>>>>>>>> has tradeoffs.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Here are the options I see:
>>>>>>>>>>>>>
>>>>>>>>>>>>> 1. consolidate to schedule (deprecate schedule_interval and
>>>>>>>>>>>>> timetable; the proposal under discussion in this thread)
>>>>>>>>>>>>> 2. do nothing (keep 3 different scheduling params)
>>>>>>>>>>>>> 3. consolidate to schedule_interval (deprecate timetable)
>>>>>>>>>>>>> 4. don't deprecate timetable, but remove schedule_on (AIP-48
>>>>>>>>>>>>> has added it in main) and take List[Dataset] in 
>>>>>>>>>>>>> `schedule_interval`
>>>>>>>>>>>>>
>>>>>>>>>>>>> Which would you propose we go with?
>>>>>>>>>>>>>
>>>>>>>>>>>>> If we have a veto against 1 (and again, we're in hypothetical
>>>>>>>>>>>>> territory since this is still a discussion thread), my vote would
>>>>>>>>>>>>> definitely be to go with 3.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The param `schedule_interval` as catch all is not *sooo *bad,
>>>>>>>>>>>>> but `schedule` definitely seems better. From a new user's 
>>>>>>>>>>>>> perspective,
>>>>>>>>>>>>> having misleading names is confusing, and can be a negative when 
>>>>>>>>>>>>> looking at
>>>>>>>>>>>>> adoption. And for a tenured user, it can nag at you. And dealing 
>>>>>>>>>>>>> with
>>>>>>>>>>>>> deprecations as an Airflow user, well it's sort of part of the 
>>>>>>>>>>>>> job, though
>>>>>>>>>>>>> it can be bothersome, certainly.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Just to look at / compare feasibility.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Deprecating schedule_interval, it would be more prominent than
>>>>>>>>>>>>> some other deprecations because it's a DAG param. That said, I 
>>>>>>>>>>>>> think it's
>>>>>>>>>>>>> pretty straightforward to handle. For one, it's at the 
>>>>>>>>>>>>> cardinality of DAG
>>>>>>>>>>>>> and not *task*, so there is less to change with than some
>>>>>>>>>>>>> other deprecations. And being kwargs-only, it's an easy find and 
>>>>>>>>>>>>> replace
>>>>>>>>>>>>> that could be automated with an `upgrade-check` script. It's very 
>>>>>>>>>>>>> easy to
>>>>>>>>>>>>> detect, either to warn about, or ultimately, after upgrade, to 
>>>>>>>>>>>>> fail the DAG
>>>>>>>>>>>>> parse (which forces user to fix). Perhaps we could even 
>>>>>>>>>>>>> experiment with a
>>>>>>>>>>>>> deprecation fix tool, so that users could get help fixing 
>>>>>>>>>>>>> deprecations, not
>>>>>>>>>>>>> necessarily in the context of an upgrade, since we're not looking 
>>>>>>>>>>>>> at 3.0
>>>>>>>>>>>>> any time soon.
>>>>>>>>>>>>>
>>>>>>>>>>>>> We could compare with execution_date, if execution_date is to
>>>>>>>>>>>>> be removed in 3.0. I think the deprecation of execution_date will 
>>>>>>>>>>>>> be a more
>>>>>>>>>>>>> difficult thing for users to handle. For one, potentially higher
>>>>>>>>>>>>> cardinality, since it is used in tasks. But also, the use of 
>>>>>>>>>>>>> execution_date
>>>>>>>>>>>>> can be harder to detect, since it can be in templates and 
>>>>>>>>>>>>> elsewhere in
>>>>>>>>>>>>> custom operators. *(With schedule_interval, since it's a
>>>>>>>>>>>>> top-level dag kwarg, it will be easier.)* And execution_date
>>>>>>>>>>>>> was replaced with `logical_date` for similar reasons -- avoiding 
>>>>>>>>>>>>> user
>>>>>>>>>>>>> confusion.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, Jul 29, 2022 at 2:42 PM Vikram Koka
>>>>>>>>>>>>> <vik...@astronomer.io.invalid> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> -1 from me.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Though I agree in principle with the idea of consolidation, I
>>>>>>>>>>>>>> don't think we should be doing this yet until we understand the
>>>>>>>>>>>>>> implications completely.
>>>>>>>>>>>>>> I am really not in favor of deprecation of the existing
>>>>>>>>>>>>>> params, unless there is really no alternative.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, Jul 29, 2022 at 2:37 PM Daniel Standish
>>>>>>>>>>>>>> <daniel.stand...@astronomer.io.invalid> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> So far, seems all in favor.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I will just highlight, in case it's not clear.... when we
>>>>>>>>>>>>>>> release this (presumably 2.4), basically every single dag in 
>>>>>>>>>>>>>>> existence will
>>>>>>>>>>>>>>> start emitting deprecation warnings, and prior to 3.0, 
>>>>>>>>>>>>>>> basically every dag
>>>>>>>>>>>>>>> in existence will need to be updated.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thankfully, for most people, this should be an easy update
>>>>>>>>>>>>>>> since DAG is kwargs only, IIRC.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, Jul 29, 2022, 2:27 PM Brent Bovenzi
>>>>>>>>>>>>>>> <br...@astronomer.io.invalid> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> +1 to consolidating and "schedule".
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Fri, Jul 29, 2022, 10:12 PM Drew Hubl
>>>>>>>>>>>>>>>> <drew.h...@astronomer.io.invalid> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> +1 for ‘schedule', and another +1 to the importance of
>>>>>>>>>>>>>>>>> consolidating to one being more important than the name of 
>>>>>>>>>>>>>>>>> that one
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Jul 29, 2022, at 1:58 PM, Josh Fell <
>>>>>>>>>>>>>>>>> josh.d.f...@astronomer.io.INVALID> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Consolidating to a single scheduling parameter is a big +1
>>>>>>>>>>>>>>>>> from me. `schedule` seems like a nice catch-all.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Fri, Jul 29, 2022 at 9:10 AM Jed Cunningham <
>>>>>>>>>>>>>>>>> jedcunning...@apache.org> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> +1 on moving to a single `schedule` param.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>

Reply via email to