OK so we have a consensus of 3.  Should I create a voting thread for this?
Namely, formally deprecating non-json-serializable params, for removal in
3.0?

On Thu, Nov 11, 2021 at 2:40 PM Jarek Potiuk <[email protected]> wrote:

> Indeed - if we want to really "deprecate" (and drop in 3.0) support
> for non-serializable params, then JSON serialization is the way to go.
>
> The only "benefit" of using YAML is "set" support but if we are going
> to "deprecate" non-serializable params, then we can easily include
> "set" in being 'non-serializable" and use JSON. There is a good reason
> why JSON does not support sets (because in serialized form it is
> exactly the same as list - there is no difference, really)
>
> J.
>
> On Thu, Nov 11, 2021 at 9:24 PM Kaxil Naik <[email protected]> wrote:
> >
> > -1 for breaking it again. We should go ahead with a deprecation route.
> JSON serializable makes sense, I am not fully convinced if YAML
> serializable is any better !
> >
> > Another note is the current params which Daniel fixed now use
> Serialization from our DAG Serialization -
> https://github.com/apache/airflow/blob/7622f5e08261afe5ab50a08a6ca0804af8c7c7fe/airflow/serialization/serialized_objects.py#L289-L330
> so it currently supports Timedelta, Timezone, Datetime, Tuple etc objects.
> >
> > But I agree with Daniel that we should deprecate and only support JSON
> Serializable objects to make it fully featured like overriding it via CLI,
> API and Webserver.
> >
> > Regards,
> > Kaxil
> >
> > On Thu, Nov 11, 2021 at 6:51 PM Daniel Standish <[email protected]>
> wrote:
> >>
> >> Yeah I agree with you.
> >>
> >> The one other thing I'll mention is the other use case that was raised
> in an issue was `datetime` which like set is also not json-serializable,
> but unlike set would probably not be yaml-serializable.
> >>
> >> But yeah let's see if others can help establish a consensus.
> >>
> >> Small note: another thing sortof in support of your position is that,
> if you can't override the param from UI and CLI and these other means
> (because it's not expecting something that can be serialize that way), then
> you don't even need it to be a `param` but it could just as easily be an
> operator or task arg instead.  I.e. if  you're staying in python you can
> use python; but what's special about params is they can be set from
> outside.  The other side of this though is that probably arbitrary dag
> params probably _did_ work with trigger dag run operator.
> >>
> >>
> >> On Thu, Nov 11, 2021 at 10:06 AM Jarek Potiuk <[email protected]> wrote:
> >>>
> >>> > So you would say in 2.2.3 we "break" that again?  Not wait for 3.0
> because, even if it was perhaps an accident, support was there?
> >>>
> >>> Yep. If others agree this is the way to go, I'd be happy to. We had
> >>> some other changes that worked "accidentally" but were never stated
> >>> that they work this way. I think it's a pretty good assumption (even
> >>> if it is implicit) that "params" set for dag triggering are "data" and
> >>> not "code". It could be python callable of course, but I think it's
> >>> kinda "abuse" - especially that it excludes triggering via CLI/UI.
> >>>
> >>> The thing is that we do not "specify" what is our "stable  API" and
> >>> what is not also in many other places, there is a certain ambiguity
> >>> for some of them. Of course it's not only whether it is "specified" or
> >>> "not", it's also much more "whether a lot of people could interpret
> >>> and use it in this way". I think (but maybe others can chime in)  - it
> >>> would be reasonable to assume that using callables or other Python
> >>> Code is expected when we already have:
> >>> a) CLI with string/JSON input
> >>> b) UI with string/JSON input
> >>> c) ability of using JSON-schema to actually verify the parameters
> >>> (this last one actually shows a clear intention of having those
> >>> parameters "data only")..
> >>>
> >>> J.
>

Reply via email to