Indeed - if we want to really "deprecate" (and drop in 3.0) support for non-serializable params, then JSON serialization is the way to go.
The only "benefit" of using YAML is "set" support but if we are going to "deprecate" non-serializable params, then we can easily include "set" in being 'non-serializable" and use JSON. There is a good reason why JSON does not support sets (because in serialized form it is exactly the same as list - there is no difference, really) J. On Thu, Nov 11, 2021 at 9:24 PM Kaxil Naik <[email protected]> wrote: > > -1 for breaking it again. We should go ahead with a deprecation route. JSON > serializable makes sense, I am not fully convinced if YAML serializable is > any better ! > > Another note is the current params which Daniel fixed now use Serialization > from our DAG Serialization - > https://github.com/apache/airflow/blob/7622f5e08261afe5ab50a08a6ca0804af8c7c7fe/airflow/serialization/serialized_objects.py#L289-L330 > so it currently supports Timedelta, Timezone, Datetime, Tuple etc objects. > > But I agree with Daniel that we should deprecate and only support JSON > Serializable objects to make it fully featured like overriding it via CLI, > API and Webserver. > > Regards, > Kaxil > > On Thu, Nov 11, 2021 at 6:51 PM Daniel Standish <[email protected]> wrote: >> >> Yeah I agree with you. >> >> The one other thing I'll mention is the other use case that was raised in an >> issue was `datetime` which like set is also not json-serializable, but >> unlike set would probably not be yaml-serializable. >> >> But yeah let's see if others can help establish a consensus. >> >> Small note: another thing sortof in support of your position is that, if you >> can't override the param from UI and CLI and these other means (because it's >> not expecting something that can be serialize that way), then you don't even >> need it to be a `param` but it could just as easily be an operator or task >> arg instead. I.e. if you're staying in python you can use python; but >> what's special about params is they can be set from outside. The other side >> of this though is that probably arbitrary dag params probably _did_ work >> with trigger dag run operator. >> >> >> On Thu, Nov 11, 2021 at 10:06 AM Jarek Potiuk <[email protected]> wrote: >>> >>> > So you would say in 2.2.3 we "break" that again? Not wait for 3.0 >>> > because, even if it was perhaps an accident, support was there? >>> >>> Yep. If others agree this is the way to go, I'd be happy to. We had >>> some other changes that worked "accidentally" but were never stated >>> that they work this way. I think it's a pretty good assumption (even >>> if it is implicit) that "params" set for dag triggering are "data" and >>> not "code". It could be python callable of course, but I think it's >>> kinda "abuse" - especially that it excludes triggering via CLI/UI. >>> >>> The thing is that we do not "specify" what is our "stable API" and >>> what is not also in many other places, there is a certain ambiguity >>> for some of them. Of course it's not only whether it is "specified" or >>> "not", it's also much more "whether a lot of people could interpret >>> and use it in this way". I think (but maybe others can chime in) - it >>> would be reasonable to assume that using callables or other Python >>> Code is expected when we already have: >>> a) CLI with string/JSON input >>> b) UI with string/JSON input >>> c) ability of using JSON-schema to actually verify the parameters >>> (this last one actually shows a clear intention of having those >>> parameters "data only").. >>> >>> J.
