Very good question.

Maybe we can solve it differently. We could recommend JSON and the
basic datasets that JSON supports, but it should be possible for
Airflow to support YAML as input format for params.

Every JSON is also valid YAML and you can use Yaml parser to parse
JSON. Yaml supports sets (though the syntax is cumbersome)
https://yaml.org/type/set.html . So if we support YAML for both API
and UI, if you REALLY want to use set - you will be able to .

My proposal is that all the examples etc. could show JSON and we
should support it, but there should be an asterisk (*) that if you
want advanced features like YAML, you could use YAML. Then we would
not have to deal with deprecation which is problematic as we could
only remove this feature in Airflow 3.

J.

On Wed, Nov 10, 2021 at 7:14 PM Daniel Standish <[email protected]> wrote:
>
> Prior to 2.2.0, you could use non-json-serializable params in a dag. Here's 
> an example with `set`:
>
> @dag.task(params={"a": {1, 2, 3}, "b": [3, 4, 5]})
> def set_param(intersection):
>     print(intersection)
>
>
> set_param("{{ params.a.intersection(params.b).pop() }}")
>
> In 2.2.0 this was broken, and in 2.2.2rc1 it should be fixed.
>
> There's a problem though.  An important part of params is the ability to 
> override them when triggering a dag from the web UI or CLI.  But params 
> supplied through either mechanism should be json-serializable.
>
> So on one hand we allow arbitrary params, and on the other hand we require 
> them to be json serializable.  We can see how this causes a problem in the 
> above example: if you provide `a` as a `list` in the UI, it won't have the 
> method `intersection`.
>
> So the question: should we deprecate non-json-serializable params?
>
> My feeling is yes.  But I'm not sure how widely this flexibility is used in 
> the wild.
>

Reply via email to