mblakely commented on issue #36369:
URL: https://github.com/apache/airflow/issues/36369#issuecomment-1912713796
@jscheffl sorry for the delay.
That is a great question, at a high level, the goal is to create an optional
parameter so that code which the DAG calls can define a default.
Our use case has a library of code that is sometimes (increasingly, usually)
run from Airflow but also needs to be run in other environments.
For example imagine a file like:
```
data_processing.py
def process(image_directory: Path, should_parallelize: bool = True, ...):
...
```
We created a DAG as a wrapper around that function using Airflow Params to
parameterize some of the values in the function the DAG will delegate to
```
data_processing_dag.py
from data_processing import process
...
with DAG(
"process_data"
params={
"image_directory": Param(type=string,
default="/path/to/a/shared/directory"),
"should_parallelize": Param(type=["boolean", "null"])
}
@task
def do_stuff(params, **kwargs)
process(image_directory=Path(params["image_directory"]),
should_parallelize=params.get("should_parallelize"))
do_stuff()
```
Our actual setup is different so please don't nitpick the code above, the
goal is give a rough sense of how we use this.
This allows us to keep the default behavior defined in a centralized place
(in the `process` function) but allow the DAG to override it. If the null is
not allowed, then it is not possible to remove the value from the DAG conf and
successfully submit it.
We have other similar use cases where we have "optional" types of strings,
numbers, integers, etc.
There are a few more complicated ones like:
```
default={"dry_run": None},
type="object",
required=["dry_run"],
properties={"dry_run": {"type": ["boolean", "null"]}},
```
or
```
default=[[50, 48], [50, 72], [70, 48], [70, 72]],
type=["null", "array"],
items={
"type": "array",
"prefixItems": [
{"type": "integer"},
{"type": "integer"},
], # first and second element must be ints
"items": False, # no additional items can be added to
the array
```
As for the other union types, I agree that would add a ton of complexity.
For our use cases, I can't think of a need to allow for different types besides
a type and null.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]