alexkruc commented on PR #25280:
URL: https://github.com/apache/airflow/pull/25280#issuecomment-1204924763
> I wonder if it’s better to only allow links on the DAG, or alternatively
design the DAG interface like this instead:
>
> ```python
> with DAG(..., owner_links={"owner1": link1, "owner2": link2, ...}):
> task1 = Operator(..., owner="owner2")
> ```
>
> This would work a bit better with duplicates, and avoid weird cases like
this:
>
> ```python
> with DAG(..., owner={"name": "uranusjr", "link": "https://uranusjr.com"}):
> task = Operator(..., owner={"name": "uranusjr", "link":
"https://uranusjr.io"}) # Oops!
> ```
@uranusjr We discussed it a bit on the issue description
(https://github.com/apache/airflow/issues/24728), and also, one of my
suggestions was to add this parameter on the DAG level. BUT -
You lose the ability to set multiple owners for a dag, as some teams use
DAGs in a multi-team manner to define one flow with several team
responsibilities.
Regarding your suggestion, having a DAG level parameter that has a "map" of
all the owners and links sounds ok, but IMO it's a bit counterintuitive because
you will have to specify the owner twice - once in the map with the link, and
the second time on the task itself.
The way I currently implemented it keeps the link (if you need it) only once
in the same place you define the owner. But if you think we should redesign it,
I can, it actually might be even a bit easier to develop haha
About your example, currently `owner` in a DAG is not a parameter, it's an
attribute.. you can't set an owner like
```python
with DAG(..., owner=<MY_COOL_OWNER>)
```
You have to set it as `default_args` that are propagating to each task
anyway.. The `owner` attribute inside the DAG object returns a string with all
the owners defined to a DAG. In case you set `default_args` as a value, and
then set the `task.owner` to something else, it will not make any error as the
`task.owner` attribute is overwriting the default args value.. I also tried it
with the following DAG:
```python
from datetime import datetime, timedelta
from airflow import DAG
from airflow.operators.bash import BashOperator
owner_obj = {'name': "test_owner_link", "link":
"https://www.google.com/search?q=blabla"}
default_args = {'owner': owner_obj, "start_date": datetime(2021, 9, 9),
"retries": 1,
'execution_timeout': timedelta(minutes=30)}
dag = DAG("test_dag_with_links", default_args=default_args,
schedule_interval=None,
catchup=False)
with dag:
task = BashOperator(task_id='task_with_link', bash_command='echo Hello')
task2 = BashOperator(task_id='task_with_no_link', bash_command='echo
"Hello Again"', owner="bla")
task3 = BashOperator(task_id='task_with_another_link',
bash_command='echo "Hello Again"',
owner={'name': "test_owner_email",
'link':
"mailto:[email protected]?subject=Mail from Our Site"})
task4 = BashOperator(task_id='task_with_slack_link',
bash_command='echo "Hello Again"',
owner={'name': "test_owner_link", "link":
"https://www.google.com/search?q=blublu"}
```
In here we set the `owner_obj` as the default args, set few other owners and
in `task4` I re-use the owner name in `owner_obj` again, but with a different
link. This flow is not failing, but you're right that it's a bit misleading,
because only the link in `task4` is kept (the reason is that `task.owner` is
superior to `default_args`)..
I think it's on the user side to understand that `default_args` are
propagating to the task, and how it works..
So in conclusion, if you guys think that adding a map inside `DAG(...,
owner_links={...})` is the right thing, we can, but I think that you should set
a link in the place that you use it, and not write the owner twice.
It can also lead to some "issues" if several teams share the same DAG, and
each team is in charge of some tasks.. In that case, no one would want to touch
the `DAG` object, as they will renounce ownership of the DAG structure itself,
but only be in charge of sporadic tasks inside it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]