Lee-W commented on code in PR #40868:
URL: https://github.com/apache/airflow/pull/40868#discussion_r1688140014


##########
airflow/datasets/__init__.py:
##########
@@ -271,6 +306,20 @@ def iter_datasets(self) -> Iterator[tuple[str, Dataset]]:
                 yield k, v
                 seen.add(k)
 
+    def iter_dag_deps(self, *, source: str, target: str) -> 
Iterator[DagDependency]:
+        """
+        Iterate dataset, dataset aliases and their resolved datasets  as dag 
dependency.
+
+        :meta private:
+        """
+        dag_deps: set[DagDependency] = set()

Review Comment:
   Got your point! Yep, it's no longer in `iter_dag_dependencies`. But it still 
happens in `airflow/serialization/serialized_objects.py`. So I think it's still 
not ` strictly in the UI layer (i.e. somewhere in API or frontend),` But in the 
"current" (before this PR) version, we're using `iter_datasets` to 
[detect_dag_dependencies](https://github.com/apache/airflow/blob/9ec9eb79a0cc845d86e7380c73269d2ee1d3c210/airflow/serialization/serialized_objects.py#L963).
 On the other hand, the [outlet 
part](https://github.com/apache/airflow/blob/9ec9eb79a0cc845d86e7380c73269d2ee1d3c210/airflow/serialization/serialized_objects.py#L946-L955)
 did not deduplicate.
   
   I think what you suggest is better. (let the API layer do the deduplication 
job, I'll probably do it 
[here](https://github.com/apache/airflow/blob/9ec9eb79a0cc845d86e7380c73269d2ee1d3c210/airflow/www/views.py#L3498)),
 but would like to check whether changing this behavior (which affect 
serialized DAG) is actually ok. Thanks!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to