Copilot commented on code in PR #67288:
URL: https://github.com/apache/airflow/pull/67288#discussion_r3283777112


##########
task-sdk/tests/task_sdk/definitions/test_taskgroup.py:
##########
@@ -957,3 +957,147 @@ def test_getitem_missing_is_key_error(self):
 
         with pytest.raises(KeyError):
             tg["nonexistent"]
+
+
+# --- topological_sort: cross-shape correctness ---
+#
+# Mirrors the shapes covered by the benchmark gist referenced from PR #67288
+# (https://gist.github.com/shahar1/9c61dc9f34f7e77cd29cfb9d67af7ceb).
+# Wall-clock timing is intentionally not asserted here — CI runners are too
+# variable for ms thresholds to be meaningful. Run dev/bench_topological_sort
+# manually to gauge performance.

Review Comment:
   The comment suggests running `dev/bench_topological_sort`, but there is no 
such script in the repository (no matches under `dev/`). This makes the 
guidance misleading; either add the referenced benchmark script or update the 
comment to point at an existing benchmark entrypoint (or remove the 
instruction).
   



##########
airflow-core/src/airflow/serialization/definitions/taskgroup.py:
##########
@@ -217,35 +216,86 @@ def iter_mapped_task_groups(self) -> 
Iterator[SerializedMappedTaskGroup]:
 
     def topological_sort(self) -> list[DAGNode]:
         """
-        Sorts children in topographical order.
+        Sort children topologically — a task always comes after its upstream 
dependencies.
 
-        A task in the result would come after any of its upstream dependencies.
+        See ``TaskGroup.topological_sort`` in task-sdk for the algorithm. 
Cycle handling:
+        cycles are caught at deserialization time, so they should never reach 
this code; if
+        one does we raise ``ValueError`` rather than silently looping forever.

Review Comment:
   The docstring claims cycles are caught at deserialization time, but there 
does not appear to be a cycle check in the deserialization code path (e.g. 
`airflow/serialization/serialized_objects.py` has no `check_cycle` usage). 
Please adjust the wording to reflect actual behavior (e.g. treat cycles as 
“unexpected/corrupt input” and document that this method raises on detection).
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to