henry3260 commented on code in PR #67252:
URL: https://github.com/apache/airflow/pull/67252#discussion_r3278992172


##########
providers/common/ai/tests/unit/common/ai/durable/test_storage.py:
##########
@@ -35,7 +35,8 @@ def tmp_cache_path(tmp_path):
 
 
 @pytest.fixture
-def storage(tmp_cache_path):
+def storage(tmp_cache_path, hook_lineage_collector):
+    # ObjectStoragePath IO records hook lineage on compat Airflow versions; 
keep it isolated.

Review Comment:
   > The same `ObjectStoragePath` pollution pattern exists in two other places 
in this provider, and they'll keep flaking the openlineage extractor test on 
Compat 3.0/3.1 even after this fix lands:
   > 
   > * 
`providers/common/ai/tests/unit/common/ai/operators/test_agent.py::TestAgentOperatorDurable::test_execute_durable_wraps_model_and_cleans_up`
 patches `_get_base_path` to a real `file://` path (line 469) and lets 
`DurableStorage` write through it, so the hook lineage collector picks up the 
same `Dataset` entries that issue [Hook lineage collector pollution: 
ObjectStoragePath-using tests leak into openlineage extractor testsĀ 
#67044](https://github.com/apache/airflow/issues/67044) describes.
   > * `providers/common/ai/tests/unit/common/ai/utils/test_file_analysis.py` 
is heavier: most tests in `TestBuildFileAnalysisRequest`, 
`TestFileAnalysisHelpers`, and `TestFormatReaders` (e.g. 
`test_text_file_analysis`, `test_detect_file_format`, 
`test_render_parquet_uses_lazy_import`, ...) construct 
`ObjectStoragePath(str(path))` against real `tmp_path` files and read through 
them, which registers inputs on the singleton collector.
   > 
   > Would you mind extending this PR to either (a) add 
`hook_lineage_collector` to the affected fixtures/tests there too, or (b) 
introduce an autouse fixture at the 
`providers/common/ai/tests/unit/common/ai/conftest.py` level that resets the 
collector for every test in this provider? Option (b) matches the "more durable 
fix" the issue calls out and would prevent the next `ObjectStoragePath`-using 
test from re-introducing the same leak.
   
   good catch! applied option b



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to