kaxil commented on code in PR #67252:
URL: https://github.com/apache/airflow/pull/67252#discussion_r3277909699


##########
providers/common/ai/tests/unit/common/ai/durable/test_storage.py:
##########
@@ -35,7 +35,8 @@ def tmp_cache_path(tmp_path):
 
 
 @pytest.fixture
-def storage(tmp_cache_path):
+def storage(tmp_cache_path, hook_lineage_collector):
+    # ObjectStoragePath IO records hook lineage on compat Airflow versions; 
keep it isolated.

Review Comment:
   The same `ObjectStoragePath` pollution pattern exists in two other places in 
this provider, and they'll keep flaking the openlineage extractor test on 
Compat 3.0/3.1 even after this fix lands:
   
   - 
`providers/common/ai/tests/unit/common/ai/operators/test_agent.py::TestAgentOperatorDurable::test_execute_durable_wraps_model_and_cleans_up`
 patches `_get_base_path` to a real `file://` path (line 469) and lets 
`DurableStorage` write through it, so the hook lineage collector picks up the 
same `Dataset` entries that issue #67044 describes.
   - `providers/common/ai/tests/unit/common/ai/utils/test_file_analysis.py` is 
heavier: most tests in `TestBuildFileAnalysisRequest`, 
`TestFileAnalysisHelpers`, and `TestFormatReaders` (e.g. 
`test_text_file_analysis`, `test_detect_file_format`, 
`test_render_parquet_uses_lazy_import`, ...) construct 
`ObjectStoragePath(str(path))` against real `tmp_path` files and read through 
them, which registers inputs on the singleton collector.
   
   Would you mind extending this PR to either (a) add `hook_lineage_collector` 
to the affected fixtures/tests there too, or (b) introduce an autouse fixture 
at the `providers/common/ai/tests/unit/common/ai/conftest.py` level that resets 
the collector for every test in this provider? Option (b) matches the "more 
durable fix" the issue calls out and would prevent the next 
`ObjectStoragePath`-using test from re-introducing the same leak.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to