jason810496 commented on PR #45260: URL: https://github.com/apache/airflow/pull/45260#issuecomment-2564284850
When attempting to remove the lineage logic from the core module, I noticed that it causes failures in tests related to **OpenLineage listener capturing hook-level lineage** (https://github.com/apache/airflow/pull/41482). For example, removing `apply_lineage` and `prepare_lineage` from `BaseOperator`: https://github.com/apache/airflow/blob/0efd9e6a2fa8bde0f6c14e88951b44badca063a2/airflow/models/baseoperator.py#L705-L739 Results in the following test failure: ``` FAILED providers/tests/openlineage/extractors/test_manager.py::test_extractor_manager_gets_data_from_pythonoperator - assert 0 == 1 + where 0 = len([]) + where [] = HookLineage(inputs=[], outputs=[]).outputs ``` https://github.com/apache/airflow/blob/0efd9e6a2fa8bde0f6c14e88951b44badca063a2/providers/tests/openlineage/extractors/test_manager.py#L301 It seems OpenLineage is still coupled with the lineage module and might need to be moved to `compact.lineage` for now (or to the OpenLineage module in a future PR ). After some experimentation, implementing an `on_load` callback in the `OpenLineageProviderPlugin` to monkey-patch the core module at runtime prevents the OpenLineage test failures, even with the lineage module removed from the core. > Based on https://airflow.apache.org/docs/apache-airflow/stable/authoring-and-scheduling/plugins.html#interface I’m not sure if this is a suitable long-term solution for maintaining OpenLineage compatibility while cleaning up the lineage module ? cc @Lee-W @uranusjr -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
