mobuchowski commented on code in PR #67901:
URL: https://github.com/apache/airflow/pull/67901#discussion_r3342698086
##########
providers/openlineage/src/airflow/providers/openlineage/plugins/listener.py:
##########
@@ -834,7 +836,17 @@ def _terminate_with_wait(self, process: psutil.Process):
def _fork_execute(self, callable, callable_name: str):
self.log.debug("Will fork to execute OpenLineage process.")
- pid = os.fork()
+ with warnings.catch_warnings():
+ # On Python 3.12+, os.fork() in a multi-threaded process emits a
+ # DeprecationWarning. The fork here is intentional and the child
+ # takes precautions (ORM reconfiguration, os._exit) so the warning
+ # is safe to suppress.
+ warnings.filterwarnings(
+ "ignore",
+ message=".*use of fork\\(\\) may lead to deadlocks in the
child",
+ category=DeprecationWarning,
+ )
Review Comment:
This is something I'd need to see truly well tested.
The forking approach stems from some real issues we had when sharing memory
- for example, deadlocks on Snowflake library that had a bug. The issue with
this class of bugs, is that they basically brick Airflow installation - and
when it's repeatable, the only way would be to remove OL integration. Which is
not good for other reasons...
I believe the real solution would be to split the integration in two - first
part, running on the same process as the task, would "collect" the data in some
serializable format, and then, the second part - running on separate processĀ
(or as initially thought - separate Airflow component like triggerer, but not
doable in edge-executor like environment) would parse those, perform network
requests, build OL events, and emit them to configured backend. Issue with that
solution is that it's giant and basically a total rework.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]