gershnik commented on code in PR #45124:
URL: https://github.com/apache/airflow/pull/45124#discussion_r1905810569


##########
airflow/dag_processing/manager.py:
##########
@@ -181,7 +180,15 @@ def _run_processor_manager(
         # to iterate the child processes
 
         set_new_process_group()
-        setproctitle("airflow scheduler -- DagFileProcessorManager")
+
+        # setproctitle causes issue on Mac OS: 
https://github.com/benoitc/gunicorn/issues/3021
+        os_type = sys.platform
+        if os_type == "darwin":
+            log.info("Mac OS detected, skipping setproctitle")
+        else:
+           from setproctitle import setproctitle
+            setproctitle("airflow scheduler -- DagFileProcessorManager")

Review Comment:
   @ashb Thank you, this is super helpful! (I am still unable to reproduce this 
even once - just tried it again after updating macOS)
   
   With regards to threads, setproctitle doesn't by itself create any threads. 
Apple frameworks do so internally for their XPC with Launch Services but those 
sit dormant unless functionality that is using them is invoked. Also see below.
   
   The crash happens in a child process post-fork on a main thread, early 
during setproctitle initialization in 
[CFBundleGetBundleWithIdentifier](https://developer.apple.com/documentation/corefoundation/1537139-cfbundlegetbundlewithidentifier?language=objc)
 call. It is called by setproctitle with a static argument[^1]:
   ```cpp
   CFBundleGetBundleWithIdentifier(CFSTR("com.apple.LaunchServices"))
   ```
   so it references no caller-supplied memory that can become invalid somehow. 
Thus, it is the _internal_ memory of CoreFoundation that is somehow corrupted 
at the time of this call. In other words the crash is "impossible" unless 
CoreFoundation itself is in a broken state.
   Also note that any threads Apple might create hasn't been started yet - this 
call happens long before such functionality is invoked. 
   
   All of this, combined with the fact that the crash is very non-deterministic 
suggests that setproctitle is a victim here of something (potentially itself) 
using Apple APIs on another thread in parallel with fork.  
   
   So the question is whether this is what is going on. Are there any calls to 
to setproctitle (including importing it) or any other Apple-using library in 
the parent process that can happen in parallel with fork? 
   
   [Update]
   @potiuk - just realized that your comment indicates that this is actually a 
known issue that has other manifestations, correct?
   
   [^1]: `CFSTR` is an Apple macro to produce a statically allocated CFString 
constant



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to