vanshaj2023 commented on PR #49462: URL: https://github.com/apache/arrow/pull/49462#issuecomment-4187882868
Thanks for the review, @pitrou! **1. Why does this only appear in `arrow-json-test` and not other tests?** The crash surfaces in `ReaderTest.MultipleChunksParallel` because that test creates a **brand-new `ThreadPool`** and immediately dispatches work to it. The race window is extremely narrow: when `LaunchWorkersUnlocked` spawns a new thread, that thread calls `SetCurrentThreadPool(this)`, writing to a `thread_local` before MinGW's `__emutls` has finished initializing TLS for the new thread. This dereferences a stale/invalid pointer and segfaults. Other tests that use the global default thread pool (created at startup via `ThreadPool::MakeEternal`) don't hit this because the pool is already **warm** by the time those tests run - no new threads need to be spawned during that vulnerable window. A raw race reproduction is possible by calling `ThreadPool::Make(N)` in a tight loop on MinGW, or more reliably by running the test in a shell loop: ```sh while ./arrow-json-test --gtest_filter=ReaderTest.MultipleChunksParallel; do :; done ``` The TlsPreservation test added in thread_pool_test.cc exercises the same code path (OwnsThisThread() → GetCurrentThreadPool() → TlsGetValue) directly from the ThreadPool tests. **2. Is there an upstream MinGW/GCC issue?** Yes, this is tracked as [GCC Bug #78605](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78605), which documents the __emutls race condition during thread startup. I've updated the code comment in thread_pool.cc to reference this upstream bug alongside the Arrow issue link. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
