Re: [PR] GH-49272: [C++][CI] Fix intermittent segfault in arrow-json-test with MinGW [arrow]

via GitHub Thu, 26 Mar 2026 17:36:35 -0700


Copilot commented on code in PR #49462:
URL: https://github.com/apache/arrow/pull/49462#discussion_r2998400313



##########
cpp/src/arrow/json/reader_test.cc:
##########
@@ -290,6 +290,32 @@ TEST(ReaderTest, MultipleChunksParallel) {
   AssertTablesEqual(*serial, *threaded);
 }
 
+// Regression test for intermittent threading crashes on MinGW.
+// Run this test multiple times manually to stress-test:
+//   while build/debug/arrow-json-test \
+//       --gtest_filter=ReaderTest.MultipleChunksParallelRegression; do :; done
+// See https://github.com/apache/arrow/issues/49272
+TEST(ReaderTest, MultipleChunksParallelRegression) {
+  int64_t count = 1 << 10;
+  ParseOptions parse_options;
+  parse_options.unexpected_field_behavior = UnexpectedFieldBehavior::InferType;
+  ReadOptions read_options;
+  read_options.block_size = static_cast<int>(count / 2);
+  read_options.use_threads = true;
+

Review Comment:
   The PR description says a new stress test runs the parallel path 20 times 
("MultipleChunksParallelStress"), but the added test here is named 
MultipleChunksParallelRegression and only runs the read once. Either update the 
test to match the described stress behavior/name, or update the PR description 
so they stay consistent.



##########
cpp/src/arrow/util/thread_pool.cc:
##########
@@ -630,9 +631,44 @@ void ThreadPool::CollectFinishedWorkersUnlocked() {
   state_->finished_workers_.clear();
 }
 
+// MinGW's __emutls implementation for C++ thread_local has known race 
conditions
+// during thread creation that can cause segfaults. Use native Win32 TLS 
instead.
+// See https://github.com/apache/arrow/issues/49272
+#  ifdef __MINGW32__
+
+namespace {
+DWORD GetPoolTlsIndex() {
+  static DWORD index = [] {
+    DWORD i = TlsAlloc();
+    if (i == TLS_OUT_OF_INDEXES) {
+      ARROW_LOG(FATAL) << "TlsAlloc failed for thread pool TLS: "
+                       << WinErrorMessage(GetLastError());
+    }
+    return i;
+  }();
+  return index;
+}
+}  // namespace
+
+static ThreadPool* GetCurrentThreadPool() {
+  // TlsGetValue() clears the calling thread's last-error state on success.
+  DWORD last_error = GetLastError();
+  auto* pool = static_cast<ThreadPool*>(TlsGetValue(GetPoolTlsIndex()));
+  SetLastError(last_error);
+  return pool;
+}
+
+static void SetCurrentThreadPool(ThreadPool* pool) {
+  TlsSetValue(GetPoolTlsIndex(), pool);

Review Comment:
   On MinGW, SetCurrentThreadPool() ignores the return value of TlsSetValue(). 
If this call fails, OwnsThisThread() will silently return false inside worker 
threads, which can lead to confusing behavior and make failures hard to 
diagnose. Consider checking the BOOL return value and logging/aborting with 
GetLastError() (similar to the TlsAlloc failure handling).
   ```suggestion
     BOOL ok = TlsSetValue(GetPoolTlsIndex(), pool);
     if (!ok) {
       ARROW_LOG(FATAL) << "TlsSetValue failed for thread pool TLS: "
                        << WinErrorMessage(GetLastError());
     }
   ```



##########
cpp/src/arrow/util/thread_pool_test.cc:
##########
@@ -660,6 +661,34 @@ TEST_F(TestThreadPool, OwnsCurrentThread) {
   ASSERT_FALSE(one_failed);
 }
 
+#ifdef _WIN32
+TEST_F(TestThreadPool, OwnsThisThreadPreservesLastError) {
+#  ifndef ARROW_ENABLE_THREADING
+  GTEST_SKIP() << "Test requires threading support";
+#  endif
+  auto pool = this->MakeThreadPool(4);
+
+  // Verify from outside the pool: OwnsThisThread() must not clobber
+  // the calling thread's last-error state.
+  ::SetLastError(ERROR_FILE_NOT_FOUND);
+  ASSERT_FALSE(pool->OwnsThisThread());
+  ASSERT_EQ(::GetLastError(), static_cast<DWORD>(ERROR_FILE_NOT_FOUND));
+
+  // Verify from inside a pool thread.
+  std::atomic<bool> error_preserved{true};
+  ASSERT_OK(pool->Spawn([&] {
+    ::SetLastError(ERROR_ACCESS_DENIED);
+    ASSERT_TRUE(pool->OwnsThisThread());
+    if (::GetLastError() != ERROR_ACCESS_DENIED) {
+      error_preserved = false;
+    }
+  }));

Review Comment:
   This test uses ASSERT_TRUE inside a ThreadPool worker thread. GTest 
assertions from non-main threads are not reliably reported and can lead to 
false positives (e.g., if the assertion fails, error_preserved may remain true 
and the test can still pass). Prefer recording the result in an atomic/Status 
and asserting on the main thread after Shutdown().



##########
cpp/src/arrow/json/reader_test.cc:
##########
@@ -290,6 +290,32 @@ TEST(ReaderTest, MultipleChunksParallel) {
   AssertTablesEqual(*serial, *threaded);
 }
 
+// Regression test for intermittent threading crashes on MinGW.
+// Run this test multiple times manually to stress-test:
+//   while build/debug/arrow-json-test \
+//       --gtest_filter=ReaderTest.MultipleChunksParallelRegression; do :; done
+// See https://github.com/apache/arrow/issues/49272
+TEST(ReaderTest, MultipleChunksParallelRegression) {
+  int64_t count = 1 << 10;
+  ParseOptions parse_options;
+  parse_options.unexpected_field_behavior = UnexpectedFieldBehavior::InferType;
+  ReadOptions read_options;
+  read_options.block_size = static_cast<int>(count / 2);
+  read_options.use_threads = true;
+
+  std::string json;
+  for (int i = 0; i < count; ++i) {
+    json += "{\"a\":" + std::to_string(i) + "}\n";
+  }
+
+  std::shared_ptr<io::InputStream> input;
+  ASSERT_OK(MakeStream(json, &input));
+  ASSERT_OK_AND_ASSIGN(auto reader, TableReader::Make(default_memory_pool(), 
input,
+                                                      read_options, 
parse_options));
+  ASSERT_OK_AND_ASSIGN(auto table, reader->Read());

Review Comment:
   MultipleChunksParallelRegression largely duplicates the JSON generation and 
reader setup logic from MultipleChunksParallel. To reduce duplication (and keep 
the regression test aligned with future changes), consider extracting a small 
helper that builds the input/reader and returns the read table, or reusing the 
existing test logic with a loop.
   ```suggestion
     ASSERT_OK_AND_ASSIGN(auto table,
                          ReadToTable(std::move(json), read_options, 
parse_options));
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] GH-49272: [C++][CI] Fix intermittent segfault in arrow-json-test with MinGW [arrow]

Reply via email to