vanshaj2023 commented on code in PR #49462:
URL: https://github.com/apache/arrow/pull/49462#discussion_r2893682926


##########
cpp/src/arrow/json/reader_test.cc:
##########
@@ -290,6 +290,32 @@ TEST(ReaderTest, MultipleChunksParallel) {
   AssertTablesEqual(*serial, *threaded);
 }
 
+// Stress test for intermittent threading crashes on MinGW.
+// See https://github.com/apache/arrow/issues/49272
+TEST(ReaderTest, MultipleChunksParallelStress) {
+  constexpr int kTrials = 20;
+  for (int trial = 0; trial < kTrials; ++trial) {
+    int64_t count = 1 << 10;
+    ParseOptions parse_options;
+    parse_options.unexpected_field_behavior = 
UnexpectedFieldBehavior::InferType;
+    ReadOptions read_options;
+    read_options.block_size = static_cast<int>(count / 2);
+    read_options.use_threads = true;
+
+    std::string json;
+    for (int i = 0; i < count; ++i) {
+      json += "{\"a\":" + std::to_string(i) + "}\n";
+    }
+
+    std::shared_ptr<io::InputStream> input;
+    ASSERT_OK(MakeStream(json, &input));
+    ASSERT_OK_AND_ASSIGN(auto reader, TableReader::Make(default_memory_pool(), 
input,
+                                                        read_options, 
parse_options));
+    ASSERT_OK_AND_ASSIGN(auto table, reader->Read());
+    ASSERT_EQ(table->num_rows(), count);
+  }
+}

Review Comment:
   No, it is not too heavy, each iteration is identical to the existing 
MultipleChunksParallel test (1024 small JSON lines), just repeated 20 times to 
increase the chance of hitting the race condition. It finishes in a few seconds.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to