vanshaj2023 commented on issue #49272:
URL: https://github.com/apache/arrow/issues/49272#issuecomment-3902522493

   Hi @raulcd 
   I'd like to work on this issue.
   
   After reviewing the repository, I found that the segfault in 
`ReaderTest.MultipleChunksParallel` is likely related to:
   
   **Implementation Approach:**
   1. The parallel chunk processing in `cpp/src/arrow/json/reader.cc` might 
have race conditions when multiple threads access shared resources
   2. Check memory management in `ThreadedTaskGroup` usage within the JSON 
reader - potential use-after-free or improper synchronization
   3. MinGW-specific thread handling differences compared to MSVC - review 
thread pool initialization and cleanup
   4. Add mutex guards around shared state modifications in parallel parsing 
code
   5. Verify proper lifetime management of chunked JSON buffers during 
concurrent reads
   
   The intermittent nature and the fact it only fails on MinGW suggests 
platform-specific threading or memory alignment issues in the parallel reader 
implementation.
   
   Could you please assign this issue to me? Does this approach sound 
reasonable, or am I missing something? Any specific areas in the JSON reader 
code I should prioritize?
   
   Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to