zanmato1984 commented on issue #39951:
URL: https://github.com/apache/arrow/issues/39951#issuecomment-1943219652

   > Enabling ASAN makes this crash go away. Enabling TSAN results in some 
reports which I describe in #40068, #40069. Building in debug mode results in 
[this 
assertion](https://github.com/apache/arrow/blob/0dbbd43ca9133912d1809394727784560cc5e797/cpp/src/arrow/compute/util.cc#L38)
 firing.
   > 
   > Lowering `arrow::dataset::ScanOptions::batch_size` to 16 also fixes the 
crash (and lowering to 1024 does not).
   
   Thanks for the experiments. Though I can only guess what was happening, I 
think we are making progress.
   
   First I think the errors reported by TSAN don't seem to be related to this 
crash. But the fired assertion does. It indicates that an arrow-managed 
stack-like temp buffer is overflowed and possibly causing subsequent unexpected 
behaviors. It also explains why lowering `batch_size` makes crash go away - 
less temp space is required for smaller batch data. Though I can't explain why 
ASAN makes the crash go away, except that it slows down the program 
significantly so the chance of crash is reduced.
   
   To verify if the fired assertion is the root cause, could you try something 
similar to #40007 and see if it resolves the issue?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to