vigneshsiva11 commented on PR #9362:
URL: https://github.com/apache/arrow-rs/pull/9362#issuecomment-3909469984

   Hi @alamb,
   
   You're absolutely right - this PR doesn't actually implement batch splitting 
as described. After further investigation, I realized that:
   
   **This PR (#9362) only:**
   - Moves the overflow check to happen BEFORE buffer mutation - Prevents 
buffer corruption by detecting overflow early - Returns a clean error instead 
of potentially panicking
   
   **This PR does NOT:**
   - Emit smaller batches
   - Stop decoding early to avoid overflow
   - Change the batch size
   
   **The complete solution is in PR #9369:**
   https://github.com/apache/arrow-rs/pull/9369
   
   PR #9369 includes the same defensive overflow check from this PR, PLUS the 
full implementation that actually splits RecordBatches when binary offsets 
would overflow. It properly detects the overflow condition during decoding and 
stops early, emitting smaller batches as described.
   
   I'll close this PR in favor of #9369, which is the comprehensive solution 
that addresses issue #7973 as intended.
   
   Thank you for catching this!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to