Spark): Enable trimming arrow batches to limit by default [arrow-adbc]

via GitHub Fri, 14 Mar 2025 10:49:16 -0700


ccstevens commented on PR #2613:
URL: https://github.com/apache/arrow-adbc/pull/2613#issuecomment-2725334630


   We should not enable this. It should be strictly worse performance for cloud 
fetch. It causes Spark to do a read-modify-write of the last cloud fetch file. 
It is better for the client to simply read the last file and only process up to 
the result row count provided in the result manifest.
   
   If this is being flipped on to make sure only the exact rows are returned 
for direct results, then we should go fix the bug causing direct results to 
return extra rows.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] feat(csharp/src/Drivers/Apache/Spark): Enable trimming arrow batches to limit by default [arrow-adbc]

Reply via email to