pitrou commented on PR #13442:
URL: https://github.com/apache/arrow/pull/13442#issuecomment-1171321955

   > If we don't have any real world benchmarks that can help estimate scaling 
characteristics on large files, then we should be able to tell whether the 16MB 
buffer size affects the micro benchmarks.
   
   Right, but that wouldn't tell us much otherwise.
   
   
   > In the case where a file has rows larger than 1MB, it's required to set 
the block size anyway, so I'm not following what the counter argument to this 
PR regarding the performance of large files
   
   A large file does not necessary have large rows. Large CSV files can 
typically have very small rows, and I expect at least _some_ large JSON files 
to be similar.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to