Re: [PR] GH-48636: [C++][Parquet] Improve parquet reading using multi threads [arrow]

via GitHub Thu, 11 Jun 2026 19:30:52 -0700


wgtmac commented on PR #50158:
URL: https://github.com/apache/arrow/pull/50158#issuecomment-4686798399


   TBH, I don't think it is a good approach as we've tried this in the past. 
The main gotcha is that reading costs of different columns vary significantly 
by nature. For example, strings take longer time to decompress and decode but 
integers are smaller and faster. If the file is on a cloud object store, the 
majority time is blocked on waiting for I/O which may exhaust the thread pool 
if it is a wide column file.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] GH-48636: [C++][Parquet] Improve parquet reading using multi threads [arrow]

Reply via email to