XiangpengHao commented on PR #6921: URL: https://github.com/apache/arrow-rs/pull/6921#issuecomment-2627551092
> I think it has the possibility to cache the decoded pages needed for the entire row group To clarify, it will only cache up to 2 pages per column: https://github.com/apache/arrow-rs/pull/6921/files#diff-e32cd78c497a3b6a5e49e47d1f7e44590071042201e5bb2c3c20de1c734ff6e5R321-R322 > if this will affect memory usage during query It will, as we need to fetch the projected columns to memory before applying filters. We may see a bit higher memory usage because of that. > the next steps for this PR? This PR comes from a caching related research project, which is currently being heavily measured for various performance metrics: CPU/memory usage etc. Those performance numbers will definitely help us understand better about the trade-offs of this PR. My plan is to push this further after we submit the paper (hopefully before March). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
