Apache9 commented on pull request #3675: URL: https://github.com/apache/hbase/pull/3675#issuecomment-918715725
> For a standalone Java program reading a ~5G file in a single JVM (... using the mapreduce snapshot APIs), this change improved run time from 90s to 30s. In a distributed system, it only had about 15% improvement (network became the bottleneck -- that's where HBASE-26274 came into play). It is a bit surprise to me that there could a 15% impact on performance. I was suppose that there should be little differences as we only read a very small amount of data with pread. Mind sharing more details here? Such as the HFile block size or something else? IIRC, the default config is to switch to stream after reading 4 HFile block size. And I saw you have already provide the test code, let me also take a look. Maybe we should file an issue about the performance issue with pread switching to stream. > Merged to master, branch-2, and branch-2.4. I think we also need this on branch-2.3? It has not been EOL yet. Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
