Apache9 commented on pull request #3675:
URL: https://github.com/apache/hbase/pull/3675#issuecomment-918715725


   > For a standalone Java program reading a ~5G file in a single JVM (... 
using the mapreduce snapshot APIs), this change improved run time from 90s to 
30s. In a distributed system, it only had about 15% improvement (network became 
the bottleneck -- that's where HBASE-26274 came into play).
   
   It is a bit surprise to me that there could a 15% impact on performance. I 
was suppose that there should be little differences as we only read a very 
small amount of data with pread. Mind sharing more details here? Such as the 
HFile block size or something else? IIRC, the default config is to switch to 
stream after reading 4 HFile block size. And I saw you have already provide the 
test code, let me also take a look. Maybe we should file an issue about the 
performance issue with pread switching to stream.
   
   > Merged to master, branch-2, and branch-2.4.
   
   I think we also need this on branch-2.3? It has not been EOL yet.
   
   Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to