[ 
https://issues.apache.org/jira/browse/HDFS-3246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16781270#comment-16781270
 ] 

Zheng Hu commented on HDFS-3246:
--------------------------------

bq. what benchmarks or tools would you run to ensure it reduces GC overhead? 
As the HBASE-21879 said,  we are observing GC pressure at HBase side when 
running YCSB benchmark. Amost all of the HBase read/write path are offheap now, 
except here: pread a HBase data block from HDFS, because of the lack of the 
ByteBuffer pread interface, so HBase can only read a data block into heap (see 
[1]).  In 100% Get case,  graph [2] and [3] indicated that the p999 latency was 
almost the same as the G1 Young GC STW ( ~ 100ms). so we have the reason to 
believe that the onheap pread increase GC overhead.  while if verify the 
reducing GC overhead by using HBase, seems not  a good way , its stack is too 
deep and so many variables, any simple HDFS tools ? 
(Of course, I'll give a final benchmark after HBASE-21879 fixed, but it will 
take some time, and this issue need to be fixed firstly)

1. http://openinx.github.io/images/hbase-offheap-onheap.png
2. 
https://issues.apache.org/jira/secure/attachment/12958391/gc-data-before-HBASE-21879.png
3. 
https://issues.apache.org/jira/secure/attachment/12958487/QPS-latencies-before-HBASE-21879.png


> pRead equivalent for direct read path
> -------------------------------------
>
>                 Key: HDFS-3246
>                 URL: https://issues.apache.org/jira/browse/HDFS-3246
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs-client, performance
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Henry Robinson
>            Assignee: Sahil Takiar
>            Priority: Major
>         Attachments: HDFS-3246.001.patch, HDFS-3246.002.patch, 
> HDFS-3246.003.patch, HDFS-3246.004.patch
>
>
> There is no pread equivalent in ByteBufferReadable. We should consider adding 
> one. It would be relatively easy to implement for the distributed case 
> (certainly compared to HDFS-2834), since DFSInputStream does most of the 
> heavy lifting.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to