[ 
https://issues.apache.org/jira/browse/HBASE-21879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16767152#comment-16767152
 ] 

Zheng Hu commented on HBASE-21879:
----------------------------------

Thanks for your info, [~ram_krish]. I think we can do this separately for HBase 
and HDFS: 
1.  For HBase, we can still re-implement the FSReaderImpl#readAtOffset as a 
ByteBuffer method.  Of course, the stream read can just read to abstract 
ByteBuffer (which would be allocated from the ByteBufferPool) directly,  the 
positional read  can keep the onheap bytes but wrap by HeapByteBuffer.  then at 
least, the stream scan will not copy bytes to on-heap now. 
2.  Pull request to HDFS, make the hdfs client support this.  I think it won't 
be too hard, because we have DFSInputStream#read(ByteBuffer) now.  but not sure 
which future version can support the pread ByteBuffer interface,  hbase may 
need a hdfs version upgrade.  finally,  we can easily refactor the pread to be 
an ByteBuffer pread. 

> Read HFile's block to ByteBuffer directly instead of to byte for reducing 
> young gc purpose
> ------------------------------------------------------------------------------------------
>
>                 Key: HBASE-21879
>                 URL: https://issues.apache.org/jira/browse/HBASE-21879
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Zheng Hu
>            Assignee: Zheng Hu
>            Priority: Major
>             Fix For: 3.0.0, 2.2.0, 2.3.0, 2.1.4
>
>         Attachments: QPS-latencies-before-HBASE-21879.png, 
> gc-data-before-HBASE-21879.png
>
>
> In HFileBlock#readBlockDataInternal,  we have the following: 
> {code}
> @VisibleForTesting
> protected HFileBlock readBlockDataInternal(FSDataInputStream is, long offset,
>     long onDiskSizeWithHeaderL, boolean pread, boolean verifyChecksum, 
> boolean updateMetrics)
>  throws IOException {
>  // .....
>   // TODO: Make this ByteBuffer-based. Will make it easier to go to HDFS with 
> BBPool (offheap).
>   byte [] onDiskBlock = new byte[onDiskSizeWithHeader + hdrSize];
>   int nextBlockOnDiskSize = readAtOffset(is, onDiskBlock, preReadHeaderSize,
>       onDiskSizeWithHeader - preReadHeaderSize, true, offset + 
> preReadHeaderSize, pread);
>   if (headerBuf != null) {
>         // ...
>   }
>   // ...
>  }
> {code}
> In the read path,  we still read the block from hfile to on-heap byte[], then 
> copy the on-heap byte[] to offheap bucket cache asynchronously,  and in my  
> 100% get performance test, I also observed some frequent young gc,  The 
> largest memory footprint in the young gen should be the on-heap block byte[].
> In fact, we can read HFile's block to ByteBuffer directly instead of to 
> byte[] for reducing young gc purpose. we did not implement this before, 
> because no ByteBuffer reading interface in the older HDFS client, but 2.7+ 
> has supported this now,  so we can fix this now. I think. 
> Will provide an patch and some perf-comparison for this. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to