[jira] [Commented] (HBASE-21879) Read HFile's block to ByteBuffer directly instead of to byte for reducing young gc purpose

ramkrishna.s.vasudevan (JIRA) Thu, 21 Feb 2019 20:34:55 -0800


    [ 
https://issues.apache.org/jira/browse/HBASE-21879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16774760#comment-16774760
 ]


ramkrishna.s.vasudevan commented on HBASE-21879:
------------------------------------------------

bq.And you can get a ByteBuffer from a netty ByteBuf, by calling the nioBuffer 
method, no different from our ByteBuff. And we have CompositeByteBuf where we 
can have multiple ByteBuf combined.

Thanks [~Apache9]. Yes in the recent years seeing some Netty code - I was 
thinking while typing the above comment that your reply on using nioBuffer or 
CompositeByteBuf will be the answer for it. The ref count and the resource 
leaking detection may be different so I could be wrong there. 

Ya it will be a big project. The Cell,, Cellcomparators, CellUtils all needs to 
be changed and that will alone  be a big change. doing it in a seperate branch 
will be better. 
Thanks for the useful discussions here.


> Read HFile's block to ByteBuffer directly instead of to byte for reducing 
> young gc purpose
> ------------------------------------------------------------------------------------------
>
>                 Key: HBASE-21879
>                 URL: https://issues.apache.org/jira/browse/HBASE-21879
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Zheng Hu
>            Assignee: Zheng Hu
>            Priority: Major
>             Fix For: 3.0.0, 2.2.0, 2.3.0, 2.1.4
>
>         Attachments: QPS-latencies-before-HBASE-21879.png, 
> gc-data-before-HBASE-21879.png
>
>
> In HFileBlock#readBlockDataInternal,  we have the following: 
> {code}
> @VisibleForTesting
> protected HFileBlock readBlockDataInternal(FSDataInputStream is, long offset,
>     long onDiskSizeWithHeaderL, boolean pread, boolean verifyChecksum, 
> boolean updateMetrics)
>  throws IOException {
>  // .....
>   // TODO: Make this ByteBuffer-based. Will make it easier to go to HDFS with 
> BBPool (offheap).
>   byte [] onDiskBlock = new byte[onDiskSizeWithHeader + hdrSize];
>   int nextBlockOnDiskSize = readAtOffset(is, onDiskBlock, preReadHeaderSize,
>       onDiskSizeWithHeader - preReadHeaderSize, true, offset + 
> preReadHeaderSize, pread);
>   if (headerBuf != null) {
>         // ...
>   }
>   // ...
>  }
> {code}
> In the read path,  we still read the block from hfile to on-heap byte[], then 
> copy the on-heap byte[] to offheap bucket cache asynchronously,  and in my  
> 100% get performance test, I also observed some frequent young gc,  The 
> largest memory footprint in the young gen should be the on-heap block byte[].
> In fact, we can read HFile's block to ByteBuffer directly instead of to 
> byte[] for reducing young gc purpose. we did not implement this before, 
> because no ByteBuffer reading interface in the older HDFS client, but 2.7+ 
> has supported this now,  so we can fix this now. I think. 
> Will provide an patch and some perf-comparison for this. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-21879) Read HFile's block to ByteBuffer directly instead of to byte for reducing young gc purpose

Reply via email to