[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

Enis Soztutar (JIRA) Fri, 29 Jan 2016 11:09:07 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15124005#comment-15124005
 ]


Enis Soztutar commented on HBASE-15180:
---------------------------------------

bq. May be BB based then. Again am trying to experiment with reading the req 
not just into one single large BB. From the pool we might be getting fixed 
sized smaller BBs (Say 64 KB or so) And we can read in to those many BBs. And 
the CellScanner need to work on a set of BBs (Like the MultiByteBuff stuff) 
Then again even this BB based API is an issue.. Continue with an InputStream 
based API gives us the freedom of experimenting with this different data 
structures.
We are still creating an IS per request. It is not the end of the world though. 
At the time of the request we do know the RPC buffer size required. I was 
trying to reuse the BBPool from Hadoop which can return a buffer at least as 
large as the request size.  

bq. Ya we have MSALB enabled by default. I agree that doing the MSLAB check in 
RPC layer looks ugly. Wanted to avoid we refer to the req read byte[] (from 
memstore cells) when some one turns MSLAB off. So what do you say? Remove this 
check?
We need to do a separate issue and remove the option for disabling MSLAB. Then 
we can assume in this patch that MSLAB is always enabled. 

> Reduce garbage created while reading Cells from Codec Decoder
> -------------------------------------------------------------
>
>                 Key: HBASE-15180
>                 URL: https://issues.apache.org/jira/browse/HBASE-15180
>             Project: HBase
>          Issue Type: Sub-task
>          Components: regionserver
>            Reporter: Anoop Sam John
>            Assignee: Anoop Sam John
>             Fix For: 2.0.0
>
>         Attachments: HBASE-15180.patch, HBASE-15180_V2.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

Reply via email to