[
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15124145#comment-15124145
]
stack commented on HBASE-15180:
-------------------------------
bq. Even with G1, this is unexpected. Is there a theoretical explanation?
Why you say [~enis]? Here's a SWAG. MSLAB is to prevent fragmentation in CMS.
Every Cell gets copied and there is the allocation of the SLABs themselves (no
reuse). G1GC avoids fragmentation by copying to a new region if fragmention.
Many small copies and SLAB allocations cost more than the relatively macro
copies G1GC does. To be verified...
> Reduce garbage created while reading Cells from Codec Decoder
> -------------------------------------------------------------
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
> Issue Type: Sub-task
> Components: regionserver
> Reporter: Anoop Sam John
> Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create
> a byte[] of length 4 and read the cell length and then an array of Cell's
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created
> on top of a ByteArrayInputStream on top of this. By default in write path, we
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that
> bytes. So there is no issue if we create Cells over the RPC read byte[]
> directly here in Decoder. No need for 2 byte[] creation and copy for every
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells
> directly from it.
> Same Codec path is used in client side also. There better we can avoid this
> direct Cell create and continue to do the copy to smaller byte[]s path. Plan
> to introduce some thing like a CodecContext associated with every Codec
> instance which can say the server/client context.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)