[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15127259#comment-15127259
 ] 

stack commented on HBASE-15180:
-------------------------------

bq. Codec and its Decoder being non private, we can not directly change the 
params/return types.

Codec was done long time ago. It was kept minimal until use cases were better 
known. We are probably only user. Lets look at a case-by-case basis. At least 
we can add methods and deprecate the old.

bq. And same Decoder used in WAL reading as well. 

Good. That was intent that it would be used everywhere. Pity it is not used 
reading and writing hfiles.

bq. Now we are returning a BAIS object from this IpcUtil method.

I see what you are saying. Rather than BAIS, instead a CIS, one that does Cells 
more natively. That sounds good.  As long as the CellIS is an IS, we can use 
Codec Interfaces.

Rather than pass a boolean to the method to do direct or tags, could you return 
a different Codec implementation? A server-side Codec and/or tags-capable 
(would be better if seriialization figured if tags were present rather than a 
meta boolean passed in by the server)? Would we still need to do context if 
serverside codec and clientside codec?

Looking at CellUtil#createCell, its content is same as what is in 
CellReadableByteArrayInputStream; its duplicated code?

We are pivoting on the underlying Stream being a BAIS. Will it always be a 
BAIS? Will it ever be a DBB?


bq. While reading from WAL, same path of flow is executed and then the IS wont 
be CellInputStream type.

Why not and should it?


bq.  Only diff is instead of BAIS we make CellBAIS object so we save copy.

Could the createCell save a copy in same way? Look see if a BAIS and if so, use 
its buffer and offset creating the Cell?

bq. As u suggest we can have new config which can be turned on at server side 
and off at client side/

Pardon me. Config is a bad idea.  Does the caller have enough context to make a 
server vs client Decoder? How did we do this for tags? It had to know if server 
vs client?  We could also pull createCellScanner out of IPCUtil. Maybe 
createCellScanner does not belong in IPCUtil? We can then in RPCClient or 
RPCServer create different cellscanners.  Or, each creates an IPCUtil. We could 
pass in a 'Context' on construction of IPCUtil and in it it would say if Client 
or Server.

The logic in CellReadablePBIS is same as for PBIS?  We use PBIS just so we can 
figure if EOF.







> Reduce garbage created while reading Cells from Codec Decoder
> -------------------------------------------------------------
>
>                 Key: HBASE-15180
>                 URL: https://issues.apache.org/jira/browse/HBASE-15180
>             Project: HBase
>          Issue Type: Sub-task
>          Components: regionserver
>            Reporter: Anoop Sam John
>            Assignee: Anoop Sam John
>             Fix For: 2.0.0
>
>         Attachments: HBASE-15180.patch, HBASE-15180_V2.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to