[
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15127620#comment-15127620
]
Anoop Sam John commented on HBASE-15180:
----------------------------------------
bq.I see what you are saying. Rather than BAIS, instead a CIS, one that does
Cells more natively. That sounds good. As long as the CellIS is an IS, we can
use Codec Interfaces.
Yes
bq.Rather than pass a boolean to the method to do direct or tags, could you
return a different Codec implementation? A server-side Codec and/or
tags-capable (would be better if seriialization figured if tags were present
rather than a meta boolean passed in by the server)? Would we still need to do
context if serverside codec and clientside codec?
Ya let me see.. I agree that it is ugly passing the boolean throughout. As of
now we dont support passing tags from client to server and reverse. Codec has
to serialize tags when it is Replication. So we have a new Codec
(KVCodecWithTags). But both these Codecs were using same BaseDecoder and Cell
create paths. Let me see how we can solve this.
bq.We are pivoting on the underlying Stream being a BAIS. Will it always be a
BAIS? Will it ever be a DBB?
It can be DBB later. Once we start reading the request into DBB (pooled) - yes.
Said that we are not having any hard need for underlying IS to be BAIS. That
is the reason why I did not add some thing like a new API in Codec where asking
Decoder to work on a byte[] or so. COntinue that to be an IS based gives us
the freedom to change the underlying data structure. We can make ByteBufferIS
which is a Cell readable. We can direct create cell (with out copy) over the
underlying DBB. (We have OffheapCell now)
bq.Could the createCell save a copy in same way? Look see if a BAIS and if so,
use its buffer and offset creating the Cell?
You mean the createCell in CellUtil? No. even if the IS is BAIS, we can not
directly make a Cell (with out any copy). BAIS is not exposing its backing
byte[] buffer. We will need indirect way of grabbing the byte[] from it.
That is why I made the CellBAIS extending BAIS which is having extra API to
create a Cell directly from its underlying buffer(with out any copy)
{quote}
bq.While reading from WAL, same path of flow is executed and then the IS wont
be CellInputStream type.
Why not and should it?
{quote}
>From that stream we can not make Cell directly with out any copy. The
>underlying stream is the one from DFS. Said that CellInputStream is a stream
>from which we can make cell DIRECTLY WITH OUT COPY. The name is confusing?
>From other streams also we can read cell but with copy.
Ya I agree new configs I also dont prefer. Actually we can avoid this and any
context. We can have 2 paths of Decoder make in client and server. Ya we can
even move it from IPCUtil also. Let me see.
> Reduce garbage created while reading Cells from Codec Decoder
> -------------------------------------------------------------
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
> Issue Type: Sub-task
> Components: regionserver
> Reporter: Anoop Sam John
> Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create
> a byte[] of length 4 and read the cell length and then an array of Cell's
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created
> on top of a ByteArrayInputStream on top of this. By default in write path, we
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that
> bytes. So there is no issue if we create Cells over the RPC read byte[]
> directly here in Decoder. No need for 2 byte[] creation and copy for every
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells
> directly from it.
> Same Codec path is used in client side also. There better we can avoid this
> direct Cell create and continue to do the copy to smaller byte[]s path. Plan
> to introduce some thing like a CodecContext associated with every Codec
> instance which can say the server/client context.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)