[
https://issues.apache.org/jira/browse/HBASE-14882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15476403#comment-15476403
]
Xiang Li commented on HBASE-14882:
----------------------------------
Hi [~anoop.hbase], I have some questions, could you please take a look at them
when you have time?
1. KeyValue implements 5 more interfaces beside Cell, they are HeapSize,
Cloneable, SettableSequenceId, SettableTimestamp, Streamable(Streamable is only
in master branch for 2.0.0).
Do I need to implement them for IndividualBytesFieldCell?
I think SettableSequenceId is a must, because I found an error when testing my
patch without SettableSequenceId implemented.
When HBase is re-started, the following exception can be found in RS log when
master asks the RS to open a region
{code}
java.io.IOException: java.lang.UnsupportedOperationException: Cell is not of
type org.apache.hadoop.hbase.SettableSequenceId
at org.apache.hadoop.hbase.CellUtil.setSequenceId(CellUtil.java:676)
at
org.apache.hadoop.hbase.regionserver.wal.FSWALEntry.stampRegionSequenceId(FSWALEntry.java:124)
at
org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.append(FSHLog.java:1853)
at
org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1750)
at
org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1672)
at
com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:128)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.UnsupportedOperationException: Cell is not of type
org.apache.hadoop.hbase.SettableSequenceId
... 9 more
{code}
SettableSequenceId and SettableTimestamp are server-side code to me, but Put is
client-side code. I am just wondering why the Cell implementation used in
Put(in client side) are expected to implement some server-side code.
2. About Streamable interface which is added by HBASE-13754.
(1) Do I need to implement Streamable as well?
Regarding the description of HBASE-13754, you mentioned:
bq. For other cell implementation, we will call getXXXLength() and
getXXXArray() and write each component one after the other
It can improve the efficiency if I have a single backing array as a whole and
implement Streamable, but it is still ok if it is not implemented. I will just
call getXXXArray(), getXXXLength and getXXXOffset() and write each field one by
one if I override Streamable#write(). So it is not a must, do I get it right?
(2) I read your discussion with Stack but still has a question for not having a
method to do de-serialization (that is, read()) to pair with write().
Regarding your comment in HBASE-13754 when explaining your idea why there is
only write() but no read()
{quote}
I think that is the correct path especially if we add the method to Cell
interface itself. All the Cell impls to have default constructor. The Codec
will create a Cell object of its choice and ask the object to make its data
from the stream. But with the new interface type I thought to just leave it as
we are not sure whether all Cells will impl the new interface
{quote}
Do you mean that if one needs to do the de-serialization against my new Cell
implementation - IndividualBytesFieldCell, he/she needs to cut each field from
stream and then call the constructor to build a new Cell or a new
IndividualBytesFieldCell? But in this way, he/she needs to read and fully
understand what I code in IndividualBytesFieldCell#write(). Does it not quite
follow the rule of encapsulation and not exposing the internal implementation?
I mean, he/she must know clearly about the internal fields of
IndividualBytesFieldCell and how I code write(), in order to implement read()
correctly.
> Provide a Put API that adds the provided family, qualifier, value without
> copying
> ---------------------------------------------------------------------------------
>
> Key: HBASE-14882
> URL: https://issues.apache.org/jira/browse/HBASE-14882
> Project: HBase
> Issue Type: Improvement
> Affects Versions: 1.2.0
> Reporter: Jerry He
> Assignee: Xiang Li
> Fix For: 2.0.0
>
> Attachments: HBASE-14882.master.000.patch,
> HBASE-14882.master.001.patch
>
>
> In the Put API, we have addImmutable()
> {code}
> /**
> * See {@link #addColumn(byte[], byte[], byte[])}. This version expects
> * that the underlying arrays won't change. It's intended
> * for usage internal HBase to and for advanced client applications.
> */
> public Put addImmutable(byte [] family, byte [] qualifier, byte [] value)
> {code}
> But in the implementation, the family, qualifier and value are still being
> copied locally to create kv.
> Hopefully we should provide an API that truly uses immutable family,
> qualifier and value.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)