[
https://issues.apache.org/jira/browse/HBASE-14882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15721141#comment-15721141
]
Xiang Li commented on HBASE-14882:
----------------------------------
[~anoop.hbase] I uploaded the patch 005 for master branch, to address your
comments and have some questions 8-)
1. The following changes are made according to your comments
1.1. Update write(OutputStream out, boolean withTags) to avoid local copy on
byte[], with reference to ValueAndTagRewriteCell
1.2. Update headOverhead() to
(a) Consider array headers in heapOverhead
(b) Use TIMESTAMP_TYPE_SIZE as a sum of size of timestamp and type
(c) Make FIXED_OVERHEAD as a static final to be calculated when the
class is initialized
1.3. Update deepClone() to return a KeyValue object
1.4. Correct the indents and all updated files are checked
1.5. A new JIRA HBASE-17254 is opened to track the possible update for
alignment when calculating heapOverhead()
2. Some questions
2.1. When calculating heapOverhead(), I think I can not make it as a whole
constant value and make heapOverhead() returns the constant directly. There are
2 parts: the first part is FIXED_OVERHEAD, which could be constant. But the
second part, the array headers for all backing byte arrays, I have to calculate
them after the instance has been created, because for family, qualifier, value
and tags, ClassSize.ARRAY is added if it is not null, while ClassSize.ARRAY is
not added if it is null.
2.2. In write(OutputStream out, boolean withTags), I return
getSerializedSize(withTags) directly as the number of bytes written. I saw you
calculated len in ValueAndTagRewriteCell' write(), by adding the size together
after each write to output stream. Your method is the safest way, while it
might be more concise if getSerializedSize(withTags) is returned. Do you think
it is safe to return getSerializedSize(withTags) directly? Based on my test, it
is safe, but I am not sure if there are some conditions I did not cover. Please
advice.
> Provide a Put API that adds the provided family, qualifier, value without
> copying
> ---------------------------------------------------------------------------------
>
> Key: HBASE-14882
> URL: https://issues.apache.org/jira/browse/HBASE-14882
> Project: HBase
> Issue Type: Improvement
> Affects Versions: 1.2.0
> Reporter: Jerry He
> Assignee: Xiang Li
> Fix For: 2.0.0
>
> Attachments: HBASE-14882.master.000.patch,
> HBASE-14882.master.001.patch, HBASE-14882.master.002.patch,
> HBASE-14882.master.003.patch, HBASE-14882.master.004.patch,
> HBASE-14882.master.005.patch
>
>
> In the Put API, we have addImmutable()
> {code}
> /**
> * See {@link #addColumn(byte[], byte[], byte[])}. This version expects
> * that the underlying arrays won't change. It's intended
> * for usage internal HBase to and for advanced client applications.
> */
> public Put addImmutable(byte [] family, byte [] qualifier, byte [] value)
> {code}
> But in the implementation, the family, qualifier and value are still being
> copied locally to create kv.
> Hopefully we should provide an API that truly uses immutable family,
> qualifier and value.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)