[ https://issues.apache.org/jira/browse/HBASE-14882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15721141#comment-15721141 ]
Xiang Li commented on HBASE-14882: ---------------------------------- [~anoop.hbase] I uploaded the patch 005 for master branch, to address your comments and have some questions 8-) 1. The following changes are made according to your comments 1.1. Update write(OutputStream out, boolean withTags) to avoid local copy on byte[], with reference to ValueAndTagRewriteCell 1.2. Update headOverhead() to (a) Consider array headers in heapOverhead (b) Use TIMESTAMP_TYPE_SIZE as a sum of size of timestamp and type (c) Make FIXED_OVERHEAD as a static final to be calculated when the class is initialized 1.3. Update deepClone() to return a KeyValue object 1.4. Correct the indents and all updated files are checked 1.5. A new JIRA HBASE-17254 is opened to track the possible update for alignment when calculating heapOverhead() 2. Some questions 2.1. When calculating heapOverhead(), I think I can not make it as a whole constant value and make heapOverhead() returns the constant directly. There are 2 parts: the first part is FIXED_OVERHEAD, which could be constant. But the second part, the array headers for all backing byte arrays, I have to calculate them after the instance has been created, because for family, qualifier, value and tags, ClassSize.ARRAY is added if it is not null, while ClassSize.ARRAY is not added if it is null. 2.2. In write(OutputStream out, boolean withTags), I return getSerializedSize(withTags) directly as the number of bytes written. I saw you calculated len in ValueAndTagRewriteCell' write(), by adding the size together after each write to output stream. Your method is the safest way, while it might be more concise if getSerializedSize(withTags) is returned. Do you think it is safe to return getSerializedSize(withTags) directly? Based on my test, it is safe, but I am not sure if there are some conditions I did not cover. Please advice. > Provide a Put API that adds the provided family, qualifier, value without > copying > --------------------------------------------------------------------------------- > > Key: HBASE-14882 > URL: https://issues.apache.org/jira/browse/HBASE-14882 > Project: HBase > Issue Type: Improvement > Affects Versions: 1.2.0 > Reporter: Jerry He > Assignee: Xiang Li > Fix For: 2.0.0 > > Attachments: HBASE-14882.master.000.patch, > HBASE-14882.master.001.patch, HBASE-14882.master.002.patch, > HBASE-14882.master.003.patch, HBASE-14882.master.004.patch, > HBASE-14882.master.005.patch > > > In the Put API, we have addImmutable() > {code} > /** > * See {@link #addColumn(byte[], byte[], byte[])}. This version expects > * that the underlying arrays won't change. It's intended > * for usage internal HBase to and for advanced client applications. > */ > public Put addImmutable(byte [] family, byte [] qualifier, byte [] value) > {code} > But in the implementation, the family, qualifier and value are still being > copied locally to create kv. > Hopefully we should provide an API that truly uses immutable family, > qualifier and value. -- This message was sent by Atlassian JIRA (v6.3.4#6332)