[ 
https://issues.apache.org/jira/browse/HBASE-12313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14184820#comment-14184820
 ] 

Anoop Sam John commented on HBASE-12313:
----------------------------------------

bq.The sizings that are in this patch as I see it cause no problem; perhaps a 
slight overcount but its for metrics only – not for anything important (You 
agree?)
Yes I am ok with it Stack.

bq.No you are right but 'do we need to fix it?' It is ok that the size 
calculated is 'rough', approx
As you see below the estimatedSerializedSizeOfKey() returns the key parts 
lengths only when it is KeyValue. But when it is non KV Cell, we end up adding 
value and tags length. This wont be slight change as the value length normally 
can be very large.  I am concerned over this.
{code}
+  private static int getSumOfKeyElementLengths(final Cell cell) {
+    return cell.getRowLength() + cell.getFamilyLength() +
+    cell.getQualifierLength() +
+    cell.getValueLength() +
+    cell.getTagsLength() +
+    KeyValue.TIMESTAMP_TYPE_SIZE;
+  }
+
+  public static int estimatedSerializedSizeOfKey(final Cell cell) {
+    if (cell instanceof KeyValue) return ((KeyValue)cell).getKeyLength();
+    // This will be a low estimate.  Will do for now.
+    return getSumOfKeyElementLengths(cell);
+  }
{code}


> Redo the hfile index length optimization so cell-based rather than serialized 
> KV key
> ------------------------------------------------------------------------------------
>
>                 Key: HBASE-12313
>                 URL: https://issues.apache.org/jira/browse/HBASE-12313
>             Project: HBase
>          Issue Type: Sub-task
>          Components: regionserver, Scanners
>            Reporter: stack
>            Assignee: stack
>         Attachments: 
> 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 
> 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 
> 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 
> 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 
> 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 12313v5.txt
>
>
> Trying to remove API that returns the 'key' of a KV serialized into a byte 
> array is thorny.
> I tried to move over the first and last key serializations and the hfile 
> index entries to be cell but patch was turning massive.  Here is a smaller 
> patch that just redoes the optimization that tries to find 'short' midpoints 
> between last key of last block and first key of next block so it is 
> Cell-based rather than byte array based (presuming Keys serialized in a 
> certain way).  Adds unit tests which we didn't have before.
> Also remove CellKey.  Not needed... at least not yet.  Its just utility for 
> toString.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to