[
https://issues.apache.org/jira/browse/HBASE-12313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14184820#comment-14184820
]
Anoop Sam John commented on HBASE-12313:
----------------------------------------
bq.The sizings that are in this patch as I see it cause no problem; perhaps a
slight overcount but its for metrics only – not for anything important (You
agree?)
Yes I am ok with it Stack.
bq.No you are right but 'do we need to fix it?' It is ok that the size
calculated is 'rough', approx
As you see below the estimatedSerializedSizeOfKey() returns the key parts
lengths only when it is KeyValue. But when it is non KV Cell, we end up adding
value and tags length. This wont be slight change as the value length normally
can be very large. I am concerned over this.
{code}
+ private static int getSumOfKeyElementLengths(final Cell cell) {
+ return cell.getRowLength() + cell.getFamilyLength() +
+ cell.getQualifierLength() +
+ cell.getValueLength() +
+ cell.getTagsLength() +
+ KeyValue.TIMESTAMP_TYPE_SIZE;
+ }
+
+ public static int estimatedSerializedSizeOfKey(final Cell cell) {
+ if (cell instanceof KeyValue) return ((KeyValue)cell).getKeyLength();
+ // This will be a low estimate. Will do for now.
+ return getSumOfKeyElementLengths(cell);
+ }
{code}
> Redo the hfile index length optimization so cell-based rather than serialized
> KV key
> ------------------------------------------------------------------------------------
>
> Key: HBASE-12313
> URL: https://issues.apache.org/jira/browse/HBASE-12313
> Project: HBase
> Issue Type: Sub-task
> Components: regionserver, Scanners
> Reporter: stack
> Assignee: stack
> Attachments:
> 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch,
> 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch,
> 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch,
> 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch,
> 0001-HBASE-12313-Redo-the-hfile-index-length-optimization.patch, 12313v5.txt
>
>
> Trying to remove API that returns the 'key' of a KV serialized into a byte
> array is thorny.
> I tried to move over the first and last key serializations and the hfile
> index entries to be cell but patch was turning massive. Here is a smaller
> patch that just redoes the optimization that tries to find 'short' midpoints
> between last key of last block and first key of next block so it is
> Cell-based rather than byte array based (presuming Keys serialized in a
> certain way). Adds unit tests which we didn't have before.
> Also remove CellKey. Not needed... at least not yet. Its just utility for
> toString.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)