[jira] [Commented] (HBASE-16840) Reuse cell's timestamp and type in ScanQueryMatcher

Yu Li (JIRA) Mon, 07 Nov 2016 05:03:08 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-16840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15644122#comment-15644122
 ]


Yu Li commented on HBASE-16840:
-------------------------------

Ok, since there's a V2 patch and no more comments from you after a ping, I 
thought you're good with it...

>From my point of view, there're some computation in {{KeyValue#getTypeByte}}
{code}
  public byte getTypeByte() {
    return this.bytes[this.offset + getKeyLength() - 1 + ROW_OFFSET];
  }
{code}

And take {{MajorCompactionScanQueryMatcher}} for example, now we have:
{code}
    if (CellUtil.isDelete(cell)) {
      ...
    }
    ...
    return columns.checkVersions(cell, timestamp, cell.getTypeByte(),
      mvccVersion > maxReadPointToTrackVersions);
{code}
So we will run into {{cell.getTypeByte}} and do the above mentioned computation 
twice, and we could avoid one time computation with patch here.

OTOH, in current {{CellUtil#isDelete}} we already have this method and some 
invocation on it
{code}
  /**
   * @return True if a delete type, a {@link KeyValue.Type#Delete} or a
   *         {KeyValue.Type#DeleteFamily} or a
   *         {@link KeyValue.Type#DeleteColumn} KeyValue type.
   */
  public static boolean isDelete(final byte type) {
    return Type.Delete.getCode() <= type
        && type <= Type.DeleteFamily.getCode();
  }
{code}
I guess it won't be more confusing by changing to use it?

I'd more like to take this as a minor improvement (already changed priority to 
Minor) rather than some big performance enhancement, and I'd like to confirm 
still we need some micro-benchmark to prove one computation better than twice 
here? Or maybe another look at the v2 patch? Thanks.

> Reuse cell's timestamp and type in ScanQueryMatcher
> ---------------------------------------------------
>
>                 Key: HBASE-16840
>                 URL: https://issues.apache.org/jira/browse/HBASE-16840
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: binlijin
>            Assignee: binlijin
>            Priority: Minor
>         Attachments: HBASE-16840_master.patch, HBASE-16840_master_V2.patch
>
>
> Reuse cell's timestamp and type in ScanQueryMatcher, this is useful for 
> KeyValue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16840) Reuse cell's timestamp and type in ScanQueryMatcher

Reply via email to