[GitHub] [hbase] ramkrish86 commented on a change in pull request #2582: HBASE-25187 Improve SizeCachedKV variants initialization

2020-11-23 Thread GitBox


ramkrish86 commented on a change in pull request #2582:
URL: https://github.com/apache/hbase/pull/2582#discussion_r529255627



##
File path: 
hbase-common/src/main/java/org/apache/hadoop/hbase/SizeCachedByteBufferKeyValue.java
##
@@ -0,0 +1,90 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase;
+
+import java.nio.ByteBuffer;
+
+import org.apache.hadoop.hbase.util.Bytes;
+import org.apache.yetus.audience.InterfaceAudience;
+
+/**
+ * This Cell is an implementation of {@link ByteBufferExtendedCell} where the 
data resides in
+ * off heap/ on heap ByteBuffer
+ */
+@InterfaceAudience.Private
+public class SizeCachedByteBufferKeyValue extends ByteBufferKeyValue {
+
+  public static final int FIXED_OVERHEAD = Bytes.SIZEOF_SHORT + 
Bytes.SIZEOF_INT;
+  private short rowLen;
+  private int keyLen;
+
+  public SizeCachedByteBufferKeyValue(ByteBuffer buf, int offset, int length, 
long seqId,
+  int keyLen) {
+super(buf, offset, length);
+// We will read all these cached values at least once. Initialize now 
itself so that we can
+// avoid uninitialized checks with every time call
+this.rowLen = super.getRowLength();
+this.keyLen = keyLen;
+setSequenceId(seqId);
+  }
+
+  public SizeCachedByteBufferKeyValue(ByteBuffer buf, int offset, int length, 
long seqId,
+  int keyLen, short rowLen) {
+super(buf, offset, length);
+// We will read all these cached values at least once. Initialize now 
itself so that we can
+// avoid uninitialized checks with every time call
+this.rowLen = rowLen;
+this.keyLen = keyLen;
+setSequenceId(seqId);
+  }
+
+  @Override
+  public short getRowLength() {
+return rowLen;
+  }
+
+  @Override
+  public int getKeyLength() {
+return this.keyLen;
+  }
+
+  @Override
+  public long heapSize() {
+return super.heapSize() + FIXED_OVERHEAD;
+  }
+
+  /**
+   * Override by just returning the length for saving cost of method 
dispatching. If not, it will
+   * call {@link ExtendedCell#getSerializedSize()} firstly, then forward to
+   * {@link SizeCachedKeyValue#getSerializedSize(boolean)}. (See HBASE-21657)
+   */
+  @Override
+  public int getSerializedSize() {
+return this.length;
+  }
+
+  @Override
+  public boolean equals(Object other) {
+return super.equals(other);

Review comment:
   Actually not needed it is basically for the spot bugs report we had to 
add that. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] ramkrish86 commented on a change in pull request #2582: HBASE-25187 Improve SizeCachedKV variants initialization

2020-11-09 Thread GitBox


ramkrish86 commented on a change in pull request #2582:
URL: https://github.com/apache/hbase/pull/2582#discussion_r519709637



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderImpl.java
##
@@ -554,8 +559,10 @@ protected int blockSeek(Cell key, boolean seekBefore) {
   + " path=" + reader.getPath());
 }
 offsetFromPos += Bytes.SIZEOF_LONG;
+rowLen = ((blockBuffer.getByteAfterPosition(offsetFromPos) & 0xff) << 
8)
+^ (blockBuffer.getByteAfterPosition(offsetFromPos + 1) & 0xff);
 blockBuffer.asSubByteBuffer(blockBuffer.position() + offsetFromPos, 
klen, pair);
-bufBackedKeyOnlyKv.setKey(pair.getFirst(), pair.getSecond(), klen);
+bufBackedKeyOnlyKv.setKey(pair.getFirst(), pair.getSecond(), klen, 
(short)rowLen);

Review comment:
   `Use BB#getShortAfterPosition() instead of having extra logic here. that 
will be even perf wise better both in MBB and SBB`
   
   You mean instead of getByteAfterPosition getting called 2 times just use 
getShortAfterPosition()?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] ramkrish86 commented on a change in pull request #2582: HBASE-25187 Improve SizeCachedKV variants initialization

2020-11-05 Thread GitBox


ramkrish86 commented on a change in pull request #2582:
URL: https://github.com/apache/hbase/pull/2582#discussion_r518546602



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderImpl.java
##
@@ -790,23 +797,28 @@ public Cell getCell() {
 // we can handle the 'no tags' case.
 if (currTagsLen > 0) {
   ret = new SizeCachedKeyValue(blockBuffer.array(),
-  blockBuffer.arrayOffset() + blockBuffer.position(), cellBufSize, 
seqId);
+  blockBuffer.arrayOffset() + blockBuffer.position(), cellBufSize, 
seqId, currKeyLen,
+  rowLen);
 } else {
   ret = new SizeCachedNoTagsKeyValue(blockBuffer.array(),
-  blockBuffer.arrayOffset() + blockBuffer.position(), cellBufSize, 
seqId);
+  blockBuffer.arrayOffset() + blockBuffer.position(), cellBufSize, 
seqId, currKeyLen,
+  rowLen);
 }
   } else {
 ByteBuffer buf = blockBuffer.asSubByteBuffer(cellBufSize);
 if (buf.isDirect()) {
-  ret = currTagsLen > 0 ? new ByteBufferKeyValue(buf, buf.position(), 
cellBufSize, seqId)
-  : new NoTagsByteBufferKeyValue(buf, buf.position(), cellBufSize, 
seqId);
+  ret = currTagsLen > 0
+  ? new SizeCachedByteBufferKeyValue(buf, buf.position(), 
cellBufSize, seqId,
+  currKeyLen, rowLen)
+  : new SizeCachedNoTagsByteBufferKeyValue(buf, buf.position(), 
cellBufSize, seqId,
+  currKeyLen, rowLen);
 } else {
   if (currTagsLen > 0) {
 ret = new SizeCachedKeyValue(buf.array(), buf.arrayOffset() + 
buf.position(),
-cellBufSize, seqId);
+cellBufSize, seqId, currKeyLen, rowLen);

Review comment:
   >>So why to change the critical part of HFile reader adding new byte[] 
state
   
   Sorry. When you mean byte[] state - are you saying we are adding a new 
byte[] as state variable?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] ramkrish86 commented on a change in pull request #2582: HBASE-25187 Improve SizeCachedKV variants initialization

2020-11-05 Thread GitBox


ramkrish86 commented on a change in pull request #2582:
URL: https://github.com/apache/hbase/pull/2582#discussion_r518546602



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderImpl.java
##
@@ -790,23 +797,28 @@ public Cell getCell() {
 // we can handle the 'no tags' case.
 if (currTagsLen > 0) {
   ret = new SizeCachedKeyValue(blockBuffer.array(),
-  blockBuffer.arrayOffset() + blockBuffer.position(), cellBufSize, 
seqId);
+  blockBuffer.arrayOffset() + blockBuffer.position(), cellBufSize, 
seqId, currKeyLen,
+  rowLen);
 } else {
   ret = new SizeCachedNoTagsKeyValue(blockBuffer.array(),
-  blockBuffer.arrayOffset() + blockBuffer.position(), cellBufSize, 
seqId);
+  blockBuffer.arrayOffset() + blockBuffer.position(), cellBufSize, 
seqId, currKeyLen,
+  rowLen);
 }
   } else {
 ByteBuffer buf = blockBuffer.asSubByteBuffer(cellBufSize);
 if (buf.isDirect()) {
-  ret = currTagsLen > 0 ? new ByteBufferKeyValue(buf, buf.position(), 
cellBufSize, seqId)
-  : new NoTagsByteBufferKeyValue(buf, buf.position(), cellBufSize, 
seqId);
+  ret = currTagsLen > 0
+  ? new SizeCachedByteBufferKeyValue(buf, buf.position(), 
cellBufSize, seqId,
+  currKeyLen, rowLen)
+  : new SizeCachedNoTagsByteBufferKeyValue(buf, buf.position(), 
cellBufSize, seqId,
+  currKeyLen, rowLen);
 } else {
   if (currTagsLen > 0) {
 ret = new SizeCachedKeyValue(buf.array(), buf.arrayOffset() + 
buf.position(),
-cellBufSize, seqId);
+cellBufSize, seqId, currKeyLen, rowLen);

Review comment:
   >>So why to change the critical part of HFile reader adding new byte[] 
state
   Sorry. When you mean byte[] state - are you saying we are adding a new 
byte[] as state variable?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] ramkrish86 commented on a change in pull request #2582: HBASE-25187 Improve SizeCachedKV variants initialization

2020-11-05 Thread GitBox


ramkrish86 commented on a change in pull request #2582:
URL: https://github.com/apache/hbase/pull/2582#discussion_r518546426



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderImpl.java
##
@@ -790,23 +797,28 @@ public Cell getCell() {
 // we can handle the 'no tags' case.
 if (currTagsLen > 0) {
   ret = new SizeCachedKeyValue(blockBuffer.array(),
-  blockBuffer.arrayOffset() + blockBuffer.position(), cellBufSize, 
seqId);
+  blockBuffer.arrayOffset() + blockBuffer.position(), cellBufSize, 
seqId, currKeyLen,
+  rowLen);
 } else {
   ret = new SizeCachedNoTagsKeyValue(blockBuffer.array(),
-  blockBuffer.arrayOffset() + blockBuffer.position(), cellBufSize, 
seqId);
+  blockBuffer.arrayOffset() + blockBuffer.position(), cellBufSize, 
seqId, currKeyLen,
+  rowLen);
 }
   } else {
 ByteBuffer buf = blockBuffer.asSubByteBuffer(cellBufSize);
 if (buf.isDirect()) {
-  ret = currTagsLen > 0 ? new ByteBufferKeyValue(buf, buf.position(), 
cellBufSize, seqId)
-  : new NoTagsByteBufferKeyValue(buf, buf.position(), cellBufSize, 
seqId);
+  ret = currTagsLen > 0
+  ? new SizeCachedByteBufferKeyValue(buf, buf.position(), 
cellBufSize, seqId,
+  currKeyLen, rowLen)
+  : new SizeCachedNoTagsByteBufferKeyValue(buf, buf.position(), 
cellBufSize, seqId,
+  currKeyLen, rowLen);
 } else {
   if (currTagsLen > 0) {
 ret = new SizeCachedKeyValue(buf.array(), buf.arrayOffset() + 
buf.position(),
-cellBufSize, seqId);
+cellBufSize, seqId, currKeyLen, rowLen);

Review comment:
   If you see my previous commit - I got the rowLen also in blockSeek and 
the readKeyValue() method to pass it to the SizeCachedKV variants. That was 
mainly just to ensure that we get rowLen while doing the KV parsing itself 
rather than KV creation. Now in the last commit since I was already doing that 
rowLen related change I changed the BBKV also so that we don't need to parse it 
there. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] ramkrish86 commented on a change in pull request #2582: HBASE-25187 Improve SizeCachedKV variants initialization

2020-11-05 Thread GitBox


ramkrish86 commented on a change in pull request #2582:
URL: https://github.com/apache/hbase/pull/2582#discussion_r518036683



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderImpl.java
##
@@ -790,23 +797,28 @@ public Cell getCell() {
 // we can handle the 'no tags' case.
 if (currTagsLen > 0) {
   ret = new SizeCachedKeyValue(blockBuffer.array(),
-  blockBuffer.arrayOffset() + blockBuffer.position(), cellBufSize, 
seqId);
+  blockBuffer.arrayOffset() + blockBuffer.position(), cellBufSize, 
seqId, currKeyLen,
+  rowLen);
 } else {
   ret = new SizeCachedNoTagsKeyValue(blockBuffer.array(),
-  blockBuffer.arrayOffset() + blockBuffer.position(), cellBufSize, 
seqId);
+  blockBuffer.arrayOffset() + blockBuffer.position(), cellBufSize, 
seqId, currKeyLen,
+  rowLen);
 }
   } else {
 ByteBuffer buf = blockBuffer.asSubByteBuffer(cellBufSize);
 if (buf.isDirect()) {
-  ret = currTagsLen > 0 ? new ByteBufferKeyValue(buf, buf.position(), 
cellBufSize, seqId)
-  : new NoTagsByteBufferKeyValue(buf, buf.position(), cellBufSize, 
seqId);
+  ret = currTagsLen > 0
+  ? new SizeCachedByteBufferKeyValue(buf, buf.position(), 
cellBufSize, seqId,
+  currKeyLen, rowLen)
+  : new SizeCachedNoTagsByteBufferKeyValue(buf, buf.position(), 
cellBufSize, seqId,
+  currKeyLen, rowLen);
 } else {
   if (currTagsLen > 0) {
 ret = new SizeCachedKeyValue(buf.array(), buf.arrayOffset() + 
buf.position(),
-cellBufSize, seqId);
+cellBufSize, seqId, currKeyLen, rowLen);

Review comment:
   Here too the rowLen is already there. Any specific reason you feel this 
need not be decoded. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] ramkrish86 commented on a change in pull request #2582: HBASE-25187 Improve SizeCachedKV variants initialization

2020-11-05 Thread GitBox


ramkrish86 commented on a change in pull request #2582:
URL: https://github.com/apache/hbase/pull/2582#discussion_r517935274



##
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderImpl.java
##
@@ -554,8 +559,10 @@ protected int blockSeek(Cell key, boolean seekBefore) {
   + " path=" + reader.getPath());
 }
 offsetFromPos += Bytes.SIZEOF_LONG;
+rowLen = ((blockBuffer.getByteAfterPosition(offsetFromPos) & 0xff) << 
8)
+^ (blockBuffer.getByteAfterPosition(offsetFromPos + 1) & 0xff);
 blockBuffer.asSubByteBuffer(blockBuffer.position() + offsetFromPos, 
klen, pair);
-bufBackedKeyOnlyKv.setKey(pair.getFirst(), pair.getSecond(), klen);
+bufBackedKeyOnlyKv.setKey(pair.getFirst(), pair.getSecond(), klen, 
(short)rowLen);

Review comment:
   IMO the rowLen is anyway getting decoded in the setKey method. And the 
same we can reuse and have it for rowLen. So I think it is better to have it 
decoded earlier. Because that rowLen decoding is happening per cell. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] ramkrish86 commented on a change in pull request #2582: HBASE-25187 Improve SizeCachedKV variants initialization

2020-11-05 Thread GitBox


ramkrish86 commented on a change in pull request #2582:
URL: https://github.com/apache/hbase/pull/2582#discussion_r517919646



##
File path: 
hbase-common/src/main/java/org/apache/hadoop/hbase/SizeCachedKeyValue.java
##
@@ -39,12 +39,22 @@
   private short rowLen;
   private int keyLen;
 
-  public SizeCachedKeyValue(byte[] bytes, int offset, int length, long seqId) {
+  public SizeCachedKeyValue(byte[] bytes, int offset, int length, long seqId, 
int keyLen) {
 super(bytes, offset, length);
 // We will read all these cached values at least once. Initialize now 
itself so that we can
 // avoid uninitialized checks with every time call
-rowLen = super.getRowLength();
-keyLen = super.getKeyLength();
+this.rowLen = super.getRowLength();

Review comment:
   I tried this actually but from the this() constructor calling the 
super.getRowLength is not allowed. Hence I went with this simple way. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org