saintstack commented on a change in pull request #2747:
URL: https://github.com/apache/hbase/pull/2747#discussion_r541302819
##########
File path:
hbase-common/src/main/java/org/apache/hadoop/hbase/ByteBufferKeyOnlyKeyValue.java
##########
@@ -161,11 +161,11 @@ private int getTimestampOffset() {
@Override
public byte getTypeByte() {
- return ByteBufferUtils.toByte(this.buf, this.offset + this.length - 1);
+ return getTypeByte(this.length);
Review comment:
this.length is updated as we parse? Presumes we read Cell parts in
order... first the row and then the CF and then the qualifier? We are not
allowed to just and read the qualifier w/o first reading row?
##########
File path:
hbase-common/src/main/java/org/apache/hadoop/hbase/ByteBufferKeyOnlyKeyValue.java
##########
@@ -161,11 +161,11 @@ private int getTimestampOffset() {
@Override
public byte getTypeByte() {
- return ByteBufferUtils.toByte(this.buf, this.offset + this.length - 1);
+ return getTypeByte(this.length);
}
@Override
- public void setSequenceId(long seqId) throws IOException {
+ public void setSequenceId(long seqId) {
throw new IllegalArgumentException("This is a key only Cell");
Review comment:
Good.
##########
File path:
hbase-common/src/main/java/org/apache/hadoop/hbase/ByteBufferKeyOnlyKeyValue.java
##########
@@ -292,4 +287,49 @@ public long heapSize() {
}
return ClassSize.align(FIXED_OVERHEAD);
}
+
+ @Override
+ public int getFamilyLengthPosition(int rowLen) {
Review comment:
This stuff used to be private. Making it public exposes the
implementation. You fellows did a mountain of work making it so we could CHANGE
the implementation. This goes back on that work? At least exposing this stuff
as public methods?
##########
File path:
hbase-common/src/main/java/org/apache/hadoop/hbase/ByteBufferKeyOnlyKeyValue.java
##########
@@ -292,4 +287,49 @@ public long heapSize() {
}
return ClassSize.align(FIXED_OVERHEAD);
}
+
+ @Override
+ public int getFamilyLengthPosition(int rowLen) {
+ return this.offset + Bytes.SIZEOF_SHORT + rowLen;
+ }
+
+ @Override
+ public int getFamilyInternalPosition(int familyLengthPosition) {
+ return familyLengthPosition + Bytes.SIZEOF_BYTE;
+ }
+
+ @Override
+ public int getQualifierInternalPosition(int famOffset, byte famLength) {
+ return famOffset + famLength;
+ }
+
+ private int getTimestampOffset(final int keylength) {
+ return this.offset + keylength - KeyValue.TIMESTAMP_TYPE_SIZE;
+ }
+
+ @Override
+ public long getTimestamp(int keyLength) {
+ int tsOffset = getTimestampOffset(keyLength);
+ return ByteBufferUtils.toLong(this.buf, tsOffset);
+ }
+
+ @Override
+ public byte getTypeByte(int keyLength) {
+ return ByteBufferUtils.toByte(this.buf, this.offset + keyLength - 1);
+ }
+
+ @Override
+ public int getKeyLength() {
+ return this.length;
+ }
+
+ @Override
+ public boolean equals(Object other) {
+ return super.equals(other);
+ }
+
+ @Override
+ public int hashCode() {
+ return super.hashCode();
+ }
}
Review comment:
Have you run w/ the jdk compile flags on to make sure all methods doing
compare get inlined? I found it informative... sometimes methods would not
inline or compile because too big.... Sometimes fixing this w/ refactor helped
improve perf.
What about that nice stack trace you showed in the issue where you showed
deep trace for hbase2 in compare but a shallow one for hbase1. As you suggested
in the JIRA, yes, a deep stack trace costs... trimming it would help perf too.
##########
File path:
hbase-common/src/main/java/org/apache/hadoop/hbase/CellComparatorImpl.java
##########
@@ -51,35 +53,38 @@
*/
public static final CellComparatorImpl COMPARATOR = new CellComparatorImpl();
+ private static final ContiguousCellFormatComparator contiguousCellComparator
=
Review comment:
Hmm... when would a comparator NOT be do left-to-right?
##########
File path: hbase-common/src/main/java/org/apache/hadoop/hbase/KeyValue.java
##########
@@ -76,7 +77,7 @@
* length and actual tag bytes length.
*/
@InterfaceAudience.Private
-public class KeyValue implements ExtendedCell, Cloneable {
+public class KeyValue implements ExtendedCell, ContiguousCellFormat, Cloneable
{
Review comment:
Could the methods be added to ExtendedCell or is that different concerns?
##########
File path:
hbase-common/src/main/java/org/apache/hadoop/hbase/CellComparatorImpl.java
##########
@@ -51,35 +53,38 @@
*/
public static final CellComparatorImpl COMPARATOR = new CellComparatorImpl();
+ private static final ContiguousCellFormatComparator contiguousCellComparator
=
Review comment:
Won't it always be a 'contiguous' left-to-right compare?
##########
File path:
hbase-common/src/main/java/org/apache/hadoop/hbase/ByteBufferKeyOnlyKeyValue.java
##########
@@ -292,4 +287,49 @@ public long heapSize() {
}
return ClassSize.align(FIXED_OVERHEAD);
}
+
+ @Override
+ public int getFamilyLengthPosition(int rowLen) {
+ return this.offset + Bytes.SIZEOF_SHORT + rowLen;
+ }
+
+ @Override
+ public int getFamilyInternalPosition(int familyLengthPosition) {
+ return familyLengthPosition + Bytes.SIZEOF_BYTE;
+ }
+
+ @Override
+ public int getQualifierInternalPosition(int famOffset, byte famLength) {
+ return famOffset + famLength;
+ }
+
+ private int getTimestampOffset(final int keylength) {
+ return this.offset + keylength - KeyValue.TIMESTAMP_TYPE_SIZE;
+ }
+
+ @Override
+ public long getTimestamp(int keyLength) {
+ int tsOffset = getTimestampOffset(keyLength);
+ return ByteBufferUtils.toLong(this.buf, tsOffset);
+ }
+
+ @Override
+ public byte getTypeByte(int keyLength) {
+ return ByteBufferUtils.toByte(this.buf, this.offset + keyLength - 1);
+ }
+
+ @Override
+ public int getKeyLength() {
+ return this.length;
+ }
+
+ @Override
+ public boolean equals(Object other) {
+ return super.equals(other);
Review comment:
Remove?
##########
File path:
hbase-common/src/main/java/org/apache/hadoop/hbase/ByteBufferKeyValue.java
##########
@@ -20,41 +20,128 @@
import java.io.IOException;
import java.io.OutputStream;
import java.nio.ByteBuffer;
+import java.util.List;
+
import org.apache.hadoop.hbase.util.ByteBufferUtils;
import org.apache.hadoop.hbase.util.Bytes;
import org.apache.hadoop.hbase.util.ClassSize;
import org.apache.yetus.audience.InterfaceAudience;
+
import
org.apache.hbase.thirdparty.com.google.common.annotations.VisibleForTesting;
/**
* This Cell is an implementation of {@link ByteBufferExtendedCell} where the
data resides in
* off heap/ on heap ByteBuffer
*/
@InterfaceAudience.Private
-public class ByteBufferKeyValue extends ByteBufferExtendedCell {
+public class ByteBufferKeyValue extends ByteBufferExtendedCell implements
ContiguousCellFormat {
Review comment:
I think that to introduce a new Interface here around Cells, you first
need to purge two old ones... smile. Ask @anoopsjohn
##########
File path:
hbase-common/src/main/java/org/apache/hadoop/hbase/ByteBufferKeyValue.java
##########
@@ -20,41 +20,128 @@
import java.io.IOException;
import java.io.OutputStream;
import java.nio.ByteBuffer;
+import java.util.List;
+
import org.apache.hadoop.hbase.util.ByteBufferUtils;
import org.apache.hadoop.hbase.util.Bytes;
import org.apache.hadoop.hbase.util.ClassSize;
import org.apache.yetus.audience.InterfaceAudience;
+
import
org.apache.hbase.thirdparty.com.google.common.annotations.VisibleForTesting;
/**
* This Cell is an implementation of {@link ByteBufferExtendedCell} where the
data resides in
* off heap/ on heap ByteBuffer
*/
@InterfaceAudience.Private
-public class ByteBufferKeyValue extends ByteBufferExtendedCell {
+public class ByteBufferKeyValue extends ByteBufferExtendedCell implements
ContiguousCellFormat {
- protected final ByteBuffer buf;
Review comment:
Why take away the final?
If for default constructor, pass nulls to the override?
##########
File path:
hbase-common/src/main/java/org/apache/hadoop/hbase/ByteBufferKeyOnlyKeyValue.java
##########
@@ -292,4 +287,49 @@ public long heapSize() {
}
return ClassSize.align(FIXED_OVERHEAD);
}
+
+ @Override
+ public int getFamilyLengthPosition(int rowLen) {
+ return this.offset + Bytes.SIZEOF_SHORT + rowLen;
+ }
+
+ @Override
+ public int getFamilyInternalPosition(int familyLengthPosition) {
+ return familyLengthPosition + Bytes.SIZEOF_BYTE;
+ }
+
+ @Override
+ public int getQualifierInternalPosition(int famOffset, byte famLength) {
+ return famOffset + famLength;
+ }
+
+ private int getTimestampOffset(final int keylength) {
+ return this.offset + keylength - KeyValue.TIMESTAMP_TYPE_SIZE;
+ }
+
+ @Override
+ public long getTimestamp(int keyLength) {
+ int tsOffset = getTimestampOffset(keyLength);
+ return ByteBufferUtils.toLong(this.buf, tsOffset);
+ }
+
+ @Override
+ public byte getTypeByte(int keyLength) {
+ return ByteBufferUtils.toByte(this.buf, this.offset + keyLength - 1);
+ }
+
+ @Override
+ public int getKeyLength() {
+ return this.length;
+ }
+
+ @Override
+ public boolean equals(Object other) {
+ return super.equals(other);
+ }
+
+ @Override
+ public int hashCode() {
Review comment:
Can remove these?
##########
File path:
hbase-common/src/main/java/org/apache/hadoop/hbase/CellComparatorImpl.java
##########
@@ -51,35 +53,38 @@
*/
public static final CellComparatorImpl COMPARATOR = new CellComparatorImpl();
+ private static final ContiguousCellFormatComparator contiguousCellComparator
=
+ new ContiguousCellFormatComparator(COMPARATOR);
+
@Override
public final int compare(final Cell a, final Cell b) {
return compare(a, b, false);
}
@Override
public int compare(final Cell a, final Cell b, boolean ignoreSequenceid) {
-
int diff = 0;
- // "Peel off" the most common path.
- if (a instanceof ByteBufferKeyValue && b instanceof ByteBufferKeyValue) {
- diff = BBKVComparator.compare((ByteBufferKeyValue)a,
(ByteBufferKeyValue)b, ignoreSequenceid);
- if (diff != 0) {
- return diff;
- }
+ // "Peeling off" the most common cases where the Cells backed by KV format
either onheap or
+ // offheap
+ if (a instanceof ContiguousCellFormat && b instanceof ContiguousCellFormat
Review comment:
I see. You want to use a comparator that has expectations about
internals... that there methods it can call to speed up the compare.
Man. We have too many if/else's in the path. if BB, if tags, if sequenceid,
if offheap.... if unsafe. If ByteBufferExtendedCell...
We don't yet have an implementation that is not contiguous?
##########
File path:
hbase-common/src/main/java/org/apache/hadoop/hbase/ByteBufferKeyOnlyKeyValue.java
##########
@@ -161,11 +161,11 @@ private int getTimestampOffset() {
@Override
public byte getTypeByte() {
- return ByteBufferUtils.toByte(this.buf, this.offset + this.length - 1);
+ return getTypeByte(this.length);
Review comment:
But maybe this.length doesn't change? Its the key length?
##########
File path:
hbase-common/src/main/java/org/apache/hadoop/hbase/ContiguousCellFormat.java
##########
@@ -0,0 +1,85 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
contributor license
+ * agreements. See the NOTICE file distributed with this work for additional
information regarding
+ * copyright ownership. The ASF licenses this file to you under the Apache
License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the
License. You may obtain a
+ * copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless
required by applicable
+ * law or agreed to in writing, software distributed under the License is
distributed on an "AS IS"
+ * BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied. See the License
+ * for the specific language governing permissions and limitations under the
License.
+ */
+package org.apache.hadoop.hbase;
+
+import org.apache.yetus.audience.InterfaceAudience;
+import org.apache.yetus.audience.InterfaceStability;
+
+/**
+ * The interface represents any cell impl that is backed by contiguous layout
like {@link KeyValue},
+ * {@link ByteBufferKeyValue}. The APIs in this interface helps to easily
access the internals of
+ * the Cell structure so that we avoid fetching the offsets and lengths every
time from the internal
+ * backed memory structure. We can use this interface only if it follows the
standard KeyValue
+ * format. Helps in performance and this is jit friendly
+ */
[email protected]
[email protected]
+public interface ContiguousCellFormat {
+
+ /**
+ * Return the family length position/offset.
+ * @param rowLen - the rowlength of the cell
+ * @return the family's offset position
+ */
+ int getFamilyLengthPosition(int rowLen);
+
+ /**
+ * Return the family length. Use this along with {@link
#getFamilyLengthPosition(int)}
+ * @param famLengthPosition - the family offset
+ * @return the family's length
+ */
+ byte getFamilyLength(int famLengthPosition);
+
+ /**
+ * Return the family position/offset. This can be used along with
+ * the result of {@link #getFamilyLengthPosition(int)}
+ * @param familyLengthPosition - the family length position
+ * @return the family's position
+ */
+ int getFamilyInternalPosition(int familyLengthPosition);
+
+ /**
+ * Return the qualifier position/offset
+ * @param famOffset - the family offset
+ * @param famLength - the family length
+ * @return the qualifier offset/position.
+ */
+ int getQualifierInternalPosition(int famOffset, byte famLength);
+
+ /**
+ * Return the qualifier length
+ * @param keyLength - the key length
+ * @param rowLen - the row length
+ * @param famLength - the family length
+ * @return the qualifier length
+ */
+ int getQualifierLength(int keyLength, int rowLen, int famLength);
+
+ /**
+ * Return the time stamp. Use this along with {@link #getKeyLength()}
+ * @param keyLength - the key length
+ * @return the timestamp
+ */
+ long getTimestamp(int keyLength);
+
+ /**
+ * Return the type byte. Use this along with {@link #getKeyLength()}
+ * @param keyLength - the key length
+ * @return - the type byte
+ */
+ byte getTypeByte(int keyLength);
+
+ /**
+ * The keylength of the cell
+ * @return the key length
+ */
+ int getKeyLength();
+}
Review comment:
Could this not just be internal to the Cell implementation? Does it have
to be exposed like this in an Interface with special comparator?
It doesn't look like it. Has to be exposed so Comparator can make use of
these methods. I seen that this patch is just generalizing the behavior done
when we added BBKVComparator. Argh.
##########
File path:
hbase-common/src/main/java/org/apache/hadoop/hbase/ByteBufferKeyOnlyKeyValue.java
##########
@@ -292,4 +287,49 @@ public long heapSize() {
}
return ClassSize.align(FIXED_OVERHEAD);
}
+
+ @Override
+ public int getFamilyLengthPosition(int rowLen) {
Review comment:
For instance, what if a Cell implementation had data members that cached
all lengths... a column family length data member and a row length data member,
etc. These methods wouldn't make sense to it?
##########
File path:
hbase-common/src/main/java/org/apache/hadoop/hbase/ByteBufferKeyOnlyKeyValue.java
##########
@@ -292,4 +287,49 @@ public long heapSize() {
}
return ClassSize.align(FIXED_OVERHEAD);
}
+
+ @Override
+ public int getFamilyLengthPosition(int rowLen) {
Review comment:
What if this stuff just stayed internal? I think you said, I did the KV
one and here you are doing BB. Do we have to do more? There'd be duplication...
or we could call out to a utility class of statics so the offset math could be
shared....
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]