[
https://issues.apache.org/jira/browse/HBASE-8782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13692214#comment-13692214
]
Hamed Madani commented on HBASE-8782:
-------------------------------------
Well from what I understand ByteBuffer.array() returns the internal array of
the buffer. But getBytes() return a new subset of internal array. getBytes()
extract this small array by looking at position and limit variables of
byteBuffer. With Binary protocol ByteBuffer.readBinary() returns a new
ByteBuffer with a small internal array. (position=0, size= "size of our useful
data). With framed transport, however, ByteBuffer.readBinary() returns the
original *trans_* array, but with new *position* and *limit* variables. so
internal arrays with framed transport are very large containing all the data in
one connection.
As for solution, my first solution to avoid copying the array was to modify
HtableInterface to accept ByteBuffer as input and separately take care of other
cases in checkAndPut() and checkAndDelete(). However, I can see that means
adding to HTableInterface!
I found *org.apache.thrift.TBaseHelper.byteBufferToByteArray* to be a more
efficient function for this use case that Bytes.getBytes().
{code}
public static byte[] byteBufferToByteArray(ByteBuffer byteBuffer) {
if (wrapsFullArray(byteBuffer)) {
return byteBuffer.array();
}
byte[] target = new byte[byteBuffer.remaining()];
byteBufferToByteArray(byteBuffer, target, 0);
return target;
}
public static boolean wrapsFullArray(ByteBuffer byteBuffer) {
return byteBuffer.hasArray()
&& byteBuffer.position() == 0
&& byteBuffer.arrayOffset() == 0
&& byteBuffer.remaining() == byteBuffer.capacity();
}
public static int byteBufferToByteArray(ByteBuffer byteBuffer, byte[] target,
int offset) {
int remaining = byteBuffer.remaining();
System.arraycopy(byteBuffer.array(),
byteBuffer.arrayOffset() + byteBuffer.position(),
target,
offset,
remaining);
return remaining;
}
{code}
the above function is more efficient because for binary protocol it simply
returns the inner array with .array() and for framed protocol it uses
system.arraycopy rather than a for loop to copy the elements. Also above
function avoids byteBuffer.duplicate().
If you also think this is a better alternative than getBytes() I can make a new
patch using byteBufferToByteArray() instead of getBytes();
> Thrift2 can not parse values when using framed transport
> --------------------------------------------------------
>
> Key: HBASE-8782
> URL: https://issues.apache.org/jira/browse/HBASE-8782
> Project: HBase
> Issue Type: Bug
> Components: Thrift
> Affects Versions: 0.95.1
> Reporter: Hamed Madani
> Attachments: HBASE_8782.patch
>
>
> ThriftHBaseServiceHandler.java use .array() on table names , and values
> (family , qualifier in checkandDelete , etc) which resulted in incorrect
> values with framed transport. Replacing .array() with getBytes() fixed this
> problem. I've attached the patch
> EDIT: updated the patch to cover checkAndPut(), checkAndDelete()
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira