[ https://issues.apache.org/jira/browse/HBASE-18026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16009186#comment-16009186 ]
Anoop Sam John commented on HBASE-18026: ---------------------------------------- Well the concern was mainly abt the master branch patch. I have been working in the PB area in master branch. We have gone to the new version of PB there. Also when a request bytes been build up into a PB messages, we work with PB CIS. Latest versions of PB allow to declare the incoming request bytes as Immutable. If so , when we traverse over the bytes and make diff ByteString, PB wont do any copy and exactly refer to only those bytes for a ByteString (eg: Row PB object) Instead it will be a BoundedLiteralBS where we will have ref to byte[] and offset and length. In such cases, the above assumptions wont be correct. Well I realized now that the patch changed org.apache.hadoop.hbase.protobuf.ProtobufUtil but in master branch we use, org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil mainly (Specially for this Get req convert etc). But the change as such is dangerous. I did not check branch-1 based code path. When we make CIS from bytes, I dont think we declare it as Immutable there. Means when PB create ByteString, it will do a copy internally and as u said the byte [] will be 0 offset and required length only. Also {code} public static byte[] zeroCopyGetBytes(final ByteString buf) { if (buf instanceof LiteralByteString) { return ((LiteralByteString) buf).bytes; } throw new UnsupportedOperationException("Need a LiteralByteString, got a " + buf.getClass().getName()); } {code} So iff it is LiteralByteString the same underlying byte[] been returned. Here the offset will be 0 and length is correct. But any other type of ByteString , it throws Exception. May be on safe side we should just do toByteArray() for such cases as well? So functional wise the patch in any branch might not have been affected. In master the PB working was is diff as I said but the change is not in a used file. In older branches, the change might be ok as the PB usage is that way there. > ProtobufUtil seems to do extra array copying > -------------------------------------------- > > Key: HBASE-18026 > URL: https://issues.apache.org/jira/browse/HBASE-18026 > Project: HBase > Issue Type: Bug > Affects Versions: 2.0.0, 1.3.2 > Reporter: Vincent Poon > Assignee: Vincent Poon > Priority: Minor > Fix For: 2.0.0, 1.4.0, 1.2.6, 1.3.2, 1.1.11 > > Attachments: HBASE-18026.branch-1.v1.patch, > HBASE-18026.master.v1.patch > > > In ProtobufUtil, the protobuf fields are copied into an array using > toByteArray(). These are then passed into the KeyValue constructor which > does another copy. > It seems like we can avoid a copy here by using > HBaseZeroCopyByteString#zeroCopyGetBytes() ? -- This message was sent by Atlassian JIRA (v6.3.15#6346)