[ 
https://issues.apache.org/jira/browse/HBASE-28597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17847206#comment-17847206
 ] 

Istvan Toth commented on HBASE-28597:
-------------------------------------

The fastest solution would be simply taking the CellBlock ByteBuffers from the 
protobuf responses, and directly sending those out in the Http Body. This would 
require zero memory copying, or in fact any processing, just straight DMA.

The client would not have control over the encryption and Codec, but we could 
give those in headers, and the client needs to include the native hbase client 
library anyway for this to work.

Hbase is not set up to make this feasible now, and AFAICT this would need 
horrible reflection hacks and/or major additions to the HBase API, and I am not 
comfortable enough with the RPC internals to attempt this.

Once we are getting cells from the HBase API, the CellBlocks have already been 
decoded and copied to the HEAP, so much of the "damage" in memory and GC 
pressure is already done.

A way to mitgate that would be if we were able to use ByteBuffer backed cells 
on the client side, but the client API does not support that. At first glance, 
ByteBuffer backed cells seem to be only generated mostly when reading HFiles, 
and in the RPC write (Puts) path. This looks more feasible than copying raw 
CellBlocks, but it would still be a very large change.

The next step where we can perhaps save cycles and GC pressure is  encoding and 
sending the data via HTTP.
Even for the current protobuf implementation, if we could use ByteBuffer backed 
CodedOutputStream like the HBase RPC code does, and somehow get Jetty to send 
that ByteBuffer directly, then we may be able to save some overhead.





> Support native Cell format in REST server and client
> ----------------------------------------------------
>
>                 Key: HBASE-28597
>                 URL: https://issues.apache.org/jira/browse/HBASE-28597
>             Project: HBase
>          Issue Type: Wish
>          Components: REST
>            Reporter: Istvan Toth
>            Priority: Major
>
> REST currently uses its own (outdated) CellSetModel format for transferring 
> cells.
> This is fine for XML and JSON, which are slow anyway and even slower handling 
> byte arrays, and is expected to be used in cases where a simple  client code 
> which does not depend on the hbase java libraries is more important than raw 
> performance.
> However, we perform the same marshalling and unmarshalling when we are using 
> protobuf, which doesn't really add value, but eats up resources.
> We could add a new encoding for Results which uses the native cell format, by 
> simply dumping the binary cell bytestreams into the REST response body.
> This should save a lot of resources on the server side, and would be either 
> faster, or the same speed on the client.
> As an additional advantage, the resulting Cells would be of native HBase Cell 
> type instead of the REST Cell type.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to