openinx commented on a change in pull request #301: HBASE-22547 Document for 
offheap read in HBase Book
URL: https://github.com/apache/hbase/pull/301#discussion_r294800257
 
 

 ##########
 File path: src/main/asciidoc/_chapters/offheap_read_write.adoc
 ##########
 @@ -0,0 +1,168 @@
+////
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+////
+
+[[offheap_read_write]]
+= RegionServer Offheap Read/Write Path
+:doctype: book
+:numbered:
+:toc: left
+:icons: font
+:experimental:
+
+[[regionserver.offheap.overview]]
+== Overview
+
+For reducing the Java GC impact to P99/P999 RPC latency, HBase 2.x has made 
the offheap read and write path. The cells are
+allocated from JVM offheap memory area, which won’t be garbage collected by 
JVM and need to be deallocated explicitly by
+upstream callers. In the write path, the request packet received from client 
will be allocated offheap and retained
+until those cells are successfully written to the WAL and Memstore. The memory 
data structure in Memstore does
+not directly store the cell memory, but reference to cells which are encoded 
in multiple chunks in MSLAB,  this is easier
+to manage the offheap memory. Similarly, in the read path, we’ll try to read 
the cache firstly, if the cache
+misses, go to the HFile and read the corresponding block. The workflow: from 
reading blocks to sending cells to
+client,  it's basically not involved in on-heap memory allocations.
+
+image::offheap-overview.png[]
+
+
+[[regionserver.offheap.readpath]]
+== Offheap read-path
+In HBase-2.0.0, 
link:https://issues.apache.org/jira/browse/HBASE-11425[HBASE-11425] changed the 
HBase read path so it
+could hold the read-data off-heap (from BucketCache) avoiding copying of 
cached data on to the java heap.
+This reduces GC pauses given there is less garbage made and so less to clear. 
The off-heap read path has a performance
+that is similar/better to that of the on-heap LRU cache.  This feature is 
available since HBase 2.0.0.
+If the BucketCache is in `file` mode, fetching will always be slower compared 
to the native on-heap LruBlockCache.
+Refer to below blogs for more details and test results on off heaped read path
+link:https://blogs.apache.org/hbase/entry/offheaping_the_read_path_in[Offheaping
 the Read Path in Apache HBase: Part 1 of 2]
+and 
link:https://blogs.apache.org/hbase/entry/offheap-read-path-in-production[Offheap
 Read-Path in Production - The Alibaba story]
+
+For an end-to-end off-heaped read-path, first of all there should be an 
off-heap backed <<offheap.blockcache>>(BC). Configure 
'hbase.bucketcache.ioengine' to off-heap in
+_hbase-site.xml_. Also specify the total capacity of the BC using 
`hbase.bucketcache.size` config. Please remember to adjust value of 
'HBASE_OFFHEAPSIZE' in
+_hbase-env.sh_. This is how we specify the max possible off-heap memory 
allocation for the RegionServer java process.
+This should be bigger than the off-heap BC size. Please keep in mind that 
there is no default for `hbase.bucketcache.ioengine`
+which means the BC is turned OFF by default (See <<direct.memory>>).
+
+Next thing to tune is the ByteBuffer pool on the RPC server side.
+The buffers from this pool will be used to accumulate the cell bytes and 
create a result cell block to send back to the client side.
+`hbase.ipc.server.reservoir.enabled` can be used to turn this pool ON or OFF. 
By default this pool is ON and available. HBase will create off heap ByteBuffers
+and pool them. Please make sure not to turn this OFF if you want end-to-end 
off-heaping in read path.
+If this pool is turned off, the server will create temp buffers on heap to 
accumulate the cell bytes and make a result cell block. This can impact the GC 
on a highly read loaded server.
+The user can tune this pool with respect to how many buffers are in the pool 
and what should be the size of each ByteBuffer.
+Use the config `hbase.ipc.server.reservoir.initial.buffer.size` to tune each 
of the buffer sizes. Default is 64 KB for HBase2.x, while it will be changed to 
65KB by default for HBase3.x
 
 Review comment:
   That's OK.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to