openinx commented on a change in pull request #301: HBASE-22547 Document for offheap read in HBase Book URL: https://github.com/apache/hbase/pull/301#discussion_r294823074
########## File path: src/main/asciidoc/_chapters/offheap_read_write.adoc ########## @@ -0,0 +1,178 @@ +//// +/** + * + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +//// + +[[offheap_read_write]] += RegionServer Offheap Read/Write Path +:doctype: book +:numbered: +:toc: left +:icons: font +:experimental: + +[[regionserver.offheap.overview]] +== Overview + +For reducing the Java GC impact to P99/P999 RPC latency, HBase 2.x has made the offheap read and write path. The cells are +allocated from JVM offheap memory area, which won’t be garbage collected by JVM and need to be deallocated explicitly by +upstream callers. In the write path, the request packet received from client will be allocated offheap and retained +until those cells are successfully written to the WAL and Memstore. The memory data structure in Memstore does +not directly store the cell memory, but reference to cells which are encoded in multiple chunks in MSLAB, this is easier +to manage the offheap memory. Similarly, in the read path, we’ll try to read the cache firstly, if the cache +misses, go to the HFile and read the corresponding block. The workflow: from reading blocks to sending cells to +client, it's basically not involved in on-heap memory allocations. + +image::offheap-overview.png[] + + +[[regionserver.offheap.readpath]] +== Offheap read-path +In HBase-2.0.0, link:https://issues.apache.org/jira/browse/HBASE-11425[HBASE-11425] changed the HBase read path so it +could hold the read-data off-heap (from BucketCache) avoiding copying of cached data on to the java heap. +This reduces GC pauses given there is less garbage made and so less to clear. The off-heap read path has a performance +that is similar/better to that of the on-heap LRU cache. This feature is available since HBase 2.0.0. +If the BucketCache is in `file` mode, fetching will always be slower compared to the native on-heap LruBlockCache. +Refer to below blogs for more details and test results on off heaped read path +link:https://blogs.apache.org/hbase/entry/offheaping_the_read_path_in[Offheaping the Read Path in Apache HBase: Part 1 of 2] +and link:https://blogs.apache.org/hbase/entry/offheap-read-path-in-production[Offheap Read-Path in Production - The Alibaba story] + +For an end-to-end off-heaped read-path, first of all there should be an off-heap backed <<offheap.blockcache>>(BC). Configure 'hbase.bucketcache.ioengine' to off-heap in +_hbase-site.xml_. Also specify the total capacity of the BC using `hbase.bucketcache.size` config. Please remember to adjust value of 'HBASE_OFFHEAPSIZE' in +_hbase-env.sh_. This is how we specify the max possible off-heap memory allocation for the RegionServer java process. +This should be bigger than the off-heap BC size. Please keep in mind that there is no default for `hbase.bucketcache.ioengine` +which means the BC is turned OFF by default (See <<direct.memory>>). + +Next thing to tune is the ByteBuffer pool on the RPC server side: + +NOTE: the config keys which starts with prefix hbase.ipc.server.reservoir are deprecated in HBase3.x. If you are still Review comment: > In all these new document, can we refer to new config at 1st place. Also say the old one and is deprecated in 3.0 @anoopsjohn, I've addressed your comment here. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
