Hi All,
I have question regarding how can i determine RAM sizing { RAM - include all required Parameters (1)Master (2) RS (3) Java heap (4) Java client etc} I have below configuration parameter can anyone help me to configure heap size and RAM for Hbase I have CDH 5.8.0 Hbase 1.2.1 we have spark job which read data from kafka pipeline and insert data in hbase so we have frequent read write optaion from 6 pipelines (6 spark job) i can say *1 MBPS for each pipeline* *spark job has 15 min window so we have 900 MB data in 15 min* if you required more information please let me know HBase Tuning parameters: *Configurations * *Value* *Reference * *comment* hbase.hregion.memstore.flush.size 128 MB http://hadoop-hbase.blogspot.in/2013/01/hbase-region-server-memory-sizing.html Memstore will be flushed to disk if size of the memstore exceeds this value in number of bytes. This value is checked by a thread that runs the frequency specified by hbase.server.thread.wakefrequency default is 128 MB. hfile.block.cache.size 0.4 0.4 (the percentage of the Java heap to use for the L1 cache) Percentage of maximum heap (-Xmx setting) to allocate to block cache Used by HFile/StoreFile. To disable, set this value to 0 hbase.ipc.server.read.threadpool.size 10 https://www.cloudera.com/documentation/enterprise/properties/5-8-x/topics/cm_props_cdh580_hbase.html Read threadpool size used by the RegionServer HBase IPC Server hbase.client.write.buffer 2 MB https://www.cloudera.com/documentation/enterprise/properties/5-8-x/topics/cm_props_cdh580_hbase.html Default size of the HTable clien write buffer in bytes. A bigger buffer takes more memory -- on both the client and server side since server instantiates the passed write buffer to process it -- but a larger buffer size reduces the number of RPCs made. For an estimate of server-side memory-used, evaluate hbase.client.write.buffer * hbase.regionserver.handler.count hbase.regionserver.handler.count 30 Default 10 hbase.hregion.max.filesize 20 GB Default 10 GB , Maximum HStoreFile size. If any one of a column families' HStoreFiles has grown to exceed this value, the hosting HRegion is split in two. hbase.client.scanner.caching 100 Number of rows that will be fetched when calling next on a scanner if it is not served from (local, client) memory. Higher caching values will enable faster scanners but will eat up more memory and some calls of next may take longer and longer times when the cache is empty. Do not set this value such that the time between invocations is greater than the scanner timeout; i.e. hbase.client.scanner.timeout.period This value is important if data in your HBase table is used without any HBase row key based lookups, or when your query looks for wide range scans (wide rowkey lookups). hbase.client.scanner.timeout.period 1 minute dfs.client.read.shortcircuit enable Enable HDFS short-circuit read. This allows a client colocated with the DataNode to read HDFS file blocks directly. This gives a performance boost to distributed clients that are aware of locality. hbase_client_java_heapsize 256 MB Maximum size in bytes for the Java process heap memory. Passed to Java -Xmx. hbase global.memstore.size · hbase.regionserver.global.memstore.size · hbase.regionserver.global.memstore.upperLimit 0.4 Maximum size of all memstores in a RegionServer before new updates are blocked and flushes are forced. · hbase.regionserver.global.memstore.lowerLimit · hbase.regionserver.global.memstore.size.lower.limit 0.38 When memstores are being forced to flush to make room in memory, keep flushing until this amount is reached. If this amount is equal to 'hbase.regionserver.global.memstore.upperLimit', then minimum possible flushing will occur when updates are blocked due to memstore limiting hbase.ipc.server.callqueue.read.ratio Seen in Hbase version .9x but not found in 2.x Off Heap Block Cache Pending callQueue Seen in Hbase version .9x but not found in 2.x [image: Inline image 2] [image: Inline image 1] 1. HDFS block size: dfs.block.size, dfs.blocksize = 128 MB(default) Thanks Manjeet -- luv all