No. You need 0.96.x HBase at least. St.Ack
On Mon, Jul 21, 2014 at 9:42 AM, Jane Tao <[email protected]> wrote: > Hi Stack, > > Does what you suggested apply to HBase 0.94.6? > > Thanks, > Jane > > > On 7/18/2014 5:11 PM, Stack wrote: > >> On Fri, Jul 18, 2014 at 4:46 PM, Jane Tao <[email protected]> wrote: >> >> Hi there, >>> >>> Our goal is to fully utilize the free RAM on each node/region server for >>> HBase. At the same time, we do not want to incur too much pressure from >>> GC >>> (garbage collection). Based on Ted's sugguestion, we are trying to using >>> bucket cache. >>> >>> However, we are not sure: >>> >>> Sorry. Config is a little complicated at the moment. It has had some >> cleanup in trunk. Meantime... >> >> >> >> - The relation between XX:MaxDirectMemorySize and java heap size. Is >>> MaxDirectMemorySize part of java heap size ? >>> >>> >> No. It is the maximum for how much the JVM should use OFFHEAP. Here is a >> bit of a note I just added to the refguide: >> >> >> <para>The default maximum direct memory varies by JVM. >> Traditionally it is 64M >> or some relation to allocated heap size (-Xmx) or no >> limit at all (JDK7 apparently). >> HBase servers use direct memory, in particular >> short-circuit reading, the hosted DFSClient will >> allocate direct memory buffers. If you do offheap >> block caching, you'll >> be making use of direct memory. Starting your JVM, >> make sure >> the <varname>-XX:MaxDirectMemorySize</varname> >> setting >> in >> <filename>conf/hbase-env.sh</filename> is set to >> some >> value that is >> higher than what you have allocated to your offheap >> blockcache >> (<varname>hbase.bucketcache.size</varname>). It >> should be larger than your offheap block >> cache and then some for DFSClient usage (How much >> the >> DFSClient uses is not >> easy to quantify; it is the number of open hfiles * >> <varname>hbase.dfs.client.read.shortcircuit.buffer.size</varname> >> where hbase.dfs.client.read. >> shortcircuit.buffer.size >> is set to 128k in HBase -- see <filename>hbase-default.xml</filename> >> default configurations). >> </para> >> >> >> >> - The relation between XX:MaxDirectMemorySize and hbase.bucketcache.size. >>> Are they equal? >>> >>> XX:MaxDirectMemorySize should be larger than hbase.bucketcache.size. >> They >> should not be equal. See note above for why. >> >> >> >> - How to adjust hbase.bucketcache.percentage.in.combinedcache? >>> >>> >>> Or just leave it as is. To adjust, just set it to other than the >> default >> which is 0.9 (0.9 of hbase.bucketcache.size). This configuration has been >> removed from trunk because it is confusing. >> >> >> >> Right now, we have the following configuration. Does it make sense? >>> >>> - java heap size of each hbase region server to 12 GB >>> - -XX:MaxDirectMemorySize to be 6GB >>> >>> Why not set it to 48G since you have the RAM? >> >> >> >> - hbase-site.xml : >>> <property> >>> <name>hbase.offheapcache.percentage</name> >>> <value>0</value> >>> </property> >>> >>> This setting is not needed. 0 is the default. >> >> >> <property> >>> <name>hbase.bucketcache.ioengine</name> >>> <value>offheap</value> >>> </property> >>> <property> >>> <name>hbase.bucketcache.percentage.in.combinedcache</name> >>> <value>0.8</value> >>> </property> >>> >>> Or you could just undo this setting and go with the default which is >> 0.9. >> >> >> <property> >>> <name>hbase.bucketcache.size</name> >>> <value>6144</value> >>> </property> >>> >>> >>> Adjust this to be 40000? (smile). >> Let us know how it goes. >> >> What version of HBase you running? Thanks. >> >> St.Ack >> >> >> >> Thanks, >>> Jane >>> >>> >>> On 7/17/2014 3:05 PM, Ted Yu wrote: >>> >>> Have you considered using BucketCache ? >>>> >>>> Please read 9.6.4.1 under >>>> http://hbase.apache.org/book.html#regionserver.arch >>>> >>>> Note: remember to verify the config values against the hbase release >>>> you're >>>> using. >>>> >>>> Cheers >>>> >>>> >>>> On Thu, Jul 17, 2014 at 2:53 PM, Jane Tao <[email protected]> wrote: >>>> >>>> Hi Ted, >>>> >>>>> In my case, there is a 6 Node HBase cluster setup (running on Oracle >>>>> BDA). >>>>> Each node has plenty of RAM (64GB) and CPU cores. Several articles seem >>>>> to >>>>> suggest >>>>> that it is not a good idea to allocate too much RAM to region server's >>>>> heap setting. >>>>> >>>>> If each region server has 10GB heap and there is only one region server >>>>> per node, then >>>>> I have 10x6=60GB for the whole HBase. This setting is good for ~100M >>>>> rows >>>>> but starts >>>>> to incur lots of GC activities on region servers when loading billions >>>>> of >>>>> rows. >>>>> >>>>> Basically, I need a configuration that can fully utilize the free RAM >>>>> on >>>>> each node for HBase. >>>>> >>>>> Thanks, >>>>> Jane >>>>> On 7/16/2014 4:17 PM, Ted Yu wrote: >>>>> >>>>> Jane: >>>>> >>>>>> Can you briefly describe the use case where multiple region servers >>>>>> are >>>>>> needed on the same host ? >>>>>> >>>>>> Cheers >>>>>> >>>>>> >>>>>> >>>>>> On Wed, Jul 16, 2014 at 3:14 PM, Dhaval Shah < >>>>>> [email protected] >>>>>> wrote: >>>>>> >>>>>> Its certainly possible (atleast with command line) but probably >>>>>> very >>>>>> >>>>>> messy. You will need to have different ports, different log files, >>>>>>> different pid files, possibly even different configs on the same >>>>>>> machine. >>>>>>> >>>>>>> >>>>>>> Regards, >>>>>>> Dhaval >>>>>>> >>>>>>> >>>>>>> ________________________________ >>>>>>> From: Jane Tao <[email protected]> >>>>>>> To: [email protected] >>>>>>> Sent: Wednesday, 16 July 2014 6:06 PM >>>>>>> Subject: multiple region servers at one machine >>>>>>> >>>>>>> >>>>>>> Hi there, >>>>>>> >>>>>>> Is it possible to run multiple region servers at one machine/node? If >>>>>>> this is possible, how to start multiple region servers with command >>>>>>> lines or cloudera manager? >>>>>>> >>>>>>> Thanks, >>>>>>> Jane >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> -- >>>>>>> >>>>>> >>>>> >>>>> -- >>> >>> >>> > -- > >
