Hi Stack,

Does what you suggested apply to HBase 0.94.6?

Thanks,
Jane

On 7/18/2014 5:11 PM, Stack wrote:
On Fri, Jul 18, 2014 at 4:46 PM, Jane Tao <[email protected]> wrote:

Hi there,

Our goal is to fully utilize the free RAM on each node/region server for
HBase. At the same time, we do not want to incur too much pressure from GC
(garbage collection). Based on Ted's sugguestion, we are trying to using
bucket cache.

However, we are not sure:

Sorry.  Config is a little complicated at the moment.  It has had some
cleanup in trunk.  Meantime...



- The relation between XX:MaxDirectMemorySize and java heap size. Is
MaxDirectMemorySize part of java heap size ?


No.  It is the maximum for how much the JVM should use OFFHEAP.  Here is a
bit of a note I just added to the refguide:


                  <para>The default maximum direct memory varies by JVM.
  Traditionally it is 64M
                      or some relation to allocated heap size (-Xmx) or no
limit at all (JDK7 apparently).
                      HBase servers use direct memory, in particular
short-circuit reading, the hosted DFSClient will
                      allocate direct memory buffers.  If you do offheap
block caching, you'll
                      be making use of direct memory.  Starting your JVM,
make sure
                      the <varname>-XX:MaxDirectMemorySize</varname> setting
in
                      <filename>conf/hbase-env.sh</filename> is set to some
value that is
                      higher than what you have allocated to your offheap
blockcache
                      (<varname>hbase.bucketcache.size</varname>).  It
should be larger than your offheap block
                      cache and then some for DFSClient usage (How much the
DFSClient uses is not
                      easy to quantify; it is the number of open hfiles *
<varname>hbase.dfs.client.read.shortcircuit.buffer.size</varname>
                      where hbase.dfs.client.read.shortcircuit.buffer.size
is set to 128k in HBase -- see <filename>hbase-default.xml</filename>
                      default configurations).
                  </para>



- The relation between XX:MaxDirectMemorySize and hbase.bucketcache.size.
Are they equal?

XX:MaxDirectMemorySize should be larger than hbase.bucketcache.size.  They
should not be equal.  See note above for why.



- How to adjust hbase.bucketcache.percentage.in.combinedcache?


Or just leave it as is.  To adjust, just set it to other than the default
which is 0.9 (0.9 of hbase.bucketcache.size).  This configuration has been
removed from trunk because it is confusing.



Right now, we have the following configuration. Does it make sense?

- java heap size of each hbase region server to 12 GB
- -XX:MaxDirectMemorySize to be 6GB

Why not set it to 48G since you have the RAM?



- hbase-site.xml :
   <property>
     <name>hbase.offheapcache.percentage</name>
     <value>0</value>
   </property>

This setting is not needed.  0 is the default.


   <property>
     <name>hbase.bucketcache.ioengine</name>
     <value>offheap</value>
   </property>
   <property>
<name>hbase.bucketcache.percentage.in.combinedcache</name>
     <value>0.8</value>
   </property>

Or you could just undo this setting and go with the default which is 0.9.


   <property>
     <name>hbase.bucketcache.size</name>
     <value>6144</value>
   </property>


Adjust this to be 40000? (smile).
Let us know how it goes.

What version of HBase you running?  Thanks.

St.Ack



Thanks,
Jane


On 7/17/2014 3:05 PM, Ted Yu wrote:

Have you considered using BucketCache ?

Please read 9.6.4.1 under
http://hbase.apache.org/book.html#regionserver.arch

Note: remember to verify the config values against the hbase release
you're
using.

Cheers


On Thu, Jul 17, 2014 at 2:53 PM, Jane Tao <[email protected]> wrote:

  Hi Ted,
In my case, there is a 6 Node HBase cluster setup (running on Oracle
BDA).
Each node has plenty of RAM (64GB) and CPU cores. Several articles seem
to
suggest
that it is not a good idea to allocate too much RAM to region server's
heap setting.

If each region server has 10GB heap and there is only one region server
per node, then
I have 10x6=60GB for the whole HBase. This setting is good for ~100M rows
but starts
to incur lots of GC activities on region servers when loading billions of
rows.

Basically, I need a configuration that can fully utilize the free RAM on
each node for HBase.

Thanks,
Jane
On 7/16/2014 4:17 PM, Ted Yu wrote:

  Jane:
Can you briefly describe the use case where multiple region servers are
needed on the same host ?

Cheers



On Wed, Jul 16, 2014 at 3:14 PM, Dhaval Shah <
[email protected]
wrote:

   Its certainly possible (atleast with command line) but probably very

messy. You will need to have different ports, different log files,
different pid files, possibly even different configs on the same
machine.


Regards,
Dhaval


________________________________
    From: Jane Tao <[email protected]>
To: [email protected]
Sent: Wednesday, 16 July 2014 6:06 PM
Subject: multiple region servers at one machine


Hi there,

Is it possible to run multiple region servers at one machine/node? If
this is possible, how to start multiple region servers with command
lines or cloudera manager?

Thanks,
Jane


--

  --


--



--

Reply via email to