Re: HBase process not starting and uses a lot of CPU

Ted Yu Wed, 16 Dec 2015 19:54:06 -0800

I noticed Phoenix config parameters. Are Phoenix jars in place ?

Can you capture jstack of the master when this happens ?


Cheers

> On Dec 16, 2015, at 7:46 PM, F21 <[email protected]> wrote:
> 
> Background:
> 
> I am prototyping a HBase cluster using docker. Docker is 1.9.1 and is running 
> in a Ubuntu 15.10 64-bit VM with access to 6GB of RAM.
> 
> Within docker, I am running 1 Zookeeper, HDFS (2.7.1) in HA mode (1 name 
> node, 1 standby name node, 3 journal nodes, 2 zookeeper failover controllers 
> (colocated with namenodes) and 3 datanodes).
> 
> In terms of HBase I am running 1.1.2 and have 2 masters and 2 region servers 
> set up to use the HDFS cluster.
> 
> All of the above are running Oracle Java 8.
> 
> I am launching all my docker containers using docker-compose. However, I have 
> startup scripts in place to check that the HDFS cluster is up and safemode is 
> off before launching the HBase servers.
> 
> Problem:
> When launching the regionservers and masters, they do not launch reliably. 
> Often times, there will be one or more regionservers and masters which do not 
> launch properly. In those cases, the failed process will be using 100% of the 
> CPU core it is launched on and use very little memory (about 20 MB). The 
> process hangs and we need to forcefully terminate it.
> 
> In the log files, we see that hbase-hdfs-master-hmaster2.log is empty and 
> hbase-hdfs-master-hmaster2.out contains some information but not much:
> 
> Thu Dec 17 02:37:26 UTC 2015 Starting master on hmaster2
> core file size          (blocks, -c) unlimited
> data seg size           (kbytes, -d) unlimited
> scheduling priority             (-e) 0
> file size               (blocks, -f) unlimited
> pending signals                 (-i) 23668
> max locked memory       (kbytes, -l) 64
> max memory size         (kbytes, -m) unlimited
> open files                      (-n) 1048576
> pipe size            (512 bytes, -p) 8
> POSIX message queues     (bytes, -q) 819200
> real-time priority              (-r) 0
> stack size              (kbytes, -s) 8192
> cpu time               (seconds, -t) unlimited
> max user processes              (-u) 1048576
> virtual memory          (kbytes, -v) unlimited
> file locks                      (-x) unlimited
> 
> This is the command we are using to launch the hbase process:
> 
> sudo -u hdfs /opt/hbase/bin/hbase-daemon.sh --config /opt/hbase/conf start 
> master
> 
> The hbase-site.xml looks like this:
> 
> <configuration>
>  <property>
>    <name>hbase.rootdir</name>
>    <value>hdfs://mycluster/hbase</value>
>  </property>
>  <property>
>    <name>zookeeper.znode.parent</name>
>    <value>/hbase</value>
>  </property>
>  <property>
>    <name>hbase.cluster.distributed</name>
>    <value>true</value>
>  </property>
>  <property>
>    <name>hbase.zookeeper.quorum</name>
>    <value>zk1</value>
>  </property>
>  <property>
>    <name>hbase.master.loadbalancer.class</name>
> <value>org.apache.phoenix.hbase.index.balancer.IndexLoadBalancer</value>
>  </property>
>  <property>
>    <name>hbase.coprocessor.master.classes</name>
> <value>org.apache.phoenix.hbase.index.master.IndexMasterObserver</value>
>  </property>
> </configuration>
> 
> The hdfs-site.xml looks like this:
> 
> <configuration>
>  <property>
>    <name>dfs.nameservices</name>
>    <value>mycluster</value>
>  </property>
>  <property>
>    <name>dfs.ha.namenodes.mycluster</name>
>    <value>nn1,nn2</value>
>  </property>
>  <property>
>    <name>dfs.namenode.rpc-address.mycluster.nn1</name>
>    <value>nn1:8020</value>
>  </property>
>  <property>
>    <name>dfs.namenode.rpc-address.mycluster.nn2</name>
>    <value>nn2:8020</value>
>  </property>
>  <property>
>    <name>dfs.namenode.http-address.mycluster.nn1</name>
>    <value>nn1:50070</value>
>  </property>
>  <property>
>    <name>dfs.namenode.http-address.mycluster.nn2</name>
>    <value>nn2:50070</value>
>  </property>
>  <property>
> <name>dfs.client.failover.proxy.provider.mycluster</name>
> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
>  </property>
> </configuration>
> 
> The core-site.xml looks like this:
> <configuration>
> <property><name>fs.defaultFS</name><value>hdfs://mycluster</value></property></configuration>
> 
> And hbase-env.sh looks like this:
> # Set environment variables here.
> 
> # This script sets variables multiple times over the course of starting an 
> hbase process,
> # so try to keep things idempotent unless you want to take an even deeper look
> # into the startup scripts (bin/hbase, etc.)
> 
> # The java implementation to use.  Java 1.7+ required.
> # export JAVA_HOME=/usr/java/jdk1.6.0/
> 
> # Extra Java CLASSPATH elements.  Optional.
> # export HBASE_CLASSPATH=
> 
> # The maximum amount of heap to use. Default is left to JVM default.
> # export HBASE_HEAPSIZE=1G
> 
> # Uncomment below if you intend to use off heap cache. For example, to 
> allocate 8G of
> # offheap, set the value to "8G".
> # export HBASE_OFFHEAPSIZE=1G
> 
> # Extra Java runtime options.
> # Below are what we set by default.  May only work with SUN JVM.
> # For more on why as well as other possible settings,
> # see http://wiki.apache.org/hadoop/PerformanceTuning
> export HBASE_OPTS="-XX:+UseConcMarkSweepGC"
> 
> # Configure PermSize. Only needed in JDK7. You can safely remove it for JDK8+
> export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS -XX:PermSize=128m 
> -XX:MaxPermSize=128m"
> export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS -XX:PermSize=128m 
> -XX:MaxPermSize=128m"
> 
> # Uncomment one of the below three options to enable java garbage collection 
> logging for the server-side processes.
> 
> # This enables basic gc logging to the .out file.
> # export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails 
> -XX:+PrintGCDateStamps"
> 
> # This enables basic gc logging to its own file.
> # If FILE-PATH is not replaced, the log file(.gc) would still be generated in 
> the HBASE_LOG_DIR .
> # export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails 
> -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH>"
> 
> # This enables basic GC logging to its own file with automatic log rolling. 
> Only applies to jdk 1.6.0_34+ and 1.7.0_2+.
> # If FILE-PATH is not replaced, the log file(.gc) would still be generated in 
> the HBASE_LOG_DIR .
> # export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails 
> -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH> -XX:+UseGCLogFileRotation 
> -XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M"
> 
> # Uncomment one of the below three options to enable java garbage collection 
> logging for the client processes.
> 
> # This enables basic gc logging to the .out file.
> # export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails 
> -XX:+PrintGCDateStamps"
> 
> # This enables basic gc logging to its own file.
> # If FILE-PATH is not replaced, the log file(.gc) would still be generated in 
> the HBASE_LOG_DIR .
> # export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails 
> -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH>"
> 
> # This enables basic GC logging to its own file with automatic log rolling. 
> Only applies to jdk 1.6.0_34+ and 1.7.0_2+.
> # If FILE-PATH is not replaced, the log file(.gc) would still be generated in 
> the HBASE_LOG_DIR .
> # export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails 
> -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH> -XX:+UseGCLogFileRotation 
> -XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M"
> 
> # See the package documentation for org.apache.hadoop.hbase.io.hfile for 
> other configurations
> # needed setting up off-heap block caching.
> 
> # Uncomment and adjust to enable JMX exporting
> # See jmxremote.password and jmxremote.access in $JRE_HOME/lib/management to 
> configure remote password access.
> # More details at: 
> http://java.sun.com/javase/6/docs/technotes/guides/management/agent.html
> # NOTE: HBase provides an alternative JMX implementation to fix the random 
> ports issue, please see JMX
> # section in HBase Reference Guide for instructions.
> 
> # export HBASE_JMX_BASE="-Dcom.sun.management.jmxremote.ssl=false 
> -Dcom.sun.management.jmxremote.authenticate=false"
> # export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS $HBASE_JMX_BASE 
> -Dcom.sun.management.jmxremote.port=10101"
> # export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS $HBASE_JMX_BASE 
> -Dcom.sun.management.jmxremote.port=10102"
> # export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS $HBASE_JMX_BASE 
> -Dcom.sun.management.jmxremote.port=10103"
> # export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS $HBASE_JMX_BASE 
> -Dcom.sun.management.jmxremote.port=10104"
> # export HBASE_REST_OPTS="$HBASE_REST_OPTS $HBASE_JMX_BASE 
> -Dcom.sun.management.jmxremote.port=10105"
> 
> # File naming hosts on which HRegionServers will run. 
> $HBASE_HOME/conf/regionservers by default.
> # export HBASE_REGIONSERVERS=${HBASE_HOME}/conf/regionservers
> 
> # Uncomment and adjust to keep all the Region Server pages mapped to be 
> memory resident
> #HBASE_REGIONSERVER_MLOCK=true
> #HBASE_REGIONSERVER_UID="hbase"
> 
> # File naming hosts on which backup HMaster will run. 
> $HBASE_HOME/conf/backup-masters by default.
> # export HBASE_BACKUP_MASTERS=${HBASE_HOME}/conf/backup-masters
> 
> # Extra ssh options.  Empty by default.
> # export HBASE_SSH_OPTS="-o ConnectTimeout=1 -o SendEnv=HBASE_CONF_DIR"
> 
> # Where log files are stored.  $HBASE_HOME/logs by default.
> # export HBASE_LOG_DIR=${HBASE_HOME}/logs
> 
> # Enable remote JDWP debugging of major HBase processes. Meant for Core 
> Developers
> # export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS -Xdebug 
> -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8070"
> # export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS -Xdebug 
> -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8071"
> # export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS -Xdebug 
> -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8072"
> # export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS -Xdebug 
> -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8073"
> 
> # A string representing this instance of hbase. $USER by default.
> # export HBASE_IDENT_STRING=$USER
> 
> # The scheduling priority for daemon processes.  See 'man nice'.
> # export HBASE_NICENESS=10
> 
> # The directory where pid files are stored. /tmp by default.
> # export HBASE_PID_DIR=/var/hadoop/pids
> 
> # Seconds to sleep between slave commands.  Unset by default.  This
> # can be useful in large clusters, where, e.g., slave rsyncs can
> # otherwise arrive faster than the master can service them.
> # export HBASE_SLAVE_SLEEP=0.1
> 
> # Tell HBase whether it should manage it's own instance of Zookeeper or not.
> # export HBASE_MANAGES_ZK=true
> 
> # The default log rolling policy is RFA, where the log file is rolled as per 
> the size defined for the
> # RFA appender. Please refer to the log4j.properties file to see more details 
> on this appender.
> # In case one needs to do log rolling on a date change, one should set the 
> environment property
> # HBASE_ROOT_LOGGER to "<DESIRED_LOG LEVEL>,DRFA".
> # For example:
> # HBASE_ROOT_LOGGER=INFO,DRFA
> # The reason for changing default to RFA is to avoid the boundary case of 
> filling out disk space as
> # DRFA doesn't put any cap on the log size. Please refer to HBase-5655 for 
> more context.
> export HBASE_LOG_DIR=/var/log/hbase
> export HBASE_PID_DIR=/var/run/hbase
> export JAVA_HOME=/usr/lib/jvm/java-8-oracle
> 
> The server still has plenty of ram available (1GB).
> 
> It's not clear what is causing this as the logs are pretty sparse. Have any 
> of you seen a problem like this before?

Re: HBase process not starting and uses a lot of CPU

Reply via email to