Type the following command: which java You should get a path, say <some-path>/bin/java jstack is under <some-path>/bin/
issue the following command: ps aux | grep aster You should see the pid of master process issue the following: <some-path>/bin/jstack <pid> Cheers On Wed, Dec 16, 2015 at 7:55 PM, F21 <[email protected]> wrote: > Yes, the phoenix jars are in place and everything (including phoenix) > works properly if the hbase server launches correctly. I was seeing this > problem before I added phoenix to the mix. > > Can you provide instructions on how to capture the jstack (not very > familiar with java)? > > Cheers > > > On 17/12/2015 2:53 PM, Ted Yu wrote: > >> I noticed Phoenix config parameters. Are Phoenix jars in place ? >> >> Can you capture jstack of the master when this happens ? >> >> Cheers >> >> On Dec 16, 2015, at 7:46 PM, F21 <[email protected]> wrote: >>> >>> Background: >>> >>> I am prototyping a HBase cluster using docker. Docker is 1.9.1 and is >>> running in a Ubuntu 15.10 64-bit VM with access to 6GB of RAM. >>> >>> Within docker, I am running 1 Zookeeper, HDFS (2.7.1) in HA mode (1 name >>> node, 1 standby name node, 3 journal nodes, 2 zookeeper failover >>> controllers (colocated with namenodes) and 3 datanodes). >>> >>> In terms of HBase I am running 1.1.2 and have 2 masters and 2 region >>> servers set up to use the HDFS cluster. >>> >>> All of the above are running Oracle Java 8. >>> >>> I am launching all my docker containers using docker-compose. However, I >>> have startup scripts in place to check that the HDFS cluster is up and >>> safemode is off before launching the HBase servers. >>> >>> Problem: >>> When launching the regionservers and masters, they do not launch >>> reliably. Often times, there will be one or more regionservers and masters >>> which do not launch properly. In those cases, the failed process will be >>> using 100% of the CPU core it is launched on and use very little memory >>> (about 20 MB). The process hangs and we need to forcefully terminate it. >>> >>> In the log files, we see that hbase-hdfs-master-hmaster2.log is empty >>> and hbase-hdfs-master-hmaster2.out contains some information but not much: >>> >>> Thu Dec 17 02:37:26 UTC 2015 Starting master on hmaster2 >>> core file size (blocks, -c) unlimited >>> data seg size (kbytes, -d) unlimited >>> scheduling priority (-e) 0 >>> file size (blocks, -f) unlimited >>> pending signals (-i) 23668 >>> max locked memory (kbytes, -l) 64 >>> max memory size (kbytes, -m) unlimited >>> open files (-n) 1048576 >>> pipe size (512 bytes, -p) 8 >>> POSIX message queues (bytes, -q) 819200 >>> real-time priority (-r) 0 >>> stack size (kbytes, -s) 8192 >>> cpu time (seconds, -t) unlimited >>> max user processes (-u) 1048576 >>> virtual memory (kbytes, -v) unlimited >>> file locks (-x) unlimited >>> >>> This is the command we are using to launch the hbase process: >>> >>> sudo -u hdfs /opt/hbase/bin/hbase-daemon.sh --config /opt/hbase/conf >>> start master >>> >>> The hbase-site.xml looks like this: >>> >>> <configuration> >>> <property> >>> <name>hbase.rootdir</name> >>> <value>hdfs://mycluster/hbase</value> >>> </property> >>> <property> >>> <name>zookeeper.znode.parent</name> >>> <value>/hbase</value> >>> </property> >>> <property> >>> <name>hbase.cluster.distributed</name> >>> <value>true</value> >>> </property> >>> <property> >>> <name>hbase.zookeeper.quorum</name> >>> <value>zk1</value> >>> </property> >>> <property> >>> <name>hbase.master.loadbalancer.class</name> >>> <value>org.apache.phoenix.hbase.index.balancer.IndexLoadBalancer</value> >>> </property> >>> <property> >>> <name>hbase.coprocessor.master.classes</name> >>> <value>org.apache.phoenix.hbase.index.master.IndexMasterObserver</value> >>> </property> >>> </configuration> >>> >>> The hdfs-site.xml looks like this: >>> >>> <configuration> >>> <property> >>> <name>dfs.nameservices</name> >>> <value>mycluster</value> >>> </property> >>> <property> >>> <name>dfs.ha.namenodes.mycluster</name> >>> <value>nn1,nn2</value> >>> </property> >>> <property> >>> <name>dfs.namenode.rpc-address.mycluster.nn1</name> >>> <value>nn1:8020</value> >>> </property> >>> <property> >>> <name>dfs.namenode.rpc-address.mycluster.nn2</name> >>> <value>nn2:8020</value> >>> </property> >>> <property> >>> <name>dfs.namenode.http-address.mycluster.nn1</name> >>> <value>nn1:50070</value> >>> </property> >>> <property> >>> <name>dfs.namenode.http-address.mycluster.nn2</name> >>> <value>nn2:50070</value> >>> </property> >>> <property> >>> <name>dfs.client.failover.proxy.provider.mycluster</name> >>> >>> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> >>> </property> >>> </configuration> >>> >>> The core-site.xml looks like this: >>> <configuration> >>> >>> <property><name>fs.defaultFS</name><value>hdfs://mycluster</value></property></configuration> >>> >>> And hbase-env.sh looks like this: >>> # Set environment variables here. >>> >>> # This script sets variables multiple times over the course of starting >>> an hbase process, >>> # so try to keep things idempotent unless you want to take an even >>> deeper look >>> # into the startup scripts (bin/hbase, etc.) >>> >>> # The java implementation to use. Java 1.7+ required. >>> # export JAVA_HOME=/usr/java/jdk1.6.0/ >>> >>> # Extra Java CLASSPATH elements. Optional. >>> # export HBASE_CLASSPATH= >>> >>> # The maximum amount of heap to use. Default is left to JVM default. >>> # export HBASE_HEAPSIZE=1G >>> >>> # Uncomment below if you intend to use off heap cache. For example, to >>> allocate 8G of >>> # offheap, set the value to "8G". >>> # export HBASE_OFFHEAPSIZE=1G >>> >>> # Extra Java runtime options. >>> # Below are what we set by default. May only work with SUN JVM. >>> # For more on why as well as other possible settings, >>> # see http://wiki.apache.org/hadoop/PerformanceTuning >>> export HBASE_OPTS="-XX:+UseConcMarkSweepGC" >>> >>> # Configure PermSize. Only needed in JDK7. You can safely remove it for >>> JDK8+ >>> export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS -XX:PermSize=128m >>> -XX:MaxPermSize=128m" >>> export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS >>> -XX:PermSize=128m -XX:MaxPermSize=128m" >>> >>> # Uncomment one of the below three options to enable java garbage >>> collection logging for the server-side processes. >>> >>> # This enables basic gc logging to the .out file. >>> # export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails >>> -XX:+PrintGCDateStamps" >>> >>> # This enables basic gc logging to its own file. >>> # If FILE-PATH is not replaced, the log file(.gc) would still be >>> generated in the HBASE_LOG_DIR . >>> # export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails >>> -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH>" >>> >>> # This enables basic GC logging to its own file with automatic log >>> rolling. Only applies to jdk 1.6.0_34+ and 1.7.0_2+. >>> # If FILE-PATH is not replaced, the log file(.gc) would still be >>> generated in the HBASE_LOG_DIR . >>> # export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails >>> -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH> -XX:+UseGCLogFileRotation >>> -XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M" >>> >>> # Uncomment one of the below three options to enable java garbage >>> collection logging for the client processes. >>> >>> # This enables basic gc logging to the .out file. >>> # export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails >>> -XX:+PrintGCDateStamps" >>> >>> # This enables basic gc logging to its own file. >>> # If FILE-PATH is not replaced, the log file(.gc) would still be >>> generated in the HBASE_LOG_DIR . >>> # export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails >>> -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH>" >>> >>> # This enables basic GC logging to its own file with automatic log >>> rolling. Only applies to jdk 1.6.0_34+ and 1.7.0_2+. >>> # If FILE-PATH is not replaced, the log file(.gc) would still be >>> generated in the HBASE_LOG_DIR . >>> # export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails >>> -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH> -XX:+UseGCLogFileRotation >>> -XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M" >>> >>> # See the package documentation for org.apache.hadoop.hbase.io.hfile >>> for other configurations >>> # needed setting up off-heap block caching. >>> >>> # Uncomment and adjust to enable JMX exporting >>> # See jmxremote.password and jmxremote.access in >>> $JRE_HOME/lib/management to configure remote password access. >>> # More details at: >>> http://java.sun.com/javase/6/docs/technotes/guides/management/agent.html >>> # NOTE: HBase provides an alternative JMX implementation to fix the >>> random ports issue, please see JMX >>> # section in HBase Reference Guide for instructions. >>> >>> # export HBASE_JMX_BASE="-Dcom.sun.management.jmxremote.ssl=false >>> -Dcom.sun.management.jmxremote.authenticate=false" >>> # export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS $HBASE_JMX_BASE >>> -Dcom.sun.management.jmxremote.port=10101" >>> # export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS >>> $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10102" >>> # export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS $HBASE_JMX_BASE >>> -Dcom.sun.management.jmxremote.port=10103" >>> # export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS $HBASE_JMX_BASE >>> -Dcom.sun.management.jmxremote.port=10104" >>> # export HBASE_REST_OPTS="$HBASE_REST_OPTS $HBASE_JMX_BASE >>> -Dcom.sun.management.jmxremote.port=10105" >>> >>> # File naming hosts on which HRegionServers will run. >>> $HBASE_HOME/conf/regionservers by default. >>> # export HBASE_REGIONSERVERS=${HBASE_HOME}/conf/regionservers >>> >>> # Uncomment and adjust to keep all the Region Server pages mapped to be >>> memory resident >>> #HBASE_REGIONSERVER_MLOCK=true >>> #HBASE_REGIONSERVER_UID="hbase" >>> >>> # File naming hosts on which backup HMaster will run. >>> $HBASE_HOME/conf/backup-masters by default. >>> # export HBASE_BACKUP_MASTERS=${HBASE_HOME}/conf/backup-masters >>> >>> # Extra ssh options. Empty by default. >>> # export HBASE_SSH_OPTS="-o ConnectTimeout=1 -o SendEnv=HBASE_CONF_DIR" >>> >>> # Where log files are stored. $HBASE_HOME/logs by default. >>> # export HBASE_LOG_DIR=${HBASE_HOME}/logs >>> >>> # Enable remote JDWP debugging of major HBase processes. Meant for Core >>> Developers >>> # export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS -Xdebug >>> -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8070" >>> # export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS -Xdebug >>> -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8071" >>> # export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS -Xdebug >>> -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8072" >>> # export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS -Xdebug >>> -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8073" >>> >>> # A string representing this instance of hbase. $USER by default. >>> # export HBASE_IDENT_STRING=$USER >>> >>> # The scheduling priority for daemon processes. See 'man nice'. >>> # export HBASE_NICENESS=10 >>> >>> # The directory where pid files are stored. /tmp by default. >>> # export HBASE_PID_DIR=/var/hadoop/pids >>> >>> # Seconds to sleep between slave commands. Unset by default. This >>> # can be useful in large clusters, where, e.g., slave rsyncs can >>> # otherwise arrive faster than the master can service them. >>> # export HBASE_SLAVE_SLEEP=0.1 >>> >>> # Tell HBase whether it should manage it's own instance of Zookeeper or >>> not. >>> # export HBASE_MANAGES_ZK=true >>> >>> # The default log rolling policy is RFA, where the log file is rolled as >>> per the size defined for the >>> # RFA appender. Please refer to the log4j.properties file to see more >>> details on this appender. >>> # In case one needs to do log rolling on a date change, one should set >>> the environment property >>> # HBASE_ROOT_LOGGER to "<DESIRED_LOG LEVEL>,DRFA". >>> # For example: >>> # HBASE_ROOT_LOGGER=INFO,DRFA >>> # The reason for changing default to RFA is to avoid the boundary case >>> of filling out disk space as >>> # DRFA doesn't put any cap on the log size. Please refer to HBase-5655 >>> for more context. >>> export HBASE_LOG_DIR=/var/log/hbase >>> export HBASE_PID_DIR=/var/run/hbase >>> export JAVA_HOME=/usr/lib/jvm/java-8-oracle >>> >>> The server still has plenty of ram available (1GB). >>> >>> It's not clear what is causing this as the logs are pretty sparse. Have >>> any of you seen a problem like this before? >>> >> >
