Another error after stopping the zkfc. Do I have to take the cluster down to format ZK?
[root@rhel1 conf]# sudo service hadoop-hdfs-zkfc stop Stopping Hadoop zkfc: [ OK ] stopping zkfc [root@rhel1 conf]# sudo -u hdfs zkfc -formatZK sudo: zkfc: command not found [root@rhel1 conf]# hdfs zkfc -formatZK 2014-07-31 17:49:56,792 INFO [main] tools.DFSZKFailoverController (DFSZKFailoverController.java:<init>(140)) - Failover controller configured for NameNode NameNode at rhel1.local/10.120.5.203:8020 2014-07-31 17:49:57,002 INFO [main] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:zookeeper.version=3.4.3-cdh4.1.3--1, built on 01/27/2013 00:13 GMT 2014-07-31 17:49:57,003 INFO [main] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:host.name=rhel1.local 2014-07-31 17:49:57,003 INFO [main] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:java.version=1.7.0_60 2014-07-31 17:49:57,003 INFO [main] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:java.vendor=Oracle Corporation 2014-07-31 17:49:57,003 INFO [main] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:java.home=/usr/java/jdk1.7.0_60/jre 2014-07-31 17:49:57,003 INFO [main] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:java.class.path=/etc/hadoop/conf:/usr/lib/hadoop/lib/jersey-core-1.8.jar:/usr/lib/hadoop/lib/commons-cli-1.2.jar:/usr/lib/hadoop/lib/jaxb-impl-2.2.3-1.jar:/usr/lib/hadoop/lib/jackson-mapper-asl-1.8.8.jar:/usr/lib/hadoop/lib/jsch-0.1.42.jar:/usr/lib/hadoop/lib/asm-3.2.jar:/usr/lib/hadoop/lib/kfs-0.3.jar:/usr/lib/hadoop/lib/jsr305-1.3.9.jar:/usr/lib/hadoop/lib/commons-logging-1.1.1.jar:/usr/lib/hadoop/lib/jets3t-0.6.1.jar:/usr/lib/hadoop/lib/xmlenc-0.52.jar:/usr/lib/hadoop/lib/jetty-6.1.26.cloudera.2.jar:/usr/lib/hadoop/lib/jsp-api-2.1.jar:/usr/lib/hadoop/lib/commons-collections-3.2.1.jar:/usr/lib/hadoop/lib/jackson-core-asl-1.8.8.jar:/usr/lib/hadoop/lib/guava-11.0.2.jar:/usr/lib/hadoop/lib/zookeeper-3.4.3-cdh4.1.3.jar:/usr/lib/hadoop/lib/servlet-api-2.5.jar:/usr/lib/hadoop/lib/log4j-1.2.17.jar:/usr/lib/hadoop/lib/avro-1.7.1.cloudera.2.jar:/usr/lib/hadoop/lib/jersey-server-1.8.jar:/usr/lib/hadoop/lib/jasper-compiler-5.5.23.jar:/usr/lib/hadoop/lib/jackson-jaxrs-1.8.8.jar:/usr/lib/hadoop/lib/jline-0.9.94.jar:/usr/lib/hadoop/lib/commons-el-1.0.jar:/usr/lib/hadoop/lib/commons-io-2.1.jar:/usr/lib/hadoop/lib/commons-configuration-1.6.jar:/usr/lib/hadoop/lib/commons-net-3.1.jar:/usr/lib/hadoop/lib/jackson-xc-1.8.8.jar:/usr/lib/hadoop/lib/jasper-runtime-5.5.23.jar:/usr/lib/hadoop/lib/commons-digester-1.8.jar:/usr/lib/hadoop/lib/slf4j-api-1.6.1.jar:/usr/lib/hadoop/lib/commons-beanutils-core-1.8.0.jar:/usr/lib/hadoop/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop/lib/activation-1.1.jar:/usr/lib/hadoop/lib/jetty-util-6.1.26.cloudera.2.jar:/usr/lib/hadoop/lib/commons-beanutils-1.7.0.jar:/usr/lib/hadoop/lib/commons-lang-2.5.jar:/usr/lib/hadoop/lib/snappy-java-1.0.4.1.jar:/usr/lib/hadoop/lib/junit-4.8.2.jar:/usr/lib/hadoop/lib/stax-api-1.0.1.jar:/usr/lib/hadoop/lib/commons-math-2.1.jar:/usr/lib/hadoop/lib/mockito-all-1.8.5.jar:/usr/lib/hadoop/lib/jettison-1.1.jar:/usr/lib/hadoop/lib/jaxb-api-2.2.2.jar:/usr/lib/hadoop/lib/paranamer-2.3.jar:/usr/lib/hadoop/lib/protobuf-java-2.4.0a.jar:/usr/lib/hadoop/lib/jersey-json-1.8.jar:/usr/lib/hadoop/lib/slf4j-log4j12-1.6.1.jar:/usr/lib/hadoop/lib/commons-codec-1.4.jar:/usr/lib/hadoop/.//hadoop-annotations-2.0.0-cdh4.1.3.jar:/usr/lib/hadoop/.//hadoop-common.jar:/usr/lib/hadoop/.//hadoop-common-2.0.0-cdh4.1.3-tests.jar:/usr/lib/hadoop/.//hadoop-common-2.0.0-cdh4.1.3.jar:/usr/lib/hadoop/.//hadoop-annotations.jar:/usr/lib/hadoop/.//hadoop-auth-2.0.0-cdh4.1.3.jar:/usr/lib/hadoop/.//hadoop-auth.jar:/usr/lib/hadoop-hdfs/./:/usr/lib/hadoop-hdfs/lib/jersey-core-1.8.jar:/usr/lib/hadoop-hdfs/lib/commons-cli-1.2.jar:/usr/lib/hadoop-hdfs/lib/jackson-mapper-asl-1.8.8.jar:/usr/lib/hadoop-hdfs/lib/asm-3.2.jar:/usr/lib/hadoop-hdfs/lib/jsr305-1.3.9.jar:/usr/lib/hadoop-hdfs/lib/commons-logging-1.1.1.jar:/usr/lib/hadoop-hdfs/lib/xmlenc-0.52.jar:/usr/lib/hadoop-hdfs/lib/jetty-6.1.26.cloudera.2.jar:/usr/lib/hadoop-hdfs/lib/jsp-api-2.1.jar:/usr/lib/hadoop-hdfs/lib/jackson-core-asl-1.8.8.jar:/usr/lib/hadoop-hdfs/lib/guava-11.0.2.jar:/usr/lib/hadoop-hdfs/lib/zookeeper-3.4.3-cdh4.1.3.jar:/usr/lib/hadoop-hdfs/lib/servlet-api-2.5.jar:/usr/lib/hadoop-hdfs/lib/log4j-1.2.17.jar:/usr/lib/hadoop-hdfs/lib/jersey-server-1.8.jar:/usr/lib/hadoop-hdfs/lib/jline-0.9.94.jar:/usr/lib/hadoop-hdfs/lib/commons-el-1.0.jar:/usr/lib/hadoop-hdfs/lib/commons-io-2.1.jar:/usr/lib/hadoop-hdfs/lib/commons-daemon-1.0.3.jar:/usr/lib/hadoop-hdfs/lib/jasper-runtime-5.5.23.jar:/usr/lib/hadoop-hdfs/lib/jetty-util-6.1.26.cloudera.2.jar:/usr/lib/hadoop-hdfs/lib/commons-lang-2.5.jar:/usr/lib/hadoop-hdfs/lib/protobuf-java-2.4.0a.jar:/usr/lib/hadoop-hdfs/lib/commons-codec-1.4.jar:/usr/lib/hadoop-hdfs/.//hadoop-hdfs-2.0.0-cdh4.1.3.jar:/usr/lib/hadoop-hdfs/.//hadoop-hdfs-2.0.0-cdh4.1.3-tests.jar:/usr/lib/hadoop-hdfs/.//hadoop-hdfs.jar:/usr/lib/hadoop-yarn/lib/jersey-core-1.8.jar:/usr/lib/hadoop-yarn/lib/jackson-mapper-asl-1.8.8.jar:/usr/lib/hadoop-yarn/lib/asm-3.2.jar:/usr/lib/hadoop-yarn/lib/netty-3.2.4.Final.jar:/usr/lib/hadoop-yarn/lib/jackson-core-asl-1.8.8.jar:/usr/lib/hadoop-yarn/lib/jersey-guice-1.8.jar:/usr/lib/hadoop-yarn/lib/log4j-1.2.17.jar:/usr/lib/hadoop-yarn/lib/avro-1.7.1.cloudera.2.jar:/usr/lib/hadoop-yarn/lib/jersey-server-1.8.jar:/usr/lib/hadoop-yarn/lib/guice-3.0.jar:/usr/lib/hadoop-yarn/lib/commons-io-2.1.jar:/usr/lib/hadoop-yarn/lib/aopalliance-1.0.jar:/usr/lib/hadoop-yarn/lib/javax.inject-1.jar:/usr/lib/hadoop-yarn/lib/snappy-java-1.0.4.1.jar:/usr/lib/hadoop-yarn/lib/guice-servlet-3.0.jar:/usr/lib/hadoop-yarn/lib/paranamer-2.3.jar:/usr/lib/hadoop-yarn/lib/protobuf-java-2.4.0a.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-applications-distributedshell.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-api-2.0.0-cdh4.1.3.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-site-2.0.0-cdh4.1.3.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-common-2.0.0-cdh4.1.3.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-api.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-site.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-server-nodemanager.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-server-tests.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-server-tests-2.0.0-cdh4.1.3.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-server-common.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-applications-distributedshell-2.0.0-cdh4.1.3.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-server-web-proxy-2.0.0-cdh4.1.3.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-server-tests-2.0.0-cdh4.1.3-tests.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-server-resourcemanager.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-server-web-proxy.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-server-resourcemanager-2.0.0-cdh4.1.3.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-server-common-2.0.0-cdh4.1.3.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-server-nodemanager-2.0.0-cdh4.1.3.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-common.jar:/usr/lib/hadoop-0.20-mapreduce/./:/usr/lib/hadoop-0.20-mapreduce/lib/jersey-core-1.8.jar:/usr/lib/hadoop-0.20-mapreduce/lib/ant-contrib-1.0b3.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20-mapreduce/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jaxb-impl-2.2.3-1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jackson-mapper-asl-1.8.8.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jsch-0.1.42.jar:/usr/lib/hadoop-0.20-mapreduce/lib/asm-3.2.jar:/usr/lib/hadoop-0.20-mapreduce/lib/kfs-0.3.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jsr305-1.3.9.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-logging-1.1.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jetty-6.1.26.cloudera.2.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jsp-api-2.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-collections-3.2.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jackson-core-asl-1.8.8.jar:/usr/lib/hadoop-0.20-mapreduce/lib/guava-11.0.2.jar:/usr/lib/hadoop-0.20-mapreduce/lib/servlet-api-2.5.jar:/usr/lib/hadoop-0.20-mapreduce/lib/log4j-1.2.17.jar:/usr/lib/hadoop-0.20-mapreduce/lib/avro-1.7.1.cloudera.2.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jersey-server-1.8.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jasper-compiler-5.5.23.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jackson-jaxrs-1.8.8.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-el-1.0.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-io-2.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-configuration-1.6.jar:/usr/lib/hadoop-0.20-mapreduce/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-net-3.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jackson-xc-1.8.8.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jasper-runtime-5.5.23.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-digester-1.8.jar:/usr/lib/hadoop-0.20-mapreduce/lib/slf4j-api-1.6.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-beanutils-core-1.8.0.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/kfs-0.2.2.jar:/usr/lib/hadoop-0.20-mapreduce/lib/activation-1.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jetty-util-6.1.26.cloudera.2.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-beanutils-1.7.0.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-lang-2.5.jar:/usr/lib/hadoop-0.20-mapreduce/lib/avro-compiler-1.7.1.cloudera.2.jar:/usr/lib/hadoop-0.20-mapreduce/lib/snappy-java-1.0.4.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/junit-4.8.2.jar:/usr/lib/hadoop-0.20-mapreduce/lib/stax-api-1.0.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-math-2.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/mockito-all-1.8.5.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jettison-1.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jaxb-api-2.2.2.jar:/usr/lib/hadoop-0.20-mapreduce/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop-0.20-mapreduce/lib/paranamer-2.3.jar:/usr/lib/hadoop-0.20-mapreduce/lib/protobuf-java-2.4.0a.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jersey-json-1.8.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-codec-1.4.jar:/usr/lib/hadoop-0.20-mapreduce/lib/hadoop-fairscheduler-2.0.0-mr1-cdh4.1.3.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-examples-2.0.0-mr1-cdh4.1.3.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-2.0.0-mr1-cdh4.1.3-ant.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-ant-2.0.0-mr1-cdh4.1.3.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-ant.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-2.0.0-mr1-cdh4.1.3-core.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-test.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-tools.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-core-2.0.0-mr1-cdh4.1.3.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-2.0.0-mr1-cdh4.1.3-examples.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-test-2.0.0-mr1-cdh4.1.3.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-tools-2.0.0-mr1-cdh4.1.3.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-examples.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-2.0.0-mr1-cdh4.1.3-tools.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-2.0.0-mr1-cdh4.1.3-test.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-core.jar 2014-07-31 17:49:57,004 INFO [main] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:java.library.path=//usr/lib/hadoop/lib/native 2014-07-31 17:49:57,004 INFO [main] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:java.io.tmpdir=/tmp 2014-07-31 17:49:57,004 INFO [main] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:java.compiler=<NA> 2014-07-31 17:49:57,004 INFO [main] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:os.name=Linux 2014-07-31 17:49:57,005 INFO [main] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:os.arch=amd64 2014-07-31 17:49:57,005 INFO [main] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:os.version=2.6.32-358.el6.x86_64 2014-07-31 17:49:57,005 INFO [main] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:user.name=root 2014-07-31 17:49:57,005 INFO [main] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:user.home=/root 2014-07-31 17:49:57,005 INFO [main] zookeeper.ZooKeeper (Environment.java:logEnv(100)) - Client environment:user.dir=/etc/hbase/conf.golden_apple 2014-07-31 17:49:57,015 INFO [main] zookeeper.ZooKeeper (ZooKeeper.java:<init>(433)) - Initiating client connection, connectString=rhel1.local:2181,rhel6.local:2181,rhel2.local:2181 sessionTimeout=5000 watcher=null 2014-07-31 17:49:57,040 INFO [main-SendThread(rhel6.local:2181)] zookeeper.ClientCnxn (ClientCnxn.java:logStartConnect(958)) - Opening socket connection to server rhel6.local/10.120.5.247:2181. Will not attempt to authenticate using SASL (unknown error) 2014-07-31 17:49:57,047 INFO [main-SendThread(rhel6.local:2181)] zookeeper.ClientCnxn (ClientCnxn.java:primeConnection(850)) - Socket connection established to rhel6.local/10.120.5.247:2181, initiating session 2014-07-31 17:49:57,050 INFO [main-SendThread(rhel6.local:2181)] zookeeper.ClientCnxn (ClientCnxn.java:run(1065)) - Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect 2014-07-31 17:49:57,989 INFO [main] zookeeper.ZooKeeper (ZooKeeper.java:close(679)) - Session: 0x0 closed 2014-07-31 17:49:57,989 INFO [main-EventThread] zookeeper.ClientCnxn (ClientCnxn.java:run(511)) - EventThread shut down Exception in thread "main" java.io.IOException: Couldn't determine existence of znode '/hadoop-ha/golden-apple' at org.apache.hadoop.ha.ActiveStandbyElector.parentZNodeExists(ActiveStandbyElector.java:263) at org.apache.hadoop.ha.ZKFailoverController.formatZK(ZKFailoverController.java:258) at org.apache.hadoop.ha.ZKFailoverController.doRun(ZKFailoverController.java:197) at org.apache.hadoop.ha.ZKFailoverController.access$000(ZKFailoverController.java:58) at org.apache.hadoop.ha.ZKFailoverController$1.run(ZKFailoverController.java:165) at org.apache.hadoop.ha.ZKFailoverController$1.run(ZKFailoverController.java:161) at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:452) at org.apache.hadoop.ha.ZKFailoverController.run(ZKFailoverController.java:161) at org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:175) Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hadoop-ha/golden-apple at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1021) at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1049) at org.apache.hadoop.ha.ActiveStandbyElector.parentZNodeExists(ActiveStandbyElector.java:261) ... 8 more On Thu, Jul 31, 2014 at 5:56 PM, Colin Kincaid Williams <[email protected]> wrote: > On 3) Run hdfs zkfc -formatZK in my test environment, I get a Warning > then an error > > WARNING: Before proceeding, ensure that all HDFS services and > failover controllers are stopped! > > > the complete output: > > sudo hdfs zkfc -formatZK > 2014-07-31 17:43:07,952 INFO [main] tools.DFSZKFailoverController > (DFSZKFailoverController.java:<init>(140)) - Failover controller configured > for NameNode NameNode at rhel1.local/10.120.5.203:8020 > 2014-07-31 17:43:08,128 INFO [main] zookeeper.ZooKeeper > (Environment.java:logEnv(100)) - Client > environment:zookeeper.version=3.4.3-cdh4.1.3--1, built on 01/27/2013 00:13 > GMT > 2014-07-31 17:43:08,129 INFO [main] zookeeper.ZooKeeper > (Environment.java:logEnv(100)) - Client environment:host.name=rhel1.local > 2014-07-31 17:43:08,129 INFO [main] zookeeper.ZooKeeper > (Environment.java:logEnv(100)) - Client environment:java.version=1.7.0_60 > 2014-07-31 17:43:08,129 INFO [main] zookeeper.ZooKeeper > (Environment.java:logEnv(100)) - Client environment:java.vendor=Oracle > Corporation > 2014-07-31 17:43:08,129 INFO [main] zookeeper.ZooKeeper > (Environment.java:logEnv(100)) - Client > environment:java.home=/usr/java/jdk1.7.0_60/jre > 2014-07-31 17:43:08,129 INFO [main] zookeeper.ZooKeeper > (Environment.java:logEnv(100)) - Client > environment:java.class.path=/etc/hadoop/conf:/usr/lib/hadoop/lib/jersey-core-1.8.jar:/usr/lib/hadoop/lib/commons-cli-1.2.jar:/usr/lib/hadoop/lib/jaxb-impl-2.2.3-1.jar:/usr/lib/hadoop/lib/jackson-mapper-asl-1.8.8.jar:/usr/lib/hadoop/lib/jsch-0.1.42.jar:/usr/lib/hadoop/lib/asm-3.2.jar:/usr/lib/hadoop/lib/kfs-0.3.jar:/usr/lib/hadoop/lib/jsr305-1.3.9.jar:/usr/lib/hadoop/lib/commons-logging-1.1.1.jar:/usr/lib/hadoop/lib/jets3t-0.6.1.jar:/usr/lib/hadoop/lib/xmlenc-0.52.jar:/usr/lib/hadoop/lib/jetty-6.1.26.cloudera.2.jar:/usr/lib/hadoop/lib/jsp-api-2.1.jar:/usr/lib/hadoop/lib/commons-collections-3.2.1.jar:/usr/lib/hadoop/lib/jackson-core-asl-1.8.8.jar:/usr/lib/hadoop/lib/guava-11.0.2.jar:/usr/lib/hadoop/lib/zookeeper-3.4.3-cdh4.1.3.jar:/usr/lib/hadoop/lib/servlet-api-2.5.jar:/usr/lib/hadoop/lib/log4j-1.2.17.jar:/usr/lib/hadoop/lib/avro-1.7.1.cloudera.2.jar:/usr/lib/hadoop/lib/jersey-server-1.8.jar:/usr/lib/hadoop/lib/jasper-compiler-5.5.23.jar:/usr/lib/hadoop/lib/jackson-jaxrs-1.8.8.jar:/usr/lib/hadoop/lib/jline-0.9.94.jar:/usr/lib/hadoop/lib/commons-el-1.0.jar:/usr/lib/hadoop/lib/commons-io-2.1.jar:/usr/lib/hadoop/lib/commons-configuration-1.6.jar:/usr/lib/hadoop/lib/commons-net-3.1.jar:/usr/lib/hadoop/lib/jackson-xc-1.8.8.jar:/usr/lib/hadoop/lib/jasper-runtime-5.5.23.jar:/usr/lib/hadoop/lib/commons-digester-1.8.jar:/usr/lib/hadoop/lib/slf4j-api-1.6.1.jar:/usr/lib/hadoop/lib/commons-beanutils-core-1.8.0.jar:/usr/lib/hadoop/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop/lib/activation-1.1.jar:/usr/lib/hadoop/lib/jetty-util-6.1.26.cloudera.2.jar:/usr/lib/hadoop/lib/commons-beanutils-1.7.0.jar:/usr/lib/hadoop/lib/commons-lang-2.5.jar:/usr/lib/hadoop/lib/snappy-java-1.0.4.1.jar:/usr/lib/hadoop/lib/junit-4.8.2.jar:/usr/lib/hadoop/lib/stax-api-1.0.1.jar:/usr/lib/hadoop/lib/commons-math-2.1.jar:/usr/lib/hadoop/lib/mockito-all-1.8.5.jar:/usr/lib/hadoop/lib/jettison-1.1.jar:/usr/lib/hadoop/lib/jaxb-api-2.2.2.jar:/usr/lib/hadoop/lib/paranamer-2.3.jar:/usr/lib/hadoop/lib/protobuf-java-2.4.0a.jar:/usr/lib/hadoop/lib/jersey-json-1.8.jar:/usr/lib/hadoop/lib/slf4j-log4j12-1.6.1.jar:/usr/lib/hadoop/lib/commons-codec-1.4.jar:/usr/lib/hadoop/.//hadoop-annotations-2.0.0-cdh4.1.3.jar:/usr/lib/hadoop/.//hadoop-common.jar:/usr/lib/hadoop/.//hadoop-common-2.0.0-cdh4.1.3-tests.jar:/usr/lib/hadoop/.//hadoop-common-2.0.0-cdh4.1.3.jar:/usr/lib/hadoop/.//hadoop-annotations.jar:/usr/lib/hadoop/.//hadoop-auth-2.0.0-cdh4.1.3.jar:/usr/lib/hadoop/.//hadoop-auth.jar:/usr/lib/hadoop-hdfs/./:/usr/lib/hadoop-hdfs/lib/jersey-core-1.8.jar:/usr/lib/hadoop-hdfs/lib/commons-cli-1.2.jar:/usr/lib/hadoop-hdfs/lib/jackson-mapper-asl-1.8.8.jar:/usr/lib/hadoop-hdfs/lib/asm-3.2.jar:/usr/lib/hadoop-hdfs/lib/jsr305-1.3.9.jar:/usr/lib/hadoop-hdfs/lib/commons-logging-1.1.1.jar:/usr/lib/hadoop-hdfs/lib/xmlenc-0.52.jar:/usr/lib/hadoop-hdfs/lib/jetty-6.1.26.cloudera.2.jar:/usr/lib/hadoop-hdfs/lib/jsp-api-2.1.jar:/usr/lib/hadoop-hdfs/lib/jackson-core-asl-1.8.8.jar:/usr/lib/hadoop-hdfs/lib/guava-11.0.2.jar:/usr/lib/hadoop-hdfs/lib/zookeeper-3.4.3-cdh4.1.3.jar:/usr/lib/hadoop-hdfs/lib/servlet-api-2.5.jar:/usr/lib/hadoop-hdfs/lib/log4j-1.2.17.jar:/usr/lib/hadoop-hdfs/lib/jersey-server-1.8.jar:/usr/lib/hadoop-hdfs/lib/jline-0.9.94.jar:/usr/lib/hadoop-hdfs/lib/commons-el-1.0.jar:/usr/lib/hadoop-hdfs/lib/commons-io-2.1.jar:/usr/lib/hadoop-hdfs/lib/commons-daemon-1.0.3.jar:/usr/lib/hadoop-hdfs/lib/jasper-runtime-5.5.23.jar:/usr/lib/hadoop-hdfs/lib/jetty-util-6.1.26.cloudera.2.jar:/usr/lib/hadoop-hdfs/lib/commons-lang-2.5.jar:/usr/lib/hadoop-hdfs/lib/protobuf-java-2.4.0a.jar:/usr/lib/hadoop-hdfs/lib/commons-codec-1.4.jar:/usr/lib/hadoop-hdfs/.//hadoop-hdfs-2.0.0-cdh4.1.3.jar:/usr/lib/hadoop-hdfs/.//hadoop-hdfs-2.0.0-cdh4.1.3-tests.jar:/usr/lib/hadoop-hdfs/.//hadoop-hdfs.jar:/usr/lib/hadoop-yarn/lib/jersey-core-1.8.jar:/usr/lib/hadoop-yarn/lib/jackson-mapper-asl-1.8.8.jar:/usr/lib/hadoop-yarn/lib/asm-3.2.jar:/usr/lib/hadoop-yarn/lib/netty-3.2.4.Final.jar:/usr/lib/hadoop-yarn/lib/jackson-core-asl-1.8.8.jar:/usr/lib/hadoop-yarn/lib/jersey-guice-1.8.jar:/usr/lib/hadoop-yarn/lib/log4j-1.2.17.jar:/usr/lib/hadoop-yarn/lib/avro-1.7.1.cloudera.2.jar:/usr/lib/hadoop-yarn/lib/jersey-server-1.8.jar:/usr/lib/hadoop-yarn/lib/guice-3.0.jar:/usr/lib/hadoop-yarn/lib/commons-io-2.1.jar:/usr/lib/hadoop-yarn/lib/aopalliance-1.0.jar:/usr/lib/hadoop-yarn/lib/javax.inject-1.jar:/usr/lib/hadoop-yarn/lib/snappy-java-1.0.4.1.jar:/usr/lib/hadoop-yarn/lib/guice-servlet-3.0.jar:/usr/lib/hadoop-yarn/lib/paranamer-2.3.jar:/usr/lib/hadoop-yarn/lib/protobuf-java-2.4.0a.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-applications-distributedshell.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-api-2.0.0-cdh4.1.3.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-site-2.0.0-cdh4.1.3.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-common-2.0.0-cdh4.1.3.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-api.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-site.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-server-nodemanager.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-server-tests.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-server-tests-2.0.0-cdh4.1.3.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-server-common.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-applications-distributedshell-2.0.0-cdh4.1.3.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-server-web-proxy-2.0.0-cdh4.1.3.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-server-tests-2.0.0-cdh4.1.3-tests.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-server-resourcemanager.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-server-web-proxy.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-server-resourcemanager-2.0.0-cdh4.1.3.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-server-common-2.0.0-cdh4.1.3.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-server-nodemanager-2.0.0-cdh4.1.3.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-common.jar:/usr/lib/hadoop-0.20-mapreduce/./:/usr/lib/hadoop-0.20-mapreduce/lib/jersey-core-1.8.jar:/usr/lib/hadoop-0.20-mapreduce/lib/ant-contrib-1.0b3.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20-mapreduce/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jaxb-impl-2.2.3-1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jackson-mapper-asl-1.8.8.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jsch-0.1.42.jar:/usr/lib/hadoop-0.20-mapreduce/lib/asm-3.2.jar:/usr/lib/hadoop-0.20-mapreduce/lib/kfs-0.3.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jsr305-1.3.9.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-logging-1.1.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jetty-6.1.26.cloudera.2.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jsp-api-2.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-collections-3.2.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jackson-core-asl-1.8.8.jar:/usr/lib/hadoop-0.20-mapreduce/lib/guava-11.0.2.jar:/usr/lib/hadoop-0.20-mapreduce/lib/servlet-api-2.5.jar:/usr/lib/hadoop-0.20-mapreduce/lib/log4j-1.2.17.jar:/usr/lib/hadoop-0.20-mapreduce/lib/avro-1.7.1.cloudera.2.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jersey-server-1.8.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jasper-compiler-5.5.23.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jackson-jaxrs-1.8.8.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-el-1.0.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-io-2.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-configuration-1.6.jar:/usr/lib/hadoop-0.20-mapreduce/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-net-3.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jackson-xc-1.8.8.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jasper-runtime-5.5.23.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-digester-1.8.jar:/usr/lib/hadoop-0.20-mapreduce/lib/slf4j-api-1.6.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-beanutils-core-1.8.0.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/kfs-0.2.2.jar:/usr/lib/hadoop-0.20-mapreduce/lib/activation-1.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jetty-util-6.1.26.cloudera.2.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-beanutils-1.7.0.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-lang-2.5.jar:/usr/lib/hadoop-0.20-mapreduce/lib/avro-compiler-1.7.1.cloudera.2.jar:/usr/lib/hadoop-0.20-mapreduce/lib/snappy-java-1.0.4.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/junit-4.8.2.jar:/usr/lib/hadoop-0.20-mapreduce/lib/stax-api-1.0.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-math-2.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/mockito-all-1.8.5.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jettison-1.1.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jaxb-api-2.2.2.jar:/usr/lib/hadoop-0.20-mapreduce/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop-0.20-mapreduce/lib/paranamer-2.3.jar:/usr/lib/hadoop-0.20-mapreduce/lib/protobuf-java-2.4.0a.jar:/usr/lib/hadoop-0.20-mapreduce/lib/jersey-json-1.8.jar:/usr/lib/hadoop-0.20-mapreduce/lib/commons-codec-1.4.jar:/usr/lib/hadoop-0.20-mapreduce/lib/hadoop-fairscheduler-2.0.0-mr1-cdh4.1.3.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-examples-2.0.0-mr1-cdh4.1.3.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-2.0.0-mr1-cdh4.1.3-ant.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-ant-2.0.0-mr1-cdh4.1.3.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-ant.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-2.0.0-mr1-cdh4.1.3-core.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-test.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-tools.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-core-2.0.0-mr1-cdh4.1.3.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-2.0.0-mr1-cdh4.1.3-examples.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-test-2.0.0-mr1-cdh4.1.3.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-tools-2.0.0-mr1-cdh4.1.3.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-examples.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-2.0.0-mr1-cdh4.1.3-tools.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-2.0.0-mr1-cdh4.1.3-test.jar:/usr/lib/hadoop-0.20-mapreduce/.//hadoop-core.jar > 2014-07-31 17:43:08,130 INFO [main] zookeeper.ZooKeeper > (Environment.java:logEnv(100)) - Client > environment:java.library.path=//usr/lib/hadoop/lib/native > 2014-07-31 17:43:08,138 INFO [main] zookeeper.ZooKeeper > (Environment.java:logEnv(100)) - Client environment:java.io.tmpdir=/tmp > 2014-07-31 17:43:08,138 INFO [main] zookeeper.ZooKeeper > (Environment.java:logEnv(100)) - Client environment:java.compiler=<NA> > 2014-07-31 17:43:08,138 INFO [main] zookeeper.ZooKeeper > (Environment.java:logEnv(100)) - Client environment:os.name=Linux > 2014-07-31 17:43:08,138 INFO [main] zookeeper.ZooKeeper > (Environment.java:logEnv(100)) - Client environment:os.arch=amd64 > 2014-07-31 17:43:08,138 INFO [main] zookeeper.ZooKeeper > (Environment.java:logEnv(100)) - Client > environment:os.version=2.6.32-358.el6.x86_64 > 2014-07-31 17:43:08,138 INFO [main] zookeeper.ZooKeeper > (Environment.java:logEnv(100)) - Client environment:user.name=root > 2014-07-31 17:43:08,138 INFO [main] zookeeper.ZooKeeper > (Environment.java:logEnv(100)) - Client environment:user.home=/root > 2014-07-31 17:43:08,139 INFO [main] zookeeper.ZooKeeper > (Environment.java:logEnv(100)) - Client > environment:user.dir=/etc/hbase/conf.golden_apple > 2014-07-31 17:43:08,149 INFO [main] zookeeper.ZooKeeper > (ZooKeeper.java:<init>(433)) - Initiating client connection, > connectString=rhel1.local:2181,rhel6.local:2181,rhel2.local:2181 > sessionTimeout=5000 watcher=null > 2014-07-31 17:43:08,170 INFO [main-SendThread(rhel2.local:2181)] > zookeeper.ClientCnxn (ClientCnxn.java:logStartConnect(958)) - Opening > socket connection to server rhel2.local/10.120.5.25:2181. Will not > attempt to authenticate using SASL (unknown error) > 2014-07-31 17:43:08,184 INFO [main-SendThread(rhel2.local:2181)] > zookeeper.ClientCnxn (ClientCnxn.java:primeConnection(850)) - Socket > connection established to rhel2.local/10.120.5.25:2181, initiating session > 2014-07-31 17:43:08,262 INFO [main-SendThread(rhel2.local:2181)] > zookeeper.ClientCnxn (ClientCnxn.java:onConnected(1187)) - Session > establishment complete on server rhel2.local/10.120.5.25:2181, sessionid > = 0x3478900fbb40019, negotiated timeout = 5000 > =============================================== > The configured parent znode /hadoop-ha/golden-apple already exists. > Are you sure you want to clear all failover information from > ZooKeeper? > WARNING: Before proceeding, ensure that all HDFS services and > failover controllers are stopped! > =============================================== > Proceed formatting /hadoop-ha/golden-apple? (Y or N) 2014-07-31 > 17:43:08,268 INFO [main-EventThread] ha.ActiveStandbyElector > (ActiveStandbyElector.java:processWatchEvent(538)) - Session connected. > Y > 2014-07-31 17:43:45,025 INFO [main] ha.ActiveStandbyElector > (ActiveStandbyElector.java:clearParentZNode(314)) - Recursively deleting > /hadoop-ha/golden-apple from ZK... > 2014-07-31 17:43:45,098 ERROR [main] ha.ZKFailoverController > (ZKFailoverController.java:formatZK(266)) - Unable to clear zk parent znode > java.io.IOException: Couldn't clear parent znode /hadoop-ha/golden-apple > at > org.apache.hadoop.ha.ActiveStandbyElector.clearParentZNode(ActiveStandbyElector.java:324) > at > org.apache.hadoop.ha.ZKFailoverController.formatZK(ZKFailoverController.java:264) > at > org.apache.hadoop.ha.ZKFailoverController.doRun(ZKFailoverController.java:197) > at > org.apache.hadoop.ha.ZKFailoverController.access$000(ZKFailoverController.java:58) > at > org.apache.hadoop.ha.ZKFailoverController$1.run(ZKFailoverController.java:165) > at > org.apache.hadoop.ha.ZKFailoverController$1.run(ZKFailoverController.java:161) > at > org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:452) > at > org.apache.hadoop.ha.ZKFailoverController.run(ZKFailoverController.java:161) > at > org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:175) > Caused by: org.apache.zookeeper.KeeperException$NotEmptyException: > KeeperErrorCode = Directory not empty for /hadoop-ha/golden-apple > at org.apache.zookeeper.KeeperException.create(KeeperException.java:125) > at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) > at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:868) > at org.apache.zookeeper.ZKUtil.deleteRecursive(ZKUtil.java:54) > at > org.apache.hadoop.ha.ActiveStandbyElector$1.run(ActiveStandbyElector.java:319) > at > org.apache.hadoop.ha.ActiveStandbyElector$1.run(ActiveStandbyElector.java:316) > at > org.apache.hadoop.ha.ActiveStandbyElector.zkDoWithRetries(ActiveStandbyElector.java:934) > at > org.apache.hadoop.ha.ActiveStandbyElector.clearParentZNode(ActiveStandbyElector.java:316) > ... 8 more > 2014-07-31 17:43:45,119 INFO [main-EventThread] zookeeper.ClientCnxn > (ClientCnxn.java:run(511)) - EventThread shut down > 2014-07-31 17:43:45,119 INFO [main] zookeeper.ZooKeeper > (ZooKeeper.java:close(679)) - Session: 0x3478900fbb40019 closed > > > > On Thu, Jul 31, 2014 at 1:56 PM, Colin Kincaid Williams <[email protected]> > wrote: > >> Thanks! I will give this a shot. >> >> >> >> On Thu, Jul 31, 2014 at 1:12 PM, Bryan Beaudreault < >> [email protected]> wrote: >> >>> We've done this a number of times without issue. Here's the general >>> flow: >>> >>> 1) Shutdown namenode and zkfc on SNN >>> 2) Stop zkfc on ANN (ANN will remain active because there is no other >>> zkfc instance running to fail over to) >>> 3) Run hdfs zkfc -formatZK on ANN >>> 4) Start zkfc on ANN (will sync up with ANN and write state to zk) >>> 5) Push new configs to the new SNN, bootstrap namenode there >>> 6) Start namenode and zkfc on SNN >>> 7) Push updated configs to all other hdfs services (datanodes, etc) >>> 8) Restart hbasemaster if you are running hbase, jobtracker for MR >>> 9) Rolling restart datanodes >>> 10) Done >>> >>> You'll have to handle any other consumers of DFSClient, like your own >>> code or other apache projects. >>> >>> >>> >>> On Thu, Jul 31, 2014 at 3:35 PM, Colin Kincaid Williams <[email protected]> >>> wrote: >>> >>>> Hi Jing, >>>> >>>> Thanks for the response. I will try this out, and file an Apache jira. >>>> >>>> Best, >>>> >>>> Colin Williams >>>> >>>> >>>> On Thu, Jul 31, 2014 at 11:19 AM, Jing Zhao <[email protected]> >>>> wrote: >>>> >>>>> Hi Colin, >>>>> >>>>> I guess currently we may have to restart almost all the >>>>> daemons/services in order to swap out a standby NameNode (SBN): >>>>> >>>>> 1. The current active NameNode (ANN) needs to know the new SBN since >>>>> in the current implementation the SBN tries to send rollEditLog RPC >>>>> request >>>>> to ANN periodically (thus if a NN failover happens later, the original ANN >>>>> needs to send this RPC to the correct NN). >>>>> 2. Looks like the DataNode currently cannot do real refreshment for >>>>> NN. Look at the code in BPOfferService: >>>>> >>>>> void refreshNNList(ArrayList<InetSocketAddress> addrs) throws >>>>> IOException { >>>>> Set<InetSocketAddress> oldAddrs = Sets.newHashSet(); >>>>> for (BPServiceActor actor : bpServices) { >>>>> oldAddrs.add(actor.getNNSocketAddress()); >>>>> } >>>>> Set<InetSocketAddress> newAddrs = Sets.newHashSet(addrs); >>>>> >>>>> if (!Sets.symmetricDifference(oldAddrs, newAddrs).isEmpty()) { >>>>> // Keep things simple for now -- we can implement this at a >>>>> later date. >>>>> throw new IOException( >>>>> "HA does not currently support adding a new standby to a >>>>> running DN. " + >>>>> "Please do a rolling restart of DNs to reconfigure the list >>>>> of NNs."); >>>>> } >>>>> } >>>>> >>>>> 3. If you're using automatic failover, you also need to update the >>>>> configuration of the ZKFC on the current ANN machine, since ZKFC will do >>>>> gracefully fencing by sending RPC to the other NN. >>>>> 4. Looks like we do not need to restart JournalNodes for the new SBN >>>>> but I have not tried before. >>>>> >>>>> Thus in general we may still have to restart all the services >>>>> (except JNs) and update their configurations. But this may be a rolling >>>>> restart process I guess: >>>>> 1. Shutdown the old SBN, bootstrap the new SBN, and start the new SBN. >>>>> 2. Keep the ANN and its corresponding ZKFC running, do a rolling >>>>> restart of all the DN to update their configurations >>>>> 3. After restarting all the DN, stop ANN and the ZKFC, and update >>>>> their configuration. The new SBN should become active. >>>>> >>>>> I have not tried the upper steps, thus please let me know if this >>>>> works or not. And I think we should also document the correct steps in >>>>> Apache. Could you please file an Apache jira? >>>>> >>>>> Thanks, >>>>> -Jing >>>>> >>>>> >>>>> >>>>> On Thu, Jul 31, 2014 at 9:37 AM, Colin Kincaid Williams < >>>>> [email protected]> wrote: >>>>> >>>>>> Hello, >>>>>> >>>>>> I'm trying to swap out a standby NameNode in a QJM / HA >>>>>> configuration. I believe the steps to achieve this would be something >>>>>> similar to: >>>>>> >>>>>> Use the Bootstrap standby command to prep the replacment standby. Or >>>>>> rsync if the command fails. >>>>>> >>>>>> Somehow update the datanodes, so they push the heartbeat / journal to >>>>>> the new standby >>>>>> >>>>>> Update the xml configuration on all nodes to reflect the replacment >>>>>> standby. >>>>>> >>>>>> Start the replacment standby >>>>>> >>>>>> Use some hadoop command to refresh the datanodes to the new NameNode >>>>>> configuration. >>>>>> >>>>>> I am not sure how to deal with the Journal switch, or if I am going >>>>>> about this the right way. Can anybody give me some suggestions here? >>>>>> >>>>>> >>>>>> Regards, >>>>>> >>>>>> Colin Williams >>>>>> >>>>>> >>>>> >>>>> CONFIDENTIALITY NOTICE >>>>> NOTICE: This message is intended for the use of the individual or >>>>> entity to which it is addressed and may contain information that is >>>>> confidential, privileged and exempt from disclosure under applicable law. >>>>> If the reader of this message is not the intended recipient, you are >>>>> hereby >>>>> notified that any printing, copying, dissemination, distribution, >>>>> disclosure or forwarding of this communication is strictly prohibited. If >>>>> you have received this communication in error, please contact the sender >>>>> immediately and delete it from your system. Thank You. >>>> >>>> >>>> >>> >> >
