My RS finally started without the "strange ZK error", but regions are still not moving...
Here is the new sample from RS log : http://pastebin.com/raw.php?i=QJxs4chE I can't see anything strange in the ZK's logs, just classical connect/disconnect requests. When should ZK nodes move from M_SERVER_SHUTDOWN to M_ZK_REGION_OFFLINE ? Is it a new behavior from the Master's side and I should upgrade HMaster before RS ? (I forgot to mention I was testing a rolling-upgrade scenario) On Sat, Jul 13, 2013 at 6:52 AM, Ted Yu <[email protected]> wrote: > w.r.t. the strange error mentioned at the bottom of the email, it came > from connectionEvent(): > > if (this.recoverableZooKeeper == null) { > LOG.error("ZK is null on connection event -- see stack trace " + > "for the stack trace when constructor was called on this zkw", > this.constructorCaller); > throw new NullPointerException("ZK is null"); > } > > this.constructorCaller was filled out in the constructor. > The error indicated that the following call wasn't successful (line 153 in > ZooKeeperWatcher ctor) > > this.recoverableZooKeeper = ZKUtil.connect(conf, quorum, this, > descriptor); > > Can you check more of the RS log ? > > zookeeper log may reveal something as well. > > Cheers > > On Fri, Jul 12, 2013 at 3:32 PM, Adrien Mogenet <[email protected] > >wrote: > > > Hi there, > > > > I'm trying to upgrade from 0.94.6 (distributed mode) to 0.94.8 and I'm > > seeing strange WARN messages leading in region-less regionserver once > > updated. > > > > Here is the kind of lines I can find: > > > > > WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: > > regionserver:60020-0x23d207e751d20c4 Attempt to transition the unassigned > > node for 9a > > eb2d2c3e878ee50ad4806dd3488c15 from M_ZK_REGION_OFFLINE to > > RS_ZK_REGION_OPENING failed, the node existed but was in the state > > M_SERVER_SHUTDOWN set by the server my-server.org,60020,1373289114184 > > > > I've uploaded a longer extract including DEBUG traces to Pastebin: > > http://pastebin.com/raw.php?i=Me2esbPF > > > > I've performed as usual: stopping the RS, updating HBase binaries and > > libraries, then starting the RS... When digging into the log file, I can > > read one strange error ZK-related ("ZKW CONSTRUCTOR STACK TRACE FOR > > DEBUGGING"), see complete trace here: > > http://pastebin.com/raw.php?i=7wy0wdNq > > > > Any idea? > > -- > > Adrien Mogenet > > http://www.borntosegfault.com > > > -- Adrien Mogenet http://www.borntosegfault.com
