Can I simply remove the orphaned znodes under the /<CLUSTER_NAME>/INSTANCES tag ?
Varun On Tue, Aug 19, 2014 at 1:54 PM, Varun Sharma <[email protected]> wrote: > Another issue I have now is that I ended up registering the participants > as <host>:<port> - this causes exceptions related to MBeann (because it > does not like colon separators). I dont know if that is interfering with > normal controller operation. I restarted the instances replacing the : with > a , but those old names are still stuck in INSTANCES znode. How can I get > rid of these - helix-admin seems to be replacing the ":" in the node name > with an underscore "_" and can't delete the node. > > This is still causing MBean related exceptions in the log trace. > > Varun > > > On Tue, Aug 19, 2014 at 12:18 PM, Zhen Zhang <[email protected]> wrote: > >> sure. Will add it. >> >> From: kishore g <[email protected]> >> Reply-To: "[email protected]" <[email protected]> >> Date: Tuesday, August 19, 2014 12:14 PM >> To: "[email protected]" <[email protected]> >> Subject: Re: Error on participant while joining cluster >> >> Thanks Jason. We need to add this to the documentation. I could not >> find the way to enable auto-join from the docs. Should we add this to admin >> interface documentation? >> >> >> >> >> >> >> On Tue, Aug 19, 2014 at 12:06 PM, Zhen Zhang <[email protected]> wrote: >> >>> Hi Varun, you need to either add the participant to the cluster before >>> start it, or enable participant auto-join config: >>> >>> add participant to cluster: >>> ./helix-admin.sh --zkSvr <ZookeeperServerAddress, e.g. localhost:2181> >>> --addNode <clusterName, e.g. terrapin> <instanceId, e.g. >>> hdfsterrapin-a-datanode-531b2679_9090> >>> >>> or, enable auto-join config: >>> ./helix-admin.sh --zkSvr <ZookeeperServerAddress> --setConfig CLUSTER >>> <clusterName> allowParticipantAutoJoin=true >>> >>> Thanks, >>> Jason >>> >>> >>> From: Varun Sharma <[email protected]> >>> Reply-To: "[email protected]" <[email protected]> >>> Date: Tuesday, August 19, 2014 11:47 AM >>> To: "[email protected]" <[email protected]> >>> Subject: Error on participant while joining cluster >>> >>> I am getting the following error while trying to join a cluster as a >>> participant. THe cluster is setup and a controller has already connected to >>> it. Can someone help out as to why this is happening ? >>> >>> >>> 2014-08-19 18:41:36,843 [main] (ZKHelixManager.java:727) INFO >>> Handling new session, session id: 147a7beb2dd63f4, instance: >>> hdfsterrapin-a-datanode-531b2679:9090, instanceTye: PARTICIPANT, cluster: >>> terrapin, zkconnection: State:CONNECTED Timeout:30000 >>> sessionid:0x147a7beb2dd63f4 local:/10.65.145.80:43854 >>> remoteserver:terrapinzk001a/10.115.59.31:2181 lastZxid:0 xid:1 sent:1 >>> recv:1 queuedpkts:0 pendingresp:0 queuedevents:0 >>> >>> 2014-08-19 18:41:36,843 [main] (ParticipantHealthReportTask.java:67) >>> WARN ParticipantHealthReportTimerTask already stopped >>> >>> 2014-08-19 18:41:36,914 [main] (ParticipantManagerHelper.java:101) INFO >>> instance: hdfsterrapin-a-datanode-531b2679:9090 auto-joining terrapin is >>> false >>> >>> *2014-08-19 18:41:36,917 [main] (ZKUtil.java:95) INFO Invalid instance >>> setup, missing znode path: >>> /terrapin/CONFIGS/PARTICIPANT/hdfsterrapin-a-datanode-531b2679:9090* >>> >>> *2014-08-19 18:41:36,918 [main] (ZKUtil.java:95) INFO Invalid instance >>> setup, missing znode path: >>> /terrapin/INSTANCES/hdfsterrapin-a-datanode-531b2679:9090/MESSAGES* >>> >>> *2014-08-19 18:41:36,918 [main] (ZKUtil.java:95) INFO Invalid instance >>> setup, missing znode path: >>> /terrapin/INSTANCES/hdfsterrapin-a-datanode-531b2679:9090/CURRENTSTATES* >>> >>> *2014-08-19 18:41:36,919 [main] (ZKUtil.java:95) INFO Invalid instance >>> setup, missing znode path: >>> /terrapin/INSTANCES/hdfsterrapin-a-datanode-531b2679:9090/STATUSUPDATES* >>> >>> *2014-08-19 18:41:36,920 [main] (ZKUtil.java:95) INFO Invalid instance >>> setup, missing znode path: >>> /terrapin/INSTANCES/hdfsterrapin-a-datanode-531b2679:9090/ERRORS* >>> >>> *2014-08-19 18:41:36,920 [main] (ZKHelixManager.java:496) ERROR fail to >>> createClient.* >>> >>> *org.apache.helix.HelixException: Initial cluster structure is not set >>> up for instance: hdfsterrapin-a-datanode-531b2679:9090, instanceType: >>> PARTICIPANT* >>> >>> at >>> org.apache.helix.manager.zk.ParticipantManagerHelper.joinCluster(ParticipantManagerHelper.java:108) >>> >>> at >>> org.apache.helix.manager.zk.ZKHelixManager.handleNewSessionAsParticipant(ZKHelixManager.java:869) >>> >>> at >>> org.apache.helix.manager.zk.ZKHelixManager.handleNewSession(ZKHelixManager.java:838) >>> >>> at >>> org.apache.helix.manager.zk.ZKHelixManager.createClient(ZKHelixManager.java:493) >>> >>> at >>> org.apache.helix.manager.zk.ZKHelixManager.connect(ZKHelixManager.java:519) >>> >>> at >>> com.pinterest.terrapin.server.TerrapinServerHandler.start(TerrapinServerHandler.java:84) >>> >>> at >>> com.pinterest.terrapin.server.TerrapinServerMain.main(TerrapinServerMain.java:31) >>> >>> 2014-08-19 18:41:36,921 [main] (ZKHelixManager.java:522) ERROR fail to >>> connect hdfsterrapin-a-datanode-531b2679:9090 >>> >>> org.apache.helix.HelixException: Initial cluster structure is not set up >>> for instance: hdfsterrapin-a-datanode-531b2679:9090, instanceType: >>> PARTICIPANT >>> >>> at >>> org.apache.helix.manager.zk.ParticipantManagerHelper.joinCluster(ParticipantManagerHelper.java:108) >>> >>> at >>> org.apache.helix.manager.zk.ZKHelixManager.handleNewSessionAsParticipant(ZKHelixManager.java:869) >>> >>> at >>> org.apache.helix.manager.zk.ZKHelixManager.handleNewSession(ZKHelixManager.java:838) >>> >>> at >>> org.apache.helix.manager.zk.ZKHelixManager.createClient(ZKHelixManager.java:493) >>> >>> at >>> org.apache.helix.manager.zk.ZKHelixManager.connect(ZKHelixManager.java:519) >>> >>> at >>> com.pinterest.terrapin.server.TerrapinServerHandler.start(TerrapinServerHandler.java:84) >>> >>> at >>> com.pinterest.terrapin.server.TerrapinServerMain.main(TerrapinServerMain.java:31) >>> >> >> >
