Re: Error on participant while joining cluster

Varun Sharma Tue, 19 Aug 2014 13:59:31 -0700

Can I simply remove the orphaned znodes under the /<CLUSTER_NAME>/INSTANCES
tag ?


Varun


On Tue, Aug 19, 2014 at 1:54 PM, Varun Sharma <[email protected]> wrote:

> Another issue I have now is that I ended up registering the participants
> as <host>:<port> - this causes exceptions related to MBeann (because it
> does not like colon separators). I dont know if that is interfering with
> normal controller operation. I restarted the instances replacing the : with
> a , but those old names are still stuck in INSTANCES znode. How can I get
> rid of these - helix-admin seems to be replacing the ":" in the node name
> with an underscore "_" and can't delete the node.
>
> This is still causing MBean related exceptions in the log trace.
>
> Varun
>
>
> On Tue, Aug 19, 2014 at 12:18 PM, Zhen Zhang <[email protected]> wrote:
>
>>  sure. Will add it.
>>
>>   From: kishore g <[email protected]>
>> Reply-To: "[email protected]" <[email protected]>
>> Date: Tuesday, August 19, 2014 12:14 PM
>> To: "[email protected]" <[email protected]>
>> Subject: Re: Error on participant while joining cluster
>>
>>   Thanks Jason. We need to add this to the documentation. I could not
>> find the way to enable auto-join from the docs. Should we add this to admin
>> interface documentation?
>>
>>
>>
>>
>>
>>
>> On Tue, Aug 19, 2014 at 12:06 PM, Zhen Zhang <[email protected]> wrote:
>>
>>>  Hi Varun, you need to either add the participant to the cluster before
>>> start it, or enable participant auto-join config:
>>>
>>>  add participant to cluster:
>>>  ./helix-admin.sh --zkSvr <ZookeeperServerAddress, e.g. localhost:2181>
>>> --addNode <clusterName, e.g. terrapin> <instanceId, e.g.
>>> hdfsterrapin-a-datanode-531b2679_9090>
>>>
>>>  or, enable auto-join config:
>>> ./helix-admin.sh --zkSvr <ZookeeperServerAddress> --setConfig CLUSTER
>>> <clusterName> allowParticipantAutoJoin=true
>>>
>>>  Thanks,
>>> Jason
>>>
>>>
>>>   From: Varun Sharma <[email protected]>
>>> Reply-To: "[email protected]" <[email protected]>
>>> Date: Tuesday, August 19, 2014 11:47 AM
>>> To: "[email protected]" <[email protected]>
>>> Subject: Error on participant while joining cluster
>>>
>>>   I am getting the following error while trying to join a cluster as a
>>> participant. THe cluster is setup and a controller has already connected to
>>> it. Can someone help out as to why this is happening ?
>>>
>>>
>>>  2014-08-19 18:41:36,843 [main] (ZKHelixManager.java:727) INFO
>>> Handling new session, session id: 147a7beb2dd63f4, instance:
>>> hdfsterrapin-a-datanode-531b2679:9090, instanceTye: PARTICIPANT, cluster:
>>> terrapin, zkconnection: State:CONNECTED Timeout:30000
>>> sessionid:0x147a7beb2dd63f4 local:/10.65.145.80:43854
>>> remoteserver:terrapinzk001a/10.115.59.31:2181 lastZxid:0 xid:1 sent:1
>>> recv:1 queuedpkts:0 pendingresp:0 queuedevents:0
>>>
>>> 2014-08-19 18:41:36,843 [main] (ParticipantHealthReportTask.java:67)
>>> WARN  ParticipantHealthReportTimerTask already stopped
>>>
>>> 2014-08-19 18:41:36,914 [main] (ParticipantManagerHelper.java:101) INFO
>>> instance: hdfsterrapin-a-datanode-531b2679:9090 auto-joining terrapin is
>>> false
>>>
>>> *2014-08-19 18:41:36,917 [main] (ZKUtil.java:95) INFO  Invalid instance
>>> setup, missing znode path:
>>> /terrapin/CONFIGS/PARTICIPANT/hdfsterrapin-a-datanode-531b2679:9090*
>>>
>>> *2014-08-19 18:41:36,918 [main] (ZKUtil.java:95) INFO  Invalid instance
>>> setup, missing znode path:
>>> /terrapin/INSTANCES/hdfsterrapin-a-datanode-531b2679:9090/MESSAGES*
>>>
>>> *2014-08-19 18:41:36,918 [main] (ZKUtil.java:95) INFO  Invalid instance
>>> setup, missing znode path:
>>> /terrapin/INSTANCES/hdfsterrapin-a-datanode-531b2679:9090/CURRENTSTATES*
>>>
>>> *2014-08-19 18:41:36,919 [main] (ZKUtil.java:95) INFO  Invalid instance
>>> setup, missing znode path:
>>> /terrapin/INSTANCES/hdfsterrapin-a-datanode-531b2679:9090/STATUSUPDATES*
>>>
>>> *2014-08-19 18:41:36,920 [main] (ZKUtil.java:95) INFO  Invalid instance
>>> setup, missing znode path:
>>> /terrapin/INSTANCES/hdfsterrapin-a-datanode-531b2679:9090/ERRORS*
>>>
>>> *2014-08-19 18:41:36,920 [main] (ZKHelixManager.java:496) ERROR fail to
>>> createClient.*
>>>
>>> *org.apache.helix.HelixException: Initial cluster structure is not set
>>> up for instance: hdfsterrapin-a-datanode-531b2679:9090, instanceType:
>>> PARTICIPANT*
>>>
>>> at
>>> org.apache.helix.manager.zk.ParticipantManagerHelper.joinCluster(ParticipantManagerHelper.java:108)
>>>
>>> at
>>> org.apache.helix.manager.zk.ZKHelixManager.handleNewSessionAsParticipant(ZKHelixManager.java:869)
>>>
>>> at
>>> org.apache.helix.manager.zk.ZKHelixManager.handleNewSession(ZKHelixManager.java:838)
>>>
>>> at
>>> org.apache.helix.manager.zk.ZKHelixManager.createClient(ZKHelixManager.java:493)
>>>
>>> at
>>> org.apache.helix.manager.zk.ZKHelixManager.connect(ZKHelixManager.java:519)
>>>
>>> at
>>> com.pinterest.terrapin.server.TerrapinServerHandler.start(TerrapinServerHandler.java:84)
>>>
>>> at
>>> com.pinterest.terrapin.server.TerrapinServerMain.main(TerrapinServerMain.java:31)
>>>
>>> 2014-08-19 18:41:36,921 [main] (ZKHelixManager.java:522) ERROR fail to
>>> connect hdfsterrapin-a-datanode-531b2679:9090
>>>
>>> org.apache.helix.HelixException: Initial cluster structure is not set up
>>> for instance: hdfsterrapin-a-datanode-531b2679:9090, instanceType:
>>> PARTICIPANT
>>>
>>> at
>>> org.apache.helix.manager.zk.ParticipantManagerHelper.joinCluster(ParticipantManagerHelper.java:108)
>>>
>>> at
>>> org.apache.helix.manager.zk.ZKHelixManager.handleNewSessionAsParticipant(ZKHelixManager.java:869)
>>>
>>> at
>>> org.apache.helix.manager.zk.ZKHelixManager.handleNewSession(ZKHelixManager.java:838)
>>>
>>> at
>>> org.apache.helix.manager.zk.ZKHelixManager.createClient(ZKHelixManager.java:493)
>>>
>>> at
>>> org.apache.helix.manager.zk.ZKHelixManager.connect(ZKHelixManager.java:519)
>>>
>>> at
>>> com.pinterest.terrapin.server.TerrapinServerHandler.start(TerrapinServerHandler.java:84)
>>>
>>> at
>>> com.pinterest.terrapin.server.TerrapinServerMain.main(TerrapinServerMain.java:31)
>>>
>>
>>
>

Re: Error on participant while joining cluster

Reply via email to