How to handle Node does not exist error?

2010-08-11 Thread Dr Hao He
hi, All, I have a 3-host cluster running ZooKeeper 3.2.2. On one of the hosts, there are a number of nodes that I can get and ls using zkCli.sh . However, when I tried to delete any of them, I got Node does not exist error.Those nodes do not exist on the other two hosts. Any idea how

Re: How to handle Node does not exist error?

2010-08-11 Thread Dr Hao He
hi, Ted, Thanks for the reply. Here is what I did: [zk: localhost:2181(CONNECTED) 0] ls /xpe/queues/3bd7851e79381ef4bfd1a5857b5e34c04e5159e5/msgs/msg002948 [] zk: localhost:2181(CONNECTED) 1] ls /xpe/queues/3bd7851e79381ef4bfd1a5857b5e34c04e5159e5/msgs [msg002807,

Re: How to handle Node does not exist error?

2010-08-11 Thread Ted Dunning
What do your nodes have in their logs during startup? Are you sure you have them configured correctly? Are the file ephemeral? Could they have disappeared on their own? Sent from my iPhone On Aug 11, 2010, at 12:10 AM, Dr Hao He h...@softtouchit.com wrote: hi, Ted, Thanks for the

Re: How to handle Node does not exist error?

2010-08-11 Thread Mahadev Konar
HI Dr Hao, Can you please post the configuration of all the 3 zookeeper servers? I suspect it might be misconfigured clusters and they might not belong to the same ensemble. Just to be clear: /xpe/queues/3bd7851e79381ef4bfd1a5857b5e34c04e5159e5/msgs/msg002807 And other such nodes exist on

Re: Sequence Number Generation With Zookeeper

2010-08-11 Thread Adam Rosien
What happens during a network partition and different clients are incrementing different counters, and then the partition goes away? Won't (potentially) the same sequence value be given out to two clients? .. Adam On Thu, Aug 5, 2010 at 5:38 PM, Jonathan Holloway jonathan.hollo...@gmail.com

zookeeper seems to hang

2010-08-11 Thread Ted Yu
Hi, Using HBase 0.20.6 (with HBASE-2473) we encountered a situation where Regionserver process was shutting down and seemed to hang. Here is the bottom of region server log: http://pastebin.com/YYawJ4jA zookeeper-3.2.2 is used. Your comment is welcome. Here is relevant portion from jstack - I

Re: Sequence Number Generation With Zookeeper

2010-08-11 Thread Ted Dunning
Can't happen. In a network partition, the side without a quorum can't update the file version. On Wed, Aug 11, 2010 at 3:11 PM, Adam Rosien a...@rosien.net wrote: What happens during a network partition and different clients are incrementing different counters, and then the partition goes

Clarification on async calls in a cluster

2010-08-11 Thread Jordan Zimmerman
If I use an async version of a call in a cluster (ensemble) what happens if the server I'm connected to goes down? Does ZK transparently resubmit the call to the next server in the cluster and call my async callback or is there something I need to do? The docs aren't clear on this and searching

Re: Sequence Number Generation With Zookeeper

2010-08-11 Thread Adam Rosien
Ah thanks, I forgot the majority-commit property because I also forgot that all servers know what the cluster should look like, rather than act adaptively (which wouldn't make sense after all). .. Adam On Wed, Aug 11, 2010 at 3:23 PM, Ted Dunning ted.dunn...@gmail.com wrote: Can't happen. In

Re: Clarification on async calls in a cluster

2010-08-11 Thread Patrick Hunt
On 08/11/2010 03:25 PM, Jordan Zimmerman wrote: If I use an async version of a call in a cluster (ensemble) what happens if the server I'm connected to goes down? Does ZK transparently resubmit the call to the next server in the cluster and call my async callback or is there something I need to

Backing up zk data files

2010-08-11 Thread Adam Rosien
http://hadoop.apache.org/zookeeper/docs/r3.3.1/zookeeperAdmin.html#sc_dataFileManagement says that one can copy the contents of the data directory and use it on another machine. The example states the other instance is not in the server list; what would happen if one did copy it to an offline

Re: How to handle Node does not exist error?

2010-08-11 Thread Ted Dunning
Try running the server in non-embedded mode. Also, you are assuming that you know everything about how to configure the quorumPeer. That is going to change and your code will break at that time. If you use a non-embedded cluster, this won't be a problem and you will be able to upgrade ZK