So vmquest85 was restarted, but gen-app02 hasn't told it that there are 2 other nodes that are down?
Which one is the seed node? On Mon, Nov 23, 2009 at 6:38 PM, B. Todd Burruss <[email protected]> wrote: > i'm observing the following on a cluster that started with 4 nodes. i have > been killing and restarting the various nodes as i test cassandra and now > i'm seeing a lot of NotFoundException exceptions in the client because what > i believe is ring state out of sync between the two nodes that are still up > and available. The first ring state shown below reflects the current state > of the cluster. Also I have seen similar issues when one of the nodes > thinks another node is still available when in fact it has been killed. it > seems to be related to bringing up, killing nodes too fast and not letting > them figure out when a node is "dead". in this case i see TimedOutException > related to NIO SocketChannel class. > > thx! > > [cassandra.883477]$ bin/nodeprobe -host gen-app02.dev.real.com -port 8080 > ring > Address Status Load > Range Ring > > 144038903974614862325597275257769797985 > 172.27.128.186Down 22.17 MB > 31124469348629903091013930339840898757 |<--| > 172.27.128.23 Down 22.17 MB > 64378740291415296162944450043143967518 | | > 172.27.128.22 Up 22.17 MB > 121134220722269938669001112695509564769 | | > 172.27.128.185Up 14.69 MB > 144038903974614862325597275257769797985 |-->| > > [cassandra.883477]$ bin/nodeprobe -host vmguest85.prognet.com -port 8080 > ring > Address Status Load > Range Ring > > 144038903974614862325597275257769797985 > 172.27.128.22 Up 22.17 MB > 121134220722269938669001112695509564769 |<--| > 172.27.128.185Up 14.69 MB > 144038903974614862325597275257769797985 |-->| > [cassandra.883477]$ > > >
