So vmquest85 was restarted, but gen-app02 hasn't told it that there
are 2 other nodes that are down?

Which one is the seed node?

On Mon, Nov 23, 2009 at 6:38 PM, B. Todd Burruss <[email protected]> wrote:
> i'm observing the following on a cluster that started with 4 nodes.  i have
> been killing and restarting the various nodes as i test cassandra and now
> i'm seeing a lot of NotFoundException exceptions in the client because what
> i believe is ring state out of sync between the two nodes that are still up
> and available.  The first ring state shown below reflects the current state
> of the cluster.  Also I have seen similar issues when one of the nodes
> thinks another node is still available when in fact it has been killed.  it
> seems to be related to bringing up, killing nodes too fast and not letting
> them figure out when a node is "dead".  in this case i see TimedOutException
> related to NIO SocketChannel class.
>
> thx!
>
> [cassandra.883477]$ bin/nodeprobe -host gen-app02.dev.real.com -port 8080
> ring
> Address       Status     Load
> Range                                      Ring
>
> 144038903974614862325597275257769797985
> 172.27.128.186Down       22.17 MB
> 31124469348629903091013930339840898757     |<--|
> 172.27.128.23 Down       22.17 MB
> 64378740291415296162944450043143967518     |   |
> 172.27.128.22 Up         22.17 MB
> 121134220722269938669001112695509564769    |   |
> 172.27.128.185Up         14.69 MB
> 144038903974614862325597275257769797985    |-->|
>
> [cassandra.883477]$ bin/nodeprobe -host vmguest85.prognet.com -port 8080
> ring
> Address       Status     Load
> Range                                      Ring
>
> 144038903974614862325597275257769797985
> 172.27.128.22 Up         22.17 MB
> 121134220722269938669001112695509564769    |<--|
> 172.27.128.185Up         14.69 MB
> 144038903974614862325597275257769797985    |-->|
> [cassandra.883477]$
>
>
>

Reply via email to