Ufff I almost won the Charles Darwin prize now. Thanks! []s
2013/8/22 Hiller, Dean <[email protected]> > Isn't this the log file from 10.0.0.146??? And this 10.0.0.146 sees that > 10.0.0.111 is up, then sees it dead and in the log we can see it bind with > this line > > INFO 12:16:23,108 Binding thrift service to ip-10-0-0-146.ec2.internal/ > 10.0.0.146:9160<http://10.0.0.146:9160> > > What is the log file look like on 10.0.0.111? > > Thanks, > Dean > > From: Marcelo Elias Del Valle <[email protected]<mailto: > [email protected]>> > Reply-To: "[email protected]<mailto:[email protected]>" < > [email protected]<mailto:[email protected]>> > Date: Thursday, August 22, 2013 9:19 AM > To: "[email protected]<mailto:[email protected]>" < > [email protected]<mailto:[email protected]>> > Subject: node dead after restart > > Hello, > > I am having a problem with a node in a test environment I have at > amazon. I am using cassandra 1.2.3 in Amazon EC2. Here is my nodetool ring > output: > > $ nodetool ring > Note: Ownership information does not include topology; for complete > information, specify a keyspace > > Datacenter: us-east > ========== > Address Rack Status State Load Owns > Token > > 113427455640312821154458202479064646084 > 10.0.0.76 1b Up Normal 31.34 MB 33.33% > 1808575600 > 10.0.0.146 1b Up Normal 34.24 MB 33.33% > 56713727820156410577229101240436610842 > 10.0.0.111 1b Down Normal 21.19 MB 33.33% > 113427455640312821154458202479064646084 > > I logged in 10.0.0.111 machine and restarted cassandra, while looking > at the log. Gossip protocol is still up, but the node starts and goes down > just after it. Here is what I see in the logs: > > sudo tail /var/log/cassandra/output.log > INFO 12:16:23,084 Node /10.0.0.111<http://10.0.0.111> has restarted, now > UP > INFO 12:16:23,095 InetAddress /10.0.0.111<http://10.0.0.111> is now UP > INFO 12:16:23,097 Node /10.0.0.111<http://10.0.0.111> state jump to > normal > INFO 12:16:23,105 Not starting native transport as requested. Use JMX > (StorageService->startNativeTransport()) to start it > INFO 12:16:23,108 Binding thrift service to ip-10-0-0-146.ec2.internal/ > 10.0.0.146:9160<http://10.0.0.146:9160> > INFO 12:16:23,137 Using TFramedTransport with a max frame size of > 15728640 bytes. > INFO 12:16:23,143 Using synchronous/threadpool thrift server on > ip-10-0-0-146.ec2.internal : 9160 > INFO 12:16:23,143 Listening for thrift clients... > INFO 12:16:30,063 Saved local counter id: > 76c1a930-a866-11e2-a3bd-831b111cd74c > INFO 12:16:32,860 InetAddress /10.0.0.111<http://10.0.0.111> is now dead. > > I am having no clue of what is wrong. Any hint of what could I do to > look for the problem? > > Best regards, > -- > Marcelo Elias Del Valle > http://mvalle.com - @mvallebr > -- Marcelo Elias Del Valle http://mvalle.com - @mvallebr
