Does, nodetool stopdaemon, implicitly drain too? or we should invoke drain and then stopdaemon?
On Mon, Oct 16, 2017 at 4:54 AM, Simon Fontana Oscarsson < simon.fontana.oscars...@ericsson.com> wrote: > Looking at the code in trunk, the stopdemon command invokes the > CassandraDaemon.stop() function which does a graceful shutdown by stopping > jmxServer and drains the node by the shutdown hook. > > /Simon > > > On 2017-10-13 20:42, Javier Canillas wrote: > > As far as I know, the nodetool stopdaemon is doing a "kill -9". > > Or did it change? > > 2017-10-12 23:49 GMT-03:00 Anshu Vajpayee <anshu.vajpa...@gmail.com>: > >> Why are you killing when we have nodetool stopdaemon ? >> >> On Fri, Oct 13, 2017 at 1:49 AM, Javier Canillas < >> javier.canil...@gmail.com> wrote: >> >>> That's what I thought. >>> >>> Thanks! >>> >>> 2017-10-12 14:26 GMT-03:00 Hannu Kröger <hkro...@gmail.com>: >>> >>>> Hi, >>>> >>>> Drain should be enough. It stops accepting writes and after that >>>> cassandra can be safely shut down. >>>> >>>> Hannu >>>> >>>> On 12 October 2017 at 20:24:41, Javier Canillas ( >>>> javier.canil...@gmail.com) wrote: >>>> >>>> Hello everyone, >>>> >>>> I have some time working with Cassandra, but every time I need to >>>> shutdown a node (for any reason like upgrading version or moving instance >>>> to another host) I see several errors on the client applications (yes, I'm >>>> using the official java driver). >>>> >>>> By the way, I'm starting C* as a stand-alone process >>>> <https://docs.datastax.com/en/cassandra/3.0/cassandra/initialize/referenceStartCprocess.html?hl=start>, >>>> and C* version is 3.11.0. >>>> >>>> The way I have implemented the shutdown process is something like the >>>> following: >>>> >>>> *# Drain all information from commitlog into sstables* >>>> >>>> *bin/nodetool drain * >>>> >>>> >>>> *cassandra_pid=`ps -ef|grep "java.*apache-cassandra"|grep -v "grep"|awk >>>> '{print $2}'` * >>>> *if [ ! -z "$cassandra_pid" ] && [ "$cassandra_pid" -ne "1" ]; then* >>>> * echo "Asking Cassandra to shutdown (nodetool drain doesn't >>>> stop cassandra)"* >>>> * kill $cassandra_pid* >>>> >>>> * echo -n "+ Checking it is down. "* >>>> * counter=10* >>>> * while [ "$counter" -ne 0 -a ! kill -0 $cassandra_pid > >>>> /dev/null 2>&1 ]* >>>> * do* >>>> * echo -n ". "* >>>> * ((counter--))* >>>> * sleep 1s* >>>> * done* >>>> * echo ""* >>>> * if ! kill -0 $cassandra_pid > /dev/null 2>&1; then* >>>> * echo "+ Its down."* >>>> * else* >>>> * echo "- Killing Cassandra."* >>>> * kill -9 $cassandra_pid* >>>> * fi* >>>> *else* >>>> * echo "Care there was a problem finding Cassandra PID"* >>>> *fi* >>>> >>>> Should I add at the beginning the following lines? >>>> >>>> echo "shutdowing cassandra gracefully with: nodetool disable gossip" >>>> $CASSANDRA_HOME/$CASSANDRA_APP/bin/nodetool disablegossip >>>> echo "shutdowing cassandra gracefully with: nodetool disable binary >>>> protocol" >>>> $CASSANDRA_HOME/$CASSANDRA_APP/bin/nodetool disablebinary >>>> echo "shutdowing cassandra gracefully with: nodetool thrift" >>>> $CASSANDRA_HOME/$CASSANDRA_APP/bin/nodetool disablethrift >>>> >>>> The shutdown log is the following: >>>> >>>> *WARN [RMI TCP Connection(10)-127.0.0.1] 2017-10-12 14:20:52,343 >>>> StorageService.java:321 - Stopping gossip by operator request* >>>> *INFO [RMI TCP Connection(10)-127.0.0.1] 2017-10-12 14:20:52,344 >>>> Gossiper.java:1532 - Announcing shutdown* >>>> *INFO [RMI TCP Connection(10)-127.0.0.1] 2017-10-12 14:20:52,355 >>>> StorageService.java:2268 - Node /10.254.169.36 <http://10.254.169.36> state >>>> jump to shutdown* >>>> *INFO [RMI TCP Connection(12)-127.0.0.1] 2017-10-12 14:20:56,141 >>>> Server.java:176 - Stop listening for CQL clients* >>>> *INFO [RMI TCP Connection(16)-127.0.0.1] 2017-10-12 14:20:59,472 >>>> StorageService.java:1442 - DRAINING: starting drain process* >>>> *INFO [RMI TCP Connection(16)-127.0.0.1] 2017-10-12 14:20:59,474 >>>> HintsService.java:220 - Paused hints dispatch* >>>> *INFO [RMI TCP Connection(16)-127.0.0.1] 2017-10-12 14:20:59,477 >>>> Gossiper.java:1532 - Announcing shutdown* >>>> *INFO [RMI TCP Connection(16)-127.0.0.1] 2017-10-12 14:20:59,480 >>>> StorageService.java:2268 - Node /127.0.0.1 <http://127.0.0.1> state jump to >>>> shutdown* >>>> *INFO [RMI TCP Connection(16)-127.0.0.1] 2017-10-12 14:21:01,483 >>>> MessagingService.java:984 - Waiting for messaging service to quiesce* >>>> *INFO [ACCEPT-/192.168.6.174 <http://192.168.6.174>] 2017-10-12 >>>> 14:21:01,485 MessagingService.java:1338 - MessagingService has terminated >>>> the accept() thread* >>>> *INFO [RMI TCP Connection(16)-127.0.0.1] 2017-10-12 14:21:02,095 >>>> HintsService.java:220 - Paused hints dispatch* >>>> *INFO [RMI TCP Connection(16)-127.0.0.1] 2017-10-12 14:21:02,111 >>>> StorageService.java:1442 - DRAINED* >>>> >>>> Disabling Gossip seemed a good idea, but watching the logs, it may use >>>> it to gracefully telling the other nodes he is going down, so I don't know >>>> if it's good or bad idea. >>>> >>>> Disabling Thrift and Binary protocol should only avoid new connections, >>>> but the one stablished and running should be attempted to finish. >>>> >>>> Any thoughts or comments? >>>> >>>> Thanks >>>> >>>> Javier. >>>> >>>> >>>> >>> >> >> >> -- >> *Regards,* >> *Anshu * >> >> >> > >