Have you checked system log for GC messages on the node that’s going down?
On Thu, May 30, 2019 at 1:53 PM Kunal <[email protected]> wrote: > Hi All, > > I am facing a situation in my 3 nodes cassandra wherein one of the > cassandra nodes is going down after around 5-10mins. > > Below messages are seen in debug.log of node which is going down: > > === > > No Title > > INFO [ScheduledTasks:1] 2019-05-30 14:39:25,179 StatusLogger.java:101 - > system_schema.views 2,16 > > INFO [Service Thread] 2019-05-30 14:39:25,182 StatusLogger.java:101 - > system.schema_keyspaces 0,0 > > INFO [ScheduledTasks:1] 2019-05-30 14:39:25,182 StatusLogger.java:101 - > system_schema.functions 2,16 > > INFO [Service Thread] 2019-05-30 14:39:32,569 StatusLogger.java:101 - > system.sstable_activity 280,10053 > > WARN [GossipTasks:1] 2019-05-30 14:39:32,572 FailureDetector.java:288 - > Not marking nodes down due to local pause of 7413014745 > 5000000000 > > DEBUG [GossipTasks:1] 2019-05-30 14:39:32,578 FailureDetector.java:294 - > Still not marking nodes down due to local pause > > INFO [ScheduledTasks:1] 2019-05-30 14:39:32,577 StatusLogger.java:101 - > virtuoranc.pmcollectionstatus 0,0 > > INFO [Service Thread] 2019-05-30 14:39:32,578 StatusLogger.java:101 - > system.batchlog 0,0 > > INFO [ScheduledTasks:1] 2019-05-30 14:39:32,579 StatusLogger.java:101 - > virtuoranc.snmp_trapdestination 0,0 > > INFO [Service Thread] 2019-05-30 14:39:32,579 StatusLogger.java:101 - > system.schema_columns 0,0 > > INFO [ScheduledTasks:1] 2019-05-30 14:39:32,579 StatusLogger.java:101 - > virtuoranc.auditlog 0,0 > > INFO [Service Thread] 2019-05-30 14:39:32,580 StatusLogger.java:101 - > system.hints 0,0 > > INFO [ScheduledTasks:1] 2019-05-30 14:39:32,580 StatusLogger.java:101 - > virtuoranc.jobproperties 0,0 > > INFO [Service Thread] 2019-05-30 14:39:32,580 StatusLogger.java:101 - > system.IndexInfo 0,0 > > ===== > > > We tried to clean this node and started it and ran nodetool repair -full > as well but it went down in between. Also nodetool command starts taking > too much time to give output after its been 3-4 mins of cassandra startup. > And at one point nodetool gives below error. > > nodetool tpstats > > nodetool: Failed to connect to '127.0.0.1:7199' - SocketTimeoutException: > 'Read timed out'. > > > Can you please let me know what is happening with this node. Any help is > appreciated. > > > > > Regards, > Kunal Vaid > -- www.vorstella.com 408 691 8402
