Hello!

We have some strange troubles with cassandra startup. Cluster consists of 4 
nodes. 32 Gb RAM per node, each node has about 30Gb of data, 8 CPU.

root@vega010:~# nodetool version ReleaseVersion: 2.2.1

So, before stop (using disablethrift, drain):

nodetool tpstats: Read Stage 0 0 3093579 0 0

Just after start in logs:

INFO [main] http://airmail.calendar/2015-11-25%2013:22:04%20GMT+3 
YamlConfigurationLoader.java:92 - Loading settings from 
file:/etc/cassandra/cassandra.yaml 
. . . skipped . . .
INFO [main] http://airmail.calendar/2015-11-25%2013:22:21%20GMT+3 
CommitLog.java:168 - Replaying 
/var/lib/cassandra/commitlog/CommitLog–5–1448388020045.log, 
/var/lib/cassandra/commitlog/CommitLog–5–1448388020046.log, /var/lib/cassand
. . .skipped. . .
INFO [main] http://airmail.calendar/2015-11-25%2013:23:44%20GMT+3 
CommitLog.java:170 - Log replay complete, 1047857 replayed mutations
. . . skipped .. .
INFO [CompactionExecutor:4] 
http://airmail.calendar/2015-11-25%2013:23:45%20GMT+3 CompactionTask.java:142 - 
Compacting (cf08d1d0–93ba–11e5-b9f0–7be7ca1986fb) 
[/var/lib/cassandra/data/system/compaction_history-b4dbb7b4dc493fb5b3bfce6e434832ca/la–3479-big-Data.db:level=0,
 
/var/lib/cassandra/data/system/compaction_history-b4dbb7b4dc493fb5b3bfce6e434832ca/la–3474-big-Data.db:level=0,
 /var/lib/cassandra/data/system/compaction_history-b4db
. . . skipped. . .
INFO [HANDSHAKE-/10.50.2.60] 
http://airmail.calendar/2015-11-25%2013:23:46%20GMT+3 
OutboundTcpConnection.java:494 - Handshaking version with /10.50.2.60
INFO [GossipStage:1] http://airmail.calendar/2015-11-25%2013:23:46%20GMT+3 
Gossiper.java:1003 - Node /10.50.2.66 has restarted, now UP
WARN [GossipTasks:1] http://airmail.calendar/2015-11-25%2013:23:46%20GMT+3 
FailureDetector.java:243 - Not marking nodes down due to local pause of 
101075806441 > 5000000000
INFO [GossipStage:1] http://airmail.calendar/2015-11-25%2013:23:46%20GMT+3 
StorageService.java:1869 - Node /10.50.2.66 state jump to normal
INFO [HANDSHAKE-/10.50.2.60] 
http://airmail.calendar/2015-11-25%2013:23:46%20GMT+3 
OutboundTcpConnection.java:494 - Handshaking version with /10.50.2.60
INFO [GossipStage:1] http://airmail.calendar/2015-11-25%2013:23:46%20GMT+3 
Gossiper.java:1003 - Node /10.50.2.60 has restarted, now UP
INFO [GossipStage:1] http://airmail.calendar/2015-11-25%2013:23:46%20GMT+3 
StorageService.java:1869 - Node /10.50.2.60 state jump to normal
INFO [GossipStage:1] http://airmail.calendar/2015-11-25%2013:23:46%20GMT+3 
Gossiper.java:1003 - Node /10.50.2.57 has restarted, now UP
INFO [HANDSHAKE-/10.50.2.66] 
http://airmail.calendar/2015-11-25%2013:23:46%20GMT+3 
OutboundTcpConnection.java:494 - Handshaking version with /10.50.2.66
INFO [HANDSHAKE-/10.50.2.57] 
http://airmail.calendar/2015-11-25%2013:23:46%20GMT+3 
OutboundTcpConnection.java:494 - Handshaking version with /10.50.2.57
INFO [GossipStage:1] http://airmail.calendar/2015-11-25%2013:23:46%20GMT+3 
StorageService.java:1869 - Node /10.50.2.57 state jump to normal
INFO [SharedPool-Worker–20] 
http://airmail.calendar/2015-11-25%2013:23:46%20GMT+3 Gossiper.java:970 - 
InetAddress /10.50.2.60 is now UP
INFO [main] http://airmail.calendar/2015-11-25%2013:23:46%20GMT+3 
ColumnFamilyStore.java:743 - Completed loading (557 ms; 7022 shards) counter 
cache for SourcesAggregatedEventsV2.StoryReadingTimeSumPerDay_UTC_P_7
INFO [HANDSHAKE-/10.50.2.66] 
http://airmail.calendar/2015-11-25%2013:23:46%20GMT+3 
OutboundTcpConnection.java:494 - Handshaking version with /10.50.2.66
INFO [main] http://airmail.calendar/2015-11-25%2013:23:46%20GMT+3 
AutoSavingCache.java:146 - reading saved cache 
/var/lib/cassandra/saved_caches/SourcesAggregatedEventsV2-StoryReadingTimeSumPerDay_UTC_N_2-f318e310735f11e5b9599b83dc51d0b0-CounterCache-c.db
INFO [SharedPool-Worker–13] 
http://airmail.calendar/2015-11-25%2013:23:46%20GMT+3 Gossiper.java:970 - 
InetAddress /10.50.2.57 is now UP INFO [SharedPool-Worker–3] 
http://airmail.calendar/2015-11-25%2013:23:46%20GMT+3 Gossiper.java:970 - 
InetAddress /10.50.2.57 is now UP
INFO [SharedPool-Worker–16] 
http://airmail.calendar/2015-11-25%2013:23:46%20GMT+3 Gossiper.java:970 - 
InetAddress /10.50.2.57 is now UP INFO [GossipStage:1] 
http://airmail.calendar/2015-11-25%2013:23:46%20GMT+3 StorageService.java:1869 
- Node /10.50.2.60 state jump to normal
INFO [SharedPool-Worker–4] 
http://airmail.calendar/2015-11-25%2013:23:46%20GMT+3 Gossiper.java:970 - 
InetAddress /10.50.2.66 is now UP INFO [SharedPool-Worker–20] 
http://airmail.calendar/2015-11-25%2013:23:46%20GMT+3 Gossiper.java:970 - 
InetAddress /10.50.2.57 is now UP
INFO [SharedPool-Worker–1] 
http://airmail.calendar/2015-11-25%2013:23:46%20GMT+3 Gossiper.java:970 - 
InetAddress /10.50.2.66 is now UP INFO [SharedPool-Worker–5] 
http://airmail.calendar/2015-11-25%2013:23:46%20GMT+3 Gossiper.java:970 - 
InetAddress /10.50.2.66 is now UP
INFO [SharedPool-Worker–2] 
http://airmail.calendar/2015-11-25%2013:23:46%20GMT+3 Gossiper.java:970 - 
InetAddress /10.50.2.66 is now UP INFO [GossipStage:1] 
http://airmail.calendar/2015-11-25%2013:23:46%20GMT+3 StorageService.java:1869 
- Node /10.50.2.57 state jump to normal
INFO [ScheduledTasks:1] http://airmail.calendar/2015-11-25%2013:25:55%20GMT+3 
StatusLogger.java:51 - Pool Name Active Pending Completed Blocked All Time 
Blocked
INFO [ScheduledTasks:1] http://airmail.calendar/2015-11-25%2013:25:55%20GMT+3 
StatusLogger.java:55 - ReadStage 32 2753202 69509 0 0
INFO [ScheduledTasks:1] http://airmail.calendar/2015-11-25%2013:25:55%20GMT+3 
StatusLogger.java:55 - MutationStage 32 602 9197964 0 0

So, just after start it has 2753202 pending readstage tasks. And it takes about 
11 hours to complete them all.

So, what could be the reason? 

-- 
Vasiliy I Ozerov
Sent with Airmail

Reply via email to