[
https://issues.apache.org/jira/browse/CASSANDRA-10515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jeff Griffith updated CASSANDRA-10515:
--------------------------------------
Description:
After upgrading from cassandra 2.0.x to 2.1.10, we began seeing problems where
some nodes break the 12G commit log max we configured and go as high as 65G or
more. Once it reaches this state, "nodetool compactionstats" hangs. Eventually
C* restarts without errors and the cleanup occurs and the commit logs shrink
back down again.
{code}
[email protected]:~$ ndc
pending tasks: 2185
compaction type keyspace table completed
total unit progress
Compaction SyncCore *cf1* 61251208033
170643574558 bytes 35.89%
Compaction SyncCore *cf2* 19262483904
19266079916 bytes 99.98%
Compaction SyncCore *cf3* 6592197093
6592316682 bytes 100.00%
Compaction SyncCore *cf4* 3411039555
3411039557 bytes 100.00%
Compaction SyncCore *cf5* 2879241009
2879487621 bytes 99.99%
Compaction SyncCore *cf6* 21252493623
21252635196 bytes 100.00%
Compaction SyncCore *cf7* 81009853587
81009854438 bytes 100.00%
Compaction SyncCore *cf8* 3005734580
3005768582 bytes 100.00%
Active compaction remaining time : n/a
{code}
I was also doing periodic "nodetool tpstats" which were working but not being
logged in system.log on the StatusLogger thread until after the compaction
started working again.
was:
After upgrading from cassandra 2.0.x to 2.1.10, we began seeing problems where
some nodes break the 12G commit log max we configured and go as high as 65G or
more. Once it reaches this state, "nodetool compactionstats" hangs. I watched
the recovery live when compactions begin happening again. the "nodetool
compactionstats" suddenly completed to show the outstanding jobs most in 100%
completion state:
{code}
[email protected]:~$ ndc
pending tasks: 2185
compaction type keyspace table completed
total unit progress
Compaction SyncCore *cf1* 61251208033
170643574558 bytes 35.89%
Compaction SyncCore *cf2* 19262483904
19266079916 bytes 99.98%
Compaction SyncCore *cf3* 6592197093
6592316682 bytes 100.00%
Compaction SyncCore *cf4* 3411039555
3411039557 bytes 100.00%
Compaction SyncCore *cf5* 2879241009
2879487621 bytes 99.99%
Compaction SyncCore *cf6* 21252493623
21252635196 bytes 100.00%
Compaction SyncCore *cf7* 81009853587
81009854438 bytes 100.00%
Compaction SyncCore *cf8* 3005734580
3005768582 bytes 100.00%
Active compaction remaining time : n/a
{code}
I was also doing periodic "nodetool tpstats" which were working but not being
logged in system.log on the StatusLogger thread until after the compaction
started working again.
> Commit logs back up with move to 2.1.10
> ---------------------------------------
>
> Key: CASSANDRA-10515
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10515
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Environment: redhat 6.5, cassandra 2.1.10
> Reporter: Jeff Griffith
> Priority: Critical
> Attachments: CommitLogProblem.jpg
>
>
> After upgrading from cassandra 2.0.x to 2.1.10, we began seeing problems
> where some nodes break the 12G commit log max we configured and go as high as
> 65G or more. Once it reaches this state, "nodetool compactionstats" hangs.
> Eventually C* restarts without errors and the cleanup occurs and the commit
> logs shrink back down again.
> {code}
> [email protected]:~$ ndc
> pending tasks: 2185
> compaction type keyspace table completed
> total unit progress
> Compaction SyncCore *cf1* 61251208033
> 170643574558 bytes 35.89%
> Compaction SyncCore *cf2* 19262483904
> 19266079916 bytes 99.98%
> Compaction SyncCore *cf3* 6592197093
> 6592316682 bytes 100.00%
> Compaction SyncCore *cf4* 3411039555
> 3411039557 bytes 100.00%
> Compaction SyncCore *cf5* 2879241009
> 2879487621 bytes 99.99%
> Compaction SyncCore *cf6* 21252493623
> 21252635196 bytes 100.00%
> Compaction SyncCore *cf7* 81009853587
> 81009854438 bytes 100.00%
> Compaction SyncCore *cf8* 3005734580
> 3005768582 bytes 100.00%
> Active compaction remaining time : n/a
> {code}
> I was also doing periodic "nodetool tpstats" which were working but not being
> logged in system.log on the StatusLogger thread until after the compaction
> started working again.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)