[
https://issues.apache.org/jira/browse/CASSANDRA-11432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15257121#comment-15257121
]
Dikang Gu commented on CASSANDRA-11432:
---------------------------------------
[~iamaleksey], yes, I'm trying to figure out when the repair is causing
problems. What I observed:
1. repair generates thousands of smaller sstables in secs, for compaction:
SSTables in each level: [966/4, 20/10, 152/100, 33, 0, 0, 0, 0, 0]
2. dropped messages in the log:
2016-04-25_21:35:51.21671 INFO 21:35:51 [ScheduledTasks:1]: MUTATION messages
were dropped in last 5000 ms: 0 for internal timeout and 358 for cross node
timeout
2016-04-25_21:35:51.21674 INFO 21:35:51 [ScheduledTasks:1]: READ messages were
dropped in last 5000 ms: 0 for internal timeout and 90 for cross node timeout
2016-04-25_21:35:51.21674 INFO 21:35:51 [ScheduledTasks:1]: COUNTER_MUTATION
messages were dropped in last 5000 ms: 0 for internal timeout and 21 for cross
node timeout
2016-04-25_21:35:51.21674 INFO 21:35:51 [ScheduledTasks:1]: Pool Name
Active Pending Completed Blocked All Time Blocked
2016-04-25_21:35:51.21798 INFO 21:35:51 [ScheduledTasks:1]: MutationStage
0 0 1009884950 0 0
2016-04-25_21:35:51.21799
2016-04-25_21:35:51.21810 INFO 21:35:51 [ScheduledTasks:1]: ReadStage
0 0 347247977 0 0
2016-04-25_21:35:51.21811
2016-04-25_21:35:51.21828 INFO 21:35:51 [ScheduledTasks:1]:
RequestResponseStage 0 0 1070811306 0
0
Do you have any advises about which part of code I should look at?
Thanks!
> Counter values become under-counted when running repair.
> --------------------------------------------------------
>
> Key: CASSANDRA-11432
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11432
> Project: Cassandra
> Issue Type: Bug
> Reporter: Dikang Gu
> Assignee: Aleksey Yeschenko
>
> We are experimenting Counters in Cassandra 2.2.5. Our setup is that we have 6
> nodes, across three different regions, and in each region, the replication
> factor is 2. Basically, each nodes holds a full copy of the data.
> We are writing to cluster with CL = 2, and reading with CL = 1.
> When are doing 30k/s counter increment/decrement per node, and at the
> meanwhile, we are double writing to our mysql tier, so that we can measure
> the accuracy of C* counter, compared to mysql.
> The experiment result was great at the beginning, the counter value in C* and
> mysql are very close. The difference is less than 0.1%.
> But when we start to run the repair on one node, the counter value in C*
> become much less than the value in mysql, the difference becomes larger than
> 1%.
> My question is that is it a known problem that the counter value will become
> under-counted if repair is running? Should we avoid running repair for
> counter tables?
> Thanks.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)