[
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15207380#comment-15207380
]
Ruoran Wang commented on CASSANDRA-9935:
----------------------------------------
Yes, I am able to reproduce with new keyspace.
{noformat}
CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy',
'replication_factor': '3'} AND durable_writes = true;
CREATE TABLE test.ui_by_modification (
bucket int,
modified_hour timestamp,
user_id bigint,
challenge_id uuid,
created timestamp,
creator_user_id bigint,
type int,
PRIMARY KEY ((bucket, modified_hour), user_id, challenge_id)
) WITH CLUSTERING ORDER BY (user_id ASC, challenge_id ASC)
AND bloom_filter_fp_chance = 0.1
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = ''
AND compaction = {'class':
'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'}
AND compression = {'sstable_compression':
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 604800
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';
{noformat}
Then I am generating data using
{noformat}
Long creatorId = (long) random.nextInt(100000000);
UUID uuid = UUID_GENERATOR.generate();
int type = random.nextInt(10);
getIdCache().put(creatorId, uuid);
Date date = DateTime.now(DateTimeZone.UTC).toDate();
try {
runQuery(
"insert into test.ui_by_modification(bucket, modified_hour,
user_id, challenge_id, created, creator_user_id, type) VALUES (?, ?, ?, ?, ?,
?, ?)",
new Random().nextInt(1024), date, creatorId,
UUID_GENERATOR.generate(), date, creatorId, type
);
} catch (Exception e) {
log.error("error", e);
}
{noformat}
I insert ~200 per second. Then I start first round of incremental repairs,
repair -pr -par --in-local-dc -inc -- test, on this 6 nodes in the cluster.
Then I waited ~1.5 hour then run the same inc repair, and then I got the same
error.
I think there is a correlation between the composite partition key and this
error.
> Repair fails with RuntimeException
> ----------------------------------
>
> Key: CASSANDRA-9935
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
> Project: Cassandra
> Issue Type: Bug
> Environment: C* 2.1.8, Debian Wheezy
> Reporter: mlowicki
> Assignee: Yuki Morishita
> Fix For: 2.1.x
>
> Attachments: db1.sync.lati.osa.cassandra.log,
> db5.sync.lati.osa.cassandra.log, system.log.10.210.3.117,
> system.log.10.210.3.221, system.log.10.210.3.230
>
>
> We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade
> to 2.1.8 it started to work faster but now it fails with:
> {code}
> ...
> [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde
> for range (-5474076923322749342,-5468600594078911162] finished
> [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde
> for range (-8631877858109464676,-8624040066373718932] finished
> [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde
> for range (-5372806541854279315,-5369354119480076785] finished
> [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde
> for range (8166489034383821955,8168408930184216281] finished
> [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde
> for range (6084602890817326921,6088328703025510057] finished
> [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde
> for range (-781874602493000830,-781745173070807746] finished
> [2015-07-29 20:44:03,957] Repair command #4 finished
> error: nodetool failed, check server logs
> -- StackTrace --
> java.lang.RuntimeException: nodetool failed, check server logs
> at
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
> at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
> {code}
> After running:
> {code}
> nodetool repair --partitioner-range --parallel --in-local-dc sync
> {code}
> Last records in logs regarding repair are:
> {code}
> INFO [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 -
> Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range
> (-7695808664784761779,-7693529816291585568] finished
> INFO [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 -
> Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range
> (8063716953988492222,8065203836608925992] finished
> INFO [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 -
> Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range
> (-5474076923322749342,-5468600594078911162] finished
> INFO [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 -
> Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range
> (-8631877858109464676,-8624040066373718932] finished
> INFO [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 -
> Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range
> (-5372806541854279315,-5369354119480076785] finished
> INFO [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 -
> Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range
> (8166489034383821955,8168408930184216281] finished
> INFO [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 -
> Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range
> (6084602890817326921,6088328703025510057] finished
> INFO [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 -
> Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range
> (-781874602493000830,-781745173070807746] finished
> {code}
> but a bit above I see (at least two times in attached log):
> {code}
> ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 -
> Repair session 1b07ea50-3608-11e5-a93e-4963524a8bde for range
> (5765414319217852786,5781018794516851576] failed with error
> org.apache.cassandra.exceptions.RepairException: [repair
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2,
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> java.util.concurrent.ExecutionException: java.lang.RuntimeException:
> org.apache.cassandra.exceptions.RepairException: [repair
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2,
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> [na:1.7.0_80]
> at java.util.concurrent.FutureTask.get(FutureTask.java:188)
> [na:1.7.0_80]
> at
> org.apache.cassandra.service.StorageService$4.runMayThrow(StorageService.java:2950)
> ~[apache-cassandra-2.1.8.jar:2.1.8]
> at
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> [apache-cassandra-2.1.8.jar:2.1.8]
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> [na:1.7.0_80]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> [na:1.7.0_80]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
> Caused by: java.lang.RuntimeException:
> org.apache.cassandra.exceptions.RepairException: [repair
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2,
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> at com.google.common.base.Throwables.propagate(Throwables.java:160)
> ~[guava-16.0.jar:na]
> at
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32)
> [apache-cassandra-2.1.8.jar:2.1.8]
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> [na:1.7.0_80]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> [na:1.7.0_80]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> ~[na:1.7.0_80]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> ~[na:1.7.0_80] ... 1 common frames omitted
> Caused by: org.apache.cassandra.exceptions.RepairException: [repair
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2,
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> at
> org.apache.cassandra.repair.RepairSession.validationComplete(RepairSession.java:166)
> ~[apache-cassandra-2.1.8.jar:2.1.8] at
> org.apache.cassandra.service.ActiveRepairService.handleMessage(ActiveRepairService.java:406)
> ~[apache-cassandra-2.1.8.jar:2.1.8]
> at
> org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:134)
> ~[apache-cassandra-2.1.8.jar:2.1.8] at
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62)
> ~[apache-cassandra-2.1.8.jar:2.1.8]
> ... 3 common frames omittedINFO [Thread-173887] 2015-07-29
> 20:44:03,854 StorageService.java:2952 - Repair session
> 846d9300-3608-11e5-a93e-4963524a8bde for range (-6705935
> 742755245856,-6704072966568763453] finished
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)