[
https://issues.apache.org/jira/browse/CASSANDRA-18675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17743987#comment-17743987
]
David Capwell commented on CASSANDRA-18675:
-------------------------------------------
This is what I think is happening so far
{code}
topologies = node.topology().withUnsyncedEpochs(route, txnId.epoch() == 4,
executeAt.epoch() == 5);
{code}
Here, epoch 4 knows about the keyspace, but epoch 5 does not; aka we are
running the txn in a topology that would have rejected it in the first place...
> (Accord): C* stores keyspace in Range which will cause ranges to be removed
> from Accord when DROP KEYSPACE is performed
> -----------------------------------------------------------------------------------------------------------------------
>
> Key: CASSANDRA-18675
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18675
> Project: Cassandra
> Issue Type: Bug
> Components: Accord
> Reporter: David Capwell
> Assignee: David Capwell
> Priority: Normal
> Fix For: 5.x
>
>
> When translating the C* model to Accord we store the keyspace in the Range,
> this has the side effect that DROP KEYSPACE will cause Accord to see ranges
> removed between epoch versions. When this is enabled in BurnTest we see that
> there are cases where we stop making progress and can loop forever (until we
> OOM).
> Example
> {code}
> Failed on seed -1330125844737109546
> accord.burn.SimulationException: Failed on seed -1330125844737109546
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> at java.util.HashMap.newNode(HashMap.java:1774)
> at java.util.HashMap.putVal(HashMap.java:632)
> at java.util.HashMap.put(HashMap.java:613)
> at
> accord.impl.InMemoryCommandStore$InMemorySafeStore.addCommandInternal(InMemoryCommandStore.java:582)
> at
> accord.impl.InMemoryCommandStore$InMemorySafeStore.addCommandInternal(InMemoryCommandStore.java:557)
> at
> accord.impl.AbstractSafeCommandStore$$Lambda$516/1241529534.accept(Unknown
> Source)
> at
> accord.impl.AbstractSafeCommandStore.getIfLoaded(AbstractSafeCommandStore.java:84)
> at
> accord.impl.AbstractSafeCommandStore.getInternalIfLoadedAndInitialised(AbstractSafeCommandStore.java:91)
> at
> accord.local.SafeCommandStore.ifLoadedAndInitialised(SafeCommandStore.java:194)
> at accord.local.Commands.lambda$updateWaitingOn$3(Commands.java:687)
> at accord.local.Commands$$Lambda$514/809300666.accept(Unknown Source)
> at accord.utils.SimpleBitSet.reverseForEach(SimpleBitSet.java:358)
> at
> accord.local.Command$WaitingOn$Update.forEachWaitingOnCommit(Command.java:1280)
> at accord.local.Commands.updateWaitingOn(Commands.java:685)
> at accord.local.Commands.initialiseWaitingOn(Commands.java:675)
> at accord.local.Commands.apply(Commands.java:481)
> at accord.messages.Apply.apply(Apply.java:121)
> at accord.messages.Apply.apply(Apply.java:34)
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]