[
https://issues.apache.org/jira/browse/CASSANDRA-7390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Joshua McKenzie updated CASSANDRA-7390:
---------------------------------------
Attachment: 7390_lockpendingcalc.txt
7390_conservative_v1.txt
Looks like I opened Pandora's Box on this one. A couple patches:
7390_conservative_v1.txt: cleans up a few underlying issues causing failures in
MoveTest. The logic on PendingRangeCalculatorService.blockUntilFinished()
relied on ThreadPoolExecutor.getActiveCount() to determine the # of running
threads - checking the documentation shows:
{code:title=getActiveCount}
/**
* Returns the approximate number of threads that are actively
* executing tasks.
*
* @return the number of threads
*/
public int getActiveCount() {...}
{code}
Changing that to an AtomicBoolean sentinel and getting rid of the now
unnecessary DiscardPolicy for extra threads took care of one race.
Also cleaned up the inter-test StorageService / TokenMetadata reset logic in
MoveTest.java, added some missing fields to TokenMetadata.resetUnsafe(), and
added another blockUntilFinished between a couple of back-to-back onChange
calls in testSimultaneousMove() that would infrequently race on
PendingRangeCalculatorService.update() calls.
With the above changes, unit test failure rate drops from 1 out of 5 runs to 1
out of 100 on average.
I'm a little wary of what that back-to-back onChange call seems to imply about
the design of the PendingRangeCalculatorService's queue'ing model - it looks
like we can have a race where we hit the PendingRangeCalculatorService for an
update while another one is already running, drop the new update() request with
the expectation that it'll be caught by the PendingRangeTask that's in-flight,
and have our logical update missed until the next update() call. This may just
be an artifact of unit-tests hitting onChange back-to-back being
non-representative of real use-cases timing.
7390_lockpendingcalc.txt: As to the final 1% failure - from my initial
inspection of TokenMetadata code and flow, it looks like we're pulling out
values from internal data structures in multiple discrete atomic steps in
calculatePendingRanges. This leaves scheduling holes where other threads
(variety of onChange paths, some nodetool commands, and brute-force unit-tests)
can hop in, grab the write lock, and make changes to TokenMetadata while our
pending calc is active:
{code:title=2 discrete read locks example}
BiMultiValMap<Token, InetAddress> bootstrapTokens = tm.getBootstrapTokens();
// No lock held here
Set<InetAddress> leavingEndpoints = tm.getLeavingEndpoints();
{code}
This patch migrates the pending range calculation inside TokenMetadata while
holding a readlock for its entirety and addresses the potential race that
occurs there (0 failures in > 1200 runs) - it also holds the readlock on
TokenMetadata for a pretty long chunk of time (8-9000 usec on a trivial
unit-test case on a laptop). That being said, reads will still operate during
these time windows and the infrequency of writes blocking / showing this race
indicates to me that we shouldn't have serious contention with this change.
Attaching a separate patch for moving pending range calc to TokenMetadata as
that's more invasive and might just be a race we're comfortable accepting to
keep contention down in the structure.
> MoveTest fails intermittently
> -----------------------------
>
> Key: CASSANDRA-7390
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7390
> Project: Cassandra
> Issue Type: Bug
> Reporter: Joshua McKenzie
> Assignee: Joshua McKenzie
> Priority: Minor
> Fix For: 3.0
>
> Attachments: 7390_conservative_v1.txt, 7390_lockpendingcalc.txt
>
>
> Reproduce with:
> for i in `seq 1 50`; do runTest MoveTest >> moveTestFailures.txt;done
> Looks to fail roughly once out of every 5 runs or so
> {code:title=failure}
> [junit] Testcase:
> newTestWriteEndpointsDuringMove(org.apache.cassandra.service.MoveTest):
> FAILED
> [junit] mismatched number of moved token expected:<0> but was:<1>
> [junit] junit.framework.AssertionFailedError: mismatched number of moved
> token expected:<0> but was:<1>
> [junit] at
> org.apache.cassandra.service.MoveTest.newTestWriteEndpointsDuringMove(MoveTest.java:140)
> {code}
> Edit: had this as Windows-only - turns out it's failing on on *nix as well
--
This message was sent by Atlassian JIRA
(v6.2#6252)