[
https://issues.apache.org/jira/browse/CASSANDRA-8429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14251911#comment-14251911
]
Benedict commented on CASSANDRA-8429:
-------------------------------------
Apologies for the delay on this one. I've had a bit of fun digging through it
and, in the end, cleaning up some of the inadequate decisions I made first time
around so that this code is hopefully easier for everyone to read and more
obviously correct (I was having trouble convincing myself prior).
Mostly I've been trying to figure out what on earth was changed for this patch
to introduce an infrequent but persistent problem with reference counts.
However after reslicing the code I have tracked it down to an unrelated already
present bug that I will file and fix separately, that was clearly just being
made more likely with the change, timing-wise. The positive upshot, though, is
that I have made the code this patch is based on quite a bit easier to read,
and I hope more clearly correct as a result. I've uploaded
[here|https://github.com/belliottsmith/cassandra/commits/8429].
The patch is split into 5 commits:
# Improves legibility by removing Pair and replacing with a bespoke class
# Merges all of the new methods introduced in this (and related) patches to an
enum flag provided to a single method (in each necessary class). Also fixes a
minor memory leak introduced here for compression metadata.
# Fixes an unrelated bug that was somehow being exercised here and preventing a
clean run of unit tests
# Refactors the abort() and finish() methods in SSTableRewriter to make them
more legible and also make the cleanup more obviously correct. This is the only
complex review to do, though it's not too bad:
#* The finishedEarly and finishedWriters collections are merged, and a special
discard collection is introduced to take the place of the former _only_ when
the sstable has a replacement. On closing a writer we immediately remove it
from finishedEarly and move its most recent reader to discard.
#* The cleanup of discardable readers has been split out into a shared method
used by both abort() and finish()
#* finish() and finishAndThrow() have been merged so that they're obviously the
same
#* Instead of managing the Iterator ourselves, I've wrapped it in an Iterator
that removes as we progress, so the complex methods have less boilerplate
# Removes the use of sharesBfWith() as setReplacedBy() is both more correct and
easier to reason about (sharesBfWith could result in a leak or a memory bug if
sstables are released out of order), also fixes the deletion logic on cleanup
that may explain why this wasn't used previously
We can probably split some of these commits into another ticket if you prefer,
but I'd feel most comfortable committing all of them together. I've filed
separate tickets for all of the other issues I came across while reviewing that
are easily delayed or done separately.
> Some keys unreadable during compaction
> --------------------------------------
>
> Key: CASSANDRA-8429
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8429
> Project: Cassandra
> Issue Type: Bug
> Environment: Ubuntu 14.04
> Reporter: Ariel Weisberg
> Assignee: Marcus Eriksson
> Fix For: 2.1.3
>
> Attachments: cluster.conf, run_stress.sh
>
>
> Starts as part of merge commit 25be46497a8df46f05ffa102bc645bfd684ea48a
> Stress will say that a key wasn't validated because it isn't returned even
> though it's loaded. The key will eventually appear and can be queried using
> cqlsh.
> Reproduce with
> #!/bin/sh
> ROWCOUNT=10000000
> SCHEMA='-col n=fixed(1) -schema
> compaction(strategy=LeveledCompactionStrategy) compression=LZ4Compressor'
> ./cassandra-stress write n=$ROWCOUNT -node xh61 -pop seq=1..$ROWCOUNT no-wrap
> -rate threads=25 $SCHEMA
> ./cassandra-stress mixed "ratio(read=2)" n=100000000 -node xh61 -pop
> "dist=extreme(1..$ROWCOUNT,0.6)" -rate threads=25 $SCHEMA
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)