[
https://issues.apache.org/jira/browse/CASSANDRA-9591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14617156#comment-14617156
]
Benedict commented on CASSANDRA-9591:
-------------------------------------
Thanks. Committed.
Could I ask that in future when merging up the branches, you leave the full
merge commits in place? i.e. when constructing your branches, start with 2.0
(or whatever the lowest affected branch is), merge that into 2.1 (without
--no-commit or --squash), and continue this all the way up to trunk. This is
how patches must be committed into mainline, and helps committers with conflict
resolutions when applying your patches.
AFAICT you have created completely distinct branches all sourced from their
mainline branches, and this means it is a lot more labour intensive for me to
merge these upwards myself when (inevitably) one of the source branches
diverges, as I have to construct my own versions of each, rebase them, perform
diffs and patch applications. If the merge commits were in place, I could (in
theory) use git rerere to resolve conflicts without any manual intervention,
ensuring we do not break upstream due to my misapplication of your changes.
> Scrub (recover) sstables even when -Index.db is missing
> -------------------------------------------------------
>
> Key: CASSANDRA-9591
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9591
> Project: Cassandra
> Issue Type: Improvement
> Reporter: mck
> Assignee: mck
> Labels: benedict-to-commit, sstablescrub
> Fix For: 2.0.x
>
> Attachments: 9591-2.0.txt, 9591-2.1.txt
>
>
> Today SSTableReader needs at minimum 3 files to load an sstable:
> - -Data.db
> - -CompressionInfo.db
> - -Index.db
> But during the scrub process the -Index.db file isn't actually necessary,
> unless there's corruption in the -Data.db and we want to be able to skip over
> corrupted rows. Given that there is still a fair chance that there's nothing
> wrong with the -Data.db file and we're just missing the -Index.db file this
> patch addresses that situation.
> So the following patch makes it possible for the StandaloneScrubber
> (sstablescrub) to recover sstables despite missing -Index.db files.
> This can happen from a catastrophic incident where data directories have been
> lost and/or corrupted, or wiped and the backup not healthy. I'm aware that
> normally one depends on replicas or snapshots to avoid such situations, but
> such catastrophic incidents do occur in the wild.
> I have not tested this patch against normal c* operations and all the other
> (more critical) ways SSTableReader is used. i'll happily do that and add the
> needed units tests if people see merit in accepting the patch.
> Otherwise the patch can live with the issue, in-case anyone else needs it.
> There's also a cassandra distribution bundled with the patch
> [here|https://github.com/michaelsembwever/cassandra/releases/download/2.0.15-recover-sstables-without-indexdb/apache-cassandra-2.0.15-recover-sstables-without-indexdb.tar.gz]
> to make life a little easier for anyone finding themselves in such a bad
> situation.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)