[
https://issues.apache.org/jira/browse/CASSANDRA-9591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14598381#comment-14598381
]
mck edited comment on CASSANDRA-9591 at 6/23/15 9:21 PM:
---------------------------------------------------------
bq. If it's impossible to have those values wired up…
It's possible to "wire" these values up doing a poorman's approach of doing a
complete pass through the data file. That's pretty wasteful, but we're only
talking about the edge-case of the StandaloneScrubber here.
eg
{code}
private void findFirstAndLast() throws IOException
{
// we have no primary index. take a pass through the data file to assign
first and last. costly, but only for StandaloneScrubber
try (RandomAccessReader dataFile = openDataReader())
{
while (!dataFile.isEOF())
{
DecoratedKey decoratedKey =
partitioner.decorateKey(ByteBufferUtil.readWithShortLength(dataFile));
if (first == null)
first = decoratedKey;
last = decoratedKey;
SSTableIdentityIterator atoms = new SSTableIdentityIterator(this,
dataFile, decoratedKey, false);
while (atoms.hasNext())
atoms.next();
}
}
first = getMinimalKey(first);
last = getMinimalKey(last);
}
{code}
Would you rather see the flag into {{updateLiveSet()}}?
was (Author: michaelsembwever):
bq. If it's impossible to have those values wired up…
It's possible to "wire" these values up doing a poorman's approach of doing a
complete pass through the data file. That's pretty wasteful, but we're only
taking about the edge-case of the StandaloneScrubber here.
eg
{code}
private void findFirstAndLast() throws IOException
{
// we have no primary index. take a pass through the data file to assign
first and last. costly, but only for StandaloneScrubber
try (RandomAccessReader dataFile = openDataReader())
{
while (!dataFile.isEOF())
{
DecoratedKey decoratedKey =
partitioner.decorateKey(ByteBufferUtil.readWithShortLength(dataFile));
if (first == null)
first = decoratedKey;
last = decoratedKey;
SSTableIdentityIterator atoms = new SSTableIdentityIterator(this,
dataFile, decoratedKey, false);
while (atoms.hasNext())
atoms.next();
}
}
first = getMinimalKey(first);
last = getMinimalKey(last);
}
{code}
Would you rather see the flag into {{updateLiveSet()}}?
> Scrub (recover) sstables even when -Index.db is missing
> -------------------------------------------------------
>
> Key: CASSANDRA-9591
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9591
> Project: Cassandra
> Issue Type: Improvement
> Reporter: mck
> Assignee: mck
> Labels: sstablescrub
> Fix For: 2.0.x
>
> Attachments: 9591-2.0.txt, 9591-2.1.txt
>
>
> Today SSTableReader needs at minimum 3 files to load an sstable:
> - -Data.db
> - -CompressionInfo.db
> - -Index.db
> But during the scrub process the -Index.db file isn't actually necessary,
> unless there's corruption in the -Data.db and we want to be able to skip over
> corrupted rows. Given that there is still a fair chance that there's nothing
> wrong with the -Data.db file and we're just missing the -Index.db file this
> patch addresses that situation.
> So the following patch makes it possible for the StandaloneScrubber
> (sstablescrub) to recover sstables despite missing -Index.db files.
> This can happen from a catastrophic incident where data directories have been
> lost and/or corrupted, or wiped and the backup not healthy. I'm aware that
> normally one depends on replicas or snapshots to avoid such situations, but
> such catastrophic incidents do occur in the wild.
> I have not tested this patch against normal c* operations and all the other
> (more critical) ways SSTableReader is used. i'll happily do that and add the
> needed units tests if people see merit in accepting the patch.
> Otherwise the patch can live with the issue, in-case anyone else needs it.
> There's also a cassandra distribution bundled with the patch
> [here|https://github.com/michaelsembwever/cassandra/releases/download/2.0.15-recover-sstables-without-indexdb/apache-cassandra-2.0.15-recover-sstables-without-indexdb.tar.gz]
> to make life a little easier for anyone finding themselves in such a bad
> situation.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)