[
https://issues.apache.org/jira/browse/CASSANDRA-9742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14617790#comment-14617790
]
Jeff Jirsa edited comment on CASSANDRA-9742 at 7/8/15 1:17 AM:
---------------------------------------------------------------
Operator perspective, fwiw: I already have repair schedules. I already know
what needs to be repaired and what doesn't. What I didn't have, previously, was
a way to validate the files on disk actually matched what I believed they
matched, short of running scrub.
`verify` was very literally `read only scrub` - when I wrote 5791, I followed
the scrub code path very closely, because that was the use case I was worried
about when I wrote it (the concern was bit level corruption due to failing
HDD/RAID controller - scrub would do the job, but it's a heavy hammer hitting a
tiny nail). The notion of "verify this node has all the data" was already
covered by repair, so I never even considered having `verify` do that.
Why not just (add a flag to) enable incremental repair validate checksums for
all sstables - the verifier will {{mutateRepairedAt(sstable.descriptor,
ActiveRepairService.UNREPAIRED_SSTABLE)}} on checksum failure which then allows
incremental repair to re-repair that data?
was (Author: jjirsa):
Operator perspective, fwiw: I already have repair schedules. I already know
what needs to be repaired and what doesn't. What I didn't have, previously, was
a way to validate the files on disk actually matched what I believed they
matched, short of running scrub.
`verify` was very literally `read only scrub` - when I wrote 5791, I followed
the scrub code path very closely, because that was the use case I was worried
about when I wrote it (the concern was bit level corruption due to failing
HDD/RAID controller - scrub would do the job, but it's a heavy hammer hitting a
tiny nail). The notion of "verify this node has all the data" was already
covered by repair, so I never even considered having `verify` do that.
Why not just have incremental repair validate checksums for all sstables - the
verifier will {{mutateRepairedAt(sstable.descriptor,
ActiveRepairService.UNREPAIRED_SSTABLE)}} on checksum failure which then allows
incremental repair to re-repair that data?
> Nodetool verify
> ---------------
>
> Key: CASSANDRA-9742
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9742
> Project: Cassandra
> Issue Type: New Feature
> Components: Tools
> Reporter: Jonathan Ellis
> Fix For: 3.x
>
>
> We introduced incremental repair in 2.1 but it is difficult to make that the
> default without unpleasant surprises for incautious users.
> Additionally, while we now store sstable checksums, we leave verification to
> the user.
> I propose introducing a new command, {{nodetool verify}}, that would address
> both of these.
> Default operation would be to do an incremental repair, plus validate
> checksums on *all* sstables (not just unrepaired ones). We could also have
> --local mode (checksums only) and --full (classic repair).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)