[
https://issues.apache.org/jira/browse/CASSANDRA-14939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16774116#comment-16774116
]
Marcus Eriksson commented on CASSANDRA-14939:
---------------------------------------------
in general, this looks very nice and will be a huge help
General stuff
* We keep the LocalSessions around for 1 day in system.repairs - I guess it is
possible to give incomplete information for summarizeRepaired after bounce for
example? (I have no real solution other than keeping them around for a longer
time)
ColumnFamilyStore;
* test(s) for getPendingRepairStats
* in getPendingRepairStats there is a potential NPE - the pending repair
status can be removed between the isPendingRepair check and getting the
pendingRepair UUID from the sstable
* when we use {{runWithCompactionsDisabled}} we could potentially pass in the
ranges we need to cancel compactions for (totally ok leaving this for a future
ticket though)
* potential NPE - {{sst.isPendingRepair}} call first, then
{{sst.getPendingRepair}} in the {{runWithCompactionsDisabled}} call
CompactionStrategyManager;
* in {{releaseRepairedData}} - we should probably make sure all the callables
are cancelled if we catch that exception there, otherwise we might keep the
sstables marked as compacting forever
PendingRepairManager;
* I think strategies can be removed even though we have the read lock -
{{Set<SSTableReader> sstables = get(sessionID).getSSTables();}} could NPE in
that case
Range;
* new {{Range.intersects(..)}} method should probably have a test
* same with new {{subtract}} method - both get tested indirectly, but might be
good with a few direct ones
LocalSessions;
* in {{getPendingStats}} sessionIDs set is not used
* in {{getPendingStats}} when checking {{if (!Iterables.any(ranges, r ->
r.intersects(session.ranges)))}} session could be null - there is a null check
right below which we could move up.
RepairedState;
* This class could use a more comments
PendingStat;
* In {{addSSTable}}, instead of checking
{{Preconditions.checkArgument(sessionID != null);}} we should probably just
skip the sstable as it means it has been moved out of pending
PendingStats;
* seems to be a mismatch in the columns in {{to/fromComposite}}
SchemaArgsParser;
* Untested
I changed {{RepairAdmin}} nodetool command to use subcommands to reduce some of
the manual parameter verification;
[https://github.com/krummas/cassandra/commits/blake/14939-trunk-nodetool]
> fix some operational holes in incremental repair
> ------------------------------------------------
>
> Key: CASSANDRA-14939
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14939
> Project: Cassandra
> Issue Type: Bug
> Reporter: Blake Eggleston
> Assignee: Blake Eggleston
> Priority: Major
> Fix For: 4.0
>
>
> Incremental repair has a few operational rough spots that make it more
> difficult to fully automate and operate at scale than it should be.
> * Visibility into whether pending repair data exists for a given token range.
> * Ability to force promotion/demotion of data for completed sessions instead
> of waiting for compaction.
> * Get the most recent repairedAt timestamp for a given token range.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]