[ 
https://issues.apache.org/jira/browse/CASSANDRA-14939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16774116#comment-16774116
 ] 

Marcus Eriksson commented on CASSANDRA-14939:
---------------------------------------------

in general, this looks very nice and will be a huge help

General stuff
 * We keep the LocalSessions around for 1 day in system.repairs - I guess it is 
possible to give incomplete information for summarizeRepaired after bounce for 
example? (I have no real solution other than keeping them around for a longer 
time)

ColumnFamilyStore;
 * test(s) for getPendingRepairStats
 * in getPendingRepairStats there is a potential NPE - the pending repair 
status can be removed between the isPendingRepair check and getting the 
pendingRepair UUID from the sstable
 * when we use {{runWithCompactionsDisabled}} we could potentially pass in the 
ranges we need to cancel compactions for (totally ok leaving this for a future 
ticket though)
 * potential NPE - {{sst.isPendingRepair}} call first, then 
{{sst.getPendingRepair}} in the {{runWithCompactionsDisabled}} call

CompactionStrategyManager;
 * in {{releaseRepairedData}} - we should probably make sure all the callables 
are cancelled if we catch that exception there, otherwise we might keep the 
sstables marked as compacting forever

PendingRepairManager;
 * I think strategies can be removed even though we have the read lock - 
{{Set<SSTableReader> sstables = get(sessionID).getSSTables();}} could NPE in 
that case

Range;
 * new {{Range.intersects(..)}} method should probably have a test
 * same with new {{subtract}} method - both get tested indirectly, but might be 
good with a few direct ones

LocalSessions;
 * in {{getPendingStats}} sessionIDs set is not used
 * in {{getPendingStats}} when checking {{if (!Iterables.any(ranges, r -> 
r.intersects(session.ranges)))}} session could be null - there is a null check 
right below which we could move up.

RepairedState;
 * This class could use a more comments

PendingStat;
 * In {{addSSTable}}, instead of checking 
{{Preconditions.checkArgument(sessionID != null);}} we should probably just 
skip the sstable as it means it has been moved out of pending

PendingStats;
 * seems to be a mismatch in the columns in {{to/fromComposite}}

SchemaArgsParser;
 * Untested

I changed {{RepairAdmin}} nodetool command to use subcommands to reduce some of 
the manual parameter verification;
 [https://github.com/krummas/cassandra/commits/blake/14939-trunk-nodetool]

> fix some operational holes in incremental repair
> ------------------------------------------------
>
>                 Key: CASSANDRA-14939
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14939
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Blake Eggleston
>            Assignee: Blake Eggleston
>            Priority: Major
>             Fix For: 4.0
>
>
> Incremental repair has a few operational rough spots that make it more 
> difficult to fully automate and operate at scale than it should be.
> * Visibility into whether pending repair data exists for a given token range.
> * Ability to force promotion/demotion of data for completed sessions instead 
> of waiting for compaction.
> * Get the most recent repairedAt timestamp for a given token range.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to