[
https://issues.apache.org/jira/browse/CASSANDRA-3200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202751#comment-16202751
]
Blake Eggleston commented on CASSANDRA-3200:
--------------------------------------------
bq. yeah I agree it duplicates a lot of code, but they also do different things
- the asymmetric ones don't need the merkle trees for example since we compare
everything outside of this class now. Let me know if you see a straight-forward
way to do it. I'll try to break out the common code in a separate class.
Hopefully the non-symmetric classes can be removed once we have confidence this
works as well.
Good point, I wasn’t paying attention to the stuff going on in their respective
base classes
bq. indentation looks good to me and they look good on github or am I
misunderstanding you?
The formatting of the matrices looks good, they just look weird starting at
column 0 when the rest of the method / comment is indented 8 spaces. iow,
something like this:
{code}
/*
... something ...
A B C D E
A = x x x
B x x x
C x x
D =
*/
{code}
Second round of review:
Everything looks good for the most part, and your optimization / stream
reduction stuff makes sense. There are just a few minor things:
HostDifferences:
* {{hasDifferencesFor}} isEmpty check is uneccesary
ReducedDifferenceHolder
* Probably don’t need this class, ImmutableMap<InetAddress, HostDifferences>
should be fine
RepairOption
* default for optimizeStreams seems to be false, but javadoc says it’s true
AsymmetricLocalSyncTask
* uncomment or remove logger info statement at line 95
AsymmetricSyncTask
* startTime is compared to Long.MIN_VALUE in {{finished}}, but it never
initialized to that value. Unless I’m mistaken, long values that aren’t
explicitly initialized to some value become 0 by default, so that branch in
finished will always run, even if {{run}} wasn’t called.
> Repair: compare all trees together (for a given range/cf) instead of by pair
> in isolation
> -----------------------------------------------------------------------------------------
>
> Key: CASSANDRA-3200
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3200
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Sylvain Lebresne
> Assignee: Marcus Eriksson
> Priority: Minor
> Labels: repair
> Fix For: 4.x
>
>
> Currently, repair compare merkle trees by pair, in isolation of any other
> tree. What that means concretely is that if I have three node A, B and C
> (RF=3) with A and B in sync, but C having some range r inconsitent with both
> A and B (since those are consistent), we will do the following transfer of r:
> A -> C, C -> A, B -> C, C -> B.
> The fact that we do both A -> C and C -> A is fine, because we cannot know
> which one is more to date from A or C. However, the transfer B -> C is
> useless provided we do A -> C if A and B are in sync. Not doing that transfer
> will be a 25% improvement in that case. With RF=5 and only one node
> inconsistent with all the others, that almost a 40% improvement, etc...
> Given that this situation of one node not in sync while the others are is
> probably fairly common (one node died so it is behind), this could be a fair
> improvement over what is transferred. In the case where we use repair to
> rebuild completely a node, this will be a dramatic improvement, because it
> will avoid the rebuilded node to get RF times the data it should get.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]