aweisberg commented on code in PR #3174:
URL: https://github.com/apache/cassandra/pull/3174#discussion_r1544720878
##########
src/java/org/apache/cassandra/db/streaming/CassandraStreamReceiver.java:
##########
@@ -246,7 +246,7 @@ public void finished()
checkNotNull(minVersion, "Unable to determine minimum cluster
version");
IAccordService accordService = AccordService.instance();
if (session.streamOperation().requiresBarrierTransaction()
- && cfs.metadata().isAccordEnabled()
+ && cfs.metadata().requiresAccordSupport()
Review Comment:
Need to puzzle through here if there is a race similar to mutations where
the mutation is supposed to write through Accord, but doesn't and so we don't
do a barrier here when need to.
##########
src/java/org/apache/cassandra/service/consensus/migration/ConsensusMigrationRepairResult.java:
##########
@@ -33,18 +34,30 @@ private
ConsensusMigrationRepairResult(ConsensusMigrationRepairType type, Epoch
this.minEpoch = minEpoch;
}
- public static ConsensusMigrationRepairResult fromCassandraRepair(Epoch
minEpoch, boolean migrationEligibleRepair)
+ public static ConsensusMigrationRepairResult fromRepair(Epoch minEpoch,
boolean paxosRepaired, boolean accordRepaired, boolean deadNodesExcluded)
{
- checkArgument(!migrationEligibleRepair ||
minEpoch.isAfter(Epoch.EMPTY), "Epoch should not be empty if Paxos and regular
repairs were performed");
- if (migrationEligibleRepair)
- return new
ConsensusMigrationRepairResult(ConsensusMigrationRepairType.paxos, minEpoch);
- else
- return new
ConsensusMigrationRepairResult(ConsensusMigrationRepairType.ineligible,
Epoch.EMPTY);
+ checkArgument((!paxosRepaired && !accordRepaired) ||
minEpoch.isAfter(Epoch.EMPTY), "Epoch should not be empty if Paxos and regular
repairs were performed");
+
+ if (deadNodesExcluded) return INELIGIBLE;
+ if (paxosRepaired && accordRepaired) return new
ConsensusMigrationRepairResult(ConsensusMigrationRepairType.either, minEpoch);
+ if (paxosRepaired) return new
ConsensusMigrationRepairResult(ConsensusMigrationRepairType.paxos, minEpoch);
+ if (accordRepaired) return new
ConsensusMigrationRepairResult(ConsensusMigrationRepairType.accord, minEpoch);
+ return INELIGIBLE;
+ }
+
+ public static ConsensusMigrationRepairResult fromPaxosOnlyRepair(Epoch
minEpoch, boolean deadNodesExcluded)
+ {
+ return fromRepair(minEpoch, true, false, deadNodesExcluded);
Review Comment:
I think this is mixing up how Paxos repair needs to work unless I am wrong.
Paxos repair doesn't finish at `ALL` and the requirement for Accord for
migration is that it be done at `ALL` so that eventually Accord can safely
switch to single replica reads.
That is why previously the Paxos only repair was not marked as migration
eligible.
If Paxos repair actually applies everything at all then this would work, but
I recall that it doesn't because that wouldn't be highly available.
##########
src/java/org/apache/cassandra/tools/nodetool/Repair.java:
##########
@@ -105,6 +105,9 @@ public class Repair extends NodeToolCmd
@Option(title = "paxos-only", name = {"-paxos-only", "--paxos-only"},
description = "If the --paxos-only flag is included, no table data is repaired,
only paxos operations..")
private boolean paxosOnly = false;
+ @Option(title = "accord-only", name = {"-accord-only", "--accord-only"},
description = "If the --accord-only flag is included, no table data is
repaired, only accord operations..")
+ private boolean accordOnly = false;
Review Comment:
Last we discussed this you actually pushed to not expose this to users in
repair and I still agree with that. An accord-only repair isn't something
useful to end users and we added a separate finish migration step to the JMX
interface to handle invoking the Accord only repair.
We definitely should as part of repair do the Accord repairs to help
maintain the contract people rely on where the repaired set is safe to read at
CL=1.
##########
src/java/org/apache/cassandra/repair/RepairJob.java:
##########
@@ -163,19 +196,59 @@ public void onFailure(Throwable t)
return;
}
+ Future<Void> accordRepair;
+ if (doAccordRepair)
+ {
+ accordRepair = paxosRepair.flatMap(unused -> {
+ logger.info("{} {}.{} starting accord repair",
session.previewKind.logPrefix(session.getId()), desc.keyspace,
desc.columnFamily);
+ IPartitioner partitioner =
desc.ranges.iterator().next().left.getPartitioner();
+ AccordRepair repair = new AccordRepair(null, partitioner,
desc.keyspace, desc.ranges);
+ return repair.repair(taskExecutor);
+ }, taskExecutor);
+ }
+ else
+ {
+ accordRepair = paxosRepair.flatMap(unused -> {
+ logger.info("{} {}.{} not running accord repair",
session.previewKind.logPrefix(session.getId()), desc.keyspace,
desc.columnFamily);
+ return ImmediateFuture.success(null);
+ });
+ }
+
+ if (session.accordOnly)
+ {
+ accordRepair.addCallback(new FutureCallback<Void>()
+ {
+ public void onSuccess(Void ignored)
+ {
+ logger.info("{} {}.{} accord repair completed",
session.previewKind.logPrefix(session.getId()), desc.keyspace,
desc.columnFamily);
+ trySuccess(new RepairResult(desc, Collections.emptyList(),
ConsensusMigrationRepairResult.fromAccordOnlyRepair(repairStartingEpoch,
session.excludedDeadNodes)));
+ }
+
+ /**
+ * Snapshot, validation and sync failures are all handled here
Review Comment:
Comment is wrong I think
##########
test/distributed/org/apache/cassandra/distributed/test/accord/AccordMigrationTest.java:
##########
@@ -570,7 +570,7 @@ public void testAccordToPaxos() throws Exception
assertTargetPaxosWrite(runCasNoApply, 1,
paxosMigratingKeys.next(), 2, 1, 1, 1, 1);
// Repair the currently migrating range from when targets
were switched, but it's not an Accord repair, this is to make sure the wrong
repair type doesn't trigger progress
- nodetool(coordinator, "repair", "-st",
upperMidToken.toString(), "-et", maxAlignedWithLocalRanges.toString());
+ nodetool(coordinator, "repair", "-st",
upperMidToken.toString(), "-et", maxAlignedWithLocalRanges.toString(),
"--paxos-only");
Review Comment:
Paxos only is not sufficient here, we need to repair both Paxos and the base
table for migration. We are also trying to address partially applied mutations
that aren't from Paxos.
##########
src/java/org/apache/cassandra/repair/RepairJob.java:
##########
@@ -163,19 +196,59 @@ public void onFailure(Throwable t)
return;
}
+ Future<Void> accordRepair;
+ if (doAccordRepair)
+ {
+ accordRepair = paxosRepair.flatMap(unused -> {
+ logger.info("{} {}.{} starting accord repair",
session.previewKind.logPrefix(session.getId()), desc.keyspace,
desc.columnFamily);
+ IPartitioner partitioner =
desc.ranges.iterator().next().left.getPartitioner();
+ AccordRepair repair = new AccordRepair(null, partitioner,
desc.keyspace, desc.ranges);
+ return repair.repair(taskExecutor);
Review Comment:
To confirm my understand, the Accord repair will complete before the regular
repair so everything it puts in memtables will be included in the regular
repair?
##########
src/java/org/apache/cassandra/repair/messages/RepairOption.java:
##########
@@ -199,21 +198,16 @@ public static RepairOption parse(Map<String, String>
options, IPartitioner parti
boolean ignoreUnreplicatedKeyspaces =
Boolean.parseBoolean(options.get(IGNORE_UNREPLICATED_KS));
boolean repairPaxos =
Boolean.parseBoolean(options.get(REPAIR_PAXOS_KEY));
boolean paxosOnly = Boolean.parseBoolean(options.get(PAXOS_ONLY_KEY));
- boolean accordRepair =
Boolean.parseBoolean(options.get(ACCORD_REPAIR_KEY));
+ boolean accordOnly =
Boolean.parseBoolean(options.get(ACCORD_ONLY_KEY));
+
+ if (paxosOnly && accordOnly)
+ throw new IllegalArgumentException("Cannot repair paxos and repair
only");
if (previewKind != PreviewKind.NONE)
{
Preconditions.checkArgument(!repairPaxos, "repairPaxos must be set
to false for preview repairs");
Preconditions.checkArgument(!paxosOnly, "paxosOnly must be set to
false for preview repairs");
- Preconditions.checkArgument(!accordRepair, "accordRepair must be
set to false for preview repairs");
- }
-
- if (accordRepair)
- {
- Preconditions.checkArgument(!paxosOnly, "paxosOnly must be set to
false for Accord repairs");
- Preconditions.checkArgument(previewKind == PreviewKind.NONE,
"Can't perform preview repair with an Accord repair");
- Preconditions.checkArgument(!force, "Accord repair only requires a
quorum to work so force is not supported");
Review Comment:
Is the force check done anywhere else?
We should still be checking for dead nodes so technically yes we can run
with force and only fail if it actually does something, but that might be less
useful then just failing immediately.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]