aratno commented on code in PR #4378:
URL: https://github.com/apache/cassandra/pull/4378#discussion_r2364903990
##########
src/java/org/apache/cassandra/db/AbstractReadCommandVerbHandler.java:
##########
@@ -107,6 +108,7 @@ else if
(localComparisonEpoch.isAfter(readCommand.serializedAtEpoch()))
private ClusterMetadata checkTokenOwnership(ClusterMetadata metadata,
Message<T> message)
{
+ boolean acceptsTransient = message.verb() != Verb.READ_REQ;
Review Comment:
Confirming this will include RANGE_REQ later on
##########
src/java/org/apache/cassandra/db/compaction/CompactionStrategyManager.java:
##########
@@ -1388,7 +1371,7 @@ public CleanupSummary
releaseRepairData(Collection<TimeUUID> sessions)
readLock.lock();
try
{
- for (PendingRepairManager prm :
Iterables.concat(pendingRepairs.getManagers(), transientRepairs.getManagers()))
+ for (PendingRepairManager prm :
Iterables.concat(pendingRepairs.getManagers()))
Review Comment:
No need to concat anymore
##########
src/java/org/apache/cassandra/tools/SSTableMetadataViewer.java:
##########
@@ -422,8 +422,9 @@ private void printSStableMetadata(File file, boolean scan)
throws IOException
field("ClusteringTypes", clusteringTypes.toString());
field("StaticColumns", FBUtilities.toString(statics));
field("RegularColumns", FBUtilities.toString(regulars));
- if (stats != null)
- field("IsTransient", stats.isTransient);
+ // TODO (review): This should be removed right? Nothing depends on
it?
+// if (stats != null)
+// field("IsTransient", stats.isTransient);
Review Comment:
I support removal 👍
##########
src/java/org/apache/cassandra/db/ReadCommand.java:
##########
@@ -1430,7 +1393,7 @@ private void serializeHeader(ReadCommand command,
DataOutputPlus out, int versio
out.writeByte(
digestFlag(command.isDigestQuery())
| indexFlag(null != command.indexQueryPlan())
- | acceptsTransientFlag(command.acceptsTransient())
+ // | acceptsTransientFlag(false) Deprecated flag, could be reused?
|
needsReconciliationFlag(command.rowFilter().needsReconciliation())
Review Comment:
`needsReconciliationFlag` was added for non-strict filtering, should
probably rename for clarity now that read reconciliation is its own concept
##########
src/java/org/apache/cassandra/cql3/statements/schema/CreateTableStatement.java:
##########
@@ -167,11 +166,12 @@ public Keyspaces apply(ClusterMetadata metadata)
TableMetadata table = builder.build();
table.validate();
- if (keyspace.replicationStrategy.hasTransientReplicas()
- && table.params.readRepair != ReadRepairStrategy.NONE)
- {
- throw ire("read_repair must be set to 'NONE' for transiently
replicated keyspaces");
- }
+ // TODO (review): This can be removed right? ReadRepair is effectively
not done anymore so the setting doesn't matter
+// if (keyspace.replicationStrategy.hasTransientReplicas()
+// && table.params.readRepair != ReadRepairStrategy.NONE)
+// {
+// throw ire("read_repair must be set to 'NONE' for transiently
replicated keyspaces");
+// }
Review Comment:
I think we should require BLOCKING for tracked keyspaces for now. Users
could previously switch to NONE if they preferred write atomicity over read
monotonicity and wanted slightly better performance, but with mutation tracking
there wouldn't be a semantics tradeoff, and the performance of read
reconciliation should be even better than read repair anyway.
The reason I think we should defer support for NONE is that we may decide to
do an equivalent of full repair with range reads, and there isn't a strong
justification to support both modes yet.
##########
src/java/org/apache/cassandra/db/Keyspace.java:
##########
@@ -629,8 +634,42 @@ private Future<?> applyInternalTracked(Mutation mutation,
Promise<?> future)
continue;
}
+ // If this range is only witnessed then don't apply the
update to the underlying column family store
+ // We still want the mutation tracking log to see the
update so that it can witness it
+ // and participate in reconciliation of the mutation
+ AbstractReplicationStrategy replicationStrategy =
cfs.keyspace.getReplicationStrategy();
+ if (replicationStrategy.hasTransientReplicas())
+ {
+ RangesAtEndpoint localRanges =
replicationStrategy.getLocalRanges(cm);
+ Token token = upd.partitionKey().getToken();
+ boolean foundMatch = false;
+ // TODO (review): Races and this being an out of range
mutation, it's silently accepting the mutation to the log
+ // but not applying to the CF. This was already
checked, but in a race what happens? Double check ack handling
+ // and whether the voting group is checked to match
safely
+ for (Range<Token> r : localRanges.onlyFull().ranges())
Review Comment:
If we're losing the range (coordinator is behind and believes replica owns
the range), we may write something to an SSTable that's not read, and will be
purged on next cleanup.
If we're gaining the range (coordinator is ahead and believes replica owns
the range), TCM should try to catch up before we process the message and get
here.
##########
src/java/org/apache/cassandra/locator/ReplicaPlans.java:
##########
@@ -632,6 +632,9 @@ E select(ConsistencyLevel consistencyLevel, L liveAndDown,
L live)
}
};
+ // TODO (review): Cheap quorums are not a goal for now, would really
require speculation for writes
+ // and would trigger reconciliation work later, potentially quite a lot so
I don't think it makes sense.
+ // So just remove selector?
Review Comment:
I support removing support for cheap quorums. I _could_ imagine a world
where we deprioritize delivery of new writes to witnesses if reconciliation is
behind, but even a change like that would require more justification.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]