dcapwell commented on code in PR #56:
URL: https://github.com/apache/cassandra-accord/pull/56#discussion_r1297775035
##########
accord-core/src/main/java/accord/messages/ReadData.java:
##########
@@ -149,14 +149,16 @@ void read(SafeCommandStore safeStore, Timestamp
executeAt, PartialTxn txn)
Ranges unavailable = safeStore.ranges().unsafeToReadAt(executeAt);
txn.read(safeStore, executeAt).begin((next, throwable) -> {
- if (throwable != null)
+ // TODO (expected, exceptions): should send exception to client,
and consistency handle/propagate locally
+ logger.trace("{}: read failed for {}: {}", txnId, unsafeStore,
throwable);
+ synchronized (ReadData.this)
{
- // TODO (expected, exceptions): should send exception to
client, and consistency handle/propagate locally
- logger.trace("{}: read failed for {}: {}", txnId, unsafeStore,
throwable);
- node.reply(replyTo, replyContext, ReadNack.Error);
+ if (fail == null)
+ fail = throwable;
+ else
+ fail.addSuppressed(throwable);
}
- else
- readComplete(unsafeStore, next, unavailable);
+ readComplete(unsafeStore, next, unavailable);
Review Comment:
before this patch this is what I see
* accord.messages.ReadData#read would reply `Error` on the first exception
and not call `readComplete`. This would not update `waitingOn`, but more
importantly it would not call `ack`; which means `if (-1 == --waitingOnCount)`
will never be true!
* `accord.impl.AbstractFetchCoordinator.FetchRequest#readComplete` makes the
read 2 operations:`txn.read()` and `store.maxAppliedFor(...).onSuccess(a ->
{this.maxApplied = a; super.readComplete(...)})` (I overly simplified). Even
if a fetch had 2 reads (C* only has 1 atm), then we would still not see the ACK
so would never complete the operation....
* `accord.messages.ReadTxnData#readComplete` documents that it should
unregister its listener... but because we call `cancel` we will schedule
removal of the listener anyways... so thats prob fine.
> I am not married to that decision though. Sending back the first error is
fine.
I guess I would ask, what will the coordinator do differently with a list of
failures? If a single one exists we will do the same as if 100 existed... so
by only sending the first we have a smaller msg payload and can start handling
the exception faster! We could detect this case and just log locally that
these exceptions are getting ignored....
So, in short, I am in favor of failing fast and not pushing handling
exceptions properly to each impl.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]