[GitHub] [cassandra-accord] belliottsmith commented on a diff in pull request #10: Metadata persistence

GitBox Tue, 11 Oct 2022 02:47:49 -0700


belliottsmith commented on code in PR #10:
URL: https://github.com/apache/cassandra-accord/pull/10#discussion_r992097241



##########
accord-core/src/main/java/accord/messages/Apply.java:
##########
@@ -54,15 +63,59 @@ public Apply(Node.Id to, Topologies topologies, TxnId 
txnId, Txn txn, Key homeKe
         this.result = result;
     }
 
+    @VisibleForImplementation
+    public Apply(Keys scope, long waitForEpoch, TxnId txnId, Txn txn, Key 
homeKey, Timestamp executeAt, Deps deps, Writes writes, Result result)
+    {
+        super(scope, waitForEpoch);
+        this.txnId = txnId;
+        this.txn = txn;
+        this.homeKey = homeKey;
+        this.executeAt = executeAt;
+        this.deps = deps;
+        this.writes = writes;
+        this.result = result;
+    }
+
+    static Future<Void> waitAndReduce(Future<Void> left, Future<Void> right)

Review Comment:
   So, last night I went back and revisited a thought I had but discarded 
because it would be less efficient than callbacks. However, on further 
reflection I think the additional cost is probably acceptable.
   
   The basic idea is to avoid most of the extra costs associated with a 
`Future` by delaying the submission of work. So instead of a `Future` chain we 
have essentially an async operation builder chain. I have a prototype here: 
https://github.com/belliottsmith/cassandra/commit/fabb1372e23ea233cb5dc487952e60d216e90774
   
   You can only mutate the end of the chain (though we could add callbacks 
mid-chain if we wanted). The tail of the chain retains a reference to the head 
of the chain; when we add to the end of the chain, the old tail replaces its 
head pointer with the new tail pointer. When we execute, we replace the head 
pointer with the final callback, then invoke the head to start processing, 
which passes the result of execution down through the chain.
   
   With respect to _safety_ I think this solves both of our concerns neatly, as 
we have all the same mechanisms for encapsulating responsibility that a 
`Future` does, but we also do not even start the work until some callback has 
been provided that _at least_ handles exceptions, and we have no `get` method 
to misuse (though we can always add this feature if we want; or add a way to 
convert to `Future`).
   
   With respect to efficiency, the main additional cost is the encapsulating 
object that represents the deferred execution. 
   Since there's no concurrency with consumer / producer, we can get away with 
minimal state, and can directly extend the implementing classes where 
efficiency matters (even for transformations). Importantly, we have zero 
concurrency overhead at any step of execution (as opposed to every step with 
`Future`).
   
   We're still encouraging the use of lambdas and other allocating 
abstractions, but it's probably manageable, and as mentioned above we can 
reduce this where it matters.
   
   _Personally_ I still prefer callbacks. But, since this makes each step of 
chaining relatively fixed-cost, I think the cognitive burden I experience 
reasoning about chains of operations with this abstraction will be much lower 
than with `Future`. 
   
   WDYT? Would this be a proper compromise between our approaches?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [cassandra-accord] belliottsmith commented on a diff in pull request #10: Metadata persistence

Reply via email to