sijie commented on a change in pull request #1808: Allow to configure sticky 
reads
URL: https://github.com/apache/bookkeeper/pull/1808#discussion_r232909063
 
 

 ##########
 File path: 
bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java
 ##########
 @@ -96,13 +96,38 @@
             this.ensemble = ensemble;
             this.eId = eId;
 
+            long entryIdToConsiderForWriteSet = eId;
+
+            if (clientCtx.getConf().enableStickyReads
+                    && lh.getLedgerMetadata().getWriteQuorumSize() == 
lh.getLedgerMetadata().getEnsembleSize()) {
+                // When sticky reads are enabled we want to make sure to take
+                // advantage of read-ahead (or, anyway, from effciencies in
+                // reading sequential data from disk through the page cache).
+                // For this, all the entries that a given bookie prefetches,
+                // should read from that bookie.
+                // For example, with e=2, w=2, a=2 we would have
+                //      B-1   B-2
+                // e-0   X     X
+                // e-1   X     X
+                // e-2   X     X
+                //
+                // In this case we want all the requests to be issued to B-1 
(by
+                // preference), so that cache hits will be maximized.
+                //
+                // We can only enable sticky reads if the ensemble==writeQuorum
+                // otherwise the same bookie will not have all the entries
+                // stored
+                entryIdToConsiderForWriteSet = 0;
 
 Review comment:
   hmm I don't think it is a good idea to sticky to ensemble 0. if the first 
bookie of ensemble 0 went down, all the reads will have huge penalties. I think 
"sitckiness" here should be at ledger handle level, when a ledger handle is 
constructed, one bookie is chosen as the sticky bookie, most of the requests 
would be sent to this sticky bookie. until failure occurs, a new sticky bookie 
would be choose. that is being said, ledger handle should probably implement a 
`Supplier<WriteSet>` which supplies the `WriteSet` for pending read ops to 
read. 
   
   so the pending read ops only talks to this supplier to get write set to read 
entries.
   
   - for normal settings, the supplier will return 
`lh.distributionSchedule.getWriteSet(entryId)`.
   - if reorder is enabled, the supplier will return reordered write set.
   - if sticky read is enabled, the supplier will return a fix writeset most of 
the time, if errors occurs, the supplier will update the write set, so the 
subsequent reads will choose a different sticky bookie to read.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to