keith-turner opened a new issue #967: Prototype adding async get methods to 
Transaction
URL: https://github.com/apache/fluo/issues/967
 
 
   [Fetching multiple cells](http://fluo.apache.org/tour/multi-get/) in the 
Fluo tour describes `get` methods that Fluo has to quickly read multiple cells. 
 While using these method is faster than calling `get(row, col)` sequentially, 
they can be a bit cumbersome.  The same thing performance wise could be 
accomplished with asynchronous `get` methods, however I am not convinced this 
would be less cumbersome.  I have been thinking about this idea for a while, 
but I have yet to convince myself its a fully baked or good idea.  I currently 
am opening this issue to share my thoughts, not to advocate for this feature.
   
   Suppose the following methods were added to SnapshotBase.  These method 
would queue up the get operation in the background and return immediately.
   
   ```java
   CompletableFuture<String> getsAsync(String row, Column column);
   CompletableFuture<String> getsAsync(String row, Column column, String 
defaultValue);
   ```
   Using these methods, this [process()](
   
https://gist.github.com/keith-turner/57e124c715c2542242f11eda85b3128c#file-contentobserver-java-L53)
 method from a Fluo Tour exercise solution could be written as follows.
   
   ```java
     public void process(TransactionBase tx, String row, Column col) throws 
Exception {
   
       // Use Future here instead of CompletableFuture because its shorter and 
has 
       // needed get method.  This should be much faster than calling three 
blocking 
       // get methods.
       Future<String> content = tx.getsAsync(row, CONTENT_COL);
       Future<String> status = tx.getsAsync(row, REF_STATUS_COL);
       Future<String> processed = tx.getsAsync(row, PROCESSED_COL, "false");
   
       // Instead of doing status.equals below have to do status.get().equals.  
Same with
       // processed.
       if (status.get().equals("referenced") && 
processed.get().equals("false")) {
         adjustCounts(tx, +1, tokenize(content.get()));
         tx.set(row, PROCESSED_COL, "true");
       }
   
       if (status.get().equals("unreferenced")) {
         for (Column c : new Column[] {PROCESSED_COL, CONTENT_COL, 
REF_COUNT_COL, REF_STATUS_COL})
           tx.delete(row, c);
   
         if (processed.get().equals("true")) {
           adjustCounts(tx, -1, tokenize(content.get()));
         }
       }
     }
   ```
   
   The method above is only one line shorter and I think having to call 
`status.get()` vs just using `status` is a bit more cumbersome and possibly 
buggy.  For example `status.equals("referenced")` above would probably compile 
(because `equals` takes `Object`), but it would always return false.
   
   The 
[adjustCounts](https://gist.github.com/keith-turner/57e124c715c2542242f11eda85b3128c#file-contentobserver-java-L35)
 method from a Tour exercise solution could be rewritten as follows.  
   
   ```java
     // This method reads the current counts for the passed in words, adds 
writes out
     // the current count plus the delta for each work.
     private void adjustCounts(TransactionBase tx, int delta, List<String> 
words) {
       
       List<Future<Void>> futures = new ArrayList<>();
   
       for (String word : new HashSet<>(words)) {
         Future<Void> future = tx.getsAsync("w:" + word, WORD_COUNT, "0")
             .thenApply(Integer::parseInt)
             .thenApply(count -> delta + count)
             .thenAccept(newCount -> {
               if (newCount == 0)
                 tx.delete("w:" + word, WORD_COUNT);
               else
                 tx.set("w:" + word, WORD_COUNT, newCount + "");
             });
       }
   
       // wait for all futures to finish
       for (Future<Void> future : futures) {
         future.get();
       }
     }
   ```
   
   Personally I think this method is slightly easier to understand, given an 
understanding of CompletableFuture.  I found CompletableFuture a bit daunting 
when I first looked at it, but it grew on me.
   
   I am interested in seeing more use cases for these proposed get async 
methods.  I am also interested in prototyping them in-order to make it possible 
to experiment with them.
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to