dcapwell commented on code in PR #2144:
URL: https://github.com/apache/cassandra/pull/2144#discussion_r1127195672


##########
src/java/org/apache/cassandra/service/accord/AccordStateCache.java:
##########
@@ -149,17 +145,12 @@ public NamedMap(String name)
         }
     }
 
-    public final Map<Object, Node<?, ?>> active = new HashMap<>();
     private final Map<Object, Node<?, ?>> cache = new HashMap<>();
-    private final Map<Object, WriteOnlyGroup<?, ?>> pendingWriteOnly = new 
HashMap<>();
-    private final Set<Instance<?, ?>> instances = new HashSet<>();
-
-    private final NamedMap<Object, Future<?>> loadFutures = new 
NamedMap<>("loadFutures");
-    private final NamedMap<Object, Future<?>> saveFutures = new 
NamedMap<>("saveFutures");
+    private final Set<Instance<?, ?, ?>> instances = new HashSet<>();
 
-    private final NamedMap<Object, Future<Data>> readFutures = new 
NamedMap<>("readFutures");
-    private final NamedMap<Object, Future<?>> writeFutures = new 
NamedMap<>("writeFutures");
+    private final NamedMap<Object, AsyncResult<Void>> saveResults = new 
NamedMap<>("saveResults");
 
+    private int linked = 0;

Review Comment:
   ATM `canEvict` is defined as:
   
   1) node.references == 0 (also known as linked)
   2) node.isLoaded // i feel this isn't needed, this happens only in the case 
of failed operations.  if the load also failed we want to evict right away and 
we already do that in 
`org.apache.cassandra.service.accord.AccordStateCache.Instance#release`, so 
these should only be Pending
   3) no pending saveResults
   
   So, lets say we remove #2, then the only case is #3 which is unsafe to evict 
until we know the result (failed saves can not be evicted...)...  
   
   So, if we don't `push` and instead do something as follows
   
   ```
   public void release(S safeRef)
   ...
               maybeEvict();
           }
   
           private void maybePush(Node<K, V> node)
           {
               if (node.references != 0) return;
               Invariants.checkState(node.isUnlinked());
               AsyncResult<Void> save = getAsyncResult(saveResults, node.key());
               if (save == null)
               {
                   logger.trace("Moving {} from active pool to cache", 
node.key());
                   push(node);
               }
               else if (!save.isDone())
               {
                   save.addCallback((s, f) -> {
                       if (f != null)
                       {
                           //TODO what do we do here?  Do we just leave in 
cache?  That is kinda what happens anyways
                           return;
                       }
                       maybePush(node);
                   }, TODONeedRefToExecutor());
               }
               else if (!save.isSuccess())
               {
                   //TODO what do we do here?  Do we just leave in cache?  That 
is kinda what happens anyways
               }
               else
               {
                   // save was success, but was another one added?  Was the 
node refereneced?
                   maybePush(node);
               }
           }
   ```
   
   This adds more complexity to things, but keeps non-evicitable nodes out of 
the list... Now, if we are full... we are full and most likely have other 
issues.  This logic doesn't prevent us from getting full as we just ignore 
nodes that failed to write, building up until we OOM (at least OOM from a cache 
point of view)...
   
   > That said, I’m not sure how this interplays with planned changes to 
persistence.
   
   I do not know what is planned, but right now our metadata is only durable if 
`saveResults` is empty or was successful, so if we have another system for 
durability, then we don't care about #3 either I believe... so all we care 
about is reference counts?  So how much we care about this case I guess is a 
function of what this planned work does in the face of failures?
   
   So yeah, I personally and ok with nodes with pending saveResults until we 
flesh out persistence... no reason to try to make this logic more complex just 
to throw away because persistence changed...
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to