imply-cheddar commented on a change in pull request #11717:
URL: https://github.com/apache/druid/pull/11717#discussion_r793223564



##########
File path: 
server/src/main/java/org/apache/druid/server/coordination/SegmentLoadDropHandler.java
##########
@@ -560,6 +560,11 @@ public void removeSegment(DataSegment segment, 
DataSegmentChangeCallback callbac
             },
             this::resolveWaitingFutures
         );
+      } else if (status.get().getState() == Status.STATE.SUCCESS) {

Review comment:
       So, the thing that is going on here is that the HTTP protocol allows for 
batches of change requests.  The request will return as soon as any one of 
those requests completes, in a batch of 5, 1 might complete immediately while 
others are still running.  When 1 completes, the coordinator will issue a new 
request.  In the meantime, one of the ones that was requested might have 
completed, this will cache that it completed so that when the new request comes 
it can immediately respond back saying that it completed loading, this causes 
another request for load from the coordinator to happen.
   
   So, even in the normal case, once `SUCCESS` (meaning that the request has 
been completed) has been achieved, the need for having the thing in the cache 
is no more.  This change is making those semantics explicit.
   
   There are still potential corner cases (as there are with any distributed 
protocol).  Those corner cases are covered by the idempotency of the API.  If, 
due to some corner case, the same thing gets request multiple times, then it 
will only be loaded or dropped once (once it has been loaded, there's nothing 
to load.  Once it has been dropped, there's nothing more to drop).
   
   The coordinator assignment algorithm also doesn't expect that a request to 
load/drop will actually succeed.  It starts over from fresh state every time it 
wakes up because failures can happen.  This means that if something is lost, it 
will identify the correct current state and eventually recover on its own.  
I.e. even if there is some point in time when a "load-drop-load-load" occurs, 
that's not actually a problem because at the end it will actually loaded.  
There is only a problem if it goes into an infinite loop of "load-load-load" 
without ever actually making progress on the work.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to