keith-turner commented on PR #3350:
URL: https://github.com/apache/accumulo/pull/3350#issuecomment-1530052719

   > I'm curious if you need a refresh column in the metadata. You could send 
the RPC to the hosting tablet server as a FaTE transaction and expect a 
response. If the call times out or fails, then you could retry as part of the 
FaTE framework.
   
   I think that could be done and it may be more efficient for the normal case 
when everything is stable and working well.  The refresh column handles the 
following cases that would not handled by an RPC alone.
   
    * The refresh column is set with the location of the tserver at the time 
the files were added.  So it can detect if the tablet has moved since it added 
the files and if it did move then it knows that no refresh is needed.
    * In the case where the manager dies, the refresh column keeps track of 
what work is left across manager restarts.
    * The transaction id in the refresh column allows the tablet server to know 
if it has already recently loaded something.  For example if 3 fate 
transactions all want to refresh a tablet at around the same time the tablet 
can do one actual refresh and satisfy all request.  Without the refresh column 
the tablet will always have reload its metadata when requested even if its not 
needed (because a recent reload it did satisfies the request).
    * The refresh column allows FATE to process the refresh operations in the 
isReady call using async RPC which means a fate thread is not tied up.  Without 
the refresh column a FATE thread would have to be tied up until all tablets are 
refreshed because the tracking of what is refreshed is moved from metadata 
table to memory.  Also the Fate operation would have to use synchronous RPCs vs 
async.  Also the fate thread has to keep everything in memory.
   
   Using only an RPC, I think we could do the following.  
   
    * Scan the metadata table for tablets with load markers and build a map of 
`Map<TServerLoc, List<KeyExtent>>`.  The map may contain some tservers that do 
not actually need to refresh because they were not the location when the files 
were set, but that does not hurt anythnig.
    * Continue sending synchronous RPCs unil a positive response has been 
received for everything in the map. 
   
   Barring all the edge cases, this may be faster. The fate thread would be 
tied up for however long this takes. If all the fate threads are tied up, more 
can be added as long as memory supports it.  There are other fate operations 
that hold a bunch of stuff in memory and/or tie up a fate thread for an 
indefinite time period while waiting for a condition to become true.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to