keith-turner commented on PR #3350:
URL: https://github.com/apache/accumulo/pull/3350#issuecomment-1530052719
> I'm curious if you need a refresh column in the metadata. You could send
the RPC to the hosting tablet server as a FaTE transaction and expect a
response. If the call times out or fails, then you could retry as part of the
FaTE framework.
I think that could be done and it may be more efficient for the normal case
when everything is stable and working well. The refresh column handles the
following cases that would not handled by an RPC alone.
* The refresh column is set with the location of the tserver at the time
the files were added. So it can detect if the tablet has moved since it added
the files and if it did move then it knows that no refresh is needed.
* In the case where the manager dies, the refresh column keeps track of
what work is left across manager restarts.
* The transaction id in the refresh column allows the tablet server to know
if it has already recently loaded something. For example if 3 fate
transactions all want to refresh a tablet at around the same time the tablet
can do one actual refresh and satisfy all request. Without the refresh column
the tablet will always have reload its metadata when requested even if its not
needed (because a recent reload it did satisfies the request).
* The refresh column allows FATE to process the refresh operations in the
isReady call using async RPC which means a fate thread is not tied up. Without
the refresh column a FATE thread would have to be tied up until all tablets are
refreshed because the tracking of what is refreshed is moved from metadata
table to memory. Also the Fate operation would have to use synchronous RPCs vs
async. Also the fate thread has to keep everything in memory.
Using only an RPC, I think we could do the following.
* Scan the metadata table for tablets with load markers and build a map of
`Map<TServerLoc, List<KeyExtent>>`. The map may contain some tservers that do
not actually need to refresh because they were not the location when the files
were set, but that does not hurt anythnig.
* Continue sending synchronous RPCs unil a positive response has been
received for everything in the map.
Barring all the edge cases, this may be faster. The fate thread would be
tied up for however long this takes. If all the fate threads are tied up, more
can be added as long as memory supports it. There are other fate operations
that hold a bunch of stuff in memory and/or tie up a fate thread for an
indefinite time period while waiting for a condition to become true.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]