milleruntime commented on pull request #1803:
URL: https://github.com/apache/accumulo/pull/1803#issuecomment-763612129


   > The hiccup I can see with this is that I don't know what "name of the 
tserver" is. The most common identifier for a tserver is the hostname and port, 
but these are not necessarily unique (tservers fail and restart on the same 
host/port).
   
   I think that is fine for recovery sorting.  If a tserver fails or restarts, 
then I don't think we want to restart the sort anyway.  I don't know if it is 
possible or worth the trouble of trying to recover the sort recovery.  I was 
thinking hostname and port was enough to indicate a sort was kicked off and by 
which server.  We also wouldn't need additional code to store the unique ID if 
we use hostname and port.
   
   Good point about updating the GC.  I think we may also need to introduce an 
age off property (or reuse another) to know when we should give up waiting for 
a sort to finish.  I am not sure how long we would want to wait for the sort to 
finish.  Waiting too long will prevent Tablet's from loading but larger 
clusters can take hours to recovery.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to