ddanielr opened a new pull request, #6219:
URL: https://github.com/apache/accumulo/pull/6219

   This PR adds a property to skip scanning the metadata table for table 
deletes. 
   
   ### DeleteMarker Creation Optimization ###
   
   When the manager deletes a table it performs an optimization step by 
creating a batch scanner with 8 threads (not configurable) to scan all the 
other table file references on the metadata table and ensure that no file 
references are found for the given table volume.
   
   The manager then directly deletes the volumes as opposed to writing delete 
markers and allowing the GC to handle the tablet file deletions.
   
   This is a nice optimization to have when dealing with a small static set of 
tables. However, when table creation is dynamic these scans can cause 
unnecessary delays and/or hanging fate processes as all metadata tablets must 
be scanned in order to process a single table delete. 
   
   ### Unnecessary File Ref Counting ###
    
   The batch scanner only needs to produce a single shared file ref result 
(`refCount`) in order to trigger delete markers to be created as the code only 
checks if `refCount` is equal to zero.
   
   However, the existing code needlessly counts all of the found refs first.
   This is unnecessary and a fast break was added to the iterator loop for the 
batch scanner.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to