keith-turner commented on pull request #2096:
URL: https://github.com/apache/accumulo/pull/2096#issuecomment-842584575


   Omitting a lot of detail, the following is basics of what the current system 
enable.  The first step uses a pluggable planner.
   
   1. Plan compaction :  Take current tablets files, current running 
compactions for the tablet, and emit set of compactions jobs.  A compaction job 
is set of files to compact, a  priority, and a destination queue.
    2. Queue Jobs : An attempt to queue the jobs emitted from the planner is 
made by doing the following.
       1. Cancel anything that was previously queued by the planner for the 
tablet.  If this fails, go back to planning step and try again because things 
probably changed during planning.
       2. Queue the new job from the planner on the desired priority queues
    3. When an internal or external compactor thread finishes a task, it takes 
the next highest priority job from the queue and actually run the compaction.
   
   I don' think interface `BiFunction<Set<File>, CompactionConfig, 
Future<File>>` would be sufficient to achieve this functionality.  It would not 
support the current functionality of priority queues and canceling queued work 
when things change (new files arrive, job start/finish during planning). In the 
current impl the priority queue for local compactions is very precise.  For 
external compaction there is essentially a global priority queue that is 
approximate (eventually consistent) that may not always start the highest 
priority job next, but usually will.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to