keith-turner edited a comment on pull request #2096:
URL: https://github.com/apache/accumulo/pull/2096#issuecomment-842584575
Omitting a lot of detail, the following are the basics of what the current
system does. The first step uses a pluggable planner.
1. Plan compaction : Take current tablets files, current running
compactions for the tablet, and emit set of compactions jobs. A compaction job
is set of files to compact, a priority, and a destination queue.
2. Queue Jobs : An attempt to queue the jobs emitted from the planner is
made by doing the following.
1. Cancel anything that was previously queued by the planner for the
tablet. If this fails, go back to planning step and try again because things
probably changed during planning.
2. Queue the new job from the planner on the desired priority queues
3. When an internal or external compactor thread finishes a task, it takes
the next highest priority job from the queue and actually runs a compaction.
I don' think interface `BiFunction<Set<File>, CompactionConfig,
Future<File>>` would be sufficient to achieve this functionality. It would not
support the current functionality of priority queues and canceling queued work
when things change (new files arrive, job start/finish during planning). In the
current impl the priority queue for local compactions is very precise. For
external compaction there is essentially a global priority queue that is
approximate (eventually consistent) that may not always start the highest
priority job next, but usually will.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]