keith-turner opened a new issue, #3528:
URL: https://github.com/apache/accumulo/issues/3528

   In the elasticity branch the manager scans tablets and for each tablet it 
calls the compaction planner plugin to generate zero or more compaction jobs.  
The generated jobs are placed in a bounded priority queue in the manager.  
Consider the following case which could happen with the current code in the 
elasticity branch.
   
    1. The manager scans all tablets and generates a single compaction job for 
tablet A.  The job is placed in the priority queue.
    2. The manager scans all tablets and generates zero compaction jobs for 
tablet A, because configuration or the tablets files changed.
    3. The job generated by the first scan is still in the queue and is picked 
up by a compactor process to work on.
   
   Ideally the existing job would be removed from the queue in step 2 when the 
planner generated zero jobs for tablet A.
   
   One possible way to handle this it to add two methods to the bounded 
priority queue of compaction jobs, `beginTabletScan()` and 
`finishTabletScan()`.  In TabletGroupWatcher we could do the following on each 
full metadata scan.
   
   1. call beginTabletScan() on the job queue
   2. scan all tablets adding new compaction jobs
   3. call endTabletScan() on the job queue.  This call should cause it delete 
any jobs that were added before beginTableScan was called.
   
   Not sure if this is the best way to solve the problem.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to