keith-turner commented on issue #4454:
URL: https://github.com/apache/accumulo/issues/4454#issuecomment-2051984089

   In 2.1 and forward user compactions reserve a set of files per tablet and 
then create one or more jobs to compact these reserved files.  The reserved 
files are not available for system compactions.  It is possible that the 
scenario happens.
   
    1. Tablet A has many files that are currently candidates for system 
compaction to reduce the tablets file count
    2. A user compaction is started that includes Tablet A and its files are 
reserved.
    3. Once the files are reserved for Tablet A a compaction job CJ1 is queued 
to process some subset of the files
    4. CJ1 sits in a queue for hours and does not run.
    5. System compaction are prevented from running during the time the files 
are reserved.
   
   The property refrenced in this ticket deals with the above situation by 
allowing the system compaction to cancel the reservation if nothing has 
happened for a while.   Then user compaction will eventually acquire a new 
reservation on the files after this happens.
   
   The way the reservation cancelation works it can only happen when zero jobs 
have run against a tablet for a user compaction. Once a single job has run, the 
reservation can not be canceled.  This is done to avoid wasting work.  
Canceling the reservation when zero jobs have run waste no work.
   
   In elasticity there is currently nothing in place for system compactions to 
cancel a user compaction reservation.  In elasticity there is a new selected 
files column that holds the per tablet reserved files.  This selected files 
column is created by the Fate operation that drives user compactions.  The 
tablet group watcher sees it and creates compaction jobs based on it.  The 
coordinator modifies it using conditional mutations as compaction jobs run.
   
   We could possibly do something like the following in elasticity.
   
    1. Add a new field to the selected files column with a count of the number 
of jobs that have completed.  This would be updated by the coordinator when 
commits compactions using conditional mutations.
    2. When the above count is zero the tablet group watcher could queue system 
and user compactions jobs.  When the count is >0 only user compaction jobs 
could be scheduled.
    3. When the selected files column exists, the compaction coordinator will 
only allow system compaction jobs to start when the count is zero.  If a system 
compaction does start it will remove the selected files column, forcing the 
fate op to recreate it when the system compaction is done.
   
   However I am not sure how to handle time in the above situation.  The above 
is conceptually what 2.1 does, but its implemented ina completely different way 
using conditional mutations vs in memory data structs in a tablet server.   
Maybe we could drop the notion of a timeout and always let system compactions 
run.  The reason the timeout was added was to prevent system compactions from 
starving user compactions in the case were a tablet always has new small files 
arriving.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to