keith-turner opened a new issue, #4664:
URL: https://github.com/apache/accumulo/issues/4664

   Compactors poll for work w/ exponential backoff.  When compactors are all 
idle for a while and there is a surge of jobs to do it can take them a bit to 
all start working.  
   
   One possible way to imporve this is to modify how polling works. The 
coordinator could hold request from compactors for a time period when nothing 
is currently queued.  When something is queued it could be immediately given to 
a held compactor RPC request.  Would not want to hold RPC request for too long 
because it could be related to a dead compactor.  Could hold request for some 
time period like 60 to 90 seconds and return nothing if the queue is still 
empty.  If the compactor is still alive it can make another request for work 
which will be held again if the queue is currently empty.
   
   
   Decreasing this latency is good for a system that a lots of small files 
arriving constantly at tablets.  With a model like this for polling and #4618, 
very low latency could be achieved for compaction of new bulk imported files.  
For minor compacted files would not have a signal like #4618 provides for bulk 
imports to queue compaction jobs for a tablet.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to