keith-turner opened a new issue, #5177:
URL: https://github.com/apache/accumulo/issues/5177

   **Describe the bug**
   
   When a compactor request a compaction job from the manager the following 
things happen.
   
    * The request runs in the general client thrift thread pool (note this 
thread pool will slowly automatically increase)
    * The request reads tablet metadata and then writes a conditional mutation 
to add the the compaction to tablet metadata.
   
   When there are lots of compactors and there is a problem writing to the 
metadata table these threads will grow in an unlimited manner. Saw this problem 
occur and it was probably caused by not having the fixes for #5155 and #5168, 
however there are many other potential causes that could cause threads to get 
stuck or be slow.
   
   **Expected behavior**
   
   The number of threads concurrently executing compaction reservation is 
somehow constrained.  This must be done w/o blocking other manager 
functionality.   A simple way to achieve this goal would be to add a semaphore 
around this functionality, however this would cause general thrift threads that 
execute all manager functionality to block which could cause other problems but 
maybe that is ok since the manager thread pool always grows.
   
   One possible way to achieve this goal would be to use #5018 and in the async 
code for executing compaction reservation runs in a limited thread pool. #5018 
was created for performance reason, but it can also easily satisfy this goal of 
protecting manager memory.  
   
   Another possible way to achieve this goal would be run another thrift server 
w/ its own port for getting compaction jobs and limit the thread pool size for 
this.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to