pchang388 opened a new issue #12055:
URL: https://github.com/apache/druid/issues/12055


   ### Affected Version
   0.22.0
   
   ### Description
   We have a working Druid cluster running the 0.22.0 version, 
   - 5 Middle Managers with 10 peons each (50 total workers)
   - According to the formula, we have 5 slots for auto compaction to take place
     -  min(totalWorkerCapacity * compactionTaskSlotRatio, 
maxCompactionTaskSlots) - default compactionTaskSlotRatio= 0.1
   - We have auto compaction enabled for 4 out of 5 of the datasources (with 
default settings)
   - For the 4 enabled datasources they have the "Max num concurrent sub 
tasks"/maxNumConcurrentSubTasks set to 1
   - When auto compaction runs (every 30 minutes by default), only some 
datasources are run in the interval because there are 2+ auto compaction tasks 
being launched per datasource instead of the expected 1 per datasource
   - I do not see any related error messages in the overlord logs but can 
attach any log dumps if needed/helpful
   - Attached screenshots of relevant settings/examples
   
   Since a single datasource is using 2+ tasks (sometimes all 5) out of the 
available 5 for auto compaction, some of the other datasources are never/rarely 
being compacted since all the task slots are full. From what I understood of 
the documentation, the "Max num concurrent sub tasks"/maxNumConcurrentSubTasks 
property should be restricting each datasource to 1 task for auto compaction 
but this does not seem to be the case. 
   
   Any help on understanding/resolving this issue would be appreciated!
   
![CompactSettings](https://user-images.githubusercontent.com/51681873/145635917-fa10256a-1638-49b0-b3ca-491f4ae99bef.png)
   
![compactTasks](https://user-images.githubusercontent.com/51681873/145635918-b3bf5fae-83aa-4b95-8de6-c31a037cab47.png)
   
![DatasourcesOverview](https://user-images.githubusercontent.com/51681873/145635920-02ccc62b-5b98-47b7-b32e-6c7c9fe2cea3.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to