keith-turner opened a new issue, #4088:
URL: https://github.com/apache/accumulo/issues/4088

   **Describe the bug**
   
   The default compaction planner operates with the following three constraints 
when looking for files to compact for system compactions.
   
    1. A max number of files
    2. A max files size, this is an optional constraint based on config.
    3. compaction ratio.
   
   It attempts to find sets of files that satisfy all three constraints.   In 
its search it has a sliding window based on the [number of 
files](https://github.com/apache/accumulo/blob/f9897862dd4e6ff4892239ff5ebeb8ed6e34bc68/core/src/main/java/org/apache/accumulo/core/spi/compaction/DefaultCompactionPlanner.java#L474-L481),
 however the set of files based on size is a [fixed 
list](https://github.com/apache/accumulo/blob/f9897862dd4e6ff4892239ff5ebeb8ed6e34bc68/core/src/main/java/org/apache/accumulo/core/spi/compaction/DefaultCompactionPlanner.java#L459-L472).
  Need to determine if the planner could include total file size in addition to 
file count in its sliding window.
   
   The first step of  work on this is coming up with an example set of files 
where a sliding window based on the sum of sizes is needed.  Need an example 
that proves this work is needed, something that the current algorithm will not 
find but an algorithm with a different sliding window technique would find.
   
   **Expected behavior**
   Ideally if a set of files exists in a tablet that meet the configured 
constraints for compaction, then the default compaction planner should be able 
to fnd that set if its computationally feasible to do so.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to