suneet-s opened a new pull request, #13852:
URL: https://github.com/apache/druid/pull/13852

   ### Description
   
   This change introduces the ability to have auto-compaction continuously
   schedule compaction tasks as slots become available.
   
   Previously, each run of the CompactSegments duty built an iterator based on
   the latest segment metadata available. This meant that if the compact tasks
   that were scheduled ran into any issues, like task lock contention, or an
   interval which can not be compacted because of a bug, auto-compaction would
   be stuck on the cluster.
       
   With this change, CompactSegments, refreshes it's view of the segments based
   on the `druid.coordinator.compaction.searchPolicyRefreshPeriod` property.
   This allows auto-compaction to continue to make progress if any interval
   fails to compact until the search policy is refreshed.
       
   Compaction statistics for the cluster are only refreshed when the search
   policy is refreshed. This is because to collect statistics, the task has
   to run through the entire list of available segments on the cluster which
   can take a long time on large clusters
       
   To enable this behavior on the cluster, add something like this to the
   coordinator runtime properties
       
   ```
   druid.coordinator.dutyGroups=["compaction"]
   druid.coordinator.compaction.duties=["compactSegments"]
   druid.coordinator.compaction.period=PT60S
   ```
   
   #### Release note
   New: You can now schedule automatic compaction to run continuously and 
configure
   how frequently it should consider new segments for compaction
   
   <hr>
   
   ##### Key changed/added classes in this PR
    * CompactionSegmentSearchPolicy#resetIfNeeded
    * CompactSegments#makeStats
   
   <hr>
   
   This PR has:
   
   - [ ] been self-reviewed.
      - [ ] using the [concurrency 
checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md)
 (Remove this item if the PR doesn't have any relation to concurrency.)
   - [ ] added documentation for new or modified features or behaviors.
   - [ ] a release note entry in the PR description.
   - [ ] added Javadocs for most classes and all non-trivial methods. Linked 
related entities via Javadoc links.
   - [ ] added or updated version, license, or notice information in 
[licenses.yaml](https://github.com/apache/druid/blob/master/dev/license.md)
   - [ ] added comments explaining the "why" and the intent of the code 
wherever would not be obvious for an unfamiliar reader.
   - [ ] added unit tests or modified existing tests to cover new code paths, 
ensuring the threshold for [code 
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
 is met.
   - [ ] added integration tests.
   - [ ] been tested in a test Druid cluster.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to