[
https://issues.apache.org/jira/browse/HBASE-19646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16305008#comment-16305008
]
stack commented on HBASE-19646:
-------------------------------
(Hey [~eclark]!)
[~belugabehr] Yeah, the jitter comes of experience running in production.
SA can force compaction if they don't want to wait (once a file has been major
compacted, it won't be major compacted again).
The scheduling in hbase is coarse. It is good enough for many install types.
For those that need better, yeah, rather than rely on hbase agency with its
crass scheduling, instead have the operator manage scheduling externally;
operator with their context will know when best to run the compactions, better
than hbase will. Offpeak is extra, rough scheduling tooling to contain
compactions inside a particular period. It is less effective than an operator
running compactions externally -- a commitment -- but less investment and may
be sufficient for certain deploys.
> Add CRON To Major Compaction
> ----------------------------
>
> Key: HBASE-19646
> URL: https://issues.apache.org/jira/browse/HBASE-19646
> Project: HBase
> Issue Type: Bug
> Components: Compaction
> Reporter: BELUGA BEHR
> Priority: Minor
>
> HBase provides _hbase.hregion.majorcompaction_
> {quote}
> Time between major compactions, expressed in milliseconds. Set to 0 to
> disable time-based automatic major compactions. User-requested and size-based
> major compactions will still run. This value is multiplied by
> hbase.hregion.majorcompaction.jitter to cause compaction to start at a
> somewhat-random time during a given window of time. The default value is 7
> days, expressed in milliseconds. If major compactions are causing disruption
> in your environment, you can configure them to run at off-peak times for your
> deployment, or disable time-based major compactions by setting this parameter
> to 0, and run major compactions in a cron job or by another external
> mechanism.
> {quote}
> Instead of this existing mechanism, that adds randomness into a production
> system (ugh), let's simply allow users to specify a cron string and replace
> this simple periodic (+jitter) scheduling mechanism. CRON is useful for
> systems that have known windows of time (i.e. weekend, nights) that are known
> to be good times for compaction.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)