[ 
https://issues.apache.org/jira/browse/HBASE-19646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16304785#comment-16304785
 ] 

Elliott Clark commented on HBASE-19646:
---------------------------------------

Jitter in a production system is one of the most crucial design components to 
ensure even response times, and reduce spiky resource usage for distributed 
systems. We should not suggest configurations that remove this functionality. 
From experience having no jitter is the cause a significant issues in many many 
different HBase user's deployments.

Compactions (including major compactions) should always be an optimization that 
most users should never know about. The default configuration of the system 
should endeavor to make sure that compactions happen in the background at a 
reasonable frequency. To that end we have chosen 1 week (seven days) as a 
amount of time between forced compactions. It's a pretty reasonable thing to 
spread compactions out over a large period of days, for most workloads.

If someone has more customized knowledge and business needs then we should give 
them the ability to script what they want. There's no reason to rebuild cron in 
HBase. Almost all the systems that HBase can be run on already have crond 
installed.

> Add CRON To Major Compaction
> ----------------------------
>
>                 Key: HBASE-19646
>                 URL: https://issues.apache.org/jira/browse/HBASE-19646
>             Project: HBase
>          Issue Type: Bug
>          Components: Compaction
>            Reporter: BELUGA BEHR
>            Priority: Minor
>
> HBase provides _hbase.hregion.majorcompaction_ 
> {quote}
> Time between major compactions, expressed in milliseconds. Set to 0 to 
> disable time-based automatic major compactions. User-requested and size-based 
> major compactions will still run. This value is multiplied by 
> hbase.hregion.majorcompaction.jitter to cause compaction to start at a 
> somewhat-random time during a given window of time. The default value is 7 
> days, expressed in milliseconds. If major compactions are causing disruption 
> in your environment, you can configure them to run at off-peak times for your 
> deployment, or disable time-based major compactions by setting this parameter 
> to 0, and run major compactions in a cron job or by another external 
> mechanism.
> {quote}
> Instead of this existing mechanism, that adds randomness into a production 
> system (ugh), let's simply allow users to specify a cron string and replace 
> this simple periodic (+jitter) scheduling mechanism.  CRON is useful for 
> systems that have known windows of time (i.e. weekend, nights) that are known 
> to be good times for compaction.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to