[
https://issues.apache.org/jira/browse/HBASE-22760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16899587#comment-16899587
]
Viraj Jasani commented on HBASE-22760:
--------------------------------------
Thanks for all your inputs and suggestions. They are really helpful!
Based on your suggestions, these are some observations from my side:
* shell command balance_switch is used to turn on/off balancer. And I agree
this is useful from consistency perspective for running cluster. Whether to run
balancer is decided based on data received from ZooKeeper ZNode: balancer.
Every time we run balance_switch, ZK gets involved and stays the source of
truth, hence, even in case of HMaster failover, new Active Master reads the
data from ZK and runs balancer only if required. With shell command, we don't
need to worry about HMaster failover.
* Implementing ConfigurationObserver is also good option to halt Snapshot
cleanup, however, the user will have to be careful updating configs on all
HMasters(not just Active) and any new backup master that is supposed to join
them. While open source Ambari provides functionality to update config at one
place(UI) and that value gets updated to all HMaster/RegionServer instances via
heartbeat, I am not sure if HBase itself provides such "update config on one
place" functionality.(Not that everyone uses Ambari for managing HBase
clusters) Hence, somehow, I feel, updating configs manually on all instances
might be error prone. Please correct me if I am wrong.
* Little off-topic observation: when we run balance_switch, balancer chore
continues to run but it doesn't perform any operation. Shouldn't we unschedule
the chore in this case and reschedule it when user turns on balancer, rather
than always keeping it alive, running every 30 min, just to let it find out
that it is not required to do anything?
Somehow, I feel, to halt/stop any chore, using ZK as source of truth and using
shell command to trigger the operation should be the right option. Please
provide your suggestions or let me know if I am missing something.
Thanks again for your help and suggestions [~reidchan] [~anoop.hbase]
[~apurtell] FYI
> Stop/Resume Snapshot Auto-Cleanup activity with shell command
> -------------------------------------------------------------
>
> Key: HBASE-22760
> URL: https://issues.apache.org/jira/browse/HBASE-22760
> Project: HBase
> Issue Type: Improvement
> Components: Admin, shell, snapshots
> Affects Versions: 3.0.0, 1.5.0, 2.3.0, 2.2.1, 1.4.11
> Reporter: Viraj Jasani
> Assignee: Viraj Jasani
> Priority: Major
>
> For any scheduled snapshot backup activity, we would like to disable
> auto-cleaner for snapshot based on TTL. However, as per HBASE-22648 we have a
> config to disable snapshot auto-cleaner:
> hbase.master.cleaner.snapshot.disable, which would take effect only upon
> HMaster restart just similar to any other hbase-site configs.
> For any running cluster, we should be able to stop/resume auto-cleanup
> activity for snapshot based on shell command. Something similar to below
> command should be able to stop/start cleanup chore:
> hbase(main):001:0> auto_snapshot_cleaner false (disable auto-cleaner)
> hbase(main):001:0> auto_snapshot_cleaner true (enable auto-cleaner)
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)