I agree, changing the config option is not needed in this case. Thanks for the suggestions
On Fri, Aug 4, 2017 at 11:03 AM, Karthik Ramasamy <[email protected]> wrote: > That is better. If the setting is in heathmgr.yaml, you could use the cli > to turn it on for desired topologies and > that gives independent control on a per topology level. > > cheers > /karthik > > On Fri, Aug 4, 2017 at 11:00 AM, Sanjeev Kulkarni <[email protected]> > wrote: > >> Having a setting enabling clusterwide is indeed one of the desired >> properties, as mentioned by Ashvin's first email. The setting in >> healthmgr.yaml would control that. It would be set to false as default. >> Users interested in trying it out could change that and submit it. >> >> >> On Fri, Aug 4, 2017 at 10:55 AM, Karthik Ramasamy <[email protected]> >> wrote: >> >> > If we enable at healthmgr.yaml, it becomes cluster wide - which is not >> the >> > desired option. For cli, there are no changes >> > in the code. The config property function is built-in already. All you >> have >> > to do is determine - what should be the key and >> > its value. >> > >> > cheers >> > /karthik >> > >> > On Fri, Aug 4, 2017 at 10:51 AM, Sanjeev Kulkarni <[email protected]> >> > wrote: >> > >> > > Enabling health manager doesn't sound like a API. Thus I agree that >> > Config >> > > is not the right place for a setting like this. >> > > I also don't like overloading cli with this. IMO cli is already >> > overloaded >> > > with a bunch of things that it shouldn't be. >> > > Why can't we make this part of healthmgr.yaml itself? Or maybe >> > > heron_internals.yaml? >> > > >> > > On Fri, Aug 4, 2017 at 10:12 AM, Karthik Ramasamy <[email protected] >> > >> > > wrote: >> > > >> > > > Ashvin - >> > > > >> > > > Instead of adding a Config API to enable self-healing per topology, >> an >> > > > interested user can enable the config using --config-property during >> > > heron >> > > > submit. For example, >> > > > >> > > > heron submit <cluster-name> --config-property >> > > > "heron.config.topology.healthmanager.mode=enable" <topology-file> >> > > > <topology-class> <topology-name> >> > > > >> > > > The advantage of this approach is that there is no hard coded >> config in >> > > the >> > > > code that will require later removal. Thoughts? >> > > > >> > > > cheers >> > > > /karthik >> > > > >> > > > >> > > > On Fri, Aug 4, 2017 at 8:57 AM, Ashvin A <[email protected]> wrote: >> > > > >> > > > > Hi, >> > > > > >> > > > > We are in the process of merging the core building blocks of the >> > > topology >> > > > > health manager (HM) based on Dhalion. This integration is still >> > > > > experimental and needs to be tested thoroughly. So it is desired >> that >> > > the >> > > > > HM be activated on-demand and remain disabled by default. >> Accordingly >> > > we >> > > > > are proposing the following scheme to launch HM process. >> > > > > >> > > > > We are thinking of satisfying the following constraints: >> > > > > >> > > > > 1. Launch on container-0, colocated with the scheduler and the >> > > metrics >> > > > > cache. >> > > > > 2. Initially HM will be disabled by default. This means HM >> process >> > > > > should not be started to avoid any side-effects. Once HM is >> well >> > > > > tested, a >> > > > > system wide configuration would enable HM for all topologies >> > > submitted >> > > > > afterwards. >> > > > > 3. If topology explicitly configure, opt-in, HM will be started >> > and >> > > > take >> > > > > actions as per the configuration, i.e. healthmgr.yaml >> > > > > 4. Like other Heron processes, executor should manage the HM's >> > life >> > > > > cycle >> > > > > >> > > > > Accordingly we propose the following. >> > > > > >> > > > > 1. Add new Config api to enable self-healing per topology: >> > > > > Config.enableHealthManager(Topology.HealthManagerMode mode). >> > > Default >> > > > > value will be "system" to indicate use the system wide >> > > configuration. >> > > > > 2. Add a new config to heron_internal.yaml: >> > > > > "heron.healthmgr.default.mode". The value will be "disabled". >> > > > > 3. The Scheduler will read the default value of HM mode from >> the >> > > > > heron_internals config file, like done in >> > SchedulerMain.setupLogging >> > > > > [3]. >> > > > > It will provide the either the user configured mode value or >> the >> > > > default >> > > > > mode value to the executor as a command line argument. >> > > > > 4. Add HM mode to the command like arguments to >> heron_executor.py. >> > > > This >> > > > > is similar to the executor command line arguments for check >> > pointing >> > > > > [2]. >> > > > > 5. The executor will launch HM if mode is not "disabled". >> > > > > 6. Later if the default HM mode value is set to "dryrun" or >> > > > > "self-healing", HM will be launched for all newly submitted >> > > > topologies. >> > > > > >> > > > > >> > > > > What do you think about this approach? >> > > > > >> > > > > Thanks, >> > > > > Ashvin >> > > > > >> > > > > >> > > > > [1] https://github.com/twitter/heron/pull/2132 >> > > > > [2] https://github.com/twitter/heron/blob/master/heron/ >> > > > > executor/src/python/ >> > > > > heron_executor.py#L58 >> > > > > [3] https://github.com/twitter/heron/blob/master/ >> > > > > heron/scheduler-core/src/java/com/twitter/heron/scheduler/ >> > > > > SchedulerMain.java#L277 >> > > > > >> > > > >> > > >> > >> > >
