Re: Newbie Question: How do I replace a machine in a deployed Model?

Raghurama Bhat Tue, 05 Sep 2017 09:59:36 -0700

Hi Dmitrii,

Thank you very much for your detailed response. I now understand that it is not 
in the scope of Juju to monitor instances and replace them. I had come across 
elastisys, but that is more for horizontal scaling and I am primarily looking 
for High Availability and self healing although elastic scaling would be nice 
too.

The approach you suggested of monitoring the instances and using libjuju to 
replace the broken machine with a new one and execute add-unit on them makes 
sense.

Thanks,

--Raghu

From: Dmitrii Shcherbakov <[email protected]>
Date: Saturday, September 2, 2017 at 12:50 PM
To: Raghurama Bhat <[email protected]>
Cc: "[email protected]" <[email protected]>
Subject: Re: Newbie Question: How do I replace a machine in a deployed Model?

Hi Raghurama,

> Does Juju controller monitor the cluster and request MaaS for a new machine 
> if it detects one of the machines is gone?

No, it doesn't.

> Even if this has to be done manually, I did not see a replace-machine option 
> to Juju.

There's no such functionality - either an operator needs to make a decision to 
do it or you need an automated system to do that depending on some custom logic.

> Only add and remove units and machines. How does this work?

Juju itself does not know anything about applications you deploy - any 
application-specific knowledge must be present in charms.

What you are looking for is an orchestrator type of capability - it will be a 
layer on top of Juju or a charm with 'super cow powers' (namely, with admin 
access to a juju controller).

A proof of concept would be a charmscaler from elastisys - it talks to a juju 
controller directly and scales based upon CPU usage from nodes collected via 
telegraf:
https://github.com/elastisys/layer-charmscaler-base/blob/164d163b4104cc47dcb1a32019509ba3f61d91eb/config.yaml#L9-L28<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_elastisys_layer-2Dcharmscaler-2Dbase_blob_164d163b4104cc47dcb1a32019509ba3f61d91eb_config.yaml-23L9-2DL28&d=DwMFaQ&c=Vxt5e0Osvvt2gflwSlsJ5DmPGcPvTRKLJyp031rXjhg&r=sfPGOSBRIiMxvWkZIf80KJUxsqXGMBLMd-Vuxb09BnI&m=KqEWCSwnbjOzxueuPssYUzxQ55zjZKw3qiQFxSNTKmM&s=-zM-6M8Gm9wb76FlGmdzjfhWyz0QJijhCIjcjXyWzZg&e=>
https://github.com/elastisys/layer-charmscaler#how-the-charmscaler-operates<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_elastisys_layer-2Dcharmscaler-23how-2Dthe-2Dcharmscaler-2Doperates&d=DwMFaQ&c=Vxt5e0Osvvt2gflwSlsJ5DmPGcPvTRKLJyp031rXjhg&r=sfPGOSBRIiMxvWkZIf80KJUxsqXGMBLMd-Vuxb09BnI&m=KqEWCSwnbjOzxueuPssYUzxQ55zjZKw3qiQFxSNTKmM&s=j_EzDICRdoTSt7j79uwQSprph4C540FPLPewq7F3UqE&e=>
https://jujucharms.com/u/elastisys/<https://urldefense.proofpoint.com/v2/url?u=https-3A__jujucharms.com_u_elastisys_&d=DwMFaQ&c=Vxt5e0Osvvt2gflwSlsJ5DmPGcPvTRKLJyp031rXjhg&r=sfPGOSBRIiMxvWkZIf80KJUxsqXGMBLMd-Vuxb09BnI&m=KqEWCSwnbjOzxueuPssYUzxQ55zjZKw3qiQFxSNTKmM&s=3ACC8k5T6nIm5zvYDyLS4htBMnTcOiq_dfKDbxAbQVg&e=>

https://github.com/elastisys/bundle-autoscaled-kubernetes<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_elastisys_bundle-2Dautoscaled-2Dkubernetes&d=DwMFaQ&c=Vxt5e0Osvvt2gflwSlsJ5DmPGcPvTRKLJyp031rXjhg&r=sfPGOSBRIiMxvWkZIf80KJUxsqXGMBLMd-Vuxb09BnI&m=KqEWCSwnbjOzxueuPssYUzxQ55zjZKw3qiQFxSNTKmM&s=hG46Sd4mwtaKNeTiIYRJX1CIPBz3ZkZ7o80N4_Ex6MA&e=>
https://jujucharms.com/u/elastisys/autoscaled-kubernetes/bundle/0<https://urldefense.proofpoint.com/v2/url?u=https-3A__jujucharms.com_u_elastisys_autoscaled-2Dkubernetes_bundle_0&d=DwMFaQ&c=Vxt5e0Osvvt2gflwSlsJ5DmPGcPvTRKLJyp031rXjhg&r=sfPGOSBRIiMxvWkZIf80KJUxsqXGMBLMd-Vuxb09BnI&m=KqEWCSwnbjOzxueuPssYUzxQ55zjZKw3qiQFxSNTKmM&s=YCAhgtVPY6DJ7eTlFSDA9GfmjidxyWjTlJ3YxF-RWtw&e=>

https://elastisys.com/cloud-platform-features/predictive-auto-scaling/<https://urldefense.proofpoint.com/v2/url?u=https-3A__elastisys.com_cloud-2Dplatform-2Dfeatures_predictive-2Dauto-2Dscaling_&d=DwMFaQ&c=Vxt5e0Osvvt2gflwSlsJ5DmPGcPvTRKLJyp031rXjhg&r=sfPGOSBRIiMxvWkZIf80KJUxsqXGMBLMd-Vuxb09BnI&m=KqEWCSwnbjOzxueuPssYUzxQ55zjZKw3qiQFxSNTKmM&s=83xErAbP0ETxBbG3VqRZcWO3j8t5Eyi0wHIHaRmw7bg&e=>

You could build your own orchestrator with help of 
https://github.com/juju/python-libjuju<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_juju_python-2Dlibjuju&d=DwMFaQ&c=Vxt5e0Osvvt2gflwSlsJ5DmPGcPvTRKLJyp031rXjhg&r=sfPGOSBRIiMxvWkZIf80KJUxsqXGMBLMd-Vuxb09BnI&m=KqEWCSwnbjOzxueuPssYUzxQ55zjZKw3qiQFxSNTKmM&s=MR6x56apBjCa-VoM03QUaZSwWdxJuelbDf8PfKcyk_4&e=>
 depending on your criteria. The whole system could look as follows:

telegraf with your own juju input plugin -> prometheus alerts -> orchestrator 
-> juju controller

The telegraf plugin would query juju and/or MAAS periodically to determine the 
number of non-failed workers and send those metrics to prometheus.

Googling a little bit, I have found somebody's http server 
https://github.com/imgix/prometheus-am-executor<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_imgix_prometheus-2Dam-2Dexecutor&d=DwMFaQ&c=Vxt5e0Osvvt2gflwSlsJ5DmPGcPvTRKLJyp031rXjhg&r=sfPGOSBRIiMxvWkZIf80KJUxsqXGMBLMd-Vuxb09BnI&m=KqEWCSwnbjOzxueuPssYUzxQ55zjZKw3qiQFxSNTKmM&s=TQ9ehRRAU8r2eRXfqlw4BO6k-WnA3DfKufDyL8OlmLk&e=>
 that handles prometheus' alerts sent as HTTP requests. That kind of HTTP 
server could well hold your orchestration logic and use python-libjuju to 
perform the necessary add-unit-on-failure operations.

To sum up: it is important to understand the difference between Juju and 
charms. Juju itself doesn't know anything about application-specific logic - 
charms do. Charm code is executed by Juju agents at certain events and this is 
where application-specific logic is actually executed. Any orchestration code 
must have admin access to the juju controller and contain subjective logic 
about how to scale-up or scale-down your application.

https://github.com/juju/juju/blob/develop/doc/architectural-overview.md#juju-components<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_juju_juju_blob_develop_doc_architectural-2Doverview.md-23juju-2Dcomponents&d=DwMFaQ&c=Vxt5e0Osvvt2gflwSlsJ5DmPGcPvTRKLJyp031rXjhg&r=sfPGOSBRIiMxvWkZIf80KJUxsqXGMBLMd-Vuxb09BnI&m=KqEWCSwnbjOzxueuPssYUzxQ55zjZKw3qiQFxSNTKmM&s=-8ojuDvs0-_vhArtmT940F7fxQ8UDz9o9UU83yWSFtM&e=>

I hope that helps.

Best Regards,
Dmitrii Shcherbakov

Field Software Engineer
IRC (freenode): Dmitrii-Sh

On Fri, Sep 1, 2017 at 7:34 PM, Raghurama Bhat 
<[email protected]<mailto:[email protected]>> wrote:
Any comments?

Thanks,

--Raghu

From: <[email protected]<mailto:[email protected]>> on 
behalf of Raghurama Bhat <[email protected]<mailto:[email protected]>>
Date: Thursday, August 31, 2017 at 8:07 AM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: Newbie Question: How do I replace a machine in a deployed Model?

Hi,

I have a newbie question. I deployed a two node Kubernetes Core Cluster using 
Juju  into a MaaS Setup.   Now if I one of the Machine has a hardware failure, 
What is the process for replacing it with another machine? Does Juju controller 
monitor the cluster and request MaaS for a new machine if it detects one of the 
machines is gone? Even if this has to be done manually, I did not see a 
replace-machine option to Juju. Only add and remove units and machines. How 
does this work?

Thanks,

--Raghu

--
Juju mailing list
[email protected]<mailto:[email protected]>
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju<https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.ubuntu.com_mailman_listinfo_juju&d=DwMFaQ&c=Vxt5e0Osvvt2gflwSlsJ5DmPGcPvTRKLJyp031rXjhg&r=sfPGOSBRIiMxvWkZIf80KJUxsqXGMBLMd-Vuxb09BnI&m=KqEWCSwnbjOzxueuPssYUzxQ55zjZKw3qiQFxSNTKmM&s=ZrgJJID5yLOMULmVKEJUZcq8_jrZy36zP0kaVvIKfrA&e=>

-- 
Juju mailing list
[email protected]
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju

Re: Newbie Question: How do I replace a machine in a deployed Model?

Reply via email to