I'd support adding these parameters in some form in
/etc/init.d/cloud-early-config. Agree that OOM killer is of no use.

On 9/4/13 11:15 AM, "Roeland Kuipers" <rkuip...@schubergphilis.com> wrote:

>Hi Dev!
>
>We have experienced a serious customers outage due to the OOM killer on a
>redundant routing vm pair member. Somehow the MASTER node ran Out of
>Memory and the OOM killer decided to kill random processes causing
>HAproxy to go down. But since keepalived was still running and
>functioning, a failover never happened.
>In our experience we rather panic on OOM instead of praying that the
>OOM-killer will do the right thing while it in 99% percent of the cases
>it just renders a machine useless.
>If this RvR would have panicked and rebooted we would have had a nice
>keepalived failure/failover without much impact on our customer.
>
>So we figured to configure the following sysctl options:
>        vm.panic_on_oom = 1
>        kernel.panic_on_oops = 1
>        kernel.panic = 10
>
>So that a VM panics and reboots after 10 seconds so a router just comes
>back in a happy state versus crippled by the OOM killer.
>
>But we hit a problem here with VPC routers as their configuration is not
>persistent across reboots when they are rebooted outside cloudstack as
>they are not configured (entirely) using kernel parameters
>(/var/cache/cloud/cmdline). But only when started by Cloudstack.
>
>It would be nice to see that the VPC router config is persistent across
>reboots even when rebooted outside cloudstack and using the same
>mechanism as the other system vm's to make things more consistent and
>reliable.
>
>What is your opinion on this? Otherwise will add it to our backlog to
>contribute improvements in this area.
>
>See also:
>
>https://issues.apache.org/jira/browse/CLOUDSTACK-4605
>https://issues.apache.org/jira/browse/CLOUDSTACK-4606
>https://issues.apache.org/jira/browse/CLOUDSTACK-4607
>
>
>Thanks & Cheers,
>Roeland Kuipers
>
>
>

Reply via email to