[jira] [Commented] (CLOUDSTACK-4616) When system Vms fail to start when host is down , link local Ip addresses do not get released resulting in all the link local Ip addresses being consumed eventually.

Sangeetha Hariharan (JIRA) Wed, 08 Jan 2014 11:03:18 -0800

    [ 
https://issues.apache.org/jira/browse/CLOUDSTACK-4616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13865749#comment-13865749
 ]


Sangeetha Hariharan commented on CLOUDSTACK-4616:
-------------------------------------------------

Tested with latest build from 4.3

In my set up I have 1 zone – 1 cluster – 1 host with ssvm,cpvm , 1 router and 
user Vm running.

I power down the host.

Host continues to be in “UP” state which is as expected.

We would expect the SSVM and CPVM to be marked in “Stopped” state and attempts 
being made to start them .
I don’t see this happen..Vms are in “Running” state and Agent State is "UP". 
This is different from the behavior noted in the bug where the SSVM and CPVM 
actually get marked as "Stopped" and there is constant effort made to restart 
SSVM and CPVM.

I do see the  time outs happening in the logs:


2014-01-08 13:05:46,704 WARN  [c.c.a.m.AgentAttache] 
(AgentTaskPool-5:ctx-71686d7f) Seq 6-1177616388: Timed out on Seq 6-1
177616388:  { Cmd , MgmtId: 112516401760401, via: 6(v-30-MyTestVM), Ver: v1, 
Flags: 100011, [{"com.cloud.agent.api.CheckHe
althCommand":{"wait":50}}] }
2014-01-08 13:06:46,716 WARN  [c.c.a.m.AgentAttache] 
(AgentTaskPool-9:ctx-8cd341ad) Seq 6-1177616389: Timed out on Seq 6-1
177616389:  { Cmd , MgmtId: 112516401760401, via: 6(v-30-MyTestVM), Ver: v1, 
Flags: 100011, [{"com.cloud.agent.api.CheckHe
althCommand":{"wait":50}}] }
2014-01-08 13:07:46,738 WARN  [c.c.a.m.AgentAttache] 
(AgentTaskPool-13:ctx-3fc81ec7) Seq 6-1177616390: Timed out on Seq 6-
1177616390:  { Cmd , MgmtId: 112516401760401, via: 6(v-30-MyTestVM), Ver: v1, 
Flags: 100011, [{"com.cloud.agent.api.CheckH
ealthCommand":{"wait":50}}] }
2014-01-08 13:08:46,744 WARN  [c.c.a.m.AgentAttache] 
(AgentTaskPool-2:ctx-6e0e5ac0) Seq 6-1177616391: Timed out on Seq 6-1
177616391:  { Cmd , MgmtId: 112516401760401, via: 6(v-30-MyTestVM), Ver: v1, 
Flags: 100011, [{"com.cloud.agent.api.CheckHe
althCommand":{"wait":50}}] }
2014-01-08 13:09:46,756 WARN  [c.c.a.m.AgentAttache] 
(AgentTaskPool-6:ctx-15e6dc40) Seq 6-1177616392: Timed out on Seq 6-1
177616392:  { Cmd , MgmtId: 112516401760401, via: 6(v-30-MyTestVM), Ver: v1, 
Flags: 100011, [{"com.cloud.agent.api.CheckHe
althCommand":{"wait":50}}] }
2014-01-08 13:10:46,772 WARN  [c.c.a.m.AgentAttache] 
(AgentTaskPool-14:ctx-e3162a4a) Seq 6-1177616393: Timed out on Seq 6-
1177616393:  { Cmd , MgmtId: 112516401760401, via: 6(v-30-MyTestVM), Ver: v1, 
Flags: 100011, [{"com.cloud.agent.api.CheckH
ealthCommand":{"wait":50}}] }
2014-01-08 13:11:46,787 WARN  [c.c.a.m.AgentAttache] 
(AgentTaskPool-1:ctx-6c539f44) Seq 6-1177616394: Timed out on Seq 6-1
177616394:  { Cmd , MgmtId: 112516401760401, via: 6(v-30-MyTestVM), Ver: v1, 
Flags: 100011, [{"com.cloud.agent.api.CheckHe
althCommand":{"wait":50}}] }
2014-01-08 13:12:46,802 WARN  [c.c.a.m.AgentAttache] 
(AgentTaskPool-8:ctx-07ad89f9) Seq 6-1177616395: Timed out on Seq 6-1
177616395:  { Cmd , MgmtId: 112516401760401, via: 6(v-30-MyTestVM), Ver: v1, 
Flags: 100011, [{"com.cloud.agent.api.CheckHe
althCommand":{"wait":50}}] }
2014-01-08 13:13:46,818 WARN  [c.c.a.m.AgentAttache] 
(AgentTaskPool-5:ctx-c2954c2b) Seq 6-1177616396: Timed out on Seq 6-1
177616396:  { Cmd , MgmtId: 112516401760401, via: 6(v-30-MyTestVM), Ver: v1, 
Flags: 100011, [{"com.cloud.agent.api.CheckHe
althCommand":{"wait":50}}] }


> When system Vms fail to start when host is down ,  link local Ip addresses do 
> not get released resulting in all the link local Ip addresses being consumed 
> eventually.
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-4616
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-4616
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the 
> default.) 
>          Components: Management Server
>    Affects Versions: 4.2.1
>         Environment: Build from 4.2-forward
>            Reporter: Sangeetha Hariharan
>            Assignee: Murali Reddy
>            Priority: Critical
>             Fix For: 4.3.0
>
>         Attachments: hostdown.rar
>
>
> When system Vms fail to start when host is down ,  link local Ip addresses do 
> not get released resulting in all the link local Ip addresses being consumed 
> eventually.
> Steps to reproduce the problem:
> Advanced zone with 1 cluster having 1 host (Xenserver).
> Had SSVM,CCPVM, 2 routers and few user Vms running in the host.
> power down the host.
> When host was powered down , host is still marked as being in "Up" state . 
> Bug tracked  in - CLOUDSTACK-2140.
> Attempt to restart all the system Vms in the host that is down is made 
> continuously  and it fails.
> These failed attempts do not result in releasing the linked local Ip , 
> resulting in all linked local Ips being consumed.
> When the host is actually powered on , attempts to start the System Vms fail 
> , because of teh following exception seen in the management-server.logs:
> 013-09-05 12:00:09,551 INFO  [cloud.vm.VirtualMachineManagerImpl] 
> (secstorage-1:null) Insufficient capacity
> com.cloud.exception.InsufficientAddressCapacityException: Insufficient link 
> local address capacityScope=interface com.cloud.dc.DataCenter; id=1
>         at 
> com.cloud.network.guru.ControlNetworkGuru.reserve(ControlNetworkGuru.java:156)
>         at 
> com.cloud.network.NetworkManagerImpl.prepareNic(NetworkManagerImpl.java:2157)
>         at 
> com.cloud.network.NetworkManagerImpl.prepare(NetworkManagerImpl.java:2127)
>         at 
> com.cloud.vm.VirtualMachineManagerImpl.advanceStart(VirtualMachineManagerImpl.java:886)
>         at 
> com.cloud.vm.VirtualMachineManagerImpl.start(VirtualMachineManagerImpl.java:578)
>         at 
> com.cloud.vm.VirtualMachineManagerImpl.start(VirtualMachineManagerImpl.java:571)
>         at 
> com.cloud.storage.secondary.SecondaryStorageManagerImpl.startSecStorageVm(SecondaryStorageManagerImpl.java:267)
>         at 
> com.cloud.storage.secondary.SecondaryStorageManagerImpl.allocCapacity(SecondaryStorageManagerImpl.java:696)
>         at 
> com.cloud.storage.secondary.SecondaryStorageManagerImpl.expandPool(SecondaryStorageManagerImpl.java:1300)
>         at 
> com.cloud.secstorage.PremiumSecondaryStorageManagerImpl.scanPool(PremiumSecondaryStorageManagerImpl.java:123)
>         at 
> com.cloud.secstorage.PremiumSecondaryStorageManagerImpl.scanPool(PremiumSecondaryStorageManagerImpl.java:50)
>         at 
> com.cloud.vm.SystemVmLoadScanner.loadScan(SystemVmLoadScanner.java:104)
>         at 
> com.cloud.vm.SystemVmLoadScanner.access$100(SystemVmLoadScanner.java:33)
>         at 
> com.cloud.vm.SystemVmLoadScanner$1.reallyRun(SystemVmLoadScanner.java:81)
>         at com.cloud.vm.SystemVmLoadScanner$1.run(SystemVmLoadScanner.java:72)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at 
> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:165)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:267)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>         at java.lang.Thread.run(Thread.java:679)
> mysql> select * from op_dc_link_local_ip_address_alloc where data_center_id=1 
> and taken is null;
> Empty set (0.00 sec)
>   



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (CLOUDSTACK-4616) When system Vms fail to start when host is down , link local Ip addresses do not get released resulting in all the link local Ip addresses being consumed eventually.

Reply via email to