Sangeetha Hariharan created CLOUDSTACK-7742:
-----------------------------------------------

             Summary: Xenserver HA - SSVM failing to start since it is running 
out of management ip address 
                 Key: CLOUDSTACK-7742
                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-7742
             Project: CloudStack
          Issue Type: Bug
      Security Level: Public (Anyone can view this level - this is the default.)
          Components: Management Server
    Affects Versions: 4.5.0
         Environment: Build from master

            Reporter: Sangeetha Hariharan


HA - SSVM failing to start since it is running out of management ip address 

Set up:

Cluster with 3 Xenserver hosts.
I am executing host HA scenarios where host is being brought down ( or 
simulating contol path network failure / storage network failure).

After couple of such scenarios , i see that the SSVM fails to start as part of 
HA the reason being running out of management nic:


management server logs:

014-10-16 12:15:44,311 DEBUG [c.c.u.d.T.Transaction] 
(Work-Job-Executor-106:ctx-323991ca job-771/job-943 ctx-3a2e9ed6) Rolling back 
the transaction: Time = 1 Name =  Work-Job-Executor-106; called by 
-TransactionLegacy.rollback:902-DataCenterIpAddressDaoImpl.takeIpAddress:61-GeneratedMethodAccessor493.invoke:-1-DelegatingMethodAccessorImpl.invoke:43-Method.invoke:606-AopUtils.invokeJoinpointUsingReflection:317-ReflectiveMethodInvocation.invokeJoinpoint:183-ReflectiveMethodInvocation.proceed:150-TransactionContextInterceptor.invoke:34-ReflectiveMethodInvocation.proceed:161-ExposeInvocationInterceptor.invoke:91-ReflectiveMethodInvocation.proceed:172
2014-10-16 12:15:44,312 INFO  [c.c.v.VirtualMachineManagerImpl] 
(Work-Job-Executor-106:ctx-323991ca job-771/job-943 ctx-3a2e9ed6) Insufficient 
capacity
com.cloud.exception.InsufficientAddressCapacityException: Unable to get a 
management ip addressScope=interface com.cloud.dc.Pod; id=1
        at 
com.cloud.network.guru.PodBasedNetworkGuru.reserve(PodBasedNetworkGuru.java:123)
        at 
com.cloud.network.guru.StorageNetworkGuru.reserve(StorageNetworkGuru.java:122)
        at 
org.apache.cloudstack.engine.orchestration.NetworkOrchestrator.prepareNic(NetworkOrchestrator.java:1338)
        at 
org.apache.cloudstack.engine.orchestration.NetworkOrchestrator.prepare(NetworkOrchestrator.java:1309)
        at 
com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:970)
        at 
com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:4590)
        at sun.reflect.GeneratedMethodAccessor210.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at 
com.cloud.vm.VmWorkJobHandlerProxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107)
        at 
com.cloud.vm.VirtualMachineManagerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4746)
        at com.cloud.vm.VmWorkJobDispatcher.runJob(VmWorkJobDispatcher.java:102)
        at 
org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:513)
        at 
org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
        at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
        at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
        at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
        at 
org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
        at 
org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:470)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
2014-10-16 12:15:44,324 DEBUG [c.c.v.VirtualMachineManagerImpl] 
(Work-Job-Executor-106:ctx-323991ca job-771/job-943 ctx-3a2e9ed6) Cleaning up 
resources for the vm VM[SecondaryStorageVm|s-115-VM] in Starting state


There are 2 issues here:

1. Some of the SSVMs that are in destroyed state still have not released the 
management Ips back to the freepool.
 
2. Some of these destroyed SSVMs have 2 management ip addresses associated with 
it . why is this the case?

3. I still see 1 management ip address that is free , but SSVM is still not 
able to come up.
 
mysql>  select id,name,state from vm_instance where id in (1,7,18,71);
+----+---------+-----------+
| id | name    | state     |
+----+---------+-----------+
|  1 | v-1-VM  | Running   |
|  7 | s-7-VM  | Destroyed |
| 18 | s-18-VM | Destroyed |
| 71 | s-71-VM | Destroyed |
+----+---------+-----------+
4 rows in set (0.00 sec)

mysql> select instance_id from nics where id in (select nic_id from 
op_dc_ip_address_alloc where taken is not null);
+-------------+
| instance_id |
+-------------+
|           1 |
|           7 |
|           7 |
|          18 |
|          18 |
|          71 |
|          71 |
+-------------+
7 rows in set (0.00 sec)

mysql> select  * from op_dc_ip_address_alloc ;
+----+--------------+----------------+--------+--------+--------------------------------------+---------------------+-------------+
| id | ip_address   | data_center_id | pod_id | nic_id | reservation_id         
              | taken               | mac_address |
+----+--------------+----------------+--------+--------+--------------------------------------+---------------------+-------------+
|  1 | 10.223.59.69 |              1 |      1 |    261 | 
5cf7268d-cf6c-421d-86d7-fe124b055bd5 | 2014-10-15 00:20:44 |           1 |
|  2 | 10.223.59.70 |              1 |      1 |      3 | 
a5254254-b384-4783-adab-05bb7d6355be | 2014-10-16 17:34:42 |           2 |
|  3 | 10.223.59.71 |              1 |      1 |    261 | 
a89646ab-ce75-4518-b685-5fa49e98faca | 2014-10-13 22:05:14 |           3 |
|  4 | 10.223.59.72 |              1 |      1 |     26 | 
4c197653-f794-438b-b743-f80cc7348be0 | 2014-10-10 18:47:44 |           4 |
|  5 | 10.223.59.73 |              1 |      1 |      3 | 
2c15e101-b534-4b4e-bebb-f580ea284c01 | 2014-10-16 18:31:42 |           5 |
|  6 | 10.223.59.74 |              1 |      1 |     27 | 
4c197653-f794-438b-b743-f80cc7348be0 | 2014-10-10 18:47:44 |           6 |
|  7 | 10.223.59.75 |              1 |      1 |     49 | 
680321c2-9290-4915-aec4-1b37dc5aefbd | 2014-10-11 00:24:44 |           7 |
|  8 | 10.223.59.76 |              1 |      1 |     48 | 
680321c2-9290-4915-aec4-1b37dc5aefbd | 2014-10-11 00:24:44 |           8 |
|  9 | 10.223.59.77 |              1 |      1 |   NULL | NULL                   
              | NULL                |           9 |
| 10 | 10.223.59.78 |              1 |      1 |      3 | 
9980f72c-4bc8-40ed-87bf-99fd375bd13d | 2014-10-10 18:48:12 |          10 |
| 11 | 10.223.59.79 |              1 |      1 |    260 | 
5cf7268d-cf6c-421d-86d7-fe124b055bd5 | 2014-10-15 00:20:44 |          11 |
| 12 | 10.223.59.80 |              1 |      1 |    260 | 
a89646ab-ce75-4518-b685-5fa49e98faca | 2014-10-13 22:05:14 |          12 |
+----+--------------+----------------+--------+--------+--------------------------------------+---------------------+-------------+
12 rows in set (0.00 sec)

mysql>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to