venkata swamybabu budumuru created CLOUDSTACK-4080:
------------------------------------------------------

             Summary: Router VM is stopped by scavenger thread as part of 
DeployVMCmd if the network.gc is set to low value like "10" seconds
                 Key: CLOUDSTACK-4080
                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-4080
             Project: CloudStack
          Issue Type: Bug
      Security Level: Public (Anyone can view this level - this is the default.)
          Components: Network Controller
    Affects Versions: 4.2.0
         Environment: commit # 7522f811672f66bc0cc13a33f4f3737ef03f22af
            Reporter: venkata swamybabu budumuru
            Priority: Blocker
             Fix For: 4.2.0


Steps to reproduce:

1. Have latest CloudStack setup with at least one advanced zone.
2. Make sure network.gc.interval and wait are set to "10" seconds
3. Have at least one network offering of type "isolated" and with all services 
enabled where LB is provided by NS and other services are provided by VR.


mysql> select * from network_offerings where id=15\G
*************************** 1. row ***************************
                       id: 15
                     name: NetworkOffering with NS
                     uuid: 4aaf5c58-6d45-4213-8c26-0b2b6f6792c5
              unique_name: NetworkOffering with NS
             display_text: NetworkOffering with NS
                  nw_rate: NULL
                  mc_rate: 10
             traffic_type: Guest
                     tags: NULL
              system_only: 0
             specify_vlan: 0
      service_offering_id: NULL
            conserve_mode: 0
                  created: 2013-08-05 07:30:38
                  removed: NULL
                  default: 0
             availability: Optional
     dedicated_lb_service: 0
shared_source_nat_service: 0
                 sort_key: 0
 redundant_router_service: 0
                    state: Enabled
               guest_type: Isolated
       elastic_ip_service: 0
  eip_associate_public_ip: 0
       elastic_lb_service: 0
        specify_ip_ranges: 0
                   inline: 0
            is_persistent: 0
              internal_lb: 0
                public_lb: 1
    egress_default_policy: 1
   concurrent_connections: NULL


mysql> select * from ntwk_offering_service_map where network_offering_id=15;
+----+---------------------+----------------+---------------+---------------------+
| id | network_offering_id | service        | provider      | created           
  |
+----+---------------------+----------------+---------------+---------------------+
| 58 |                  15 | Dhcp           | VirtualRouter | 2013-08-05 
07:30:38 |
| 55 |                  15 | Dns            | VirtualRouter | 2013-08-05 
07:30:38 |
| 60 |                  15 | Firewall       | VirtualRouter | 2013-08-05 
07:30:38 |
| 59 |                  15 | Lb             | Netscaler     | 2013-08-05 
07:30:38 |
| 54 |                  15 | PortForwarding | VirtualRouter | 2013-08-05 
07:30:38 |
| 56 |                  15 | SourceNat      | VirtualRouter | 2013-08-05 
07:30:38 |
| 53 |                  15 | StaticNat      | VirtualRouter | 2013-08-05 
07:30:38 |
| 57 |                  15 | UserData       | VirtualRouter | 2013-08-05 
07:30:38 |
| 61 |                  15 | Vpn            | VirtualRouter | 2013-08-05 
07:30:38 |
+----+---------------------+----------------+---------------+---------------------+


mysql> select * from host_details where host_id=4;
+----+---------+-------------------+-------------------------------------------+
| id | host_id | name              | value                                     |
+----+---------+-------------------+-------------------------------------------+
| 13 |       4 | deviceName        | NetscalerVPXLoadBalancer                  |
| 11 |       4 | guid              | 1cf71bde-3994-42eb-80e0-046278a1763d      |
| 21 |       4 | ip                | 10.147.60.26                              |
| 19 |       4 | lbdevicededicated | false                                     |
| 23 |       4 | lbdeviceid        | 1                                         |
| 16 |       4 | name              | 201-NetscalerVPXLoadBalancer-10.147.60.26 |
| 17 |       4 | numretries        | 2                                         |
| 18 |       4 | password          | ck3EWqTylg79ZMj4gG2sHA==                  |
| 20 |       4 | physicalNetworkId | 201                                       |
| 15 |       4 | privateinterface  | 1/2                                       |
| 14 |       4 | publicinterface   | 1/3                                       |
| 12 |       4 | username          | nsroot                                    |
| 22 |       4 | zoneId            | 2                                         |
+----+---------+-------------------+-------------------------------------------+

4. As a non-ROOT domain user, Try to deploy a VM using the network that is 
created using the above n/w offering.


mysql> select * from networks where id=242\G
*************************** 1. row ***************************
                   id: 242
                 name: test
                 uuid: c8028134-77ab-415a-bdb9-d9378754479b
         display_text: test
         traffic_type: Guest
broadcast_domain_type: Vlan
        broadcast_uri: NULL
              gateway: 10.0.48.1
                 cidr: 10.0.48.0/20
                 mode: Dhcp
  network_offering_id: 15
  physical_network_id: 200
       data_center_id: 1
            guru_name: ExternalGuestNetworkGuru
                state: Allocated
              related: 242
            domain_id: 2
           account_id: 4
                 dns1: 10.103.128.16
                 dns2: NULL
            guru_data: NULL
           set_fields: 0
             acl_type: Account
       network_domain: cs4cloud.internal
       reservation_id: 803e1334-ed30-4980-a6f2-299427724bb9
           guest_type: Isolated
     restart_required: 0
              created: 2013-08-05 11:27:19
              removed: NULL
    specify_ip_ranges: 0
               vpc_id: NULL
          ip6_gateway: NULL
             ip6_cidr: NULL
         network_cidr: NULL
      display_network: 1
       network_acl_id: NULL


Observations:

(i) deployVMCmd goes fine without any issues but, network scavenger is going 
and shutting down the network immediately after startAnswer for userVM

(ii) Here is the deployVMCmd :


2013-08-05 16:57:19,427 DEBUG [cloud.api.ApiServlet] (catalina-exec-22:null) 
===END===  10.252.192.25 -- GET  
command=deployVirtualMachine&zoneId=7b6b3c07-7e33-483f-b2a1-2f89a0d9ff96&templateId=4643adee-fd8e-11e2-9c07-069f2c0000aa&hypervisor=KVM&serviceOfferingId=d42e0af6-370b-4a4f-a318-98d1d2a9a8e3&networkIds=c8028134-77ab-415a-bdb9-d9378754479b&displayname=test&name=test&response=json&sessionkey=YGtsbnrjR7V3vmhttXR20I2v8L0%3D&_=1375702049488


(iii) The above command initiated a router VM deployment

2013-08-05 16:57:25,237 DEBUG [agent.transport.Request] (Job-Executor-8:job-59 
= [ 17db0422-7b06-4bb6-b78a-ae9728bd26d1 ]) Seq 1-1321009484: Sending  { Cmd , 
MgmtId: 7280707764394, via: 1, Ver: v1, Flags: 100011, 
[{"com.cloud.agent.api.StartCommand":{"vm":{"id":14,"name":"r-14-VM","type":"DomainRouter","cpus":1,"minSpeed":500,"maxSpeed":500,"minRam":134217728,"maxRam":134217728,"arch":"x86_64","os":"Debian
 GNU/Linux 5.0 (32-bit)","bootArgs":" template=domP name=r-14-VM 
eth2ip=10.147.44.67 eth2mask=255.255.255.0 gateway=10.147.44.1 eth0ip=10.0.48.1 
eth0mask=255.255.240.0 domain=cs4cloud.internal dhcprange=10.0.48.1 
eth1ip=169.254.3.57 eth1mask=255.255.0.0 type=router disable_rp_filter=true 
dns1=10.103.128.16","rebootOnCrash":false,"enableHA":true,"limitCpuUse":false,"enableDynamicallyScaleVm":false,"vncPassword":"821327d55071659e","params":{},"uuid":"3980de9d-1b91-4e7c-ae36-2b9a9d08ef38","disks":[{"data":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid":"a11a7197-2aed-40e6-9569-6787c577ab2c","volumeType":"ROOT","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"5458182e-bfcb-351c-97ed-e7223bca2b8e","id":1,"poolType":"NetworkFilesystem","host":"10.147.28.7","path":"/export/home/swamy/primary.campo.kvm.1.zone","port":2049}},"name":"ROOT-14","size":276162048,"path":"458958b5-5497-477e-9e53-1227a0187688","volumeId":17,"vmName":"r-14-VM","accountId":4,"format":"QCOW2","id":17,"hypervisorType":"None"}},"diskSeq":0,"type":"ROOT"}],"nics":[{"deviceId":2,"networkRateMbps":200,"defaultNic":true,"uuid":"65fd9e0c-5b15-4181-9ab1-4a13fea4739e","ip":"10.147.44.67","netmask":"255.255.255.0","gateway":"10.147.44.1","mac":"06:2a:3a:00:00:12","dns1":"10.103.128.16","broadcastType":"Vlan","type":"Public","broadcastUri":"vlan://44","isolationUri":"vlan://44","isSecurityGroupEnabled":false},{"deviceId":0,"networkRateMbps":200,"defaultNic":false,"uuid":"d2a0ff89-a6ce-4005-a875-d36787318342","ip":"10.0.48.1","netmask":"255.255.240.0","gateway":"10.0.48.1","mac":"02:00:6b:6d:00:02","dns1":"10.103.128.16","broadcastType":"Vlan","type":"Guest","broadcastUri":"vlan://909","isolationUri":"vlan://909","isSecurityGroupEnabled":false},{"deviceId":1,"networkRateMbps":-1,"defaultNic":false,"uuid":"0a187643-dc28-48b5-bf99-8acbe5f753c2","ip":"169.254.3.57","netmask":"255.255.0.0","gateway":"169.254.0.1","mac":"0e:00:a9:fe:03:39","broadcastType":"LinkLocal","type":"Control","isSecurityGroupEnabled":false}]},"hostIp":"10.147.40.11","executeInSequence":false,"wait":0}},{"com.cloud.agent.api.check.CheckSshCommand":{"ip":"169.254.3.57","port":3922,"interval":6,"retries":100,"name":"r-14-VM","wait":0}},{"com.cloud.agent.api.GetDomRVersionCmd":{"accessDetails":{"router.ip":"169.254.3.57","router.name":"r-14-VM"},"wait":0}},{},{"com.cloud.agent.api.routing.IpAssocCommand":{"ipAddresses":[{"accountId":4,"publicIp":"10.147.44.67","sourceNat":true,"add":true,"oneToOneNat":false,"firstIP":true,"vlanId":"44","vlanGateway":"10.147.44.1","vlanNetmask":"255.255.255.0","vifMacAddress":"06:b0:5a:00:00:12","networkRate":200,"trafficType":"Public"}],"accessDetails":{"router.guest.ip":"10.0.48.1","zone.network.type":"Advanced","router.ip":"169.254.3.57","router.name":"r-14-VM"},"wait":0}}]
 }


(iv) Router VM started successfully

2013-08-05 16:58:40,412 DEBUG [agent.transport.Request] 
(AgentManager-Handler-5:null) Seq 1-1321009484: Processing:  { Ans: , MgmtId: 
7280707764394, via: 1, Ver: v1, Flags: 10, 
[{"com.cloud.agent.api.StartAnswer":{"vm":{"id":14,"name":"r-14-VM","type":"DomainRouter","cpus":1,"minSpeed":500,"maxSpeed":500,"minRam":134217728,"maxRam":134217728,"arch":"x86_64","os":"Debian
 GNU/Linux 5.0 (32-bit)","bootArgs":" template=domP name=r-14-VM 
eth2ip=10.147.44.67 eth2mask=255.255.255.0 gateway=10.147.44.1 eth0ip=10.0.48.1 
eth0mask=255.255.240.0 domain=cs4cloud.internal dhcprange=10.0.48.1 
eth1ip=169.254.3.57 eth1mask=255.255.0.0 type=router disable_rp_filter=true 
dns1=10.103.128.16","rebootOnCrash":false,"enableHA":true,"limitCpuUse":false,"enableDynamicallyScaleVm":false,"vncPassword":"821327d55071659e","vncAddr":"10.147.40.11","params":{},"uuid":"3980de9d-1b91-4e7c-ae36-2b9a9d08ef38","disks":[{"data":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid":"a11a7197-2aed-40e6-9569-6787c577ab2c","volumeType":"ROOT","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"5458182e-bfcb-351c-97ed-e7223bca2b8e","id":1,"poolType":"NetworkFilesystem","host":"10.147.28.7","path":"/export/home/swamy/primary.campo.kvm.1.zone","port":2049}},"name":"ROOT-14","size":276162048,"path":"458958b5-5497-477e-9e53-1227a0187688","volumeId":17,"vmName":"r-14-VM","accountId":4,"format":"QCOW2","id":17,"hypervisorType":"None"}},"diskSeq":0,"type":"ROOT"}],"nics":[{"deviceId":2,"networkRateMbps":200,"defaultNic":true,"uuid":"65fd9e0c-5b15-4181-9ab1-4a13fea4739e","ip":"10.147.44.67","netmask":"255.255.255.0","gateway":"10.147.44.1","mac":"06:2a:3a:00:00:12","dns1":"10.103.128.16","broadcastType":"Vlan","type":"Public","broadcastUri":"vlan://44","isolationUri":"vlan://44","isSecurityGroupEnabled":false},{"deviceId":0,"networkRateMbps":200,"defaultNic":false,"uuid":"d2a0ff89-a6ce-4005-a875-d36787318342","ip":"10.0.48.1","netmask":"255.255.240.0","gateway":"10.0.48.1","mac":"02:00:6b:6d:00:02","dns1":"10.103.128.16","broadcastType":"Vlan","type":"Guest","broadcastUri":"vlan://909","isolationUri":"vlan://909","isSecurityGroupEnabled":false},{"deviceId":1,"networkRateMbps":-1,"defaultNic":false,"uuid":"0a187643-dc28-48b5-bf99-8acbe5f753c2","ip":"169.254.3.57","netmask":"255.255.0.0","gateway":"169.254.0.1","mac":"0e:00:a9:fe:03:39","broadcastType":"LinkLocal","type":"Control","isSecurityGroupEnabled":false}]},"result":true,"wait":0}},{"com.cloud.agent.api.check.CheckSshAnswer":{"result":true,"wait":0}},{"com.cloud.agent.api.GetDomRVersionAnswer":{"templateVersion":"Cloudstack
 Release 4.2.0 Thu Jun 13 04:15:09 UTC 
2013","scriptsVersion":"0026a7d7d957616f59bdeab0c49258bb","result":true,"details":"Cloudstack
 Release 4.2.0 Thu Jun 13 04:15:09 UTC 
2013&0026a7d7d957616f59bdeab0c49258bb","wait":0}},{"com.cloud.agent.api.NetworkUsageAnswer":{"routerName":"r-14-VM","bytesSent":0,"bytesReceived":0,"result":true,"wait":0}},{"com.cloud.agent.api.routing.IpAssocAnswer":{"results":["10.147.44.67
 - success"],"result":true,"wait":0}}] }

(v) After the router VM is up, it triggered startComand for userVM and that as 
well went fine.

2013-08-05 16:58:43,001 DEBUG [agent.transport.Request] 
(AgentManager-Handler-14:null) Seq 1-1321009496: Processing:  { Ans: , MgmtId: 
7280707764394, via: 1, Ver: v1, Flags: 10, 
[{"com.cloud.agent.api.StartAnswer":{"vm":{"id":13,"name":"i-4-13-VM","type":"User","cpus":1,"minSpeed":500,"maxSpeed":500,"minRam":536870912,"maxRam":536870912,"arch":"x86_64","os":"CentOS
 5.5 
(64-bit)","bootArgs":"","rebootOnCrash":false,"enableHA":false,"limitCpuUse":false,"enableDynamicallyScaleVm":false,"vncPassword":"b1f3f5cddd630be3","vncAddr":"10.147.40.11","params":{},"uuid":"773ccd08-cfef-41ad-8c43-30a169630d0b","disks":[{"data":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid":"ffa1c5ad-cdca-40dc-93bd-03c1dbdd4caf","volumeType":"ROOT","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"5458182e-bfcb-351c-97ed-e7223bca2b8e","id":1,"poolType":"NetworkFilesystem","host":"10.147.28.7","path":"/export/home/swamy/primary.campo.kvm.1.zone","port":2049}},"name":"ROOT-13","size":8589934592,"path":"ad4b1306-a145-4944-b296-775671c9624b","volumeId":16,"vmName":"i-4-13-VM","accountId":4,"format":"QCOW2","id":16,"hypervisorType":"None"}},"diskSeq":0,"type":"ROOT"},{"data":{"org.apache.cloudstack.storage.to.TemplateObjectTO":{"id":0,"format":"ISO","accountId":0,"hvm":false}},"diskSeq":3,"type":"ISO"}],"nics":[{"deviceId":0,"networkRateMbps":200,"defaultNic":true,"uuid":"302aa348-f8aa-4872-ad59-142500ed9f63","ip":"10.0.49.81","netmask":"255.255.240.0","gateway":"10.0.48.1","mac":"02:00:1a:0c:00:01","dns1":"10.103.128.16","broadcastType":"Vlan","type":"Guest","broadcastUri":"vlan://909","isolationUri":"vlan://909","isSecurityGroupEnabled":false}]},"result":true,"wait":0}}]
 }

(vi) One additional observation here : After the router VM started, I could see 
the nics.state as "Allocated" for userVM for sometime.


*************************** 33. row ***************************
            id: 33
          uuid: 302aa348-f8aa-4872-ad59-142500ed9f63
   instance_id: 13
   mac_address: 02:00:1a:0c:00:01
   ip4_address: 10.0.49.81
       netmask: NULL
       gateway: NULL
       ip_type: Ip4
 broadcast_uri: NULL
    network_id: 242
          mode: Dhcp
         state: Allocated
      strategy: Start
 reserver_name: ExternalGuestNetworkGuru
reservation_id: NULL
     device_id: 0
   update_time: 2013-08-05 16:57:19
 isolation_uri: NULL
   ip6_address: NULL
   default_nic: 1
       vm_type: User
       created: 2013-08-05 11:27:19
       removed: NULL
   ip6_gateway: NULL
      ip6_cidr: NULL
  secondary_ip: 0
   display_nic: 1

(vii) Once the UserVM is up then I see the nics.state for userVM nic as 
"Reserved"


mysql> select * from nics where instance_id=13\G
*************************** 1. row ***************************
            id: 33
          uuid: 302aa348-f8aa-4872-ad59-142500ed9f63
   instance_id: 13
   mac_address: 02:00:1a:0c:00:01
   ip4_address: 10.0.49.81
       netmask: 255.255.240.0
       gateway: 10.0.48.1
       ip_type: Ip4
 broadcast_uri: vlan://909
    network_id: 242
          mode: Dhcp
         state: Reserved
      strategy: Start
 reserver_name: ExternalGuestNetworkGuru
reservation_id: 803e1334-ed30-4980-a6f2-299427724bb9
     device_id: 0
   update_time: 2013-08-05 16:58:40
 isolation_uri: vlan://909
   ip6_address: NULL
   default_nic: 1
       vm_type: User
       created: 2013-08-05 11:27:19
       removed: NULL
   ip6_gateway: NULL
      ip6_cidr: NULL
  secondary_ip: 0
   display_nic: 1
1 row in set (0.00 sec)


(viii) As soon the startAnswer Comes for the userVM, I see that there is 
network shutdown initiated by network scavenger thread.


Here is the snippet from mgmt server logs.


013-08-05 16:58:45,756 DEBUG [agent.manager.AgentManagerImpl] 
(AgentManager-Handler-2:null) SeqA 2-790: Processing Seq 2-790:  { Cmd , 
MgmtId: -1, via: 2, Ver: v1, Flags: 11, 
[{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{"_proxyVmId":2,"_loadInfo":"{\n
  \"connections\": []\n}","wait":0}}] }
2013-08-05 16:58:45,762 DEBUG [agent.manager.AgentManagerImpl] 
(AgentManager-Handler-2:null) SeqA 2-790: Sending Seq 2-790:  { Ans: , MgmtId: 
7280707764394, via: 2, Ver: v1, Flags: 100010, 
[{"com.cloud.agent.api.AgentControlAnswer":{"result":true,"wait":0}}] }
2013-08-05 16:58:46,685 DEBUG [network.resource.NetscalerResource] 
(DirectAgent-169:null) Netscaler load balancer 10.147.60.26 successfully 
executed IPAssocCommand to remove IP com.cloud.agent.api.to.IpAddressTO@1e6c629
2013-08-05 16:58:46,685 DEBUG [agent.manager.DirectAgentAttache] 
(DirectAgent-169:null) Seq 8-1242890258: Response Received:
2013-08-05 16:58:46,685 DEBUG [agent.transport.Request] (DirectAgent-169:null) 
Seq 8-1242890258: Processing:  { Ans: , MgmtId: 7280707764394, via: 8, Ver: v1, 
Flags: 10, [{"com.cloud.agent.api.routing.IpAssocAnswer":{"results":["null - 
success"],"result":true,"wait":0}}] }
2013-08-05 16:58:46,686 DEBUG [agent.transport.Request] 
(Network-Scavenger-1:null) Seq 8-1242890258: Received:  { Ans: , MgmtId: 
7280707764394, via: 8, Ver: v1, Flags: 10, { IpAssocAnswer } }
2013-08-05 16:58:46,800 DEBUG 
[cloud.network.ExternalLoadBalancerDeviceManagerImpl] 
(Network-Scavenger-1:null) External load balancer has shut down the guest 
network for account dom1Acc2(id = 4) with VLAN tag 909
2013-08-05 16:58:46,804 DEBUG [cloud.network.NetworkManagerImpl] 
(Network-Scavenger-1:null) Sending network shutdown to VirtualRouter
2013-08-05 16:58:46,808 DEBUG 
[network.router.VirtualNetworkApplianceManagerImpl] (Network-Scavenger-1:null) 
Stopping router VM[DomainRouter|r-14-VM]
2013-08-05 16:58:46,817 DEBUG [cloud.capacity.CapacityManagerImpl] 
(Network-Scavenger-1:null) VM state transitted from :Running to Stopping with 
event: StopRequestedvm's original host id: 1 new host id: 1 host id before 
state transition: 1
2013-08-05 16:58:46,945 DEBUG [agent.transport.Request] 
(AgentManager-Handler-1:null) Seq 1-1321009498: Processing:  { Ans: , MgmtId: 
7280707764394, via: 1, Ver: v1, Flags: 10, 
[{"com.cloud.agent.api.NetworkUsageAnswer":{"routerName":"r-14-VM","bytesSent":0,"bytesReceived":0,"result":true,"details":"","wait":0}}]
 }


Attaching all the required logs along with db dump to the bug.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to