[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-9575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15882794#comment-15882794
 ] 

Mani Prashanth Varma Manthena commented on CLOUDSTACK-9575:
-----------------------------------------------------------

Here are the possible fixes for this issue:
https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1006232
https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2127554

> ACS 4.9 + VMware/ESXi 6.0: Sometimes VRs are failing to fully come up into 
> running state
> ----------------------------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-9575
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9575
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the 
> default.) 
>          Components: Management Server, VMware
>    Affects Versions: 4.9.0
>            Reporter: Mani Prashanth Varma Manthena
>            Priority: Critical
>              Labels: vmware
>         Attachments: hostd.log, management-server.log.zip
>
>
> Here is the exception in the cloudstack-management server log:
> {noformat}
> 2016-10-21 07:33:53,361 WARN  [c.c.n.v.VpcManagerImpl] 
> (API-Job-Executor-62:ctx-749d66c8 job-102 ctx-25e96d70) (logid:502d5b39) 
> Failed to start vpc [VPC [9-nuageVpcWithoutStaticNatNetACL] due to
> com.cloud.exception.AgentUnavailableException: Resource [Host:3] is 
> unreachable: Host 3: Unable to start instance due to Unable to start  
> VM:bd09d7ab-81f1-4053-a838-869d108cb827 due to error in finalizeStart, not 
> retrying
>         at 
> com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:1120)
>         at 
> com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:4592)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at 
> com.cloud.vm.VmWorkJobHandlerProxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107)
>         at 
> com.cloud.vm.VirtualMachineManagerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4753)
>         at 
> com.cloud.vm.VmWorkJobDispatcher.runJob(VmWorkJobDispatcher.java:102)
>         at 
> org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:554)
>         at 
> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
>         at 
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
>         at 
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
>         at 
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
>         at 
> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
>         at 
> org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:502)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: com.cloud.utils.exception.ExecutionException: Unable to start  
> VM:bd09d7ab-81f1-4053-a838-869d108cb827 due to error in finalizeStart, not 
> retrying
>         at 
> com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:1084)
>         at 
> com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:4592)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         ... 18 more
> 2016-10-21 07:33:53,362 DEBUG [c.c.n.v.VpcManagerImpl] 
> (API-Job-Executor-62:ctx-749d66c8 job-102 ctx-25e96d70) (logid:502d5b39) 
> Destroying vpc [VPC [9-nuageVpcWithoutStaticNatNetACL] that failed to start
> 2016-10-21 07:33:53,362 DEBUG [c.c.n.v.VpcManagerImpl] 
> (API-Job-Executor-62:ctx-749d66c8 job-102 ctx-25e96d70) (logid:502d5b39) 
> Destroying vpc [VPC [9-nuageVpcWithoutStaticNatNetACL]
> 2016-10-21 07:33:53,363 DEBUG [c.c.n.v.VpcManagerImpl] 
> (API-Job-Executor-62:ctx-749d66c8 job-102 ctx-25e96d70) (logid:502d5b39) 
> Updating VPC [VPC [9-nuageVpcWithoutStaticNatNetACL] with state Inactive as a 
> part of vpc delete
> 2016-10-21 07:33:53,373 DEBUG [c.c.n.v.VpcManagerImpl] 
> (API-Job-Executor-62:ctx-749d66c8 job-102 ctx-25e96d70) (logid:502d5b39) 
> Shutting down vpc [VPC [9-nuageVpcWithoutStaticNatNetACL]
> 2016-10-21 07:33:53,376 DEBUG [c.c.n.r.NetworkHelperImpl] 
> (API-Job-Executor-62:ctx-749d66c8 job-102 ctx-25e96d70) (logid:502d5b39) 
> Attempting to destroy router 13
> {noformat}
> Note: We are hitting this issue 1 out of three times in our regressions.
> Here are the failures on the ESXi 6.0 host:
> {noformat}
> 2016-10-25T05:54:12.964Z error hostd[2B981B70] [Originator@6876 sub=Hbrsvc] 
> Failed to retrieve VM config (id=7)                                           
>                                                           
> 2016-10-25T05:54:12.964Z error hostd[2B981B70] [Originator@6876 sub=Hbrsvc] 
> Replicator: VmReconfig failed to retrieve replication config for VM 7, 
> ignoring: vim.fault.ReplicationVmConfigFault
> 2016-10-25T05:54:12.996Z warning hostd[2BB81B70] [Originator@6876 
> sub=Vmsvc.vm:/vmfs/volumes/469adbc4-fd0b07aa/r-7-VM/r-7-VM.vmx 
> opID=76510a94-ea-f9-04a0 user=vpxuser:VSPHERE.LOCAL\Administrator] 
> CannotRetrieveCorefiles: VM is in an invalid state
> 2016-10-25T05:54:13.022Z info hostd[2BB81B70] [Originator@6876 sub=Libs 
> opID=76510a94-ea-f9-04a0 user=vpxuser:VSPHERE.LOCAL\Administrator] Error 
> getting extended config name from config file.                     
> 2016-10-25T05:54:13.022Z info hostd[2BB81B70] [Originator@6876 sub=Libs 
> opID=76510a94-ea-f9-04a0 user=vpxuser:VSPHERE.LOCAL\Administrator] Error 
> getting extended config file path.
> 2016-10-25T05:54:55.764Z info hostd[2BC85B70] [Originator@6876 
> sub=Vmsvc.vm:/vmfs/volumes/469adbc4-fd0b07aa/r-7-VM/r-7-VM.vmx 
> opID=HB-SpecSync-host-3855@93-2beb0d5-87-06ad user=vpxuser] State Transition 
> (VM_STATE_ON -> VM_STATE_RECONFIGURING)
> 2016-10-25T05:54:55.765Z info hostd[2BC85B70] [Originator@6876 
> sub=Vmsvc.vm:/vmfs/volumes/469adbc4-fd0b07aa/r-7-VM/r-7-VM.vmx 
> opID=HB-SpecSync-host-3855@93-2beb0d5-87-06ad user=vpxuser] State Transition 
> (VM_STATE_RECONFIGURING -> VM_STATE_ON)
> 2016-10-25T05:54:55.766Z info hostd[2BBC2B70] [Originator@6876 
> sub=Guestsvc.GuestFileTransferImpl] Entered VmPowerStateListener              
>                                                                        
> 2016-10-25T05:54:55.766Z info hostd[2BBC2B70] [Originator@6876 
> sub=Guestsvc.GuestFileTransferImpl] VmPowerStateListener succeeded            
>                                                                        
> 2016-10-25T05:54:55.766Z info hostd[2BBC2B70] [Originator@6876 sub=Hbrsvc] 
> Replicator: powerstate change VM: 7 Old: 1 New: 1                             
>                                                            
> 2016-10-25T05:54:55.766Z info hostd[2BC85B70] [Originator@6876 
> sub=Vmsvc.vm:/vmfs/volumes/469adbc4-fd0b07aa/r-7-VM/r-7-VM.vmx 
> opID=HB-SpecSync-host-3855@93-2beb0d5-87-06ad user=vpxuser] Send config 
> update invoked
> 2016-10-25T05:54:55.783Z info hostd[2BC85B70] [Originator@6876 
> sub=Vimsvc.TaskManager opID=HB-SpecSync-host-3855@93-2beb0d5-87-06ad 
> user=vpxuser] Task Completed : 
> haTask-ha-root-pool-vim.ResourcePool.updateChildResourceConfiguration-1928 
> Status success
> 2016-10-25T05:54:58.330Z info hostd[2A59FB70] [Originator@6876 
> sub=Vmsvc.vm:/vmfs/volumes/469adbc4-fd0b07aa/r-7-VM/r-7-VM.vmx] Send config 
> update invoked                                                           
> 2016-10-25T05:54:58.366Z info hostd[2A59FB70] [Originator@6876 
> sub=Vmsvc.vm:/vmfs/volumes/469adbc4-fd0b07aa/r-7-VM/r-7-VM.vmx] Send config 
> update invoked                                                           
> 2016-10-25T05:54:58.385Z info hostd[2A59FB70] [Originator@6876 
> sub=Vmsvc.vm:/vmfs/volumes/469adbc4-fd0b07aa/r-7-VM/r-7-VM.vmx] Send config 
> update invoked                                                           
> 2016-10-25T05:54:58.438Z info hostd[2A59FB70] [Originator@6876 
> sub=Vmsvc.vm:/vmfs/volumes/469adbc4-fd0b07aa/r-7-VM/r-7-VM.vmx] Tools 
> manifest version status changed from guestToolsNotInstalled to 
> guestToolsUnmana
> 2016-10-25T05:54:58.447Z info hostd[2A59FB70] [Originator@6876 
> sub=Vmsvc.vm:/vmfs/volumes/469adbc4-fd0b07aa/r-7-VM/r-7-VM.vmx] Send config 
> update invoked                                                           
> 2016-10-25T05:54:58.469Z info hostd[2A59FB70] [Originator@6876 
> sub=Vmsvc.vm:/vmfs/volumes/469adbc4-fd0b07aa/r-7-VM/r-7-VM.vmx] Send config 
> update invoked                                                           
> 2016-10-25T05:54:58.492Z info hostd[2A59FB70] [Originator@6876 
> sub=Vmsvc.vm:/vmfs/volumes/469adbc4-fd0b07aa/r-7-VM/r-7-VM.vmx] Tools 
> manifest version status changed from guestToolsUnmanaged to 
> guestToolsUnmanaged
> 2016-10-25T05:54:58.498Z info hostd[2A59FB70] [Originator@6876 
> sub=Vmsvc.vm:/vmfs/volumes/469adbc4-fd0b07aa/r-7-VM/r-7-VM.vmx] Send config 
> update invoked                                                           
> 2016-10-25T05:54:58.515Z info hostd[2A59FB70] [Originator@6876 
> sub=Vmsvc.vm:/vmfs/volumes/469adbc4-fd0b07aa/r-7-VM/r-7-VM.vmx] Setting the 
> tools properties cache.                                                  
> 2016-10-25T05:55:05.555Z info hostd[FFA02A70] [Originator@6876 
> sub=Vimsvc.TaskManager opID=134074df-7f-06cc 
> user=vpxuser:VSPHERE.LOCAL\vpxd-extension-57107a00-4014-11e6-85ca-90e2ba4b0016]
>  Task Created : haTask-ha-host-vim.HostSystem.acquireCimServicesTicket-1943
> 2016-10-25T05:55:05.556Z info hostd[2BBC2B70] [Originator@6876 
> sub=Vimsvc.TaskManager opID=134074df-7f-06cc 
> user=vpxuser:VSPHERE.LOCAL\vpxd-extension-57107a00-4014-11e6-85ca-90e2ba4b0016]
>  Task Completed : haTask-ha-host-vim.HostSystem.acquireCimServicesTicket-1943 
> Status success
> 2016-10-25T05:55:05.780Z info hostd[2A59FB70] [Originator@6876 
> sub=Vimsvc.TaskManager opID=HB-host-3855@531-609c8d99-79-06d3 user=vpxuser] 
> Task Created : haTask--vim.OverheadService.downloadVMXConfig-1946        
> 2016-10-25T05:55:05.781Z info hostd[2A59FB70] [Originator@6876 
> sub=Vimsvc.TaskManager opID=HB-host-3855@531-609c8d99-79-06d3 user=vpxuser] 
> Task Completed : haTask--vim.OverheadService.downloadVMXConfig-1946 Status 
> success
> 2016-10-25T05:55:28.986Z info hostd[2BBC2B70] [Originator@6876 
> sub=Vimsvc.TaskManager opID=358a7cd8-cf-06ea 
> user=vpxuser:VSPHERE.LOCAL\Administrator] Task Created : 
> haTask-7-vim.VirtualMachine.shutdownGuest-1957
> 2016-10-25T05:55:28.987Z info hostd[2BBC2B70] [Originator@6876 
> sub=Vmsvc.vm:/vmfs/volumes/469adbc4-fd0b07aa/r-7-VM/r-7-VM.vmx 
> opID=358a7cd8-cf-06ea user=vpxuser:VSPHERE.LOCAL\Administrator] State 
> Transition (VM_STATE_ON -> VM_STATE_SHUTTING_DOWN)
> 2016-10-25T05:55:28.987Z info hostd[2BBC2B70] [Originator@6876 
> sub=Vimsvc.ha-eventmgr opID=358a7cd8-cf-06ea 
> user=vpxuser:VSPHERE.LOCAL\Administrator] Event 496 : Guest OS shut down for 
> r-7-VM on esxi-2.mvdcdev28.us.alcatel-lucent.com in ha-datacenter
> 2016-10-25T05:55:29.202Z info hostd[2BBC2B70] [Originator@6876 
> sub=Vmsvc.vm:/vmfs/volumes/469adbc4-fd0b07aa/r-7-VM/r-7-VM.vmx 
> opID=358a7cd8-cf-06ea user=vpxuser:VSPHERE.LOCAL\Administrator] State 
> Transition (VM_STATE_SHUTTING_DOWN -> VM_STATE_ON_SHUTTING_DOWN)
> 2016-10-25T05:55:29.203Z info hostd[2BBC2B70] [Originator@6876 
> sub=Vimsvc.TaskManager opID=358a7cd8-cf-06ea 
> user=vpxuser:VSPHERE.LOCAL\Administrator] Task Completed : 
> haTask-7-vim.VirtualMachine.shutdownGuest-1957 Status success
> 2016-10-25T05:55:30.163Z info hostd[2BB81B70] [Originator@6876 
> sub=Vmsvc.vm:/vmfs/volumes/469adbc4-fd0b07aa/r-7-VM/r-7-VM.vmx] Send config 
> update invoked                                                           
> 2016-10-25T05:55:30.182Z info hostd[2BB81B70] [Originator@6876 
> sub=Vmsvc.vm:/vmfs/volumes/469adbc4-fd0b07aa/r-7-VM/r-7-VM.vmx] Send config 
> update invoked                                                           
> [LikewiseGetDomainJoinInfo:355] QueryInformation(): ERROR_FILE_NOT_FOUND 
> (2/0):                                                                        
>                                                              
> 2016-10-25T05:55:48.770Z info hostd[2A59FB70] [Originator@6876 sub=Hostsvc 
> opID=6ea70743] VsanSystemVmkProvider : GetRuntimeInfo: Complete, runtime 
> info: (vim.vsan.host.VsanRuntimeInfo) {                         
> -->    accessGenNo = 0,                                                       
>                                                                               
>                                                         
> --> }                                                                         
>                                                                               
>                                                         
> 2016-10-25T05:55:54.814Z info hostd[2B981B70] [Originator@6876 
> sub=Vmsvc.vm:/vmfs/volumes/469adbc4-fd0b07aa/r-7-VM/r-7-VM.vmx] Send config 
> update invoked                                                           
> 2016-10-25T05:55:54.825Z info hostd[2C080B70] [Originator@6876 
> sub=Vimsvc.ha-eventmgr] Event 497 : The dvPort 262 link was down in the 
> vSphere Distributed Switch  in ha-datacenter                                 
> 2016-10-25T05:55:54.887Z warning hostd[FFA43B70] [Originator@6876 sub=Libs] 
> VigorOnlineStatusCb: Failed to find RPC Request in pending list for sequence 
> 0                                                          
> 2016-10-25T05:55:54.887Z info hostd[FFA43B70] [Originator@6876 sub=Libs] 
> SOCKET 5 (44)                                                                 
>                                                              
> 2016-10-25T05:55:54.887Z info hostd[FFA43B70] [Originator@6876 sub=Libs] recv 
> detected client closed connection                                             
>                                                         
> 2016-10-25T05:55:54.887Z info hostd[FFA43B70] [Originator@6876 sub=Libs] 
> VigorTransportClientProcessError: Remote connection failure                   
>                                                              
> 2016-10-25T05:55:54.887Z info hostd[FFA43B70] [Originator@6876 sub=Libs] 
> VigorTransportClientDrainRecv: draining read.                                 
>                                                              
> 2016-10-25T05:55:54.887Z info hostd[FFA43B70] [Originator@6876 sub=Libs] 
> SOCKET 5 (44)                                                                 
>                                                              
> 2016-10-25T05:55:54.887Z info hostd[FFA43B70] [Originator@6876 sub=Libs] recv 
> detected client closed connection                                             
>                                                         
> 2016-10-25T05:55:54.887Z info hostd[FFA43B70] [Originator@6876 sub=Libs] 
> VigorTransportClientProcessError: closing connection.                         
>                                                              
> 2016-10-25T05:55:54.887Z info hostd[FFA43B70] [Originator@6876 sub=Libs] 
> VigorTransportClientManageConnection: connection closed.                      
>                                                              
> 2016-10-25T05:55:54.892Z info hostd[FFA43B70] [Originator@6876 sub=Libs] 
> CnxAuthdProtoSecureConnect: Unencrypted connection, skipping thumbprint 
> exchange.                                                          
> 2016-10-25T05:55:54.900Z info hostd[FFA43B70] [Originator@6876 sub=Libs] 
> CnxConnectAuthd: Returning false because CnxAuthdProtoConnect failed          
>                                                              
> 2016-10-25T05:55:54.900Z info hostd[FFA43B70] [Originator@6876 sub=Libs] 
> Cnx_Connect: Returning false because CnxConnectAuthd failed                   
>                                                              
> 2016-10-25T05:55:54.900Z info hostd[FFA43B70] [Originator@6876 
> sub=vm:Cnx_Connect: Error message: There is no VMware process running for 
> config file /vmfs/volumes/469adbc4-fd0b07aa/r-7-VM/r-7-VM.vmx]             
> 2016-10-25T05:55:54.900Z warning hostd[FFA43B70] [Originator@6876 
> sub=vm:VigorTransportClientManageConnection: Failed to re-connect to VM 
> /vmfs/volumes/469adbc4-fd0b07aa/r-7-VM/r-7-VM.vmx]                        
> 2016-10-25T05:55:54.902Z info hostd[FFA43B70] [Originator@6876 sub=Libs] 
> VigorOnlineDisconnectCb: connection closed (is final).                        
>                                                              
> 2016-10-25T05:55:54.920Z info hostd[2B981B70] [Originator@6876 sub=Hostsvc] 
> Lookupvm: Cartel ID not set for VM 7                                          
>                                                           
> 2016-10-25T05:55:54.920Z warning hostd[2B981B70] [Originator@6876 
> sub=Vmsvc.vm:/vmfs/volumes/469adbc4-fd0b07aa/r-7-VM/r-7-VM.vmx] Unable to get 
> resource settings for a powered on VM                               
> 2016-10-25T05:55:54.920Z info hostd[2B981B70] [Originator@6876 sub=Hostsvc] 
> Lookupvm: Cartel ID not set for VM 7                                          
>                                                           
> 2016-10-25T05:55:54.931Z info hostd[2B981B70] [Originator@6876 
> sub=Vmsvc.vm:/vmfs/volumes/469adbc4-fd0b07aa/r-7-VM/r-7-VM.vmx] Send config 
> update invoked                                                           
> 2016-10-25T05:55:54.952Z info hostd[2B981B70] [Originator@6876 sub=Hostsvc] 
> Lookupvm: Cartel ID not set for VM 7                                          
>                                                           
> 2016-10-25T05:55:54.952Z warning hostd[2B981B70] [Originator@6876 
> sub=Vmsvc.vm:/vmfs/volumes/469adbc4-fd0b07aa/r-7-VM/r-7-VM.vmx] Unable to get 
> resource settings for a powered on VM                               
> 2016-10-25T05:55:54.952Z info hostd[2B981B70] [Originator@6876 sub=Hostsvc] 
> Lookupvm: Cartel ID not set for VM 7                                          
>                                                           
> 2016-10-25T05:55:54.962Z info hostd[2B981B70] [Originator@6876 
> sub=Vmsvc.vm:/vmfs/volumes/469adbc4-fd0b07aa/r-7-VM/r-7-VM.vmx] State 
> Transition (VM_STATE_ON_SHUTTING_DOWN -> VM_STATE_OFF)
> 2016-10-25T05:55:54.963Z info hostd[2A59FB70] [Originator@6876 
> sub=Guestsvc.GuestFileTransferImpl] Entered VmPowerStateListener              
>                                                                        
> 2016-10-25T05:55:54.963Z info hostd[2A59FB70] [Originator@6876 
> sub=Guestsvc.GuestFileTransferImpl] VmPowerStateListener succeeded            
>                                                                        
> 2016-10-25T05:55:54.963Z info hostd[2A59FB70] [Originator@6876 sub=Hbrsvc] 
> Replicator: powerstate change VM: 7 Old: 1 New: 0                             
>                                                            
> 2016-10-25T05:55:54.963Z info hostd[2A59FB70] [Originator@6876 sub=Hbrsvc] 
> Replicator: Poweroff for VM: (id=7)                                           
>                                                            
> 2016-10-25T05:55:54.963Z warning hostd[2B981B70] [Originator@6876 
> sub=Vmsvc.vm:/vmfs/volumes/469adbc4-fd0b07aa/r-7-VM/r-7-VM.vmx] Failed to 
> find activation record, event user unknown.                             
> 2016-10-25T05:55:54.964Z info hostd[2B981B70] [Originator@6876 
> sub=Vimsvc.ha-eventmgr] Event 498 : r-7-VM on  
> esxi-2.mvdcdev28.us.alcatel-lucent.com in ha-datacenter is powered off
> {noformat}
> Workaround: Redo/retry the operation that brings up the VRs (e.g. create 
> VPCs/networks, restart VPCs/networks with cleanup, etc.).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to