[
https://issues.apache.org/jira/browse/CLOUDSTACK-3137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Animesh Chaturvedi updated CLOUDSTACK-3137:
-------------------------------------------
These blockers and critical issues are resolved but not verified. Reporters of
these issues please verify the fixes and help close these issues
> systemvms HA fails on latest master (4591f94)
> ---------------------------------------------
>
> Key: CLOUDSTACK-3137
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-3137
> Project: CloudStack
> Issue Type: Bug
> Security Level: Public(Anyone can view this level - this is the
> default.)
> Affects Versions: 4.2.0
> Reporter: Prasanna Santhanam
> Assignee: Koushik Das
> Priority: Blocker
> Fix For: 4.2.0
>
> Attachments: logs.tar.bz2
>
>
> SystemVMs fail to start during HA on latest master
> (4591f94a0bb69875dca769c9738998d3b5b96106)
> It looks like the HA of system VMs is failing. On our automated test
> environment we reboot the management server before the systemVMs spin up to
> ensure that the global settings are affected and the configuration for the
> secstorage.internal.sites is rightly set in the secondary storage VM. The
> management server upon reboot discovers that SSVM was put to perform HA in
> the op_ha_work table and starts to re-spin the SSVM. At this time the CPVM is
> also booting up. No further activity is observed after two HA attempts.
> SystemVM start before Management server start:
> Jun 24 22:30:49 cloudstack-centos63 local0: 2013-06-25 05:30:49,738 DEBUG
> [cloud.vm.VirtualMachineManagerImpl] (secstorage-1:null) Allocating entries
> for VM: VM[SecondaryStorageVm|s-1-VM]
> Jun 24 22:30:49 cloudstack-centos63 local0: 2013-06-25 05:30:49,771 DEBUG
> [cloud.vm.VirtualMachineManagerImpl] (secstorage-1:null) Allocating nics for
> VM[SecondaryStorageVm|s-1-VM]
> Jun 24 22:30:49 cloudstack-centos63 local0: 2013-06-25 05:30:49,774 DEBUG
> [cloud.network.NetworkManagerImpl] (secstorage-1:null) Allocating nic for vm
> VM[SecondaryStorageVm|s-1-VM] in network Ntwk[200|Public|1] with requested
> profile NicProfile[0-0-null-null-null
> Jun 24 22:30:49 cloudstack-centos63 local0: 2013-06-25 05:30:49,843 DEBUG
> [cloud.network.NetworkManagerImpl] (secstorage-1:null) Allocating nic for vm
> VM[SecondaryStorageVm|s-1-VM] in network Ntwk[202|Control|3] with requested
> profile null
> Jun 24 22:30:49 cloudstack-centos63 local0: 2013-06-25 05:30:49,870 DEBUG
> [cloud.network.NetworkManagerImpl] (secstorage-1:null) Allocating nic for vm
> VM[SecondaryStorageVm|s-1-VM] in network Ntwk[201|Management|2] with
> requested profile null
> Jun 24 22:30:49 cloudstack-centos63 local0: 2013-06-25 05:30:49,916 DEBUG
> [cloud.network.NetworkManagerImpl] (secstorage-1:null) Allocating nic for vm
> VM[SecondaryStorageVm|s-1-VM] in network Ntwk[203|Storage|4] with requested
> profile null
> ...
> ...
> Jun 24 22:30:52 cloudstack-centos63 local0: 2013-06-25 05:30:52,552 DEBUG
> [agent.transport.Request] (consoleproxy-1:null) Seq 3-2067660811: Sending {
> Cmd , MgmtId: 200973787296321, via: 3, Ver: v1, Flags: 100111,
> [{"org.apache.cloudstack.storage.command.CopyCommand":{"srcTO":{"org.apache.cloudstack.storage.to.TemplateObjectTO":{"path":"template/tmpl/1/1/","origUrl":"http://nfs/templates/acton/acton-systemvm-02062012.vhd.bz2","uuid":"1","id":1,"format":"VHD","accountId":1,"checksum":"f613f38c96bf039f2e5cbf92fa8ad4f8","hvm":false,"displayText":"SystemVM
> Template
> (XenServer)","imageDataStore":{"com.cloud.agent.api.to.NfsTO":{"_url":"nfs://nfs.fmt.vmops.com:/expo...
> Jun 24 22:31:01 cloudstack-centos63 local0: 2013-06-25 05:31:01,824 INFO
> [context.support.XmlWebApplicationContext] (Thread-26:null) Closing Root
> WebApplicationContext: startup date [Mon Jun 24 23:25:50 UTC 2013]; root of
> context hierarchy
> ^^^^^^^^^^ Management Server Restarts ^^^^^^^^^^^^^^
> After restart SSVM scheduled for HA
> Jun 24 22:32:52 cloudstack-centos63 local0: 2013-06-25 05:32:52,394 INFO
> [cloud.ha.HighAvailabilityManagerImpl] (Timer-1:null) Schedule vm for HA:
> VM[SecondaryStorageVm|s-1-VM]
> Jun 24 22:32:52 cloudstack-centos63 local0: 2013-06-25 05:32:52,455 INFO
> [cloud.vm.VirtualMachineManagerImpl] (Timer-1:null) Handling unfinished work
> item: ItWork[1805a89b-d03a-4f1f-b717-ac9f5da6edea-Starting-2-Prepare]
> Jun 24 22:32:52 cloudstack-centos63 local0: 2013-06-25 05:32:52,504 INFO
> [cloud.ha.HighAvailabilityManagerImpl] (Timer-1:null) Schedule vm for HA:
> VM[ConsoleProxy|v-2-VM]
> ...
> Jun 24 22:32:52 cloudstack-centos63 local0: 2013-06-25 05:32:52,540 INFO
> [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-0:null) Starting work
> Jun 24 22:32:52 cloudstack-centos63 local0: 2013-06-25 05:32:52,547 INFO
> [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-1:null) Starting work
> ...
> Jun 24 22:32:52 cloudstack-centos63 local0: 2013-06-25 05:32:52,815 INFO
> [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-2:work-1) Processing
> HAWork[1-HA-1-Starting-Investigating]
> ...
> Jun 24 22:32:53 cloudstack-centos63 local0: 2013-06-25 05:32:53,289 DEBUG
> [cloud.ha.CheckOnAgentInvestigator] (HA-Worker-0:work-2) Unable to reach the
> agent for VM[ConsoleProxy|v-2-VM]: Resource [Host:3] is unreachable: Host 3:
> Host with specified id is not in the right state: Disconnected
> Jun 24 22:32:53 cloudstack-centos63 local0: 2013-06-25 05:32:53,290 INFO
> [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-0:work-2)
> SimpleInvestigator found VM[ConsoleProxy|v-2-VM]to be alive? null
> Jun 24 22:32:53 cloudstack-centos63 local0: 2013-06-25 05:32:53,293 DEBUG
> [cloud.ha.CheckOnAgentInvestigator] (HA-Worker-2:work-1) Unable to reach the
> agent for VM[SecondaryStorageVm|s-1-VM]: Resource [Host:2] is unreachable:
> Host 2: Host with specified id is not in the right state: Disconnected
> Jun 24 22:32:53 cloudstack-centos63 local0: 2013-06-25 05:32:53,298 INFO
> [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-2:work-1)
> SimpleInvestigator found VM[SecondaryStorageVm|s-1-VM]to be alive? null
> Jun 24 22:32:53 cloudstack-centos63 local0: 2013-06-25 05:32:53,336 INFO
> [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-0:work-2)
> XenServerInvestigator found VM[ConsoleProxy|v-2-VM]to be alive? null
> Jun 24 22:32:53 cloudstack-centos63 local0: 2013-06-25 05:32:53,338 DEBUG
> [cloud.ha.UserVmDomRInvestigator] (HA-Worker-0:work-2) Not a User Vm, unable
> to determine state of VM[ConsoleProxy|v-2-VM] returning null
> Jun 24 22:32:53 cloudstack-centos63 local0: 2013-06-25 05:32:53,338 INFO
> [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-0:work-2) null found
> VM[ConsoleProxy|v-2-VM]to be alive? null
> Jun 24 22:32:53 cloudstack-centos63 local0: 2013-06-25 05:32:53,338 DEBUG
> [cloud.ha.ManagementIPSystemVMInvestigator] (HA-Worker-0:work-2) Testing if
> VM[ConsoleProxy|v-2-VM] is alive
> ....
> Rescheduled HA
> Jun 24 22:32:53 cloudstack-centos63 local0: 2013-06-25 05:32:53,623 INFO
> [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-0:work-2) Rescheduling
> HAWork[2-HA-2-Starting-Investigating] to try again at Tue Jun 25 05:43:07 UTC
> 2013
> Jun 24 22:32:53 cloudstack-centos63 local0: 2013-06-25 05:32:53,626 INFO
> [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-2:work-1) Rescheduling
> HAWork[1-HA-1-Starting-Investigating] to try again at Tue Jun 25 05:43:07 UTC
> 2013
> ...
> Jun 24 22:43:11 cloudstack-centos63 local0: 2013-06-25 05:43:11,586 INFO
> [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-3:work-1) Processing
> HAWork[1-HA-1-Starting-Investigating]
> Jun 24 22:43:11 cloudstack-centos63 local0: 2013-06-25 05:43:11,592 INFO
> [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-3:work-1) HA on
> VM[SecondaryStorageVm|s-1-VM]
> Jun 24 22:43:11 cloudstack-centos63 local0: 2013-06-25 05:43:11,593 INFO
> [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-3:work-1) VM
> VM[SecondaryStorageVm|s-1-VM] has been changed. Current State = Starting
> Previous State = Starting last updated = 6 previous updated = 2
> Jun 24 22:43:11 cloudstack-centos63 local0: 2013-06-25 05:43:11,593 INFO
> [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-3:work-1) Completed
> HAWork[1-HA-1-Starting-Investigating]
> Jun 24 22:43:11 cloudstack-centos63 local0: 2013-06-25 05:43:11,611 INFO
> [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-2:work-2) Processing
> HAWork[2-HA-2-Starting-Investigating]
> Jun 24 22:43:11 cloudstack-centos63 local0: 2013-06-25 05:43:11,618 INFO
> [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-2:work-2) HA on
> VM[ConsoleProxy|v-2-VM]
> Jun 24 22:43:11 cloudstack-centos63 local0: 2013-06-25 05:43:11,618 INFO
> [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-2:work-2) VM
> VM[ConsoleProxy|v-2-VM] has been changed. Current State = Starting Previous
> State = Starting last updated = 6 previous updated = 2
> Jun 24 22:43:11 cloudstack-centos63 local0: 2013-06-25 05:43:11,618 INFO
> [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-2:work-2) Completed
> HAWork[2-HA-2-Starting-Investigating]
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira