Indra, While I see there is a solution to your problem due to cloudstack changes with web servers, there is also another tangent to this issue..
Let me know if your issue is resolved by leveraging a different web server for image downloads. But if not, or someone else comes across this message thread, here is a another tangent to this problem. ---- Below is a summary to the problem we've seen that seem very similar In the nutshell, here is what we've seen and what we believe cause the issue. CloudStack Java Agent initiates 2 https sessions when attempting to download. 1) Session is established to web server to get the header information and keep it open.. 2) Session is established to download the content - this is usually a longer stream operation What we've seen happen, as session # 2 goes on downloading, somehow http session #1 gets corrupted. Session #2 is eventually terminated as it completes the operation, yet session #1 remains to be open. This defies the logic in a way, because no packets are sent over the open session between SSVM and HTTP server. I'd expect for one of two parties sending termination packet - but that never happens. As the result - the 1st session hangs and prevents agent from proceed to the next stage of installing the template. I've seen the socket being open for days on SSVM and yet not a single packet comes - not even keep alive. Patch should be out soon. Credit goes to Marcus for figuring this out. I also have a shell script that can resolve the specific stuck template download, but i've made it geared toward QCOW2. I can post it if need be, but we should probably release a proper fix.. Regards ilya On Wed, May 11, 2016 at 4:37 AM, Indra Pramana <in...@sg.or.id> wrote: > Dear all, > > I setup a test environment using CloudStack 4.8.0 (latest from repository), > one management server and two KVM agent hosts running Ubuntu 14.04 LTS. > Setup a zone with basic networking because the test environment doesn't > support VLAN and the test management server only have 1 NIC. The private > and public network are using the same subnet. Using Ceph RBD as primary > storage and NFS as secondary storage. > > Zone is created successfully, the two hosts are added successfully and > connected to the management servers. SSVM and CPVM are created without any > issues. However, the default > "CentOS 5.5(64-bit) no GUI (KVM)" template is stuck at "Installing > Template" status and Ready = "No" although it seems to be downloaded > successfully. I also tried registering an Ubuntu ISO and it faces the same > problem, it's downloaded successfully but then will stuck at "Installing > ISO" status. > > I ran the SSVM health check and everything is OK. Also go through the > troubleshooting steps as per below documentation but still unable to > determine the root cause: > > > https://cwiki.apache.org/confluence/display/CLOUDSTACK/SSVM,+templates,+Secondary+storage+troubleshooting > > I tried restarting the SSVM service, stop and start the SSVM and restarting > the management-server service, it will cause the download of the ISO and > template to be restarted, but it will then stuck at installing again. > > Logs from management server shows below lines which keeps on appearing / > looping every several seconds: > > === > 2016-05-11 19:31:29,031 DEBUG [c.c.h.o.r.Ovm3HypervisorGuru] > (Timer-6:ctx-369bfc57) (logid:0b1754a2) getCommandHostDelegation: class > org.apache.cloudstack.storage.command.DownloadProgressCommand > 2016-05-11 19:31:29,031 DEBUG [c.c.h.XenServerGuru] (Timer-6:ctx-369bfc57) > (logid:0b1754a2) getCommandHostDelegation: class > org.apache.cloudstack.storage.command.DownloadProgressCommand > 2016-05-11 19:31:29,031 DEBUG [o.a.c.s.RemoteHostEndPoint] > (Timer-6:ctx-369bfc57) (logid:0b1754a2) Sending command > org.apache.cloudstack.storage.command.DownloadProgressCommand to host: 3 > 2016-05-11 19:31:29,034 DEBUG [c.c.a.t.Request] (Timer-6:ctx-369bfc57) > (logid:0b1754a2) Seq 3-3028107799453237894: Sending { Cmd , MgmtId: > 181122461655966, via: 3(s-3-VM), Ver: v1, Flags: 100011, > [{"org.apache.cloudstack.storage.command > > .DownloadProgressCommand":{"jobId":"cab4ada2-e52f-4c25-9c30-9ec23dc3fcd4","request":"GET_STATUS","hvm":false,"description":"CentOS > 5.5(64-bit) no GUI > > (KVM)","checksum":"ed0e788280ff2912ea40f7f91ca7a249","maxDownloadSizeInBytes":536870912 > > 00,"id":4,"resourceType":"TEMPLATE","installPath":"template/tmpl/1/4","_store":{"com.cloud.agent.api.to.NfsTO":{"_url":"nfs://x.x.0.242/export/cloudstack","_role":"Image"}},"url":" > http://download.cloud.com/releases/2.2.0/eec2209b-987 > > 5-3c8d-92be-c001bd8a0faf.qcow2.bz2","format":"QCOW2","accountId":1,"name":"centos55-x86_64","secUrl":"nfs://x.x.0.242/export/cloudstack","wait":0}}] > } > 2016-05-11 19:31:29,081 DEBUG [c.c.a.t.Request] > (AgentManager-Handler-8:null) (logid:) Seq 3-3028107799453237894: > Processing: { Ans: , MgmtId: 181122461655966, via: 3, Ver: v1, Flags: 10, > [{"com.cloud.agent.api.storage.DownloadAnswer":{ > > "jobId":"cab4ada2-e52f-4c25-9c30-9ec23dc3fcd4","downloadPct":100,"errorString":"Downloaded > 472312242 bytes (download complete > > remote=472312242bytes)","downloadStatus":"DOWNLOAD_IN_PROGRESS","downloadPath":"/mnt/SecStorage/1b8d363c-1a86-3 > > 25d-b1b8-3c030ea93856/template/tmpl/1/4/dnld3801843575178925235tmp_","installPath":"template/tmpl/1/4","templateSize":0,"templatePhySicalSize":0,"checkSum":"ed0e788280ff2912ea40f7f91ca7a249","result":true,"details":"Downloaded > 472312242 > bytes (download complete remote=472312242bytes)","wait":0}}] } > 2016-05-11 19:31:33,039 DEBUG [c.c.h.d.HostDaoImpl] (ClusteredAgentManager > Timer:ctx-b98dc395) (logid:46e91a46) Resetting hosts suitable for reconnect > 2016-05-11 19:31:33,041 DEBUG [c.c.h.d.HostDaoImpl] (ClusteredAgentManager > Timer:ctx-b98dc395) (logid:46e91a46) Completed resetting hosts suitable for > reconnect > 2016-05-11 19:31:33,041 DEBUG [c.c.h.d.HostDaoImpl] (ClusteredAgentManager > Timer:ctx-b98dc395) (logid:46e91a46) Acquiring hosts for clusters already > owned by this management server > 2016-05-11 19:31:33,042 DEBUG [c.c.h.d.HostDaoImpl] (ClusteredAgentManager > Timer:ctx-b98dc395) (logid:46e91a46) Completed acquiring hosts for clusters > already owned by this management server > 2016-05-11 19:31:33,042 DEBUG [c.c.h.d.HostDaoImpl] (ClusteredAgentManager > Timer:ctx-b98dc395) (logid:46e91a46) Acquiring hosts for clusters not owned > by any management server > 2016-05-11 19:31:33,043 DEBUG [c.c.h.d.HostDaoImpl] (ClusteredAgentManager > Timer:ctx-b98dc395) (logid:46e91a46) Completed acquiring hosts for clusters > not owned by any management server > 2016-05-11 19:31:33,520 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentManager-Handler-10:null) (logid:) SeqA 4-3326: Processing Seq > 4-3326: { Cmd , MgmtId: -1, via: 4, Ver: v1, Flags: 11, > [{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{ > "_proxyVmId":4,"_loadInfo":"{\n \"connections\": []\n}","wait":0}}] } > 2016-05-11 19:31:33,608 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentManager-Handler-10:null) (logid:) SeqA 4-3326: Sending Seq 4-3326: { > Ans: , MgmtId: 181122461655966, via: 4, Ver: v1, Flags: 100010, > [{"com.cloud.agent.api.AgentControlAnswe > r":{"result":true,"wait":0}}] } > 2016-05-11 19:31:37,910 INFO [o.a.c.f.j.i.AsyncJobManagerImpl] > (AsyncJobMgr-Heartbeat-1:ctx-e8a869c3) (logid:8c9bcefd) Begin cleanup > expired async-jobs > 2016-05-11 19:31:37,916 INFO [o.a.c.f.j.i.AsyncJobManagerImpl] > (AsyncJobMgr-Heartbeat-1:ctx-e8a869c3) (logid:8c9bcefd) End cleanup expired > async-jobs > 2016-05-11 19:31:37,966 DEBUG [o.a.c.s.i.BaseImageStoreDriverImpl] > (RemoteHostEndPoint-4:ctx-02c29196) (logid:cc402ab0) Performing image store > createTemplate async callback > 2016-05-11 19:31:38,522 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentManager-Handler-11:null) (logid:) SeqA 4-3327: Processing Seq > 4-3327: { Cmd , MgmtId: -1, via: 4, Ver: v1, Flags: 11, > [{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{ > "_proxyVmId":4,"_loadInfo":"{\n \"connections\": []\n}","wait":0}}] } > 2016-05-11 19:31:38,604 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentManager-Handler-11:null) (logid:) SeqA 4-3327: Sending Seq 4-3327: { > Ans: , MgmtId: 181122461655966, via: 4, Ver: v1, Flags: 100010, > [{"com.cloud.agent.api.AgentControlAnswe > r":{"result":true,"wait":0}}] } > 2016-05-11 19:31:39,081 DEBUG [o.a.c.s.i.BaseImageStoreDriverImpl] > (RemoteHostEndPoint-4:ctx-7ac61961) (logid:f3094baf) Performing image store > createTemplate async callback > 2016-05-11 19:31:39,219 DEBUG [c.c.s.d.DownloadListener] > (Timer-5:ctx-a9ee8fd5) (logid:b445ac5e) Scheduling timeout at 30000 ms, > TEMPLATE: 203 at host 3 > 2016-05-11 19:31:39,472 DEBUG [c.c.s.d.DownloadListener] > (Timer-6:ctx-e44234cf) (logid:bb7ce357) Scheduling timeout at 30000 ms, > TEMPLATE: 4 at host 3 > 2016-05-11 19:31:41,786 DEBUG [o.s.b.f.a.InjectionMetadata] > (secstorage-1:ctx-1a3a9719) (logid:6bcbe476) Processing injected method of > bean 'org.apache.cloudstack.storage.image.store.ImageStoreImpl': > AutowiredFieldElement for com.cloud.s > torage.dao.VMTemplateDao > org.apache.cloudstack.storage.image.store.ImageStoreImpl.imageDao > 2016-05-11 19:31:41,786 DEBUG [o.s.b.f.s.DefaultListableBeanFactory] > (secstorage-1:ctx-1a3a9719) (logid:6bcbe476) Returning cached instance of > singleton bean 'VMTemplateDaoImpl' > 2016-05-11 19:31:41,786 DEBUG [o.s.b.f.a.InjectionMetadata] > (secstorage-1:ctx-1a3a9719) (logid:6bcbe476) Processing injected method of > bean 'org.apache.cloudstack.storage.image.store.ImageStoreImpl': > AutowiredFieldElement for private org > .apache.cloudstack.storage.datastore.ObjectInDataStoreManager > org.apache.cloudstack.storage.image.store.ImageStoreImpl.objectInStoreMgr > 2016-05-11 19:31:41,786 DEBUG [o.s.b.f.s.DefaultListableBeanFactory] > (secstorage-1:ctx-1a3a9719) (logid:6bcbe476) Returning cached instance of > singleton bean 'objectInDataStoreManagerImpl' > 2016-05-11 19:31:41,786 DEBUG [o.s.b.f.a.InjectionMetadata] > (secstorage-1:ctx-1a3a9719) (logid:6bcbe476) Processing injected method of > bean 'org.apache.cloudstack.storage.image.store.ImageStoreImpl': > AutowiredFieldElement for private com > .cloud.capacity.dao.CapacityDao > org.apache.cloudstack.storage.image.store.ImageStoreImpl.capacityDao > 2016-05-11 19:31:41,787 DEBUG [o.s.b.f.s.DefaultListableBeanFactory] > (secstorage-1:ctx-1a3a9719) (logid:6bcbe476) Returning cached instance of > singleton bean 'capacityDaoImpl' > 2016-05-11 19:31:41,789 DEBUG [o.s.b.f.a.InjectionMetadata] > (secstorage-1:ctx-1a3a9719) (logid:6bcbe476) Processing injected method of > bean 'org.apache.cloudstack.storage.image.store.ImageStoreImpl': > AutowiredFieldElement for com.cloud.storage.dao.VMTemplateDao > org.apache.cloudstack.storage.image.store.ImageStoreImpl.imageDao > 2016-05-11 19:31:41,789 DEBUG [o.s.b.f.s.DefaultListableBeanFactory] > (secstorage-1:ctx-1a3a9719) (logid:6bcbe476) Returning cached instance of > singleton bean 'VMTemplateDaoImpl' > 2016-05-11 19:31:41,789 DEBUG [o.s.b.f.a.InjectionMetadata] > (secstorage-1:ctx-1a3a9719) (logid:6bcbe476) Processing injected method of > bean 'org.apache.cloudstack.storage.image.store.ImageStoreImpl': > AutowiredFieldElement for private > org.apache.cloudstack.storage.datastore.ObjectInDataStoreManager > org.apache.cloudstack.storage.image.store.ImageStoreImpl.objectInStoreMgr > 2016-05-11 19:31:41,789 DEBUG [o.s.b.f.s.DefaultListableBeanFactory] > (secstorage-1:ctx-1a3a9719) (logid:6bcbe476) Returning cached instance of > singleton bean 'objectInDataStoreManagerImpl' > 2016-05-11 19:31:41,789 DEBUG [o.s.b.f.a.InjectionMetadata] > (secstorage-1:ctx-1a3a9719) (logid:6bcbe476) Processing injected method of > bean 'org.apache.cloudstack.storage.image.store.ImageStoreImpl': > AutowiredFieldElement for private com.cloud.capacity.dao.CapacityDao > org.apache.cloudstack.storage.image.store.ImageStoreImpl.capacityDao > 2016-05-11 19:31:41,790 DEBUG [o.s.b.f.s.DefaultListableBeanFactory] > (secstorage-1:ctx-1a3a9719) (logid:6bcbe476) Returning cached instance of > singleton bean 'capacityDaoImpl' > 2016-05-11 19:31:41,792 DEBUG [o.s.b.f.a.InjectionMetadata] > (secstorage-1:ctx-1a3a9719) (logid:6bcbe476) Processing injected method of > bean 'org.apache.cloudstack.storage.image.store.ImageStoreImpl': > AutowiredFieldElement for com.cloud.storage.dao.VMTemplateDao > org.apache.cloudstack.storage.image.store.ImageStoreImpl.imageDao > 2016-05-11 19:31:41,792 DEBUG [o.s.b.f.s.DefaultListableBeanFactory] > (secstorage-1:ctx-1a3a9719) (logid:6bcbe476) Returning cached instance of > singleton bean 'VMTemplateDaoImpl' > 2016-05-11 19:31:41,792 DEBUG [o.s.b.f.a.InjectionMetadata] > (secstorage-1:ctx-1a3a9719) (logid:6bcbe476) Processing injected method of > bean 'org.apache.cloudstack.storage.image.store.ImageStoreImpl': > AutowiredFieldElement for private > org.apache.cloudstack.storage.datastore.ObjectInDataStoreManager > org.apache.cloudstack.storage.image.store.ImageStoreImpl.objectInStoreMgr > 2016-05-11 19:31:41,793 DEBUG [o.s.b.f.s.DefaultListableBeanFactory] > (secstorage-1:ctx-1a3a9719) (logid:6bcbe476) Returning cached instance of > singleton bean 'objectInDataStoreManagerImpl' > 2016-05-11 19:31:41,793 DEBUG [o.s.b.f.a.InjectionMetadata] > (secstorage-1:ctx-1a3a9719) (logid:6bcbe476) Processing injected method of > bean 'org.apache.cloudstack.storage.image.store.ImageStoreImpl': > AutowiredFieldElement for private com.cloud.capacity.dao.CapacityDao > org.apache.cloudstack.storage.image.store.ImageStoreImpl.capacityDao > 2016-05-11 19:31:41,793 DEBUG [o.s.b.f.s.DefaultListableBeanFactory] > (secstorage-1:ctx-1a3a9719) (logid:6bcbe476) Returning cached instance of > singleton bean 'capacityDaoImpl' > 2016-05-11 19:31:41,794 DEBUG [o.a.c.s.SecondaryStorageManagerImpl] > (secstorage-1:ctx-1a3a9719) (logid:6bcbe476) Zone 1 is ready to launch > secondary storage VM > 2016-05-11 19:31:42,212 DEBUG [o.s.b.f.a.InjectionMetadata] > (consoleproxy-1:ctx-0006665c) (logid:35d3c02d) Processing injected method > of bean 'org.apache.cloudstack.storage.image.store.ImageStoreImpl': > AutowiredFieldElement for com.cloud.storage.dao.VMTemplateDao > org.apache.cloudstack.storage.image.store.ImageStoreImpl.imageDao > 2016-05-11 19:31:42,212 DEBUG [o.s.b.f.s.DefaultListableBeanFactory] > (consoleproxy-1:ctx-0006665c) (logid:35d3c02d) Returning cached instance of > singleton bean 'VMTemplateDaoImpl' > 2016-05-11 19:31:42,212 DEBUG [o.s.b.f.a.InjectionMetadata] > (consoleproxy-1:ctx-0006665c) (logid:35d3c02d) Processing injected method > of bean 'org.apache.cloudstack.storage.image.store.ImageStoreImpl': > AutowiredFieldElement for private > org.apache.cloudstack.storage.datastore.ObjectInDataStoreManager > org.apache.cloudstack.storage.image.store.ImageStoreImpl.objectInStoreMgr > 2016-05-11 19:31:42,212 DEBUG [o.s.b.f.s.DefaultListableBeanFactory] > (consoleproxy-1:ctx-0006665c) (logid:35d3c02d) Returning cached instance of > singleton bean 'objectInDataStoreManagerImpl' > 2016-05-11 19:31:42,212 DEBUG [o.s.b.f.a.InjectionMetadata] > (consoleproxy-1:ctx-0006665c) (logid:35d3c02d) Processing injected method > of bean 'org.apache.cloudstack.storage.image.store.ImageStoreImpl': > AutowiredFieldElement for private com.cloud.capacity.dao.CapacityDao > org.apache.cloudstack.storage.image.store.ImageStoreImpl.capacityDao > 2016-05-11 19:31:42,213 DEBUG [o.s.b.f.s.DefaultListableBeanFactory] > (consoleproxy-1:ctx-0006665c) (logid:35d3c02d) Returning cached instance of > singleton bean 'capacityDaoImpl' > 2016-05-11 19:31:42,216 DEBUG [c.c.c.ConsoleProxyManagerImpl] > (consoleproxy-1:ctx-0006665c) (logid:35d3c02d) Zone 1 is ready to launch > console proxy > 2016-05-11 19:31:43,523 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentManager-Handler-15:null) (logid:) SeqA 4-3328: Processing Seq > 4-3328: { Cmd , MgmtId: -1, via: 4, Ver: v1, Flags: 11, > > [{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{"_proxyVmId":4,"_loadInfo":"{\n > \"connections\": []\n}","wait":0}}] } > 2016-05-11 19:31:43,606 DEBUG [c.c.a.m.AgentManagerImpl] > (AgentManager-Handler-15:null) (logid:) SeqA 4-3328: Sending Seq 4-3328: { > Ans: , MgmtId: 181122461655966, via: 4, Ver: v1, Flags: 100010, > [{"com.cloud.agent.api.AgentControlAnswer":{"result":true,"wait":0}}] } > 2016-05-11 19:31:47,908 INFO [o.a.c.f.j.i.AsyncJobManagerImpl] > (AsyncJobMgr-Heartbeat-1:ctx-7a3cd65f) (logid:2832c3eb) Begin cleanup > expired async-jobs > 2016-05-11 19:31:47,917 INFO [o.a.c.f.j.i.AsyncJobManagerImpl] > (AsyncJobMgr-Heartbeat-1:ctx-7a3cd65f) (logid:2832c3eb) End cleanup expired > async-jobs > ==== > > Anyone can advise what could be the problem? > > Looking forward to your reply, thank you. > > Cheers. > > -ip- >