venkata swamybabu budumuru created CLOUDSTACK-4399:
------------------------------------------------------

             Summary: [Templates] template entries are deleted from 
template_store_ref when downloadTemplate times out
                 Key: CLOUDSTACK-4399
                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-4399
             Project: CloudStack
          Issue Type: Bug
      Security Level: Public (Anyone can view this level - this is the default.)
          Components: Management Server
    Affects Versions: 4.2.0
         Environment: commit id # 3c40e8bb3f6278f78c24c6317d513bd5ad599944
            Reporter: venkata swamybabu budumuru
            Priority: Critical
             Fix For: 4.2.0


Steps to reproduce :

1. Have a latest CloudStack setup with at least 1 advanced zone
2. Add at least 1 KVM cluster.
3. Make sure systemVMs are up 
4. As a non-ROOT admin, deploy at least 1 User VM (which includes router) on 
this host.
5. Add another cluster of type XenServer

Observations:

(i) As soon as the host is added, it initiated systemVM template from 
download.cloud.com which it taking quite longtime to download.
(ii) when this download was at round 11%, did SSVM restart and then destroy 
operations on SSVM.
(iii) The above operation resulted in template download timeout as per mgmt 
server logs.

2013-08-19 16:04:22,136 DEBUG [storage.download.DownloadListener] 
(Timer-8:null) Scheduling timeout at 30000 ms, TEMPLATE: 1 at host 2

(iv) Before the timeout happened, I could see an entry for template :1 in 
template_store_ref saying DOWNLOAD_IN_PROGRESS but, after timeout this entry is 
totally deleted from template_store_ref.

2013-08-19 16:04:33,386 DEBUG [storage.download.DownloadListener] 
(Timer-8:null) Send command failed
com.cloud.utils.exception.CloudRuntimeException: Unable to send message
        at 
org.apache.cloudstack.storage.RemoteHostEndPoint.sendMessageAsync(RemoteHostEndPoint.java:179)
        at 
com.cloud.storage.download.DownloadListener.sendCommand(DownloadListener.java:187)
        at 
com.cloud.storage.download.DownloadListener$StatusTask.run(DownloadListener.java:82)
        at java.util.TimerThread.mainLoop(Timer.java:534)
        at java.util.TimerThread.run(Timer.java:484)
Caused by: com.cloud.exception.AgentUnavailableException: Resource [Host:2] is 
unreachable: Host 2: Host with specified id is not in the right state: 
Disconnected
        at 
com.cloud.agent.manager.ClusteredAgentManagerImpl.getAttache(ClusteredAgentManagerImpl.java:540)
        at 
com.cloud.agent.manager.AgentManagerImpl.send(AgentManagerImpl.java:526)
        at 
org.apache.cloudstack.storage.RemoteHostEndPoint.sendMessageAsync(RemoteHostEndPoint.java:177)
        ... 4 more
2013-08-19 16:04:33,386 WARN  [storage.download.DownloadListener] 
(Timer-8:null) Unable to monitor download progress of TEMPLATE: 1 at host 2
2013-08-19 16:04:33,387 DEBUG [storage.image.BaseImageStoreDriverImpl] 
(Timer-8:null) Performing image store createTemplate async callback
2013-08-19 16:04:33,531 WARN  [storage.download.DownloadListener] 
(Timer-8:null) Entering download error state because the storage host 
disconnected, TEMPLATE: 1 at host 2


(v) After the above incident happened, there is no way that I could get the 
systemVM template automatically downloaded. Have tried SSVM restart / SSVM 
destroy but nothing happened.

Expected Result:
============

- when timeout happens, it should set the state as "DOWNLOAD_ERROR" so that 
SSVM during sync will be able to identify the situation and download it again.


Few more additional observations:
========================

(a) Observed the same issue with user templates as well. below is the quick way 
to reproduce it

- login as a non-ROOT domain user
- registerTemplate from download.cloud.com
- while the above download is in progress, keep performing a reboot followed by 
a destroy SSVM so that download command timeout after 30 seconds.
- After it time-out then check cloud.template_store_ref table and the above 
user template entry completely disappears.

Attaching all the required logs along with db dump to the bug.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to