CS User created CLOUDSTACK-7775:
-----------------------------------

             Summary: Xen S3 backed secondary storage - local volume snapshots 
fail
                 Key: CLOUDSTACK-7775
                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-7775
             Project: CloudStack
          Issue Type: Bug
      Security Level: Public (Anyone can view this level - this is the default.)
    Affects Versions: 4.4.0, 4.3.1
            Reporter: CS User


Xenserver, 6.01 latest hotfixes, cloudstack 4.3 and 4.4. Snapshot of volume on 
local disk, to be transferred to S3. When the xenhost attemps the PUT request, 
the snapshot is no longer present and so the request fails. Snapshots of 
volumes stored on Primary storage work fine and are uploaded to the S3 backed 
secondary storage as expected. 

Also tried upgrading the hosts to Xenserver 6.2, however it still has the same 
issue. On another environment with cloudstack 4.3, with Xenserver 6.01 (no 
recent xen hotfixes and no S3 backed secondary storage), local snapshots work 
fine. 

We see this in the management server logs:

(I have removed references to any hosts from the logs and amended IP's)

{noformat}
2014-10-21 12:36:55,332 WARN  [c.c.h.x.r.CitrixResourceBase] 
(DirectAgent-235:ctx-55338794) Task failed! Task record:                 uuid: 
c520bf5a-1bc1-57ca-f9d0-960179118118
           nameLabel: Async.host.call_plugin
     nameDescription:
   allowedOperations: []
   currentOperations: {}
             created: Tue Oct 21 12:36:56 BST 2014
            finished: Tue Oct 21 12:36:56 BST 2014
              status: failure
          residentOn: com.xensource.xenapi.Host@1dd85be5
            progress: 1.0
                type: <none/>
              result:
           errorInfo: [XENAPI_PLUGIN_FAILURE, getSnapshotSize, 
CommandException, 2]
         otherConfig: {}
           subtaskOf: com.xensource.xenapi.Task@aaf13f6f
            subtasks: []

2014-10-21 12:36:55,386 WARN  [c.c.h.x.r.CitrixResourceBase] 
(DirectAgent-235:ctx-55338794) callHostPlugin failed for cmd: getSnapshotSize 
with args snapshotUuid: 0cef2f39-03e8-458d-bf19-7a2294c40ac7, isISCSI: true, 
primaryStorageSRUuid: b1802ff2-5a63-d1e7-04de-dea7ba7eab27,  due to Task 
failed! Task record:                 uuid: c520bf5a-1bc1-57ca-f9d0-960179118118
           nameLabel: Async.host.call_plugin
     nameDescription:
   allowedOperations: []
   currentOperations: {}
             created: Tue Oct 21 12:36:56 BST 2014
            finished: Tue Oct 21 12:36:56 BST 2014
              status: failure
          residentOn: com.xensource.xenapi.Host@1dd85be5
            progress: 1.0
                type: <none/>
              result:
           errorInfo: [XENAPI_PLUGIN_FAILURE, getSnapshotSize, 
CommandException, 2]
         otherConfig: {}
           subtaskOf: com.xensource.xenapi.Task@aaf13f6f
            subtasks: []

Task failed! Task record:                 uuid: 
c520bf5a-1bc1-57ca-f9d0-960179118118
           nameLabel: Async.host.call_plugin
     nameDescription:
   allowedOperations: []
   currentOperations: {}
             created: Tue Oct 21 12:36:56 BST 2014
            finished: Tue Oct 21 12:36:56 BST 2014
              status: failure
          residentOn: com.xensource.xenapi.Host@1dd85be5
            progress: 1.0
                type: <none/>
              result:
           errorInfo: [XENAPI_PLUGIN_FAILURE, getSnapshotSize, 
CommandException, 2]
         otherConfig: {}
           subtaskOf: com.xensource.xenapi.Task@aaf13f6f
            subtasks: []
        at 
com.cloud.hypervisor.xen.resource.CitrixResourceBase.checkForSuccess(CitrixResourceBase.java:3293)
        at 
com.cloud.hypervisor.xen.resource.CitrixResourceBase.callHostPluginAsync(CitrixResourceBase.java:3507)
        at 
com.cloud.hypervisor.xen.resource.XenServerStorageProcessor.getSnapshotSize(XenServerStorageProcessor.java:1211)
        at 
com.cloud.hypervisor.xen.resource.XenServerStorageProcessor.backupSnapshot(XenServerStorageProcessor.java:1401)
        at 
com.cloud.storage.resource.StorageSubsystemCommandHandlerBase.execute(StorageSubsystemCommandHandlerBase.java:94)
        at 
com.cloud.storage.resource.StorageSubsystemCommandHandlerBase.handleStorageCommands(StorageSubsystemCommandHandlerBase.java:52)
        at 
com.cloud.hypervisor.xen.resource.CitrixResourceBase.executeRequest(CitrixResourceBase.java:546)
        at 
com.cloud.hypervisor.xen.resource.XenServer56Resource.executeRequest(XenServer56Resource.java:61)
        at 
com.cloud.hypervisor.xen.resource.XenServer610Resource.executeRequest(XenServer610Resource.java:102)
        at 
com.cloud.agent.manager.DirectAgentAttache$Task.runInContext(DirectAgentAttache.java:216)
        at 
org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
        at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
        at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
        at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
        at 
org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
        at java.util.concurrent.FutureTask.run(FutureTask.java:166)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:722)
2014-10-21 12:36:55,391 WARN  [c.c.h.x.r.CitrixResourceBase] 
(DirectAgent-235:ctx-55338794) unable to destroy 
task(com.xensource.xenapi.Task@84832980) on 
host(45026915-2bf9-4ec1-8b57-f211194ff29b) due to
You gave an invalid object reference.  The object may have recently been 
deleted.  The class parameter gives the type of reference given, and the handle 
parameter echoes the bad value given.
        at com.xensource.xenapi.Types.checkResponse(Types.java:693)
        at com.xensource.xenapi.Connection.dispatch(Connection.java:350)
        at 
com.cloud.hypervisor.xen.resource.XenServerConnectionPool$XenServerConnection.dispatch(XenServerConnectionPool.java:456)
        at com.xensource.xenapi.Task.destroy(Task.java:616)
        at 
com.cloud.hypervisor.xen.resource.CitrixResourceBase.callHostPluginAsync(CitrixResourceBase.java:3523)
        at 
com.cloud.hypervisor.xen.resource.XenServerStorageProcessor.getSnapshotSize(XenServerStorageProcessor.java:1211)
        at 
com.cloud.hypervisor.xen.resource.XenServerStorageProcessor.backupSnapshot(XenServerStorageProcessor.java:1401)
        at 
com.cloud.storage.resource.StorageSubsystemCommandHandlerBase.execute(StorageSubsystemCommandHandlerBase.java:94)
        at 
com.cloud.storage.resource.StorageSubsystemCommandHandlerBase.handleStorageCommands(StorageSubsystemCommandHandlerBase.java:52)
        at 
com.cloud.hypervisor.xen.resource.CitrixResourceBase.executeRequest(CitrixResourceBase.java:546)
        at 
com.cloud.hypervisor.xen.resource.XenServer56Resource.executeRequest(XenServer56Resource.java:61)
        at 
com.cloud.hypervisor.xen.resource.XenServer610Resource.executeRequest(XenServer610Resource.java:102)
        at 
com.cloud.agent.manager.DirectAgentAttache$Task.runInContext(DirectAgentAttache.java:216)
        at 
org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
        at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
        at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
        at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
        at 
org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
        at java.util.concurrent.FutureTask.run(FutureTask.java:166)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:722)
2014-10-21 12:36:56,777 DEBUG [c.c.a.m.AgentManagerImpl] 
(AgentManager-Handler-12:null) SeqA 132-143: Processing Seq 132-143:  { Cmd , 
MgmtId: -1, via: 132, Ver: v1, Flags: 11, 
[{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{"_proxyVmId":64,"_loadInfo":"{\n
  \"connections\": []\n}","wait":0}}] }
2014-10-21 12:36:56,804 DEBUG [c.c.h.x.r.CitrixResourceBase] 
(DirectAgent-235:ctx-55338794) Host 192.14.225.18 
OpaqueRef:a0fd0b23-f978-5f67-b626-1eadc343a141: Removing SR
2014-10-21 12:36:56,808 DEBUG [c.c.a.m.AgentManagerImpl] 
(AgentManager-Handler-12:null) SeqA 132-143: Sending Seq 132-143:  { Ans: , 
MgmtId: 345052759949, via: 132, Ver: v1, Flags: 100010, 
[{"com.cloud.agent.api.AgentControlAnswer":{"result":true,"wait":0}}] }
2014-10-21 12:36:56,818 DEBUG [c.c.h.x.r.CitrixResourceBase] 
(DirectAgent-235:ctx-55338794) Host 192.14.225.18 
OpaqueRef:b02a715a-229e-3758-f4e4-88a7709810b5: Unplugging pbd
2014-10-21 12:36:57,488 DEBUG [c.c.h.x.r.CitrixResourceBase] 
(DirectAgent-235:ctx-55338794) Host 192.14.225.18 
OpaqueRef:a0fd0b23-f978-5f67-b626-1eadc343a141: Forgetting
2014-10-21 12:36:57,504 WARN  [c.c.h.x.r.XenServerStorageProcessor] 
(DirectAgent-235:ctx-55338794) BackupSnapshot Failed due to S3 upload of 
snapshots 0cef2f39-03e8-458d-bf19-7a2294c40ac7 failed
com.cloud.utils.exception.CloudRuntimeException: S3 upload of snapshots 
0cef2f39-03e8-458d-bf19-7a2294c40ac7 failed
        at 
com.cloud.hypervisor.xen.resource.XenServerStorageProcessor.backupSnapshot(XenServerStorageProcessor.java:1421)
        at 
com.cloud.storage.resource.StorageSubsystemCommandHandlerBase.execute(StorageSubsystemCommandHandlerBase.java:94)
        at 
com.cloud.storage.resource.StorageSubsystemCommandHandlerBase.handleStorageCommands(StorageSubsystemCommandHandlerBase.java:52)
        at 
com.cloud.hypervisor.xen.resource.CitrixResourceBase.executeRequest(CitrixResourceBase.java:546)
        at 
com.cloud.hypervisor.xen.resource.XenServer56Resource.executeRequest(XenServer56Resource.java:61)
        at 
com.cloud.hypervisor.xen.resource.XenServer610Resource.executeRequest(XenServer610Resource.java:102)
        at 
com.cloud.agent.manager.DirectAgentAttache$Task.runInContext(DirectAgentAttache.java:216)
        at 
org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
        at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
        at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
        at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
        at 
org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
        at java.util.concurrent.FutureTask.run(FutureTask.java:166)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:722)
2014-10-21 12:36:57,885 DEBUG [c.c.a.ApiServlet] 
(catalina-exec-11:ctx-3e69dc37) ===START===  192.168.20.10 -- GET  
command=queryAsyncJobResult&jobId=436e0dd3-283a-471c-aba1-5ecbdd19e0e5&response=json&sessionkey=bC690U9amEIq2aZn5caH5oi0Tog%3D&_=1413891416702
{noformat}

On the xenserver we see the following in the /var/log/cloud/vmops.log file:

{noformat}
2014-10-21 12:36:11    DEBUG [root] #### VMOPS enter  
create_secondary_storage_folder ####
2014-10-21 12:36:11    DEBUG [root] create_secondary_storage_folder, args: 
{'newFolder': 'snapshots/2/257', 'remoteMountPath': 
'hostname.fake.net:/CS_TEST_SEC_01'}
2014-10-21 12:36:11    DEBUG [root] Successfully mounted 
hostname.fake.net:/CS_TEST_SEC_01 to 
/var/run/cloud_mount/17652326-6d41-44fc-b7e5-7ea0ab7e2d4d
2014-10-21 12:36:11    DEBUG [root] Successfully unmounted 
/var/run/cloud_mount/17652326-6d41-44fc-b7e5-7ea0ab7e2d4d
2014-10-21 12:36:11    DEBUG [root] #### VMOPS exit  
create_secondary_storage_folder ####
2014-10-21 12:36:31    DEBUG [root] #### VMOPS enter  get_rule_logs_for_vms ####
2014-10-21 12:36:31    DEBUG [root] #### VMOPS enter  
network_rules_for_rebooted_vm ####
2014-10-21 12:36:31    DEBUG [root] #### VMOPS enter  check_domid_changed ####
2014-10-21 12:36:31    DEBUG [root] #### VMOPS exit  check_domid_changed ####
2014-10-21 12:36:31    DEBUG [root] #### VMOPS exit  
network_rules_for_rebooted_vm ####
2014-10-21 12:36:31    DEBUG [root] #### VMOPS enter  
network_rules_for_rebooted_vm ####
2014-10-21 12:36:31    DEBUG [root] #### VMOPS enter  check_domid_changed ####
2014-10-21 12:36:31    DEBUG [root] #### VMOPS exit  check_domid_changed ####
2014-10-21 12:36:31    DEBUG [root] #### VMOPS exit  
network_rules_for_rebooted_vm ####
2014-10-21 12:36:31    DEBUG [root] #### VMOPS exit  get_rule_logs_for_vms ####
2014-10-21 12:36:56    DEBUG [root] primarySRPath: 
/dev/VG_XenStorage-b1802ff2-5a63-d1e7-04de-dea7ba7eab27
2014-10-21 12:36:58    DEBUG [root] #### VMOPS enter  deleteSnapshotBackup ####
2014-10-21 12:36:58    DEBUG [root] Calling deleteSnapshotBackup with 
{'localMountPoint': 
'/var/run/cloud_mount/ccd200be-0636-32f6-ad98-fb8e2fa8c70c', 'backupUUID': 
'0cef2f39-03e8-458d-bf19-7a2294c40ac7', 'secondaryStora
geMountPath': 'hostname.fake.net:/CS_TEST_SEC_01', 'path': 'snapshots/2/257'}
2014-10-21 12:36:58    DEBUG [root] Successfully mounted 
hostname.fake.net:/CS_TEST_SEC_01/snapshots to 
/var/run/cloud_mount/ccd200be-0636-32f6-ad98-fb8e2fa8c70c
2014-10-21 12:36:58    DEBUG [root] Chdired to 
/var/run/cloud_mount/ccd200be-0636-32f6-ad98-fb8e2fa8c70c/2/257
2014-10-21 12:36:58    DEBUG [root] checking existence of 
0cef2f39-03e8-458d-bf19-7a2294c40ac7.vhd
2014-10-21 12:36:58    DEBUG [root] backupVHD 
0cef2f39-03e8-458d-bf19-7a2294c40ac7.vhd exists.
2014-10-21 12:36:58    DEBUG [root] #### VMOPS exit  deleteSnapshotBackup ####
{noformat}

On the xenserver we see the following in the /var/log/cloud/s3xen.log file:

{noformat}
2014-10-21 12:36:57    DEBUG [root] #### VMOPS enter s3 #### ####
2014-10-21 12:36:57    DEBUG [root] #### VMOPS Operation put on file 
/dev/VG_XenStorage-b1802ff2-5a63-d1e7-04de-dea7ba7eab27/VHD-0cef2f39-03e8-458d-bf19-7a2294c40ac7
 from/in bucket CS_TEST_SEC_01 key 
snapshots/2/257/VHD-0cef2f39-03e8-458d-bf19-7a2294c40ac7 ####
2014-10-21 12:36:57    DEBUG [root] #### VMOPS Traceback (most recent call 
last):
  File "/etc/xapi.d/plugins/s3xen", line 414, in s3
    client.put(bucket, key, filename, maxSingleUploadBytes)
  File "/etc/xapi.d/plugins/s3xen", line 325, in put
    raise Exception(
Exception: Attempt to put 
/dev/VG_XenStorage-b1802ff2-5a63-d1e7-04de-dea7ba7eab27/VHD-0cef2f39-03e8-458d-bf19-7a2294c40ac7
 that does not exist.
 ####
2014-10-21 12:36:57    DEBUG [root] #### VMOPS exit s3 with result false ####
{noformat}

It appears that Xen is deleting the file on the local filesystem before the S3 
script has a chance to upload it, however the snapshot does appear to have been 
successfully transferred to the secondary storage staging area, after which the 
S3 transfer is attempted from the local lvm volume. When this fails the 
snapshot is deleted, as you can see from the above logs. 

It appears that either the issue with Xen deleting the snapshot needs to be 
resolved (has this behavior changed recently?) or the PUT request to S3 needs 
to send the file which is located in the staging area, rather than the file on 
local disk, as happens with the primary storage volume snapshots. 




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to