[jira] [Reopened] (CLOUDSTACK-5370) Xenserver - Snapshots - vhd entries get accumulated on the primary store when snapshot creation fails becasue of not being able to reach the secondary store.

Sangeetha Hariharan (JIRA) Wed, 15 Jan 2014 16:12:37 -0800

     [ 
https://issues.apache.org/jira/browse/CLOUDSTACK-5370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Sangeetha Hariharan reopened CLOUDSTACK-5370:
---------------------------------------------


Tested with latest build from 4.3

Following  behavior is still observed:

Bring down NFS secondary store.
1. Attempt a snapshot on ROOT volume.
When snapshot fails , we see that the entry with store_role="primary" in 
snapshot_store_ref being left in the DB. This needs to be cleaned up in DB.

In this case , I see the number of vhd count in the primary store is increased 
by 1 . I see that there were 2 vhd entries being created when the snapshot was 
attempted , one of which got deleted as part of the snapshot failing.

2. Now attempt another snapshot on the same ROOT volume.
We see that now , the CreateObject command itself fails.

In this case , there are 2 vhd entries that are being left behind on the 
primary store.

Following is exception seen when CreateObject command fails :

2014-01-06 19:02:09,075 DEBUG [c.c.a.t.Request] (DirectAgent-156:ctx-5f88b79f) 
Seq 2-1271792415: Processing: { Ans: , MgmtId: 112516401760401, via: 2, Ver: 
v1, Flags: 10, 
[{"org.apache.cloudstack.storage.command.CreateObjectAnswer":{"result":false,"details":"create
 snapshot operation Failed for snapshotId: 33, reason: 
com.cloud.utils.exception.CloudRuntimeException: callHostPlugin failed for cmd: 
getVhdParent with args snapshotUuid: 85f1b2ae-9160-4a50-aa59-9b908927b3d4, 
isISCSI: false, primaryStorageSRUuid: 64acdb4c-a455-1072-c9c0-8d40869fcc29, due 
to There was a failure communicating with the plugin.","wait":0}}] }

Reopening this issue.

> Xenserver - Snapshots - vhd entries get accumulated on the primary store when 
> snapshot creation fails becasue of not being able to reach the secondary 
> store.
> -------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-5370
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-5370
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the 
> default.) 
>          Components: Management Server
>    Affects Versions: 4.3.0
>         Environment: Build from 4.3
>            Reporter: Sangeetha Hariharan
>            Assignee: edison su
>            Priority: Critical
>             Fix For: 4.3.0
>
>         Attachments: xennfsdown.rar
>
>
> Set up:
> Advanced Zone with 2 Xenserver 6.2 hosts:
> 1. Deploy 5 Vms in each of the hosts with 10 GB ROOT volume size , so we 
> start with 10 Vms.
> 2. Start concurrent snapshots for ROOT volumes of all the Vms.
> 3. Shutdown the Secondary storage server when the snapshots are in the 
> progress. ( In my case i stopped the nfs server)
> 4. Bring the Secondary storage server up after 12 hours. ( In my case started 
> the nfs server).
> When secondary server was down (NFS server down) for about 12 hours , I see 
> that hourly snapshots get attempted every hour and fail with 
> “CreatedOnPrimary" state . I see many entries being created on the primary 
> store ( I see 120 entries , but I have only 14 vms).
> We accumulate  2 vhd files on the primary store for  every  snapshot that is 
> attempted.
> When secondary store is brought up , and when  another snapshot is attempted 
> and it succeeds, we see the vhd files are all being cleared out.
> This is a problem that we accumulate so many vhd files ( In case of vmware 
> and kvm where there are no delta snapshots this size would be significantly 
> higher) on primary store. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Reopened] (CLOUDSTACK-5370) Xenserver - Snapshots - vhd entries get accumulated on the primary store when snapshot creation fails becasue of not being able to reach the secondary store.

Reply via email to