[
https://issues.apache.org/jira/browse/CLOUDSTACK-5370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sangeetha Hariharan reopened CLOUDSTACK-5370:
---------------------------------------------
Reopening this issue:
Following is the behavior observed:
Bring down NFS secondary store.
1. Attempt a snapshot on ROOT volume.
When snapshot fails , we see that the entry with store_role="primary" in
snapshot_store_ref being left in the DB. This needs to be cleaned up in DB.
In this case , I see the number of vhd count in the primary store is increased
by 1 . I see that there were 2 vhd entries being created when the snapshot was
attempted , one of which got deleted as part of the snapshot failing.
2. Now attempt another snapshot on the same ROOT volume.
We see that now , the CreateObject command itself fails.
In this case , there are 2 vhd entries that are being left behind on the
primary store.
Following is exception seen when CreateObject command fails :
2014-01-06 19:02:09,075 DEBUG [c.c.a.t.Request] (DirectAgent-156:ctx-5f88b79f)
Seq 2-1271792415: Processing: { Ans: , MgmtId: 112516401760401, via: 2, Ver:
v1, Flags: 10,
[{"org.apache.cloudstack.storage.command.CreateObjectAnswer":{"result":false,"details":"create
snapshot operation Failed for snapshotId: 33, reason:
com.cloud.utils.exception.CloudRuntimeException: callHostPlugin failed for cmd:
getVhdParent with args snapshotUuid: 85f1b2ae-9160-4a50-aa59-9b908927b3d4,
isISCSI: false, primaryStorageSRUuid: 64acdb4c-a455-1072-c9c0-8d40869fcc29,
due to There was a failure communicating with the plugin.","wait":0}}] }
> Xenserver - Snapshots - vhd entries get accumulated on the primary store when
> snapshot creation fails becasue of not being able to reach the secondary
> store.
> -------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: CLOUDSTACK-5370
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-5370
> Project: CloudStack
> Issue Type: Bug
> Security Level: Public(Anyone can view this level - this is the
> default.)
> Components: Management Server
> Affects Versions: 4.3.0
> Environment: Build from 4.3
> Reporter: Sangeetha Hariharan
> Assignee: edison su
> Priority: Critical
> Fix For: 4.3.0
>
>
> Set up:
> Advanced Zone with 2 Xenserver 6.2 hosts:
> 1. Deploy 5 Vms in each of the hosts with 10 GB ROOT volume size , so we
> start with 10 Vms.
> 2. Start concurrent snapshots for ROOT volumes of all the Vms.
> 3. Shutdown the Secondary storage server when the snapshots are in the
> progress. ( In my case i stopped the nfs server)
> 4. Bring the Secondary storage server up after 12 hours. ( In my case started
> the nfs server).
> When secondary server was down (NFS server down) for about 12 hours , I see
> that hourly snapshots get attempted every hour and fail with
> “CreatedOnPrimary" state . I see many entries being created on the primary
> store ( I see 120 entries , but I have only 14 vms).
> We accumulate 2 vhd files on the primary store for every snapshot that is
> attempted.
> When secondary store is brought up , and when another snapshot is attempted
> and it succeeds, we see the vhd files are all being cleared out.
> This is a problem that we accumulate so many vhd files ( In case of vmware
> and kvm where there are no delta snapshots this size would be significantly
> higher) on primary store.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)