[
https://issues.apache.org/jira/browse/CLOUDSTACK-5499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Abhinandan Prateek updated CLOUDSTACK-5499:
-------------------------------------------
Assignee: Sateesh Chodapuneedi
> Vmware -When nfs was down for about 12 hours and then brought back up again
> , snasphots are not being attempted for some of the volumes which have
> snaphots that are in "CreatedOnPrimary" state.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: CLOUDSTACK-5499
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-5499
> Project: CloudStack
> Issue Type: Bug
> Security Level: Public(Anyone can view this level - this is the
> default.)
> Components: Management Server
> Affects Versions: 4.3.0
> Environment: Build from 4.3
> Reporter: Sangeetha Hariharan
> Assignee: Sateesh Chodapuneedi
> Priority: Critical
> Fix For: 4.3.0
>
> Attachments: nfs12down.rar
>
>
> Vmware -When nfs was down for about 12 hours and then brought back up again
> , snasphots are not being attempted for some of the volumes which have
> snaphots that are in "CreatedOnPrimary" state.
> Set up :
> Advanced Zone with 2 5.1 ESXI hosts.
> Steps to reproduce the problem:
> 1. Deploy 5 Vms in each of the hosts , so we start with 11 Vms.
> 2. Start concurrent snapshots for ROOT volumes of all the Vms.
> 3. Shutdown the Secondary storage server when the snapshots are in the
> progress.
> 4. Bring the Secondary storage server up after 12 hours.
> Follwoing are the issues that are seen in this run:
> 1. I see that the snapshots that are in Progress , report failures only after
> 12 hours even though the backup.snapshot.wait is set to 12 hours.
> 2. New snapshot request that were executed when the NFS server was down , do
> not report failure immediately. In my case , i see that such request
> eventually succeeded when the NFS server was brought up. Is this the expected
> behavior ? Should we not expect to fail right away , instead of holding on to
> such active sessions ?
> 3. Some of the snapshot failures resulted in snaphots that are in
> "CreatedOnPrimary" state. For such volumes , snapshots are not being
> attempted at all , even though the NFS server was brought up.
> Volumes in this state are - 16,18,17,22.
> There are instances where I have seen the snapshots being scheduled and
> succeeding even when the previous state was "CreatedOnPrimary". Why are were
> able to schedule snapshots in such cases ? And sometimes not in other cases?
> mysql> select volume_id,status,created from snapshots where volume_id=18;
> +-----------+------------------+---------------------+
> | volume_id | status | created |
> +-----------+------------------+---------------------+
> | 18 | Destroyed | 2013-12-12 23:24:14 |
> | 18 | CreatedOnPrimary | 2013-12-12 23:53:39 |
> | 18 | BackedUp | 2013-12-13 01:53:38 |
> | 18 | CreatedOnPrimary | 2013-12-13 03:53:38 |
> +-----------+------------------+---------------------+
> mysql> select volume_id,status,created from snapshots;
> +-----------+------------------+---------------------+
> | volume_id | status | created |
> +-----------+------------------+---------------------+
> | 22 | Destroyed | 2013-12-12 23:24:13 |
> | 21 | Destroyed | 2013-12-12 23:24:13 |
> | 20 | Destroyed | 2013-12-12 23:24:14 |
> | 19 | Destroyed | 2013-12-12 23:24:14 |
> | 18 | Destroyed | 2013-12-12 23:24:14 |
> | 17 | Destroyed | 2013-12-12 23:24:14 |
> | 16 | Destroyed | 2013-12-12 23:24:14 |
> | 14 | Destroyed | 2013-12-12 23:24:15 |
> | 25 | Destroyed | 2013-12-12 23:24:15 |
> | 24 | Destroyed | 2013-12-12 23:24:15 |
> | 23 | Destroyed | 2013-12-12 23:24:15 |
> | 22 | CreatedOnPrimary | 2013-12-12 23:53:38 |
> | 21 | Destroyed | 2013-12-12 23:53:38 |
> | 20 | Destroyed | 2013-12-12 23:53:38 |
> | 19 | Destroyed | 2013-12-12 23:53:39 |
> | 18 | CreatedOnPrimary | 2013-12-12 23:53:39 |
> | 17 | CreatedOnPrimary | 2013-12-12 23:53:40 |
> | 16 | CreatedOnPrimary | 2013-12-12 23:53:40 |
> | 14 | Destroyed | 2013-12-12 23:53:40 |
> | 25 | Destroyed | 2013-12-12 23:53:41 |
> | 24 | Destroyed | 2013-12-12 23:53:41 |
> | 23 | Destroyed | 2013-12-12 23:53:42 |
> | 21 | Destroyed | 2013-12-13 00:53:37 |
> | 19 | Destroyed | 2013-12-13 00:53:38 |
> | 22 | BackedUp | 2013-12-13 01:53:37 |
> | 21 | Destroyed | 2013-12-13 01:53:38 |
> | 20 | Destroyed | 2013-12-13 01:53:38 |
> | 19 | Destroyed | 2013-12-13 01:53:38 |
> | 18 | BackedUp | 2013-12-13 01:53:38 |
> | 17 | BackedUp | 2013-12-13 01:53:38 |
> | 16 | BackedUp | 2013-12-13 01:53:39 |
> | 14 | Destroyed | 2013-12-13 01:53:39 |
> | 25 | Destroyed | 2013-12-13 01:53:39 |
> | 24 | Destroyed | 2013-12-13 01:53:39 |
> | 23 | Destroyed | 2013-12-13 01:53:40 |
> | 22 | CreatedOnPrimary | 2013-12-13 03:53:37 |
> | 21 | Destroyed | 2013-12-13 03:53:38 |
> | 20 | Destroyed | 2013-12-13 03:53:38 |
> | 19 | Destroyed | 2013-12-13 03:53:38 |
> | 18 | CreatedOnPrimary | 2013-12-13 03:53:38 |
> | 17 | CreatedOnPrimary | 2013-12-13 03:53:38 |
> | 16 | CreatedOnPrimary | 2013-12-13 03:53:39 |
> | 14 | Destroyed | 2013-12-13 03:53:39 |
> | 24 | Destroyed | 2013-12-13 08:53:37 |
> | 25 | Destroyed | 2013-12-13 09:53:37 |
> | 23 | Destroyed | 2013-12-13 10:53:37 |
> | 21 | Destroyed | 2013-12-13 16:53:37 |
> | 20 | Destroyed | 2013-12-13 16:53:38 |
> | 19 | Destroyed | 2013-12-13 16:53:38 |
> | 14 | Destroyed | 2013-12-13 16:53:38 |
> | 21 | BackedUp | 2013-12-13 18:53:37 |
> | 20 | BackedUp | 2013-12-13 18:53:38 |
> | 19 | BackedUp | 2013-12-13 18:53:38 |
> | 14 | BackedUp | 2013-12-13 18:53:38 |
> | 25 | BackedUp | 2013-12-13 18:53:38 |
> | 24 | BackedUp | 2013-12-13 18:53:38 |
> | 23 | BackedUp | 2013-12-13 18:53:39 |
> | 21 | BackedUp | 2013-12-13 19:53:37 |
> | 20 | BackedUp | 2013-12-13 19:53:38 |
> | 19 | BackedUp | 2013-12-13 19:53:38 |
> | 14 | BackedUp | 2013-12-13 19:53:38 |
> | 25 | BackedUp | 2013-12-13 19:53:38 |
> | 24 | BackedUp | 2013-12-13 19:53:39 |
> | 23 | BackedUp | 2013-12-13 19:53:39 |
> +-----------+------------------+---------------------+
--
This message was sent by Atlassian JIRA
(v6.1.4#6159)