Hi Steve, No firm diagnosis or solution, but this could be something going on during snapshot cleanup, when the volumes are being unmounted if you are using the HOTADD transport. One possibility (but by no means the only one) is:
kb.vmware.com/kb/2010953 One way to try to isolate this is to manually create a snapshot, and leave it in place for the same amount of time the backup normally runs for that VM. Do you see the frozen VM issue occur? Next, remove the snapshot you just created. Does the VM seem to freeze while the snapshot is being removed? The big factor is how long the backup took and how busy the target VM's I/O was. The longer the backup with high I/O, the larger the redo log becomes, requiring longer time to consolidate the disks. I'm not sure what those error log messages are about, but I cannot tie them to the problem you are describing in this thread. Best regards, - Andy ____________________________________________________________________________ Andrew Raibeck | Tivoli Storage Manager Level 3 Technical Lead | [email protected] IBM Tivoli Storage Manager links: Product support: http://www.ibm.com/support/entry/portal/Overview/Software/Tivoli/Tivoli_Storage_Manager Online documentation: http://www.ibm.com/support/knowledgecenter/SSGSG7/welcome Product Wiki: https://www.ibm.com/developerworks/community/wikis/home/wiki/Tivoli%20Storage%20Manager "ADSM: Dist Stor Manager" <[email protected]> wrote on 2015-03-09 10:08:31: > From: "Schaub, Steve" <[email protected]> > To: [email protected] > Date: 2015-03-09 10:12 > Subject: Re: VE 7.1.1.1 backup "freezing" a VM & question about "megablocks" > Sent by: "ADSM: Dist Stor Manager" <[email protected]> > > Andy, > > These are the only odd messages I saw from these backups. Just to > clarify, the backups didn't freeze, but the servers became > unresponsive to the end user while the backup was running. As soon > as the backup completed, the servers became responsive again. > Looking at the Windows app event logs, there is a gap showing no > activity for the duration of the backup. There were some entries in > the system event log during the backup. It was like someone hit the > "pause" button. Which I could understand if it looked like either > VSS or VMWare Snapshot manager hung, but as far as I can tell they > were successful. > > Thanks, > -steve > > 03/05/2015 07:58:19 ANS9365E VMware vStorage API error for virtual > machine 'XXXXXXXX'. > TSM function name : VixDiskLib_Read > TSM file : vmvddksdk.cpp (2766) > API return code : 1 > API error message : NBD_ERR_DISKLIB > 03/05/2015 07:58:19 ANS0361I DIAG: VmProcessExtent(): Retrying > failed read: vddksdkRead() rc=4398, startSector=31955968, numSectorsToRead=512 > 03/05/2015 09:14:51 ANS9365E VMware vStorage API error for virtual > machine 'ZZZZZZZZ'. > TSM function name : VixDiskLib_Read > TSM file : vmvddksdk.cpp (2766) > API return code : 1 > API error message : NBD_ERR_DISKLIB > 03/05/2015 09:14:51 ANS0361I DIAG: VmProcessExtent(): Retrying > failed read: vddksdkRead() rc=4398, startSector=24463360, numSectorsToRead=512 > 03/05/2015 09:16:02 ANS9365E VMware vStorage API error for virtual > machine 'XXXXXXXX'. > TSM function name : VixDiskLib_Read > TSM file : vmvddksdk.cpp (2766) > API return code : 1 > API error message : NBD_ERR_DISKLIB > 03/05/2015 09:16:02 ANS0361I DIAG: VmProcessExtent(): Retrying > failed read: vddksdkRead() rc=4398, startSector=76280832, numSectorsToRead=512 > 03/05/2015 09:19:05 ANS9365E VMware vStorage API error for virtual > machine 'XXXXXXXX'. > TSM function name : VixDiskLib_Read > TSM file : vmvddksdk.cpp (2766) > API return code : 1 > API error message : NBD_ERR_DISKLIB > 03/05/2015 09:19:05 ANS0361I DIAG: VmProcessExtent(): Retrying > failed read: vddksdkRead() rc=4398, startSector=82475520, numSectorsToRead=512 > 03/05/2015 10:06:56 ANS9365E VMware vStorage API error for virtual > machine 'XXXXXXXX'. > TSM function name : VixDiskLib_Read > TSM file : vmvddksdk.cpp (2766) > API return code : 1 > API error message : NBD_ERR_DISKLIB > > -----Original Message----- > From: ADSM: Dist Stor Manager [mailto:[email protected]] On > Behalf Of Andrew Raibeck > Sent: Monday, March 09, 2015 9:24 AM > To: [email protected] > Subject: Re: [ADSM-L] VE 7.1.1.1 backup "freezing" a VM & question > about "megablocks" > > Hi Steve, > > Off-hand I am not sure what this is, but a couple of things: > > 1. Are there any anomalous messages in the error log during the > timeframe of the backup that exhibits the problem? > > 2. Consider capturing a dump (*) of the TSM client backup process, > e.g., dsmcsc.exe, when the backup is in the "frozen" state, open a > PMR with TSM support, and send in the dump, the dsmerror.log, and > the dsmsched.log. > > (*) In case you are not familiar with capturing a dump: start task > manager, find the TSM client process that appears frozen, right- > click on the process, and select "Create Dump File". > > Best regards, > > - Andy > > ____________________________________________________________________________ > > Andrew Raibeck | Tivoli Storage Manager Level 3 Technical Lead | > [email protected] > > IBM Tivoli Storage Manager links: > Product support: > http://www.ibm.com/support/entry/portal/Overview/Software/Tivoli/ > Tivoli_Storage_Manager > > Online documentation: > http://www.ibm.com/support/knowledgecenter/SSGSG7/welcome > Product Wiki: > https://www.ibm.com/developerworks/community/wikis/home/wiki/Tivoli% > 20Storage%20Manager > > "ADSM: Dist Stor Manager" <[email protected]> wrote on 2015-03-06 > 11:28:19: > > > From: "Schaub, Steve" <[email protected]> > > To: [email protected] > > Date: 2015-03-06 11:29 > > Subject: VE 7.1.1.1 backup "freezing" a VM & question about "megablocks" > > Sent by: "ADSM: Dist Stor Manager" <[email protected]> > > > > First, many thanks to Wanda & others who have been so helpful in > > answering my previous VE questions! > > > > We had a situation yesterday where 2 VE backups were causing the VM's > > to go unresponsive. No response to ping, unable to RDP, etc. > > As soon as the backup finished (or was killed in one case), the > > servers picked back up where they left off. They never rebooted, but > > you can actually see in the Windows event logs a gap where no activity > > happens. Has anyone seen this behavior before? VE is at 7.1.1.1, the > > Hosts are ESXi 5.0 U2, vCenter is 5.5, windows 2008R2. > > > > Secondly, while reading the docs, I ran across the idea of performing > > periodic full backups in VE due to fragmentation of "megablocks"? Is > > this needed? If so, how do you manage it (how frequently, do you try > > to scatter the fulls across every day, how do these interact with > > daily incrementals, etc)? If it matters, all our backups land on a > > VTL. > > > > Thanks, > > > > Steve Schaub > > Systems Engineer II, Backup/Recovery > > Blue Cross Blue Shield of Tennessee > > 423-535-6574 (desk) > > 423-785-7347 (cell) > > > > ----------------------------------------------------- > > Please see the following link for the BlueCross BlueShield of > > Tennessee E-mail disclaimer: > > http://www.bcbst.com/email_disclaimer.shtm > > > ----------------------------------------------------- > Please see the following link for the BlueCross BlueShield of > Tennessee E-mail disclaimer: http://www.bcbst.com/email_disclaimer.shtm >
