Hi Chavdar and Michael, Thanks for your thoughts and help.
I added "memoryefficientbackup". But still the sessions keep crashing. Once the session crashes, I get a whole bit of errors for storage pool directories, and in fact the whole pool becomes unavailable. I run "update stgpooldir ... access=readwrite" and all is accessible again. Some of the containers are in unavailable state and need audit. Our container storage is on a Dell PowerEdge R730xd, has 24 CPU's allocated, 64 GB memory, 110 TB disk. The disks are declared as VMDKs. Network is on a 10Gb Intel 82588 card. Nothing I can see points to a lack of resources. Everything worked fine till 4 days ago. That is why I thought of a problem with Windows updates, but as I rolled them back, that does not make sense. I am quite at a loss where to look next ... Thanks David [Server Side] . 20-08-2023 19:47:22 ANR0839I Session 197902 started for node MEDFS2 (WinNT) (SSL medspice.bgu.ac.il[132.72.73.246]:53184) on STOREWARE13.auth.ad.bgu.ac.il:1502. (SESSION: 197902) 20-08-2023 19:47:26 ANR8592I Session 197903 connection is using protocol TLSV13, cipher specification TLS_AES_256_GCM_SHA384, certificate TSM Self-Signed Certificate. (SESSION: 197903) 20-08-2023 19:47:26 ANR0839I Session 197903 started for node MEDFS2 (WinNT) (SSL medspice.bgu.ac.il[132.72.73.246]:53185) on STOREWARE13.auth.ad.bgu.ac.il:1502. (SESSION: 197903) 20-08-2023 19:47:55 ANR2012W Error encountered for storage pool directory: \\medbackup.med.ad.bgu.ac.il\tsmc20 in storage pool: CPOOL. (SESSION: 197881) 20-08-2023 19:47:55 ANR1181E sdtxn.c(1404): Data storage transaction 0:83236375 was aborted. (SESSION: 197881) 20-08-2023 19:47:55 ANR0204I The container state for \\medbackup.med.ad.bgu.ac.il\tsmc17\18\0000000000001853.- ncf is updated from AVAILABLE to UNAVAILABLE. (SESSION: 197883) 20-08-2023 19:47:55 ANR3660E An unexpected error occurred while opening or writing to the container. Container \\medbackup.med.ad.bgu.ac.il\tsmc17\18\0000000000001853.- ncf in stgpool CPOOL has been marked as UNAVAILABLE and should be audited to validate accessibility and content. (SESSION: 197883) [From the client side:] During the incr of a large filespace: Normal File--> 7.132.827 \\medfs2\e$\medusers14\angel\17.8.23 BU - E\MyDocs(E)-PrevOLD-D\MyDocs (D)\PERSON-CRITER\FAMILY\OMRI's folder 313843070\OMRI 1-16 medical issue\MRIs - CTs - OMRI\MY PROCESSING of MRI and general MRI data\For-Crop-T2W - coronal Copy.pptx ** Unsuccessful ** ANS1228E Sending of object '\\medfs2\e$\medusers14\angel\17.8.23 BU - E\MyDocs(E)-PrevOLD-D\MyDocs (D)\PERSON-CRITER\FAMILY\OMRI's folder 313843070\OMRI 1-16 medical issue\MRIs - CTs - OMRI\MY PROCESSING of MRI and general MRI data\For-Crop-T2W - coronal Copy.pptx' failed. ANS1311E Server out of data storage space [I ran sel of the latest file. It failed because all containerdirs were unavailable.] ANS1804E Selective Backup processing of '\\medfs2\e$\medusers14\angel\17.8.23 BU - E\MyDocs(E)-PrevOLD-D\MyDocs (D)\PERSON-CRITER\FAMILY\OMRI's folder 313843070\OMRI 1-16 medical issue\MRIs - CTs - OMRI\MY PROCESSING of MRI and general MRI data\For-Crop-T2W - coronal Copy.pptx' finished with failures. Total number of objects inspected: 1 Total number of objects backed up: 0 Total number of objects updated: 0 Total number of objects rebound: 0 Total number of objects deleted: 0 Total number of objects expired: 0 Total number of objects failed: 1 ... Network data transfer rate: 148.306,35 KB/sec Aggregate data transfer rate: 211,50 KB/sec Objects compressed by: 0% Total data reduction ratio: 0.23% Subfile objects reduced by: 0% Elapsed processing time: 00:00:32 ANS1311E Server out of data storage space [Then I updated the containerdirs to readwrite and ran the selective backup. No problem] ----------------------------------------------------------------------------------------------------------- Protect> sel '\\medfs2\e$\medusers14\angel\17.8.23 BU - E\MyDocs(E)-PrevOLD-D\MyDocs (D)\PERSON-CRITER\FAMILY\OMRI's folder 313843070\OMRI 1-16 medical issue\MRIs - CTs - OMRI\MY PROCESSING of MRI and general MRI data\For-Crop-T2W - coronal Copy.pptx' Selective Backup function invoked. Normal File--> 7.132.827 \\medfs2\e$\medusers14\angel\17.8.23 BU - E\MyDocs(E)-PrevOLD-D\MyDocs (D)\PERSON-CRITER\FAMILY\OMRI's folder 313843070\OMRI 1-16 medical issue\MRIs - CTs - OMRI\MY PROCESSING of MRI and general MRI data\For-Crop-T2W - coronal Copy.pptx [Sent] Selective Backup processing of '\\medfs2\e$\medusers14\angel\17.8.23 BU - E\MyDocs(E)-PrevOLD-D\MyDocs (D)\PERSON-CRITER\FAMILY\OMRI's folder 313843070\OMRI 1-16 medical issue\MRIs - CTs - OMRI\MY PROCESSING of MRI and general MRI data\For-Crop-T2W - coronal Copy.pptx' finished without failure. -----Original Message----- From: ADSM: Dist Stor Manager <ADSM-L@VM.MARIST.EDU> On Behalf Of Chavdar Cholev Sent: Sunday, August 20, 2023 3:43 PM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] INCR backups fail ! TSM 8.1.17 Windows Server and client Just to make sure that we are on the same page... You have TSM installed on VM running on VMware. This VM has few LUNs presented and those LUN are used for containers? Short in the dark: 1. Check VM resources if they are as IBM TSM blue print. 2. Check LUNs/HDDs response time in perf. monitor. The response time should around 20-30 Ms during the backup operating. 3. Do you know if those HDDd for LUNs are .vmdk or RDM (raw device map)? Thank you! Chavdar On Saturday, August 19, 2023, David L.A. De Leeuw <da...@bgu.ac.il> wrote: > Hi TSM experts, > > Our incr backup fails consistently in the last few days. It starts > alright but after a few gigabyte on the client we get the error: > > ANS1301E This operation cannot continue due to an error on the IBM > Spectrum Protect server. See your IBM Spectrum Protect server > administrator for assistance. > > On the server side we see: > > 18-08-2023 22:57:25 ANR2012W Error encountered for storage pool directory: > \\medbackup.med.ad.bgu.ac.il\tsmc1 in storage pool: > CPOOL. (SESSION: 194578) > 18-08-2023 22:57:25 ANR0530W Transaction failed for session 194578 for > node > MEDFS2 (WinNT) - internal server error detected. > (SESSION: 194578) > 18-08-2023 22:57:26 ANR2012W Error encountered for storage pool directory: > \\medbackup.med.ad.bgu.ac.il\tsmc1 in storage pool: > CPOOL. (SESSION: 194578) > > > Then we find one or more containers unavailable. We fix the containers > with "audit container ... action=scanall" > No errors are found. But the next backup will fail again. > > The server is on 8.1.17, the client as well. > The containers are on a number of disks on a shared windows server 2019. > There have been some updates on the windows server recently. > (KB5029247,KB5029647) > > The audits are fine, data is accessible, but backups fail. > Any ideas ? > > David de Leeuw > Ben-Gurion University of the Negev > Beer Sheva Israel > >