Thanks for the detailed explanations / experiences / suggestions. I greatly appreciate and will store away in case I ever try this again.
Yes we do have lots of clients backing up at once - we easily hit 40-simultaneous sessions thus the reason for the high number. NUMOPENVOLSALLOWED is set to 10 for this server. I had not planned to empty it every night. This server doesn't have that much incoming backups. It has been running for a month without needing to force migration to tape. On Fri, Feb 13, 2015 at 1:39 PM, Prather, Wanda <[email protected]> wrote: > Probably just bad luck…. > > When I set up FILE pools for customers, I usually have to tweak them a > couple of times to get the sizing right, depends on the load, the number of > concurrent sessions, etc. Been there, done that, got the scars. > > Assumptions you should change: > > • Unlike a disk pool, if there is no space available in a TYPE=FILE > pool, backups don't fail over to the NEXT stgpool. WAD. I don't know why > it's that way. I think some RFE pressure is indicated, it causes me grief. > > • In a seq pool on disk, you need to be much more aggressive about > reclaims. If you have reclaim set at 59, you are saying you are willing to > live with 59% of your disk space dead/expired and unusable! That means you > need to size your disk pool so that 41% is big enough to hold the entire > night's backup. I set reclaim on my disk pools to 20%, or 15% if the disk > throughput is sufficient to tolerate the I/O. > > • Migration from a sequential pool may not be working like you > think; read the DEFINE STGPOOL HIGHMIG option definition in the admin ref > for your version. I always set MAXSCRATCH to 0 for a sequential file pool > and use pre-defined volumes instead of scratch so I have better control > over what happens. > > • You have mountlimit set to 40 in the devclass; how many concurrent > client sessions do you have writing to that pool? > > • Also check server option NUMOPENVOLSALLOWED to make sure you can > have enough volumes in use at once to do concurrent backups plus reclaims > plus backup stgpool plus migration etc etc etc. > > • If you are going to fill this pool and empty it out via migration > every night, best to force the migration yourself with a MIGRATE STPOOL > command rather than relying on the threshold. And if reclaims don't kick > in on their own regularly, set up a RECLAIM STGPOOL schedule to fire daily > anyway. Won't hurt. > > • The usual problem I see is that people don't have enough volumes > defined in the pool to account for all the concurrent sessions, plus some > empty volumes to allow for reclaims, plus a high enough > NUMOPENVOLSALLOWED. You've defined your volumes at 50G, so you should have > enough. One of these other issues is probably your problem. > > • While things are working well, check daily to see what is a > "normal" value of the number of "empty" volumes in that pool. Then set > yourself an alert to let you know when the number of "empty" volumes drops > below the "normal" value so you can investigate before disaster sets in. > > Good luck! > > Wanda Prather > TSM Consultant > ICF International Enterprise and Cybersecurity Systems Division > > > > > > -----Original Message----- > From: ADSM: Dist Stor Manager [mailto:[email protected]] On Behalf Of > Zoltan Forray > Sent: Friday, February 13, 2015 12:13 PM > To: [email protected] > Subject: [ADSM-L] DEVCLASS=FILE - what am I missing > > Up until recently, I have always used DEVCLASS=DISK for disk storage and > always preformatted/allocated the disk volumes into multiple chunks to all > for multi-I/O benefits. > > When I recently stood-up a new server, I decided to try DEVCLASS=FILE for > disk-based storage/incoming backups. > > I thought I understood that FILE type storage was basically > "tape/sequential files on disk" and would act accordingly and things like > reclamation now applied so when the file chunks (I defined 50GB file sizes) > got below the reclaim value, it would reclaim such files, create new ones > and delete the old ones automagically. > > Well, last night became a disaster. Backups failing all over because it > couldn't allocate any more files and also would not automatically shift to > use the "nextpool" which is defined as a tape pool. > > So, what am I doing wrong? What assumptions are wrong? Here is the > devclass values with the empty values left out...: > > Device Class Name: TSMFS > Device Access Strategy: Sequential > Storage Pool Count: 1 > Device Type: FILE > Format: DRIVE > Est/Max Capacity (MB): 51,200.0 > Mount Limit: 40 > Directory: /tsmpool > > Here is the lone stgpool that used this devclass: > > 12:06:21 PM GALAXY : q stg backuppool f=d > Storage Pool Name: BACKUPPOOL > Storage Pool Type: Primary > Device Class Name: TSMFS > Estimated Capacity: 7,106 G > Space Trigger Util: 84.5 > Pct Util: 80.9 > Pct Migr: 80.9 > Pct Logical: 99.2 > High Mig Pct: 85 > Low Mig Pct: 75 > Migration Delay: 0 > Migration Continue: Yes > Migration Processes: 1 > Reclamation Processes: 1 > Next Storage Pool: PRIMARY-ONSITE > Reclaim Storage Pool: > Maximum Size Threshold: No Limit > Access: Read/Write > Description: > Overflow Location: > Cache Migrated Files?: > Collocate?: No > Reclamation Threshold: 59 > Offsite Reclamation Limit: > Maximum Scratch Volumes Allowed: 143 > Number of Scratch Volumes Used: 137 > Delay Period for Volume Reuse: 0 Day(s) > Migration in Progress?: No > Amount Migrated (MB): 0.00 > Elapsed Migration Time (seconds): 1,009 > Reclamation in Progress?: No > Last Update by (administrator): ZFORRAY > Last Update Date/Time: 02/13/2015 11:44:23 > Storage Pool Data Format: Native > Copy Storage Pool(s): > Active Data Pool(s): > Continue Copy on Error?: Yes > CRC Data: No > Reclamation Type: Threshold > Overwrite Data when Deleted: > Deduplicate Data?: No Processes For Identifying > Duplicates: > Duplicate Data Not Stored: > Auto-copy Mode: Client Contains Data Deduplicated > by Client?: No > > I calculated the "Max Scratch Volumes" value based on having ~7.6TB > filesystem so 50GB * 143 = 7.1TB > > This morning when I checked, there were plenty of volumes with <40% > utilized. SO why didn't reclaim kick-in? or am I totally off on this > assumption? I manually performed move data on them and it freed things > up. > -- > *Zoltan Forray* > TSM Software & Hardware Administrator > BigBro / Hobbit / Xymon Administrator > Virginia Commonwealth University > UCC/Office of Technology Services > [email protected]<mailto:[email protected]> - 804-828-4807 > Don't be a phishing victim - VCU and other reputable organizations will > never use email to request that you reply with your password, social > security number or confidential personal information. For more details > visit http://infosecurity.vcu.edu/phishing.html > > -- *Zoltan Forray* TSM Software & Hardware Administrator BigBro / Hobbit / Xymon Administrator Virginia Commonwealth University UCC/Office of Technology Services [email protected] - 804-828-4807 Don't be a phishing victim - VCU and other reputable organizations will never use email to request that you reply with your password, social security number or confidential personal information. For more details visit http://infosecurity.vcu.edu/phishing.html
