"Prather, Wanda" wrote: > 1. IT DEPENDS. (You KNEW I was going to say that, didn't you?!?)
Yes. > > > 2. There is no "right" answer (so you will probably be getting LOTS of > opinions back about this question ;>). Understood. > 3. If you only have a few clients to back up, it just doesn't matter. That's me, however, I was thinking in terms of tuning my repository for quickest throughput. > 4. The purpose of your disk pool is to act as a "buffer", so that you can > have many more clients backing up concurrently than you have tape drives. > As long as your disk pool is large enough so that your clients can back up > without waiting for a tape, you have an "adequate" configuration. Understood, however I go back to my size tuning question, more smaller or less bigger? But, you said it doesn't matter which probably answers my question. > 5. The rule of thumb is that you want to have enough space in your disk > pool to hold at least one day's backups. That way if there is a tape > problem (more common than disk problems), you will have time to fix it when > you arrive in the morning without any backups failing. Now you have a > "better" configuration. Understood. > 6. After that, what you do with your disk pool is work on increasing > throughput/performance, ie. the "best" combination of function and > throughput. And it depends a lot on your client mix and a lot on your > hardware. Assume very few clients (1-2) with many many filesystems. > IF you have "n" clients backing up concurrently, and you have at least "n" > disk pool volumes, TSM will start "n" I/O's in parallel to the different > disk pool volumes. So to take advantage of MANY volumes, you have to have > MANY concurrent backups in progress. But, if you are writing to 1 spindle > of disk with a zillion tiny diskpool volumes, you will get more thrashing of > the disk than throughput. This is important information I could not confirm. Thank you for this. > So, if writing to a raw (not RAID) spindle, you proably want 2-3 diskpool > volumes per spindle. No, it's just JBOD spindles now, maybe RAID in the future. > On the other hand, lots of people use some type of RAID these days. The > disk pool I/O is sequential; some people have reported good results with > striping across multiple physical spindles. And if you are writing to a > Shark with gobs of cache memory in front of it, that buffers the effect of > the disk head movement and you may not be able to come up with many > configurations where you can measure the difference between a "few" and > "many" disk pool volumes. Good information, thank you. > 6. NONE of that is really going to affect your throughput to LTO. > > - no matter how many (or few) diskpool volumes you have, if they are full > of lots of itty bitty files (actually aggregates of files), remember that > TSM has to update the data base each time it moves one. It's hard to stream > enough data to the tape during this process to get great throughput numbers. Good information. > - no matter how many (or few) diskpool volumes you have, if they contain a > few BIG files, TSM has a better chance of pushing the data to LTO fast, > because it doesn't have to make data base updates as it goes. > > - many people report better thoroughput on BIG files when writing direct > from the client to tape, not the disk pool. > > This should give people LOTS of targets to respond to!! Thanks for your detailed response. From my standpoint, I guess I should allocate large files which will slightly exceed the number of clients streams I anticipate. I am also running multiple streams from the same client at the same time. But, you answered my overall question about many-small or few-big and I think I have enough information to cobble a nice architecture here. Thank you again. Mitch > > > Wanda Prather > Johns Hopkins University Applied Physics Laboratory > 443-778-8769 >
