Hi all-
For quite some time now, I have been trying to track down an elusive bottleneck in my TSM environment relating to disk-to-tape performance. This is a long read, but I would be very greatful for any suggestions. Hopefully some of you folks much smarter than me out there will be able to point me in the right direction. If any other LTO3 or LTO4 users out there could give me some examples of their real-world performance along with a little detail on their config, that would be most helpful as well! My current environment consists of: * TSM server = p570 LPAR w/4 1.9GHz processors and 8GB RAM, (6) 2Gb HBAS (2 for disk and 4 for tape traffic), and a 10Gb Ethernet adapter. * TSM 5.4.1.2 on AIX 5.3 TL6 * 3584 w/14 LTO3 drives at primary site * 3584 w/12 LTO1 drives at DR/hotsite (copypool volumes are written directly to this library via SAN routing) * DB (80GB -- 4GB DBVOL size) residing on IBM DS8300 behind IBM SVC * Log (11GB - single LOGVOL) residing on IBM DS8300 behind IBM SVC * Primary Storage pool in question (2.5TB -- 20GB volume size), DISK device class, residing on IBMDS8300 behind IBM SVC I currently back up about 4.5TB / night, of which ~2TB is written directly to my primary LTO3 tape pool with a simultaneous write to my copypool across town. So, each morning I'm left with about 2.5TB of data to copy and migrate from my disk pool(s) to copypool and onsite tape respectively. My backup stg performance to LTO1 tape (copypool) is about what I would expect. I run 5 threads for this process (5 mount points used), and I consistently average 20-25MB/sec/drive. Fair enough. I don't know of anyone getting a whole lot more than that out of an LTO1 drive. After that is complete, I then migrate that data to my LTO3 tape here onsite. That performance is pretty lousy compared to what I would expect to get out of LTO3. I run 6 migration threads (6 mount points used), and I average around 25MB/sec/drive going to LT03 as well. All SAN links between the TSM server and the LT03 drives are a minimum of 2Gb, so that is my lowest common denominator. I've tried using less threads to see if perhaps I was saturating an HBA rather than the drive. Same speed. I've tried separating my DB and STG pools on different storage subsystems. Same speed. I've opened PMR's with IBM support, and they have poured over all of my TSM server settings / config and found nothing to go on. We've had IBM ATS teams evaluate the situation, and they've never been able to pinpoint a problem. I've tried various tools--tapewrite, nmon, filemon, etc. and I've not found a smoking gun. At this point, my gut is that SVC is the bottleneck, but for those of you familiar with SVC, you know that trying to obtain meaningful performance statistics on the SVC cluster itself is frustrating. I know there are folks out there getting much better performance out of LTO3 drives, so please tell me how you're doing it! Suggestions? Questions? Thank you! -Kevin ----------------------------------------- This E-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended only for the use of the Individual(s) named above. If you are not the intended recipient of this E-mail, or the employee or agent responsible for delivering it to the intended recipient, you are hereby notified that any dissemination or copying of this E-mail is strictly prohibited. If you have received this E-mail in error, please immediately notify us at (865)374-4900 or notify us by E-mail at [EMAIL PROTECTED]
