The most serious problem we have encountered is the effect of reclamation on backup throughput. We have a 1 GB backup network that is used for servers to write directly to the Data Domain, bypassing TSM. When only one client is writing directly to the DD we see backup network utilization around 85%. When reclamation is running the client gets 5% backup network utilization. I've cancelled reclamation and watched the client throughput increase, then drop again when autoreclamation restarts. We now only run reclamation when the client's 1 TB daily backup is complete (a 4- to 5-hour process).
Collocation will increase the number of files used to store data and to be reclaimed. If you can run reclamation when no other processing is running there should be no impact, but watch your network stats for verification. Side Comment: We run NDMP backups across fiber to a VTL on the Data domain. There is no effect on backup network utilization when the NDMP backups are running. I'm still puzzled by this. Reclamation seems to use enough DD resources to slow backup network data ingestion but NDMP backups running with a higher throughput don't use enough DD processing power to slow (or even effect) a direct write by a client over Ethernet. Jim Schneider -----Original Message----- From: ADSM: Dist Stor Manager [mailto:[email protected]] On Behalf Of Richard Rhodes Sent: Thursday, April 19, 2012 8:28 AM To: [email protected] Subject: [ADSM-L] DataDomain and dedup per node Hi Everyone, As we have been implementing our two new DD boxes we have been setting them up like our existing two DD boxes - file devices with the pool NOT collocated. This is what DD recommends and it seems to work very well this way. But, I've been thinking about collocating anyway! I was poking around the DD command line and found that you can get the dedup/compression information for any individual directory or file. For example, below is the dedup/comp factors for a file volume in a pool with one node I'm testing with: rsbkup:/tsmdata/tsm_scripts==>./run_cmd.ksh tsm2 "q nodedata WVLOGS01P" | grep isdd2260 WVLOGS01p /isdd2260/tsm2/test/0002267E.BFS TEST-PRI-ISDD2260 30,551.83 WVLOGS01P /isdd2260/tsm2/test/0002267F.BFS TEST-PRI-ISDD2260 30,621.15 WVLOGS01P /isdd2260/tsm2/test/00022680.BFS TEST-PRI-ISDD2260 30,601.55 WVLOGS01P /isdd2260/tsm2/test/00022682.BFS TEST-PRI-ISDD2260 30,604.08 WVLOGS01P /isdd2260/tsm2/test/00022683.BFS TEST-PRI-ISDD2260 30,620.86 WVLOGS01P /isdd2260/tsm2/test/00022684.BFS TEST-PRI-ISDD2260 4,731.24 rsbkup:/tsmdata/tsm_scripts==>./run_cmd.ksh tsm2 "q vol /isdd2260/tsm2/test/0002267E.BFS" /isdd2260/tsm2/test/0002267E.BFS TEST-PRI-ISDD2260 TEST 30.6 G 100.0 Full sysadmin@isdd2260# filesys show compression /data/col1/tsm2/test/0002267e.bfs Total files: 1; bytes/storage_used: 4.6 Original Bytes: 32,332,636,620 Globally Compressed: 30,695,597,675 Locally Compressed: 6,930,888,022 Meta-data: 98,615,480 In this case, this vol is getting a 4.6x overall dedup/comp factor. So, if I collocate the pool in TSM I should be able to use "q nodedata <node>" to get a list of vols used by a node, then I can query the DD to get the dedup/comp stats for that node. A little scripting and I can generate a report of dedup/comp ratios by TSM node. This would help us maintain which nodes make sense to put/keep on the DD. Just curious if anyone is using collocation for a DD file pool? To do so would use more volumes and more filling volumes, but I can't think of any real reason to not collocate. Rick ----------------------------------------- The information contained in this message is intended only for the personal and confidential use of the recipient(s) named above. If the reader of this message is not the intended recipient or an agent responsible for delivering it to the intended recipient, you are hereby notified that you have received this document in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify us immediately, and delete the original message. ********************************************************************** Information contained in this e-mail message and in any attachments thereto is confidential. If you are not the intended recipient, please destroy this message, delete any copies held on your systems, notify the sender immediately, and refrain from using or disclosing all or any part of its content to any other person.
