Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool
Like it says in the document, it's a recommendation and not a technical limit. However, having the server running at 100% utilization all the time doesnt seem like a healthy scenario. Why arent you deduplicating files larger than 1GB? From my experience, datafiles from SQL, Exchange and such has a very large de-dup ratio, while TSM's deduplication skips files smaller than 2KB? I have a customer up north who used this configuration on an HP EVA based box with SATA disks. The disks where breaking down so fast that the arrays within the box was in a constant rebuild phase. HP claimed it was TSM dedup that was breaking the disks (they actually claimed TSM was writing so often that the disks broke), a scenario I have very hard to believe. Best Regards Daniel Daniel Sparrman Exist i Stockholm AB Växel: 08-754 98 00 Fax: 08-754 97 30 daniel.sparr...@exist.se http://www.existgruppen.se Posthusgatan 1 761 30 NORRTÄLJE -ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU skrev: - Till: ADSM-L@VM.MARIST.EDU Från: Colwell, William F. bcolw...@draper.com Sänt av: ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU Datum: 09/28/2011 20:43 Ärende: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool Hi Daniel, I remember hearing about a 6 TB limit for dedup in a webinar or conference call, but what I recall is that that was a daily thruput limit. In the same section of the redbook as you quote is this paragraph - Experienced administrators already know that Tivoli Storage Manager database expiration was one of the more processor-intensive activities on a Tivoli Storage Manager Server. Expiration is still processor intensive, albeit less so in Tivoli Storage Manager V6.1, but this is now second to deduplication in terms of consumption of processor cycles. Calculating the MD5 hash for each object and the SHA1 hash for each chunk is a processor intensive activity. I can say this is absolutely correct; my processor is frequently running at or near 100%. I have gone way beyond 6 TB of storage for dedup storagepools as this sql shows for the 2 instances on my server - select cast(stgpool_name as char(12)) as Stgpool, - cast(sum(num_files) / 1024 /1024 as decimal(4,1)) as Mil Files, - cast(sum(physical_mb) / 1024 /1024 as decimal(4,1)) as Physical_TB, - cast(sum(logical_mb)/ 1024 /1024 as decimal(4,1))as Logical_TB, - cast(sum(reporting_mb) / 1024 /1024 as decimal(4,1))as Reporting_TB - from occupancy - where stgpool_name in (select stgpool_name from stgpools where deduplicate = 'YES') - group by stgpool_name StgpoolMil Files Physical_TB Logical_TB Reporting_TB - -- --- - BKP_2 368.0 0.030.0 95.8 BKP_2X 341.0 0.023.9 58.6 StgpoolMil Files Physical_TB Logical_TB Reporting_TB - -- --- - BKP_2 224.0 0.035.7 74.1 BKP_FS_249.0 0.021.0 45.5 Also, I am not using any random disk pool, all the disk storage is scratch allocated file class volumes. There is also a tape library (lto5) for files larger than 1GB which are excluded from deduplication. Regards, Bill Colwell Draper Lab -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Daniel Sparrman Sent: Wednesday, September 28, 2011 3:49 AM To: ADSM-L@VM.MARIST.EDU Subject: Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool To be honest, it doesnt really say. The information is from the Tivoli Storage Manager Technical Guide: Note: In terms of sizing Tivoli Storage Manager V6.1 deduplication, we currently recommend using Tivoli Storage Manager to deduplicate up to 6 TB total of storage pool space for the deduplicated pools. This is a rule of thumb only and exists solely to give an indication of where to start investigating VTL or filer deduplication. The reason that a particular figure is mentioned is for guidance in typical scenarios on commodity hardware. If more than 6 TB of real diskspace is to be duplicated, you can either use Tivoli Storage Manager or a hardware deduplication device. The 6 TB is in addition to whatever disk is required by non-deduplicated storage pools. This rule of thumb will change as processor and disk technologies advance, because the recommendation is not an architectural, support, or testing limit. http://www.redbooks.ibm.com/redbooks/pdfs/sg247718.pdf I'm guessing it's server-side since client-side shouldnt use any resources @ the server. I'm
Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool
I'm not fully aware of how the DD replicates data, but if you have 15-20TB/day being written to your main DD, and that data is then replicated to the off-site DD, how much data is actually replicated? With a 1Gbs connection, you could hit values up to 360GB/s hour (expecting 100MB/s which should be theoretically possible, but it's usually lower than that on a 1Gbs connections) which means 8.6TB per 24 hours. So the data is both deduplicated and compressed before you send it offsite? Does the DD do the dedup within the same box, or require a separate box for dedup? You're also running with the same risk as the previous poster, you're relying entirely on the fact that your DD setup wont break. Is this how the DD is sold? (Buy 2 DD's, replicate between them and you're safe) ? I know it's (like the previous poster stated) always a question about costs vs mitigating risks, but if I got to choose, I'd rather have fast restores from my main site and slow from my offsite, as long as I can restore the data. Instead of having fast from main, fast from off, but there's a chance I might not be able to do restore at all. If DD claims they have data invunerability I'd really like to see how they hit 100% protection, since it would be the first system in the world to actually have managed to secure that last 0,0001% risk ;) RAID usually was secure until someone made an error, put in a blank disk and forgot to rebuild :) Best Regards Daniel Daniel Sparrman Exist i Stockholm AB Växel: 08-754 98 00 Fax: 08-754 97 30 daniel.sparr...@exist.se http://www.existgruppen.se Posthusgatan 1 761 30 NORRTÄLJE -ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU skrev: - Till: ADSM-L@VM.MARIST.EDU Från: Shawn Drew shawn.d...@americas.bnpparibas.com Sänt av: ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU Datum: 09/28/2011 22:26 Ärende: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool We average between 15-20TB/day at our main site, and that goes directly to a single DD890 (no random pool) . single-pool, file devclass, NFS mounted on 2x10GB crossover connections. Replicates over a 1gb WAN link to another DD890. (I spent all the money on the DD boxes, I didn't have enough left over for 10GB switches!) That other DD890 backs up another 7-10TB/day, replicating to the main site (bi-directional replication). All with file devclasses and there is not more than a one hour lag in replication by the time I show up in the morning.TSM doesn't have to do replication or backup stgpools anymore, so I can actually afford to do full db backups every day now. (I was doing an incremental scheme before) IBM has a similar recommended configuration with their Protectier solution, so they do support a single pool, backend replication solution. Data Domain also claims that data invulnerability which should catch any data corruption issue as soon as the data is written, and not later, when you try and restore. Regards, Shawn Shawn Drew Internet daniel.sparr...@exist.se Sent by: ADSM-L@VM.MARIST.EDU 09/28/2011 02:13 AM Please respond to ADSM-L@VM.MARIST.EDU To ADSM-L cc Subject [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool How many TB of data is common in this configuration? In a large environment, where databases are 5-10TB each and you have a demand to backup 5-10-15-20TB of data each night, this would require you to have 10Gbs for every host, something that would also cost a penny. Especially since the DD needs to be configured to have the throughput to write all those TB within a limited amount of time. Does the DD do de-dup within the same box (meaning, can I have 1 box that handles normal storage and does de-dup) or do I need a 2nd box? And the same issue also arises with the filepool, you're moving alot of data around completely unnecessary every day when u do reclaim. If I'm right, it also sounds like (in your description from the previous mails) you're not only using the DD for TSM storage. That sounds like putting all the eggs in the same basket. Best Regards Daniel Daniel Sparrman Exist i Stockholm AB Växel: 08-754 98 00 Fax: 08-754 97 30 daniel.sparr...@exist.se http://www.existgruppen.se Posthusgatan 1 761 30 NORRTÄLJE -ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU skrev: - Till: ADSM-L@VM.MARIST.EDU Från: Allen S. Rout a...@ufl.edu Sänt av: ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU Datum: 09/27/2011 18:55 Ärende: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool On 09/27/2011 12:02 PM, Rick Adamson wrote: The bigger question I have is since the file based storage is native to TSM why exactly is using a file based storage not supported? Not supported by what? If you've got a DD, then the simplest way to connect it to TSM is via files. Some backup apps require
Re: vtl versus file systems for pirmary pool
On Sep 29, 2011, at 12:30 AM, Daniel Sparrman wrote: I'm not fully aware of how the DD replicates data, but if you have 15-20TB/day being written to your main DD, and that data is then replicated to the off-site DD, how much data is actually replicated? With a 1Gbs connection, you could hit values up to 360GB/s hour (expecting 100MB/s which should be theoretically possible, but it's usually lower than that on a 1Gbs connections) which means 8.6TB per 24 hours. So the data is both deduplicated and compressed before you send it offsite? It's certainly de-duped before being replicated; it's probably compressed as well, but that's less obvious to me. Does the DD do the dedup within the same box, or require a separate box for dedup? Same box, as an in-line process. They're very proud of that. Nick Daniel Sparrman Exist i Stockholm AB Växel: 08-754 98 00 Fax: 08-754 97 30 daniel.sparr...@exist.se http://www.existgruppen.se Posthusgatan 1 761 30 NORRTÄLJE
Preschedule problem with ESSbase backup.
Hi TSM's I have a preschedule command to be run two cmd files to put an Essbase in read / only mode and then dump the database into a flat file. When the first COMAND is finished it must start another command file to do the same, just on a different base. But when the first command is finished start the backup without the second cmd file is executed. Is there anyone who can tell me where the problem is. example: startesscmd BackupBUD_EP.scr BackupBUD_EP.scr: LOG x x x; BEGIN ARCHIVE BUD_EP JointVen BUD_JVResult; SELECT 'BUD_EP JointVen ; EXPORT bud_JV.txt 2 1; BEGIN ARCHIVE BUD_EP Operator BUD_OPResult; SELECT 'BUD_EP Operator ; EXPORT bud_op.txt 2 1; BEGIN ARCHIVE BUD_EP EP BUD_EPResult; SELECT 'BUD_EP EP ; EXPORT bud_ep.txt 2 1; EXIT; Regards Bo Nielsen Senior Technology Consultant UNIX Server SAN and Backup DONG Energy Klædemålet 9 2100 København Ø Tlf. +45 99 55 54 34 bo...@dongenergy.dkmailto:bo...@dongenergy.dk www.dongenergy.comhttp://www.dongenergy.com/
Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool
Yepp, we have the same thing with our Sepaton, all deduplication is done inline. Reason I asked is because there seems to be other manufacturers who needs a 2nd box todo deduplication. Regards Daniel Daniel Sparrman Exist i Stockholm AB Växel: 08-754 98 00 Fax: 08-754 97 30 daniel.sparr...@exist.se http://www.existgruppen.se Posthusgatan 1 761 30 NORRTÄLJE -ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU skrev: - Till: ADSM-L@VM.MARIST.EDU Från: Nick Laflamme dplafla...@gmail.com Sänt av: ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU Datum: 09/29/2011 13:34 Ärende: Re: [ADSM-L] vtl versus file systems for pirmary pool On Sep 29, 2011, at 12:30 AM, Daniel Sparrman wrote: I'm not fully aware of how the DD replicates data, but if you have 15-20TB/day being written to your main DD, and that data is then replicated to the off-site DD, how much data is actually replicated? With a 1Gbs connection, you could hit values up to 360GB/s hour (expecting 100MB/s which should be theoretically possible, but it's usually lower than that on a 1Gbs connections) which means 8.6TB per 24 hours. So the data is both deduplicated and compressed before you send it offsite? It's certainly de-duped before being replicated; it's probably compressed as well, but that's less obvious to me. Does the DD do the dedup within the same box, or require a separate box for dedup? Same box, as an in-line process. They're very proud of that. Nick Daniel Sparrman Exist i Stockholm AB Växel: 08-754 98 00 Fax: 08-754 97 30 daniel.sparr...@exist.se http://www.existgruppen.se Posthusgatan 1 761 30 NORRTÄLJE
Re: Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool
So the data is both deduplicated and compressed before you send it offsite? Yes, that is how the DD handles replication. DD is a inline dedup system. When data come into the DD it is deduped, what is left is compressed, then it is written to disk. Only the new unique data is replicated. (yes, there must be meta data and new unique dedup hashes must be sent somehow). In general, the replication data stream reflects the dedup/compression ratio. Does the DD do the dedup within the same box, or require a separate box for dedup? A DD is nothing more than a powererful pc server with lots of memory, SATA disks, Linux OS. The secret sauce is the code to handle dedup, compression, replication, nfs, cifs, vtl, log structured filesystem, snapshots, etc, etc. You're also running with the same risk as the previous poster, you're relying entirely on the fact that your DD setup wont break. There is a security in tapes pieces/parts. A drive can fail but the rest keep running. A cartridge can get chewed up but it's only one cartridge. (We have 2 DD's, but also still have two large 3584 libraries). If a DD were to have a complete meltdown all backups on it are gone. This is true and something you have to come to grips with if moving to any disk based backup system. As has been mentioned it's a question if risk and cost. You could have dual onsite DD's with one for primary pool and a second for a TSM copy pool, but that doubles your cost. I will say that from what I see of our DD's, DD put a lot of time/effort into making the box highly reliable. Now, we implemented ours with a front end disk pool. The main reason is that we still wanted backups to not rely directly on the availability of the DD. If the DD is down for some reason (code upgrade, processor broke, etc) then backup still run. Is this how the DD is sold? (Buy 2 DD's, replicate between them and you're safe) ? You can run two DD and use it's replication. You can also use it as just a primary pool with a normal copy pool on tape. A DD (or any dedup system) doesn't change TSM, but it makes you think hard on how you configure and run TSM. If DD claims they have data invunerability I'd really like to see how they hit 100% protection, since it would be the first system in the world to actually have managed to secure that last 0,0001% risk ;) RAID usually was secure until someone made an error, put in a blank disk and forgot to rebuild :) Agreed. Ask the vendors for their stats on data loss events! Don't believe what they say, but ask anyway. I have to say I am impressed with our DD's (ouch, that hurt! It also shows that EMC didn't design it.). It runs it's own log based filesystem (new data is always appended on the end, not updated in place) which required periodic (weekly) compactions. Has snapshots. It has checksums built in, and runs on Raid6. Remember that since it's inline dedup/compression, it doesn't get as high I/O load on the actual spindles as a straight filesystem would. They truly did design it to make sure your data is safe. Of course . . .all it takes is a firmware bug to destroy everything! What we decided is that a major data loss event on the DD will trigger a disaster situation for the TSM system. Rick Best Regards Daniel Daniel Sparrman Exist i Stockholm AB Växel: 08-754 98 00 Fax: 08-754 97 30 daniel.sparr...@exist.se http://www.existgruppen.se Posthusgatan 1 761 30 NORRTÄLJE -ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU skrev: - Till: ADSM-L@VM.MARIST.EDU Från: Shawn Drew shawn.d...@americas.bnpparibas.com Sänt av: ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU Datum: 09/28/2011 22:26 Ärende: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool We average between 15-20TB/day at our main site, and that goes directly to a single DD890 (no random pool) . single-pool, file devclass, NFS mounted on 2x10GB crossover connections. Replicates over a 1gb WAN link to another DD890. (I spent all the money on the DD boxes, I didn't have enough left over for 10GB switches!) That other DD890 backs up another 7-10TB/day, replicating to the main site (bi-directional replication). All with file devclasses and there is not more than a one hour lag in replication by the time I show up in the morning.TSM doesn't have to do replication or backup stgpools anymore, so I can actually afford to do full db backups every day now. (I was doing an incremental scheme before) IBM has a similar recommended configuration with their Protectier solution, so they do support a single pool, backend replication solution. Data Domain also claims that data invulnerability which should catch any data corruption issue as soon as the data is written, and not later, when you try and restore. Regards, Shawn Shawn Drew
Re: Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool
Richard, excellent comments! I will add that to TSM is just storage and has no idea about the deduplication, compression, etc. that DD performs, thus making it challenging to determine the actual storage utilization from an individual client and/or file space perspective. Secondly, aside from the preformatted daily system report (autosupport), which is not customizable, getting reporting from the DD can be a little challenging to say the least. ~Rick -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Richard Rhodes Sent: Thursday, September 29, 2011 9:18 AM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool So the data is both deduplicated and compressed before you send it offsite? Yes, that is how the DD handles replication. DD is a inline dedup system. When data come into the DD it is deduped, what is left is compressed, then it is written to disk. Only the new unique data is replicated. (yes, there must be meta data and new unique dedup hashes must be sent somehow). In general, the replication data stream reflects the dedup/compression ratio. Does the DD do the dedup within the same box, or require a separate box for dedup? A DD is nothing more than a powererful pc server with lots of memory, SATA disks, Linux OS. The secret sauce is the code to handle dedup, compression, replication, nfs, cifs, vtl, log structured filesystem, snapshots, etc, etc. You're also running with the same risk as the previous poster, you're relying entirely on the fact that your DD setup wont break. There is a security in tapes pieces/parts. A drive can fail but the rest keep running. A cartridge can get chewed up but it's only one cartridge. (We have 2 DD's, but also still have two large 3584 libraries). If a DD were to have a complete meltdown all backups on it are gone. This is true and something you have to come to grips with if moving to any disk based backup system. As has been mentioned it's a question if risk and cost. You could have dual onsite DD's with one for primary pool and a second for a TSM copy pool, but that doubles your cost. I will say that from what I see of our DD's, DD put a lot of time/effort into making the box highly reliable. Now, we implemented ours with a front end disk pool. The main reason is that we still wanted backups to not rely directly on the availability of the DD. If the DD is down for some reason (code upgrade, processor broke, etc) then backup still run. Is this how the DD is sold? (Buy 2 DD's, replicate between them and you're safe) ? You can run two DD and use it's replication. You can also use it as just a primary pool with a normal copy pool on tape. A DD (or any dedup system) doesn't change TSM, but it makes you think hard on how you configure and run TSM. If DD claims they have data invunerability I'd really like to see how they hit 100% protection, since it would be the first system in the world to actually have managed to secure that last 0,0001% risk ;) RAID usually was secure until someone made an error, put in a blank disk and forgot to rebuild :) Agreed. Ask the vendors for their stats on data loss events! Don't believe what they say, but ask anyway. I have to say I am impressed with our DD's (ouch, that hurt! It also shows that EMC didn't design it.). It runs it's own log based filesystem (new data is always appended on the end, not updated in place) which required periodic (weekly) compactions. Has snapshots. It has checksums built in, and runs on Raid6. Remember that since it's inline dedup/compression, it doesn't get as high I/O load on the actual spindles as a straight filesystem would. They truly did design it to make sure your data is safe. Of course . . .all it takes is a firmware bug to destroy everything! What we decided is that a major data loss event on the DD will trigger a disaster situation for the TSM system. Rick Best Regards Daniel Daniel Sparrman Exist i Stockholm AB Växel: 08-754 98 00 Fax: 08-754 97 30 daniel.sparr...@exist.se http://www.existgruppen.se Posthusgatan 1 761 30 NORRTÄLJE -ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU skrev: - Till: ADSM-L@VM.MARIST.EDU Från: Shawn Drew shawn.d...@americas.bnpparibas.com Sänt av: ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU Datum: 09/28/2011 22:26 Ärende: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool We average between 15-20TB/day at our main site, and that goes directly to a single DD890 (no random pool) . single-pool, file devclass, NFS mounted on 2x10GB crossover connections. Replicates over a 1gb WAN link to another DD890. (I spent all the money on the DD boxes, I didn't have enough left over for 10GB switches!) That other DD890 backs up another 7-10TB/day, replicating to
Re: Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool
The elephants in the room: It is tempting, once DD gets in the door, to move all database backups (the typical TDP/RMAN and SQLLiteSpeed stuff) to go directly to DD. (No TSM involved, so save money on licenses?) Combinations that have more advanced communications with the back end storage (OST / Boost / Avarmar+DD) may be able to get hints about what is already stored on the dedupe device? Seems unlikey that TSM will gain any features like this any time soon. (NDMP? VTL? these feature are pretty dated.) Is TSM 6 not losing data via dedupe this week? How problematic is many TB of data on fileclass on file systems when it comes time to do a fsck after a system crash? [RC] On Sep 27, 2011, at 03:06 PM, Prather, Wanda wprat...@icfi.com wrote: Actually I have more customers using Data Domains without the VTL license than with it. With a Windows TSM server, you can just write to it via TCP/IP using a CIFS share(NFS mount with an AIX TSM server). If you have sufficient TCP/IP bandwidth for your load, no fibre connections needed. From the TSM point of view, you configure it as a file pool. You get the benefits of dedup and (if you have a 2nd one at your DR site) replication. Neither good or bad, just different. Very simple setup, works great if it meets your throughput requirements. W -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Daniel Sparrman Sent: Tuesday, September 27, 2011 2:49 PM To: ADSM-L@VM.MARIST.EDU Subject: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool The fact you actually need to pay a VTL license is just plain scary. When u bought it, did they think you're gonna use it as a fileserver? I'm not to specialized into Data Domain, but arent they marketed as backup hardware? So you get a disk, but if you want to use it for something else than that, you need to pay a license? Sorry for sounding bitter, but I've always heard people referring to Data Domain as a VTL. Daniel Sparrman Exist i Stockholm AB Växel: 08-754 98 00 Fax: 08-754 97 30 daniel.sparr...@exist.se http://www.existgruppen.se Posthusgatan 1 761 30 NORRTÄLJE -ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU skrev: - Till: ADSM-L@VM.MARIST.EDU Från: Allen S. Rout a...@ufl.edu Sänt av: ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU Datum: 09/27/2011 18:55 Ärende: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool On 09/27/2011 12:02 PM, Rick Adamson wrote: The bigger question I have is since the file based storage is native to TSM why exactly is using a file based storage not supported? Not supported by what? If you've got a DD, then the simplest way to connect it to TSM is via files. Some backup apps require something that looks like a library, in which case you'd be buying the VTL license. FWIW, if you're already in DD space, you're paying a pretty penny. The VTL license isn't chicken feed, I agree, but it's not a major component of the total cost. - Allen S. Rout
Re :vtl versus file systems for pirmary pool
Good questions: We are currently working on a project that is using ProtectTIER. The ProtectTIER 7650G is does dedup. It looks like a TS3500 w LTO drive. We will be getting another 7650G at a second data center. The idea is to to cross-replicate between data centers. -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of robert_clark Sent: Thursday, September 29, 2011 11:34 AM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool The elephants in the room: It is tempting, once DD gets in the door, to move all database backups (the typical TDP/RMAN and SQLLiteSpeed stuff) to go directly to DD. (No TSM involved, so save money on licenses?) Combinations that have more advanced communications with the back end storage (OST / Boost / Avarmar+DD) may be able to get hints about what is already stored on the dedupe device? Seems unlikey that TSM will gain any features like this any time soon. (NDMP? VTL? these feature are pretty dated.) Is TSM 6 not losing data via dedupe this week? How problematic is many TB of data on fileclass on file systems when it comes time to do a fsck after a system crash? [RC] On Sep 27, 2011, at 03:06 PM, Prather, Wanda wprat...@icfi.com wrote: Actually I have more customers using Data Domains without the VTL license than with it. With a Windows TSM server, you can just write to it via TCP/IP using a CIFS share(NFS mount with an AIX TSM server). If you have sufficient TCP/IP bandwidth for your load, no fibre connections needed. From the TSM point of view, you configure it as a file pool. You get the benefits of dedup and (if you have a 2nd one at your DR site) replication. Neither good or bad, just different. Very simple setup, works great if it meets your throughput requirements. W -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Daniel Sparrman Sent: Tuesday, September 27, 2011 2:49 PM To: ADSM-L@VM.MARIST.EDU Subject: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool The fact you actually need to pay a VTL license is just plain scary. When u bought it, did they think you're gonna use it as a fileserver? I'm not to specialized into Data Domain, but arent they marketed as backup hardware? So you get a disk, but if you want to use it for something else than that, you need to pay a license? Sorry for sounding bitter, but I've always heard people referring to Data Domain as a VTL. Daniel Sparrman Exist i Stockholm AB Växel: 08-754 98 00 Fax: 08-754 97 30 daniel.sparr...@exist.se http://www.existgruppen.se Posthusgatan 1 761 30 NORRTÄLJE -ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU skrev: - Till: ADSM-L@VM.MARIST.EDU Från: Allen S. Rout a...@ufl.edu Sänt av: ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU Datum: 09/27/2011 18:55 Ärende: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool On 09/27/2011 12:02 PM, Rick Adamson wrote: The bigger question I have is since the file based storage is native to TSM why exactly is using a file based storage not supported? Not supported by what? If you've got a DD, then the simplest way to connect it to TSM is via files. Some backup apps require something that looks like a library, in which case you'd be buying the VTL license. FWIW, if you're already in DD space, you're paying a pretty penny. The VTL license isn't chicken feed, I agree, but it's not a major component of the total cost. - Allen S. Rout
Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool
The elephant has left the building. Do you get the same advanced features by just dumping data onto a DD as you do with the TSM TDP clients? Exmerge anyone? Or perhaps an SQL dump? Still have todo filebackups, or wait, why not just use robocopy and copy it onto the DD? Or what the heck, just place the fileserver on the DD. That way you dont have todo backups, the data is already on the DD. As for TSM loosing data, what tells you that the DD dedup algorithm never lost data? I bet I can prove you wrong. Well, when the DD hits the wall, at least you wont have todo a fsck, since there wont be anything left that needs an fsck. DD replication = not application aware = not detecting software-based discrepencies. That's why I'd never replace TSM's backup storage pool or the copypools feature with a replicated solution. If you're OK with replication, why dont you just mirror the solution (if you want the errors to hit both the boxes at the same time, make sure to use synchronous mirroring and not async, god knows, with async you might not get the error mirrored in time). It's ok to make it easy, but when the shit hits the fan, make sure you actually know what you sacrificed (having I destroyed a datacenter in your resume probably wont make it easier to find a new job).k Scary *schruggs* Daniel Sparrman Exist i Stockholm AB Växel: 08-754 98 00 Fax: 08-754 97 30 daniel.sparr...@exist.se http://www.existgruppen.se Posthusgatan 1 761 30 NORRTÄLJE -ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU skrev: - Till: ADSM-L@VM.MARIST.EDU Från: robert_clark robert_cl...@mac.com Sänt av: ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU Datum: 09/29/2011 19:34 Ärende: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool The elephants in the room: It is tempting, once DD gets in the door, to move all database backups (the typical TDP/RMAN and SQLLiteSpeed stuff) to go directly to DD. (No TSM involved, so save money on licenses?) Combinations that have more advanced communications with the back end storage (OST / Boost / Avarmar+DD) may be able to get hints about what is already stored on the dedupe device? Seems unlikey that TSM will gain any features like this any time soon. (NDMP? VTL? these feature are pretty dated.) Is TSM 6 not losing data via dedupe this week? How problematic is many TB of data on fileclass on file systems when it comes time to do a fsck after a system crash? [RC] On Sep 27, 2011, at 03:06 PM, Prather, Wanda wprat...@icfi.com wrote: Actually I have more customers using Data Domains without the VTL license than with it. With a Windows TSM server, you can just write to it via TCP/IP using a CIFS share(NFS mount with an AIX TSM server). If you have sufficient TCP/IP bandwidth for your load, no fibre connections needed. From the TSM point of view, you configure it as a file pool. You get the benefits of dedup and (if you have a 2nd one at your DR site) replication. Neither good or bad, just different. Very simple setup, works great if it meets your throughput requirements. W -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Daniel Sparrman Sent: Tuesday, September 27, 2011 2:49 PM To: ADSM-L@VM.MARIST.EDU Subject: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool The fact you actually need to pay a VTL license is just plain scary. When u bought it, did they think you're gonna use it as a fileserver? I'm not to specialized into Data Domain, but arent they marketed as backup hardware? So you get a disk, but if you want to use it for something else than that, you need to pay a license? Sorry for sounding bitter, but I've always heard people referring to Data Domain as a VTL. Daniel Sparrman Exist i Stockholm AB Växel: 08-754 98 00 Fax: 08-754 97 30 daniel.sparr...@exist.se http://www.existgruppen.se Posthusgatan 1 761 30 NORRTÄLJE -ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU skrev: - Till: ADSM-L@VM.MARIST.EDU Från: Allen S. Rout a...@ufl.edu Sänt av: ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU Datum: 09/27/2011 18:55 Ärende: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool On 09/27/2011 12:02 PM, Rick Adamson wrote: The bigger question I have is since the file based storage is native to TSM why exactly is using a file based storage not supported? Not supported by what? If you've got a DD, then the simplest way to connect it to TSM is via files. Some backup apps require something that looks like a library, in which case you'd be buying the VTL license. FWIW, if you're already in DD space, you're paying a pretty penny. The VTL license isn't chicken feed, I agree, but it's not a major component of the total cost. - Allen S. Rout
Backing up EMC SourceOne
Anyone out there backing up a multiple server SourceOne configuration? This is the replacement product for MailXtender. There is a script to run preschedule to set SourceOne up for backup. And then you also have to backup the other servers in the SourceOne configuration while this is 'Paused'. Any help would be greatly appreciated! Bill Boyer Free Tip: The F1 Key does NOT destroy your PC! - ??
Re: Backing up EMC SourceOne
Bill, We have one SourceOne server using a database on a separate server. We run the job sequence using BMC's Control-M scheduler. SourceOne server1) Activity Suspend vbs 2) Native Archive Suspend vbs Database server 3) Database export 4) Archive of export SourceOne server5) Archive of SourceOneMessageCenter, SourceOneIndex directories 6) Native Archive resume vbs 7) Activity Resume vbs The entire process takes about 20 minutes. I don't know if this is what you're looking for, but hope it helps. Jim Schneider United Stationers -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@vm.marist.edu] On Behalf Of Bill Boyer Sent: Thursday, September 29, 2011 2:15 PM To: ADSM-L@vm.marist.edu Subject: [ADSM-L] Backing up EMC SourceOne Anyone out there backing up a multiple server SourceOne configuration? This is the replacement product for MailXtender. There is a script to run preschedule to set SourceOne up for backup. And then you also have to backup the other servers in the SourceOne configuration while this is 'Paused'. Any help would be greatly appreciated! Bill Boyer Free Tip: The F1 Key does NOT destroy your PC! - ??
Re: Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool
Hi Daniel, My main point was to say that your previous posts seemed to be saying that dedup storagepools were recommended to be 6 TB in size at most. It is my understanding the 6TB recommendation was a daily server thruput maximum design target when dedup is in use. I agree, a processor at 100% is not good and I have been adjusting the server design to reduce the load. I started re-hosting our backup service on v6 as soon as v6 was available. I started out deduping everything but quickly ran into performance problems. To solve them I started excluding classes of data from dedup - all Oracle backups, all outlook PST files and any other file larger than 1 GB. I also replaced all the disks I started with over 12 months and greatly expanded the total storage. Where the Redbook says that expiration is much improved, that is only partly true. If dedup is involved, a hidden process starts after the visible expiration process is done and runs on for quite a while longer. This process has to check if a chuck in an expired file can truly be removed from storage because it could be that other files are pointing to that chunk. You can see the process by entering 'show dedupdeleteinfo' after expiration completes. The thing about big files is that they are broken into lots of chunks. When a big file is expired, this hidden process will take a long time to complete and can bog down the system. This is the real reason I exclude some files from dedup. As for SATA, I have been using some big arrays (20 2TB disks, raid 6), 8 such arrays, for 18 months and have had only 1 disk fail. But I try not to abuse them. Backups first go onto jbod disks - 15K rpm, 600GB - and all the dedup activity is done there. The storagepools on those disks are then migrated to storagepools on the SATA arrays. It is a mostly sequential process. I can only suggest that if your customer does storagepool backup from the SATA arrays after migration or reclaim, and the copypool is not dedup, then there would be a lot of random requests to the SATA storagepools to rehydrate the backups. Regards, Bill Colwell Draper Lab -Original Message- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Daniel Sparrman Sent: Thursday, September 29, 2011 1:24 AM To: ADSM-L@VM.MARIST.EDU Subject: Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool Like it says in the document, it's a recommendation and not a technical limit. However, having the server running at 100% utilization all the time doesnt seem like a healthy scenario. Why arent you deduplicating files larger than 1GB? From my experience, datafiles from SQL, Exchange and such has a very large de-dup ratio, while TSM's deduplication skips files smaller than 2KB? I have a customer up north who used this configuration on an HP EVA based box with SATA disks. The disks where breaking down so fast that the arrays within the box was in a constant rebuild phase. HP claimed it was TSM dedup that was breaking the disks (they actually claimed TSM was writing so often that the disks broke), a scenario I have very hard to believe. Best Regards Daniel Daniel Sparrman Exist i Stockholm AB Växel: 08-754 98 00 Fax: 08-754 97 30 daniel.sparr...@exist.se http://www.existgruppen.se Posthusgatan 1 761 30 NORRTÄLJE -ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU skrev: - Till: ADSM-L@VM.MARIST.EDU Från: Colwell, William F. bcolw...@draper.com Sänt av: ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU Datum: 09/28/2011 20:43 Ärende: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool Hi Daniel, I remember hearing about a 6 TB limit for dedup in a webinar or conference call, but what I recall is that that was a daily thruput limit. In the same section of the redbook as you quote is this paragraph - Experienced administrators already know that Tivoli Storage Manager database expiration was one of the more processor-intensive activities on a Tivoli Storage Manager Server. Expiration is still processor intensive, albeit less so in Tivoli Storage Manager V6.1, but this is now second to deduplication in terms of consumption of processor cycles. Calculating the MD5 hash for each object and the SHA1 hash for each chunk is a processor intensive activity. I can say this is absolutely correct; my processor is frequently running at or near 100%. I have gone way beyond 6 TB of storage for dedup storagepools as this sql shows for the 2 instances on my server - select cast(stgpool_name as char(12)) as Stgpool, - cast(sum(num_files) / 1024 /1024 as decimal(4,1)) as Mil Files, - cast(sum(physical_mb) / 1024 /1024 as decimal(4,1)) as Physical_TB, - cast(sum(logical_mb)/ 1024 /1024 as decimal(4,1))as Logical_TB, - cast(sum(reporting_mb) / 1024 /1024 as decimal(4,1))as Reporting_TB - from occupancy -