Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool

2011-09-29 Thread Daniel Sparrman
Like it says in the document, it's a recommendation and not a technical limit.

However, having the server running at 100% utilization all the time doesnt seem 
like a healthy scenario.

Why arent you deduplicating files larger than 1GB? From my experience, 
datafiles from SQL, Exchange and such has a very large de-dup ratio, while 
TSM's deduplication skips files smaller than 2KB?

I have a customer up north who used this configuration on an HP EVA based box 
with SATA disks. The disks where breaking down so fast that the arrays within 
the box was in a constant rebuild phase. HP claimed it was TSM dedup that was 
breaking the disks (they actually claimed TSM was writing so often that the 
disks broke), a scenario I have very hard to believe.

Best Regards

Daniel



Daniel Sparrman
Exist i Stockholm AB
Växel: 08-754 98 00
Fax: 08-754 97 30
daniel.sparr...@exist.se
http://www.existgruppen.se
Posthusgatan 1 761 30 NORRTÄLJE



-ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU skrev: -


Till: ADSM-L@VM.MARIST.EDU
Från: Colwell, William F. bcolw...@draper.com
Sänt av: ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU
Datum: 09/28/2011 20:43
Ärende: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file 
systems for pirmary pool

Hi Daniel,

 

I remember hearing about a 6 TB limit for dedup in a webinar or conference call,

but what I recall is that that was a daily thruput limit.  In the same section 
of the

redbook as you quote is this paragraph -

 

Experienced administrators already know that Tivoli Storage Manager database 
expiration

was one of the more processor-intensive activities on a Tivoli Storage Manager 
Server.

Expiration is still processor intensive, albeit less so in Tivoli Storage 
Manager V6.1, but this is

now second to deduplication in terms of consumption of processor cycles. 
Calculating the

MD5 hash for each object and the SHA1 hash for each chunk is a processor 
intensive activity.

 

I can say this is absolutely correct; my processor is frequently running at or 
near 100%.

 

I have gone way beyond 6 TB of storage for dedup storagepools as this sql shows

for the 2 instances on my server -

 

select cast(stgpool_name as char(12)) as Stgpool, -

   cast(sum(num_files) / 1024 /1024 as decimal(4,1)) as Mil Files, -

   cast(sum(physical_mb)   / 1024 /1024 as decimal(4,1)) as Physical_TB, -

   cast(sum(logical_mb)/ 1024 /1024 as decimal(4,1))as Logical_TB, -

   cast(sum(reporting_mb)  / 1024 /1024 as decimal(4,1))as Reporting_TB -

from occupancy -

  where stgpool_name in (select stgpool_name from stgpools where deduplicate = 
'YES') -

   group by stgpool_name

 

 

StgpoolMil Files  Physical_TB  Logical_TB  Reporting_TB

- --  --- -

BKP_2  368.0  0.030.0  95.8

BKP_2X 341.0  0.023.9  58.6

 

 

StgpoolMil Files  Physical_TB  Logical_TB  Reporting_TB

- --  --- -

BKP_2  224.0  0.035.7  74.1

BKP_FS_249.0  0.021.0  45.5

 

 

Also, I am not using any random disk pool, all the disk storage is scratch 
allocated

file class volumes.  There is also a tape library (lto5) for files larger than 
1GB

which are excluded from deduplication.

 

 

Regards,

 

Bill Colwell

Draper Lab

 

 

-Original Message-
From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Daniel 
Sparrman
Sent: Wednesday, September 28, 2011 3:49 AM
To: ADSM-L@VM.MARIST.EDU
Subject: Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file 
systems for pirmary pool

 

To be honest, it doesnt really say. The information is from the Tivoli Storage 
Manager Technical Guide:

 

Note: In terms of sizing Tivoli Storage Manager V6.1 deduplication, we currently

recommend using Tivoli Storage Manager to deduplicate up to 6 TB total of 
storage pool

space for the deduplicated pools. This is a rule of thumb only and exists 
solely to give an

indication of where to start investigating VTL or filer deduplication. The 
reason that a

particular figure is mentioned is for guidance in typical scenarios on 
commodity hardware.

If more than 6 TB of real diskspace is to be duplicated, you can either use 
Tivoli Storage

Manager or a hardware deduplication device. The 6 TB is in addition to whatever 
disk is

required by non-deduplicated storage pools. This rule of thumb will change as 
processor

and disk technologies advance, because the recommendation is not an 
architectural,

support, or testing limit.

 

http://www.redbooks.ibm.com/redbooks/pdfs/sg247718.pdf

 

I'm guessing it's server-side since client-side shouldnt use any resources @ 
the server. I'm 

Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool

2011-09-29 Thread Daniel Sparrman
I'm not fully aware of how the DD replicates data, but if you have 15-20TB/day 
being written to your main DD, and that data is then replicated to the off-site 
DD, how much data is actually replicated?
 
With a 1Gbs connection, you could hit values up to 360GB/s hour (expecting 
100MB/s which should be theoretically possible, but it's usually lower than 
that on a 1Gbs connections) which means 8.6TB per 24 hours. So the data is both 
deduplicated and compressed before you send it offsite?
 
Does the DD do the dedup within the same box, or require a separate box for 
dedup?
 
You're also running with the same risk as the previous poster, you're relying 
entirely on the fact that your DD setup wont break. Is this how the DD is sold? 
(Buy 2 DD's, replicate between them and you're safe) ? I know it's (like the 
previous poster stated) always a question about costs vs mitigating risks, but 
if I got to choose, I'd rather have fast restores from my main site and slow 
from my offsite, as long as I can restore the data. Instead of having fast from 
main, fast from off, but there's a chance I might not be able to do restore at 
all.

If DD claims they have data invunerability I'd really like to see how they 
hit 100% protection, since it would be the first system in the world to 
actually have managed to secure that last 0,0001% risk ;) RAID usually was 
secure until someone made an error, put in a blank disk and forgot to rebuild 
:)
 
Best Regards
 
Daniel



Daniel Sparrman
Exist i Stockholm AB
Växel: 08-754 98 00
Fax: 08-754 97 30
daniel.sparr...@exist.se
http://www.existgruppen.se
Posthusgatan 1 761 30 NORRTÄLJE



-ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU skrev: -


Till: ADSM-L@VM.MARIST.EDU
Från: Shawn Drew shawn.d...@americas.bnpparibas.com
Sänt av: ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU
Datum: 09/28/2011 22:26
Ärende: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file 
systems for pirmary pool

We average between 15-20TB/day at our main site, and that goes directly to 
a single DD890 (no random pool) .  single-pool, file devclass, NFS mounted 
on 2x10GB crossover connections. Replicates over a 1gb WAN link to another 
DD890.   (I spent all the money on the DD boxes, I didn't have enough left 
over for 10GB switches!)

That other DD890 backs up another 7-10TB/day, replicating to the main site 
   (bi-directional replication). 

All with file devclasses and there is not more than a one hour lag in 
replication by the time I show up in the morning.TSM doesn't have to 
do replication or backup stgpools anymore, so I can actually afford to do 
full db backups every day now.  (I was doing an incremental scheme before)

IBM has a similar recommended configuration with their Protectier 
solution, so they do support a single pool, backend replication solution.  
Data Domain also claims that data invulnerability which should catch any 
data corruption issue as soon as the data is written, and not later, when 
you try and restore. 


Regards, 
Shawn

Shawn Drew





Internet
daniel.sparr...@exist.se

Sent by: ADSM-L@VM.MARIST.EDU
09/28/2011 02:13 AM
Please respond to
ADSM-L@VM.MARIST.EDU


To
ADSM-L
cc

Subject
[ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for 
pirmary pool






How many TB of data is common in this configuration? In a large 
environment, where databases are 5-10TB each and you have a demand to 
backup 5-10-15-20TB of data each night, this would require you to have 
10Gbs for every host, something that would also cost a penny. Especially 
since the DD needs to be configured to have the throughput to write all 
those TB within a limited amount of time.
 
Does the DD do de-dup within the same box (meaning, can I have 1 box that 
handles normal storage and does de-dup) or do I need a 2nd box?
 
And the same issue also arises with the filepool, you're moving alot of 
data around completely unnecessary every day when u do reclaim. 
 
If I'm right, it also sounds like (in your description from the previous 
mails) you're not only using the DD for TSM storage. That sounds like 
putting all the eggs in the same basket.
 
Best Regards
 
Daniel



Daniel Sparrman
Exist i Stockholm AB
Växel: 08-754 98 00
Fax: 08-754 97 30
daniel.sparr...@exist.se
http://www.existgruppen.se
Posthusgatan 1 761 30 NORRTÄLJE



-ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU skrev: -


Till: ADSM-L@VM.MARIST.EDU
Från: Allen S. Rout a...@ufl.edu
Sänt av: ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU
Datum: 09/27/2011 18:55
Ärende: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary 
pool

On 09/27/2011 12:02 PM, Rick Adamson wrote:


 The bigger question I have is since the file based storage is
  native to TSM why exactly is using a file based storage
  not supported?

Not supported by what?

If you've got a DD, then the simplest way to connect it to TSM is via
files.  Some backup apps require 

Re: vtl versus file systems for pirmary pool

2011-09-29 Thread Nick Laflamme
On Sep 29, 2011, at 12:30 AM, Daniel Sparrman wrote:

 I'm not fully aware of how the DD replicates data, but if you have 
 15-20TB/day being written to your main DD, and that data is then replicated 
 to the off-site DD, how much data is actually replicated?
 
 With a 1Gbs connection, you could hit values up to 360GB/s hour (expecting 
 100MB/s which should be theoretically possible, but it's usually lower than 
 that on a 1Gbs connections) which means 8.6TB per 24 hours. So the data is 
 both deduplicated and compressed before you send it offsite?

It's certainly de-duped before being replicated; it's probably compressed as 
well, but that's less obvious to me. 

 Does the DD do the dedup within the same box, or require a separate box for 
 dedup?

Same box, as an in-line process. They're very proud of that. 

Nick

 Daniel Sparrman
 Exist i Stockholm AB
 Växel: 08-754 98 00
 Fax: 08-754 97 30
 daniel.sparr...@exist.se
 http://www.existgruppen.se
 Posthusgatan 1 761 30 NORRTÄLJE
 


Preschedule problem with ESSbase backup.

2011-09-29 Thread Bo Krogholm Nielsen
Hi TSM's

I have a preschedule command to be run two cmd files to put an Essbase in read 
/ only mode and then dump the database into a flat file. When the first COMAND 
is finished it must start another command file to do the same, just on a 
different base.
But when the first command is finished start the backup without the second cmd 
file is executed.
Is there anyone who can tell me where the problem is.
example:
startesscmd BackupBUD_EP.scr

BackupBUD_EP.scr:

LOG x x x;
BEGIN ARCHIVE BUD_EP JointVen BUD_JVResult;
SELECT 'BUD_EP  JointVen ;
EXPORT bud_JV.txt 2 1;
BEGIN ARCHIVE BUD_EP Operator BUD_OPResult;
SELECT 'BUD_EP  Operator ;
EXPORT bud_op.txt 2 1;
BEGIN ARCHIVE BUD_EP EP BUD_EPResult;
SELECT 'BUD_EP  EP ;
EXPORT bud_ep.txt 2 1;
EXIT;

Regards
Bo Nielsen
Senior Technology Consultant
UNIX Server SAN and Backup

DONG Energy
Klædemålet 9
2100 København Ø
Tlf. +45 99 55 54 34

bo...@dongenergy.dkmailto:bo...@dongenergy.dk
www.dongenergy.comhttp://www.dongenergy.com/


Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool

2011-09-29 Thread Daniel Sparrman
Yepp, we have the same thing with our Sepaton, all deduplication is done 
inline. Reason I asked is because there seems to be other manufacturers who 
needs a 2nd box todo deduplication.

Regards

Daniel


Daniel Sparrman
Exist i Stockholm AB
Växel: 08-754 98 00
Fax: 08-754 97 30
daniel.sparr...@exist.se
http://www.existgruppen.se
Posthusgatan 1 761 30 NORRTÄLJE

-ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU skrev: -

Till: ADSM-L@VM.MARIST.EDU
Från: Nick Laflamme dplafla...@gmail.com
Sänt av: ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU
Datum: 09/29/2011 13:34
Ärende: Re: [ADSM-L] vtl versus file systems for pirmary pool

On Sep 29, 2011, at 12:30 AM, Daniel Sparrman wrote:

 I'm not fully aware of how the DD replicates data, but if you have 
 15-20TB/day being written to your main DD, and that data is then replicated 
 to the off-site DD, how much data is actually replicated?
 
 With a 1Gbs connection, you could hit values up to 360GB/s hour (expecting 
 100MB/s which should be theoretically possible, but it's usually lower than 
 that on a 1Gbs connections) which means 8.6TB per 24 hours. So the data is 
 both deduplicated and compressed before you send it offsite?

It's certainly de-duped before being replicated; it's probably compressed as 
well, but that's less obvious to me. 

 Does the DD do the dedup within the same box, or require a separate box for 
 dedup?

Same box, as an in-line process. They're very proud of that. 

Nick

 Daniel Sparrman
 Exist i Stockholm AB
 Växel: 08-754 98 00
 Fax: 08-754 97 30
 daniel.sparr...@exist.se
 http://www.existgruppen.se
 Posthusgatan 1 761 30 NORRTÄLJE
 


Re: Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool

2011-09-29 Thread Richard Rhodes
 So the data is both deduplicated and compressed before you
 send it offsite?

Yes, that is how the DD handles replication.

DD is a inline dedup system.  When data come into the DD
it is deduped, what is left is compressed, then it 
is written to disk. Only the new unique data is replicated. 
(yes, there must be meta data and new unique dedup hashes
must be sent somehow).  In general, the replication 
data stream reflects the dedup/compression ratio.

 
 Does the DD do the dedup within the same box, or require a separate 
 box for dedup?

A DD is nothing more than a powererful pc server with 
lots of memory, SATA disks, Linux OS.  The secret sauce is the code to 
handle dedup, compression, replication, nfs, cifs, vtl,
log structured filesystem, snapshots,  etc, etc.
 
 You're also running with the same risk as the previous poster, 
 you're relying entirely on the fact that your DD setup wont break. 

There is a security in tapes pieces/parts. 
A drive can fail but the rest keep 
running.  A cartridge can get chewed up but it's only one
cartridge.  (We have 2 DD's, but also still have two large
3584 libraries). 

If a DD were to have a complete meltdown all backups on it are gone.

This is true and something you have to come to grips with
if moving to any disk based backup system.  As has been
mentioned it's a question if risk and cost.  You could
have dual onsite DD's with one for primary pool and a second 
for a TSM copy pool, but that doubles your cost.  I will say
that from what I see of our DD's, DD put a lot of time/effort
into making the box highly reliable.

Now, we implemented ours with a front end disk pool.  The
main reason is that we still wanted backups to not rely 
directly on the availability of the DD.  If the DD is down
for some reason (code upgrade, processor broke, etc) then
backup still run.

 Is this how the DD is sold? (Buy 2 DD's, replicate between them and 
 you're safe) ? 

You can run two DD and use it's replication.  You can also use it
as just a primary pool with a normal copy pool on tape. 
A DD (or any dedup system) doesn't change TSM, but it makes
you think hard on how you configure and run TSM.
 
 If DD claims they have data invunerability I'd really like to see 
 how they hit 100% protection, since it would be the first system in 
 the world to actually have managed to secure that last 0,0001% risk 
 ;) RAID usually was secure until someone made an error, put in a 
 blank disk and forgot to rebuild :)

Agreed.  Ask the vendors for their stats on data loss events!
Don't believe what they say, but ask anyway.

I have to say I am impressed with our DD's (ouch, that hurt! It
also shows that EMC didn't design it.). 
It runs it's own
log based filesystem (new data is always appended on the end, 
not updated in place) which required periodic (weekly) compactions.
Has snapshots.  It has checksums built in, and runs on Raid6.  Remember 
that 
since it's inline dedup/compression, it doesn't get as high
I/O load on the actual spindles as a straight filesystem would.
They truly did design it to make sure your data is safe.  Of 
course . . .all it takes is a firmware bug to destroy everything!

What we decided is that a major data loss event on the DD will trigger a
disaster situation for the TSM system.

Rick

 
 Best Regards
 
 Daniel
 
 
 
 Daniel Sparrman
 Exist i Stockholm AB
 Växel: 08-754 98 00
 Fax: 08-754 97 30
 daniel.sparr...@exist.se
 http://www.existgruppen.se
 Posthusgatan 1 761 30 NORRTÄLJE
 
 
 
 -ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU skrev: -
 
 
 Till: ADSM-L@VM.MARIST.EDU
 Från: Shawn Drew shawn.d...@americas.bnpparibas.com
 Sänt av: ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU
 Datum: 09/28/2011 22:26
 Ärende: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus 
 file systems for pirmary pool
 
 We average between 15-20TB/day at our main site, and that goes directly 
to 
 a single DD890 (no random pool) .  single-pool, file devclass, NFS 
mounted 
 on 2x10GB crossover connections. Replicates over a 1gb WAN link to 
another 
 DD890.   (I spent all the money on the DD boxes, I didn't have enough 
left 
 over for 10GB switches!)
 
 That other DD890 backs up another 7-10TB/day, replicating to the main 
site 
(bi-directional replication). 
 
 All with file devclasses and there is not more than a one hour lag in 
 replication by the time I show up in the morning.TSM doesn't have to 

 do replication or backup stgpools anymore, so I can actually afford to 
do 
 full db backups every day now.  (I was doing an incremental scheme 
before)
 
 IBM has a similar recommended configuration with their Protectier 
 solution, so they do support a single pool, backend replication 
solution. 
 Data Domain also claims that data invulnerability which should catch 
any 
 data corruption issue as soon as the data is written, and not later, 
when 
 you try and restore. 
 
 
 Regards, 
 Shawn
 
 Shawn Drew
 
 
 
 

Re: Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool

2011-09-29 Thread Rick Adamson
Richard, excellent comments!

I will add that to TSM is just storage and has no idea about the deduplication, 
compression, etc. that DD performs, thus making it challenging to determine the 
actual storage utilization from an individual client and/or file space 
perspective. 

Secondly, aside from the preformatted daily system report (autosupport), which 
is not customizable, getting reporting from the DD can be a little challenging 
to say the least.


~Rick


-Original Message-
From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of 
Richard Rhodes
Sent: Thursday, September 29, 2011 9:18 AM
To: ADSM-L@VM.MARIST.EDU
Subject: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl 
versus file systems for pirmary pool

 So the data is both deduplicated and compressed before you
 send it offsite?

Yes, that is how the DD handles replication.

DD is a inline dedup system.  When data come into the DD
it is deduped, what is left is compressed, then it 
is written to disk. Only the new unique data is replicated. 
(yes, there must be meta data and new unique dedup hashes
must be sent somehow).  In general, the replication 
data stream reflects the dedup/compression ratio.

 
 Does the DD do the dedup within the same box, or require a separate 
 box for dedup?

A DD is nothing more than a powererful pc server with 
lots of memory, SATA disks, Linux OS.  The secret sauce is the code to 
handle dedup, compression, replication, nfs, cifs, vtl,
log structured filesystem, snapshots,  etc, etc.
 
 You're also running with the same risk as the previous poster, 
 you're relying entirely on the fact that your DD setup wont break. 

There is a security in tapes pieces/parts. 
A drive can fail but the rest keep 
running.  A cartridge can get chewed up but it's only one
cartridge.  (We have 2 DD's, but also still have two large
3584 libraries). 

If a DD were to have a complete meltdown all backups on it are gone.

This is true and something you have to come to grips with
if moving to any disk based backup system.  As has been
mentioned it's a question if risk and cost.  You could
have dual onsite DD's with one for primary pool and a second 
for a TSM copy pool, but that doubles your cost.  I will say
that from what I see of our DD's, DD put a lot of time/effort
into making the box highly reliable.

Now, we implemented ours with a front end disk pool.  The
main reason is that we still wanted backups to not rely 
directly on the availability of the DD.  If the DD is down
for some reason (code upgrade, processor broke, etc) then
backup still run.

 Is this how the DD is sold? (Buy 2 DD's, replicate between them and 
 you're safe) ? 

You can run two DD and use it's replication.  You can also use it
as just a primary pool with a normal copy pool on tape. 
A DD (or any dedup system) doesn't change TSM, but it makes
you think hard on how you configure and run TSM.
 
 If DD claims they have data invunerability I'd really like to see 
 how they hit 100% protection, since it would be the first system in 
 the world to actually have managed to secure that last 0,0001% risk 
 ;) RAID usually was secure until someone made an error, put in a 
 blank disk and forgot to rebuild :)

Agreed.  Ask the vendors for their stats on data loss events!
Don't believe what they say, but ask anyway.

I have to say I am impressed with our DD's (ouch, that hurt! It
also shows that EMC didn't design it.). 
It runs it's own
log based filesystem (new data is always appended on the end, 
not updated in place) which required periodic (weekly) compactions.
Has snapshots.  It has checksums built in, and runs on Raid6.  Remember 
that 
since it's inline dedup/compression, it doesn't get as high
I/O load on the actual spindles as a straight filesystem would.
They truly did design it to make sure your data is safe.  Of 
course . . .all it takes is a firmware bug to destroy everything!

What we decided is that a major data loss event on the DD will trigger a
disaster situation for the TSM system.

Rick

 
 Best Regards
 
 Daniel
 
 
 
 Daniel Sparrman
 Exist i Stockholm AB
 Växel: 08-754 98 00
 Fax: 08-754 97 30
 daniel.sparr...@exist.se
 http://www.existgruppen.se
 Posthusgatan 1 761 30 NORRTÄLJE
 
 
 
 -ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU skrev: -
 
 
 Till: ADSM-L@VM.MARIST.EDU
 Från: Shawn Drew shawn.d...@americas.bnpparibas.com
 Sänt av: ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU
 Datum: 09/28/2011 22:26
 Ärende: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus 
 file systems for pirmary pool
 
 We average between 15-20TB/day at our main site, and that goes directly 
to 
 a single DD890 (no random pool) .  single-pool, file devclass, NFS 
mounted 
 on 2x10GB crossover connections. Replicates over a 1gb WAN link to 
another 
 DD890.   (I spent all the money on the DD boxes, I didn't have enough 
left 
 over for 10GB switches!)
 
 That other DD890 backs up another 7-10TB/day, replicating to 

Re: Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool

2011-09-29 Thread robert_clark


The elephants in the room:

It is tempting, once DD gets in the door, to move all database backups (the 
typical TDP/RMAN and SQLLiteSpeed stuff) to go directly to DD. (No TSM 
involved, so save money on licenses?)

Combinations that have more advanced communications with the back end storage 
(OST / Boost / Avarmar+DD) may be able to get hints about what is already 
stored on the dedupe device? Seems unlikey that TSM will gain any features like 
this any time soon. (NDMP? VTL? these feature are pretty dated.)

Is TSM 6 not losing data via dedupe this week?

How problematic is many TB of data on fileclass on file systems  when it 
comes time to do a fsck after a system crash?

[RC]

On Sep 27, 2011, at 03:06 PM, Prather, Wanda wprat...@icfi.com wrote:

Actually I have more customers using Data Domains without the VTL license than 
with it.

With a Windows TSM server, you can just write to it via TCP/IP using a CIFS 
share(NFS mount with an AIX TSM server).
If you have sufficient TCP/IP bandwidth for your load, no fibre connections 
needed.
From the TSM point of view, you configure it as a file pool.

You get the benefits of dedup and (if you have a 2nd one at your DR site) replication. 
Neither good or bad, just different.

Very simple setup, works great if it meets your throughput requirements.

W

-Original Message-
From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Daniel 
Sparrman
Sent: Tuesday, September 27, 2011 2:49 PM
To: ADSM-L@VM.MARIST.EDU
Subject: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems 
for pirmary pool

The fact you actually need to pay a VTL license is just plain scary.

When u bought it, did they think you're gonna use it as a fileserver? I'm not 
to specialized into Data Domain, but arent they marketed as backup hardware? So 
you get a disk, but if you want to use it for something else than that, you 
need to pay a license?

Sorry for sounding bitter, but I've always heard people referring to Data 
Domain as a VTL.



Daniel Sparrman
Exist i Stockholm AB
Växel: 08-754 98 00
Fax: 08-754 97 30
daniel.sparr...@exist.se
http://www.existgruppen.se
Posthusgatan 1 761 30 NORRTÄLJE



-ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU skrev: -


Till: ADSM-L@VM.MARIST.EDU
Från: Allen S. Rout a...@ufl.edu
Sänt av: ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU
Datum: 09/27/2011 18:55
Ärende: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool

On 09/27/2011 12:02 PM, Rick Adamson wrote:



The bigger question I have is since the file based storage is
native to TSM why exactly is using a file based storage  not supported?


Not supported by what?

If you've got a DD, then the simplest way to connect it to TSM is via files. 
Some backup apps require something that looks like a library, in which case 
you'd be buying the VTL license.

FWIW, if you're already in DD space, you're paying a pretty penny. The VTL 
license isn't chicken feed, I agree, but it's not a major component of the 
total cost.


- Allen S. Rout


Re :vtl versus file systems for pirmary pool

2011-09-29 Thread James Choate
Good questions:

We are currently working on a project that is using ProtectTIER.  The 
ProtectTIER 7650G is does dedup.  It looks like a TS3500 w LTO drive.  We will 
be getting another 7650G at a second data center.  The idea is to to 
cross-replicate between data centers.


-Original Message-
From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of 
robert_clark
Sent: Thursday, September 29, 2011 11:34 AM
To: ADSM-L@VM.MARIST.EDU
Subject: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file 
systems for pirmary pool


The elephants in the room:

It is tempting, once DD gets in the door, to move all database backups (the 
typical TDP/RMAN and SQLLiteSpeed stuff) to go directly to DD. (No TSM 
involved, so save money on licenses?)

Combinations that have more advanced communications with the back end storage 
(OST / Boost / Avarmar+DD) may be able to get hints about what is already 
stored on the dedupe device? Seems unlikey that TSM will gain any features like 
this any time soon. (NDMP? VTL? these feature are pretty dated.)

Is TSM 6 not losing data via dedupe this week?

How problematic is many TB of data on fileclass on file systems  when it 
comes time to do a fsck after a system crash?

[RC]

On Sep 27, 2011, at 03:06 PM, Prather, Wanda wprat...@icfi.com wrote:

Actually I have more customers using Data Domains without the VTL license than 
with it.

With a Windows TSM server, you can just write to it via TCP/IP using a CIFS 
share(NFS mount with an AIX TSM server).
If you have sufficient TCP/IP bandwidth for your load, no fibre connections 
needed.
From the TSM point of view, you configure it as a file pool.

You get the benefits of dedup and (if you have a 2nd one at your DR site) 
replication.
Neither good or bad, just different.
Very simple setup, works great if it meets your throughput requirements.

W

-Original Message-
From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Daniel 
Sparrman
Sent: Tuesday, September 27, 2011 2:49 PM
To: ADSM-L@VM.MARIST.EDU
Subject: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems 
for pirmary pool

The fact you actually need to pay a VTL license is just plain scary.

When u bought it, did they think you're gonna use it as a fileserver? I'm not 
to specialized into Data Domain, but arent they marketed as backup hardware? So 
you get a disk, but if you want to use it for something else than that, you 
need to pay a license?

Sorry for sounding bitter, but I've always heard people referring to Data 
Domain as a VTL.



Daniel Sparrman
Exist i Stockholm AB
Växel: 08-754 98 00
Fax: 08-754 97 30
daniel.sparr...@exist.se
http://www.existgruppen.se
Posthusgatan 1 761 30 NORRTÄLJE



-ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU skrev: -


Till: ADSM-L@VM.MARIST.EDU
Från: Allen S. Rout a...@ufl.edu
Sänt av: ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU
Datum: 09/27/2011 18:55
Ärende: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool

On 09/27/2011 12:02 PM, Rick Adamson wrote:


 The bigger question I have is since the file based storage is
 native to TSM why exactly is using a file based storage  not supported?

Not supported by what?

If you've got a DD, then the simplest way to connect it to TSM is via files. 
Some backup apps require something that looks like a library, in which case 
you'd be buying the VTL license.

FWIW, if you're already in DD space, you're paying a pretty penny. The VTL 
license isn't chicken feed, I agree, but it's not a major component of the 
total cost.


- Allen S. Rout


Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool

2011-09-29 Thread Daniel Sparrman
The elephant has left the building.

Do you get the same advanced features by just dumping data onto a DD as you do 
with the TSM TDP clients? Exmerge anyone? Or perhaps an SQL dump?

Still have todo filebackups, or wait, why not just use robocopy and copy it 
onto the DD? Or what the heck, just place the fileserver on the DD. That way 
you dont have todo backups, the data is already on the DD.

As for TSM loosing data, what tells you that the DD dedup algorithm never lost 
data? I bet I can prove you wrong.

Well, when the DD hits the wall, at least you wont have todo a fsck, since 
there wont be anything left that needs an fsck.

DD replication = not application aware = not detecting software-based 
discrepencies. That's why I'd never replace TSM's backup storage pool or the 
copypools feature with a replicated solution.

If you're OK with replication, why dont you just mirror the solution (if you 
want the errors to hit both the boxes at the same time, make sure to use 
synchronous mirroring and not async, god knows, with async you might not get 
the error mirrored in time).

It's ok to make it easy, but when the shit hits the fan, make sure you actually 
know what you sacrificed (having I destroyed a datacenter in your resume 
probably wont make it easier to find a new job).k

Scary *schruggs*





Daniel Sparrman
Exist i Stockholm AB
Växel: 08-754 98 00
Fax: 08-754 97 30
daniel.sparr...@exist.se
http://www.existgruppen.se
Posthusgatan 1 761 30 NORRTÄLJE



-ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU skrev: -


Till: ADSM-L@VM.MARIST.EDU
Från: robert_clark robert_cl...@mac.com
Sänt av: ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU
Datum: 09/29/2011 19:34
Ärende: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file 
systems for pirmary pool


The elephants in the room:

It is tempting, once DD gets in the door, to move all database backups (the 
typical TDP/RMAN and SQLLiteSpeed stuff) to go directly to DD. (No TSM 
involved, so save money on licenses?)

Combinations that have more advanced communications with the back end storage 
(OST / Boost / Avarmar+DD) may be able to get hints about what is already 
stored on the dedupe device? Seems unlikey that TSM will gain any features like 
this any time soon. (NDMP? VTL? these feature are pretty dated.)

Is TSM 6 not losing data via dedupe this week?

How problematic is many TB of data on fileclass on file systems  when it 
comes time to do a fsck after a system crash?

[RC]

On Sep 27, 2011, at 03:06 PM, Prather, Wanda wprat...@icfi.com wrote:

Actually I have more customers using Data Domains without the VTL license than 
with it.

With a Windows TSM server, you can just write to it via TCP/IP using a CIFS 
share(NFS mount with an AIX TSM server).
If you have sufficient TCP/IP bandwidth for your load, no fibre connections 
needed.
From the TSM point of view, you configure it as a file pool.

You get the benefits of dedup and (if you have a 2nd one at your DR site) 
replication. 
Neither good or bad, just different.
Very simple setup, works great if it meets your throughput requirements.

W

-Original Message-
From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Daniel 
Sparrman
Sent: Tuesday, September 27, 2011 2:49 PM
To: ADSM-L@VM.MARIST.EDU
Subject: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems 
for pirmary pool

The fact you actually need to pay a VTL license is just plain scary.

When u bought it, did they think you're gonna use it as a fileserver? I'm not 
to specialized into Data Domain, but arent they marketed as backup hardware? So 
you get a disk, but if you want to use it for something else than that, you 
need to pay a license?

Sorry for sounding bitter, but I've always heard people referring to Data 
Domain as a VTL.



Daniel Sparrman
Exist i Stockholm AB
Växel: 08-754 98 00
Fax: 08-754 97 30
daniel.sparr...@exist.se
http://www.existgruppen.se
Posthusgatan 1 761 30 NORRTÄLJE



-ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU skrev: -


Till: ADSM-L@VM.MARIST.EDU
Från: Allen S. Rout a...@ufl.edu
Sänt av: ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU
Datum: 09/27/2011 18:55
Ärende: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool

On 09/27/2011 12:02 PM, Rick Adamson wrote:


 The bigger question I have is since the file based storage is
 native to TSM why exactly is using a file based storage  not supported?

Not supported by what?

If you've got a DD, then the simplest way to connect it to TSM is via files. 
Some backup apps require something that looks like a library, in which case 
you'd be buying the VTL license.

FWIW, if you're already in DD space, you're paying a pretty penny. The VTL 
license isn't chicken feed, I agree, but it's not a major component of the 
total cost.


- Allen S. Rout

Backing up EMC SourceOne

2011-09-29 Thread Bill Boyer
Anyone out there backing up a multiple server SourceOne configuration? This
is the replacement product for MailXtender. There is a script to run
preschedule to set SourceOne up for backup. And then you also have to backup
the other servers in the SourceOne configuration while this is 'Paused'.



Any help would be greatly appreciated!



Bill Boyer

Free Tip: The F1 Key does NOT destroy your PC!  - ??


Re: Backing up EMC SourceOne

2011-09-29 Thread Schneider, Jim
Bill,

We have one SourceOne server using a database on a separate server.  We
run the job sequence using BMC's Control-M scheduler.

SourceOne server1) Activity Suspend vbs
2) Native Archive Suspend vbs
Database server 3) Database export
4) Archive of export
SourceOne server5) Archive of SourceOneMessageCenter,
SourceOneIndex directories
6) Native Archive resume vbs
7) Activity Resume vbs
The entire process takes about 20 minutes.

I don't know if this is what you're looking for, but hope it helps.

Jim Schneider
United Stationers

-Original Message-
From: ADSM: Dist Stor Manager [mailto:ADSM-L@vm.marist.edu] On Behalf Of
Bill Boyer
Sent: Thursday, September 29, 2011 2:15 PM
To: ADSM-L@vm.marist.edu
Subject: [ADSM-L] Backing up EMC SourceOne

Anyone out there backing up a multiple server SourceOne configuration?
This
is the replacement product for MailXtender. There is a script to run
preschedule to set SourceOne up for backup. And then you also have to
backup
the other servers in the SourceOne configuration while this is 'Paused'.



Any help would be greatly appreciated!



Bill Boyer

Free Tip: The F1 Key does NOT destroy your PC!  - ??


Re: Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool

2011-09-29 Thread Colwell, William F.
Hi Daniel,

My main point was to say that your previous posts seemed to be saying that 
dedup storagepools
were recommended to be 6 TB in size at most.  It is my understanding the 6TB 
recommendation was 
a daily server thruput maximum design target when dedup is in use.

I agree, a processor at 100% is not good and I have been adjusting the server 
design to reduce
the load.

I started re-hosting our backup service on v6 as soon as v6 was available.  I 
started out
deduping everything but quickly ran into performance problems.  To solve them I 
started excluding
classes of data from dedup - all Oracle backups, all outlook PST files and any 
other file larger
than 1 GB.  I also replaced all the disks I started with over 12 months and 
greatly expanded the
total storage.

Where the Redbook says that expiration is much improved, that is only partly 
true.  If dedup is involved,
a hidden process starts after the visible expiration process is done and runs 
on for quite a while longer.
This process has to check if a chuck in an expired file can truly be removed 
from storage because
it could be that other files are pointing to that chunk.  You can see the 
process by entering
'show dedupdeleteinfo' after expiration completes.

The thing about big files is that they are broken into lots of chunks.  When a 
big file is expired,
this hidden process will take a long time to complete and can bog down the 
system.  This is the
real reason I exclude some files from dedup.

As for SATA, I have been using some big arrays (20 2TB disks, raid 6), 8 such 
arrays, for 18 months
and have had only 1 disk fail.  But I try not to abuse them.  Backups first go 
onto jbod
disks - 15K rpm, 600GB - and all the dedup activity is done there.  The 
storagepools on those disks
are then migrated to storagepools on the SATA arrays.  It is a mostly 
sequential process.

I can only suggest that if your customer does storagepool backup from the SATA 
arrays after migration or
reclaim, and the copypool is not dedup, then there would be a lot of random 
requests to the SATA storagepools
to rehydrate the backups.

Regards,

Bill Colwell
Draper Lab

-Original Message-
From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Daniel 
Sparrman
Sent: Thursday, September 29, 2011 1:24 AM
To: ADSM-L@VM.MARIST.EDU
Subject: Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file 
systems for pirmary pool

Like it says in the document, it's a recommendation and not a technical limit.

However, having the server running at 100% utilization all the time doesnt seem 
like a healthy scenario.

Why arent you deduplicating files larger than 1GB? From my experience, 
datafiles from SQL, Exchange and such has a very large de-dup ratio, while 
TSM's deduplication skips files smaller than 2KB?

I have a customer up north who used this configuration on an HP EVA based box 
with SATA disks. The disks where breaking down so fast that the arrays within 
the box was in a constant rebuild phase. HP claimed it was TSM dedup that was 
breaking the disks (they actually claimed TSM was writing so often that the 
disks broke), a scenario I have very hard to believe.

Best Regards

Daniel



Daniel Sparrman
Exist i Stockholm AB
Växel: 08-754 98 00
Fax: 08-754 97 30
daniel.sparr...@exist.se
http://www.existgruppen.se
Posthusgatan 1 761 30 NORRTÄLJE



-ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU skrev: -


Till: ADSM-L@VM.MARIST.EDU
Från: Colwell, William F. bcolw...@draper.com
Sänt av: ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU
Datum: 09/28/2011 20:43
Ärende: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file 
systems for pirmary pool

Hi Daniel,

 

I remember hearing about a 6 TB limit for dedup in a webinar or conference call,

but what I recall is that that was a daily thruput limit.  In the same section 
of the

redbook as you quote is this paragraph -

 

Experienced administrators already know that Tivoli Storage Manager database 
expiration

was one of the more processor-intensive activities on a Tivoli Storage Manager 
Server.

Expiration is still processor intensive, albeit less so in Tivoli Storage 
Manager V6.1, but this is

now second to deduplication in terms of consumption of processor cycles. 
Calculating the

MD5 hash for each object and the SHA1 hash for each chunk is a processor 
intensive activity.

 

I can say this is absolutely correct; my processor is frequently running at or 
near 100%.

 

I have gone way beyond 6 TB of storage for dedup storagepools as this sql shows

for the 2 instances on my server -

 

select cast(stgpool_name as char(12)) as Stgpool, -

   cast(sum(num_files) / 1024 /1024 as decimal(4,1)) as Mil Files, -

   cast(sum(physical_mb)   / 1024 /1024 as decimal(4,1)) as Physical_TB, -

   cast(sum(logical_mb)/ 1024 /1024 as decimal(4,1))as Logical_TB, -

   cast(sum(reporting_mb)  / 1024 /1024 as decimal(4,1))as Reporting_TB -

from occupancy -