Re: Tape Reliability Recommendations

2003-02-20 Thread PAC Brion Arnaud
Hi Peter,

I sincerely empathize, as I've been living the same martyrdom as you
some months ago.
We had a brand new 3584 library, equiped with scsi lto drives, attached
to the server thru 2108 San data gateways and fiber channeling. After
months of investigation, upgrades of all kinds, drive exchanges etc ...
We finally found that the length of fiber cables was a little bit too
long, therefore generating timeouts errors, and subsequent tape
failures.
Don't know if it could be your case, but worth throwing an eye on it, if
not already done !
My 2 cents !

Arnaud  

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
| Arnaud Brion, Panalpina Management Ltd., IT Group |
| Viaduktstrasse 42, P.O. Box, 4002 Basel - Switzerland |
| Phone: +41 61 226 19 78 / Fax: +41 61 226 17 01   | 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=



-Original Message-
From: Peter Ford [mailto:[EMAIL PROTECTED]] 
Sent: Tuesday, 18 February, 2003 19:43
To: [EMAIL PROTECTED]
Subject: Re: Tape Reliability Recommendations


 -Original Message-
 From: Kelly J. Lipp [mailto:[EMAIL PROTECTED]]
 Sent: Tuesday, February 18, 2003 10:32 AM
 To: [EMAIL PROTECTED]
 Subject: Re: Tape Reliability Recommendations

 As for reliability.  That turns out to be a very mixed bag.

...snip...

 For instance,
 we have a site with a large fiber channel and LTO
 configuration.  No end to
 the problems so far and they are very serious problems.  Is 
 this a result of
 the tape technology?  I doubt it, but one never knows, do one?\

I would be very curious to hear what type of reliability problems you
have seen with LTO.  I have posted here before, but we have been
experiencing an incredibly high number of read errors with our 3584 LTO
library.  We regularly see errors when trying to restore data from
tapes.  We have been auditing volumes recently and have seen errors on a
tape during one audit, and then audit again, with no errors.  There is
no discernable pattern to these errors (across multiple tapes and
multiple drives).  Due to the nature of the data we are backing up, the
data does not change often (and therefore the tapes are generally
written to once, and the data stays there), so over-used tapes should
not be an issue.  

Anything that you could share with the list, or me directly, would be
greatly appreciated.

Thanks.
Peter

Peter Ford
System Engineer


Stentor, Inc.
 5000 Marina Blvd, 
 Brisbane, CA 94005-1811  
 Main Phone: 650-228-
 Fax: 650 228-5566 
 http://www.stentor.com
 [EMAIL PROTECTED] 

 


 



Re: Tape Reliability Recommendations

2003-02-19 Thread Stapleton, Mark
-Original Message-
From: Peter Ford [mailto:[EMAIL PROTECTED]] 
As for reliability.  That turns out to be a very mixed bag. 
...snip...
For instance,
we have a site with a large fiber channel and LTO
configuration.  No end to
the problems so far and they are very serious problems.  Is 
this a result of
the tape technology?  I doubt it, but one never knows, do one?\

From: Kelly J. Lipp [mailto:[EMAIL PROTECTED]]
I would be very curious to hear what type of reliability 
problems you have seen with LTO.  I have posted here before, 
but we have been experiencing an incredibly high number of 
read errors with our 3584 LTO library.  We regularly see 
errors when trying to restore data from tapes.  We have been 
auditing volumes recently and have seen errors on a tape 
during one audit, and then audit again, with no errors.  
There is no discernable pattern to these errors (across 
multiple tapes and multiple drives).  Due to the nature of 
the data we are backing up, the data does not change often 
(and therefore the tapes are generally written to once, and 
the data stays there), so over-used tapes should not be an issue.  
 
Anything that you could share with the list, or me directly, 
would be greatly appreciated.

A problem we've had several times up here in the Great Dry North this
winter has been an environmental one. The 3583 library is fairly
vulnerable to a lack of humidity. While the docs say that 20% is the
minimum required for proper operation, we've found that 40% is really
the minimum needed, particularly in server rooms that are not really
equipped as server rooms; i.e., carpet on the floor, no raised floor,
lots of foot traffic, etc. If you can scoot your feet around, touch the
outside of the library cabinet, and get **zapped**, you've got a
problem. (Your server room should be at 40% in any event; tape 'floats'
best across tape heads at that humidity.)

IBM found a workaround for the lack of humidity. At two sites I've been
to, they've taken one of those yellow-and-green grounding straps they
use to ground mainframe boxes, and attached the library's outside panel
to a decent ground. One customer went from multiple, daily, severe
problems to no problems at all in one day. (Many thanks go to Bryan
Hanson, IBM tape Top Gun, for the fix.)

The problem is that the 3583 is a complex machine that combines many
moving mechanical parts and electronics in a relatively small metal box.
When you send a charge through the box, its relatively small surface
area allows a substantial charge through the box, rather than
dissipating it across its surface. Larger libraries (like the 3584) can
dissipate the charge faster and are therefore less vulnerable.

Don't trust those Wal-Mart temperature/humidity meters. If you're having
3583 problems that can't seem to get fixed, and your environment looks
like the one I've described above, get a good meter and check your
server room.

--
Mark Stapleton ([EMAIL PROTECTED]) 



Re: Tape Reliability Recommendations

2003-02-19 Thread Prather, Wanda
Adding a bit of my own experience to Kelley's:

Even though we are all using TSM, we use the hardware differently.

At one time we had two DLT libraries, with libraries and drives provided by
the same vendor.
Identical hardware, manufacturer, media, and microcode, same level of TSM
running each.

One we got perfect reliability, the other was a nightmare - constant read
and write errors, and many drive failures.  We couldn't stop the problems,
even after replacing all the drives, more than once.  After a LOT of time
spent with the vendor and hardware gurus, we finally learned:

Part of it is just the total load on the hardware, and the other part is the
TYPE of backups you run.

- If you are doing a LOT OF TINY files - for example, workstation/desktop
backups - you will get a tremendous amount of start/stop activity (or call
it backhitch, or repositioning, whatever) during migration to the tape, and
during reclaims.  TSM uses tape almost like a direct access device, and this
pushes the media and the drive to their max capability.  You need the best
drive mechanics and the best media you can buy.  And you will need to be on
GOOD TERMS with your vendor and stay on top of those microcode levels.

-If you are dumping a FEW BIG FILES daily - for example, huge databases -
you tend to write one or two big files on the tape and you're done.  The
tape sits in your vault for a while, then you do the same thing to the tape
again.  Even though you may be writing MORE GB PER DAY than with the
workstation model above, it's far less stress on the drives and the media.
You will probably get better reliability on your drives/media than someone
doing workstation backups with the same hardware.

My opinion and nobody else's,
Wanda Prather





-Original Message-
From: Kelly J. Lipp [mailto:[EMAIL PROTECTED] mailto:[EMAIL PROTECTED] ]
Sent: Tuesday, February 18, 2003 1:32 PM
To: [EMAIL PROTECTED]
Subject: Re: Tape Reliability Recommendations


I have done a significant amount of testing and have quite a lot of
practical experience with what I will refer to as the Big Three tape
technologies:

AIT3
LTO1
SDLT320

Of the three, I know AIT the best.  It's always good to know where someone
sits before they tell you where they stand.

In a TSM environment, all three of these technologies perform very
similarly: within 10-15% of each other.  Don't let the manufacturer's
performance claims sway your decision.  Backup is generally not about the
hardware but more about the software.  TSM is quite powerful, but often
trades power for speed.  Remember, we have a sophisticated database running
here to track what's going on.

For all three drives, we are able to sustain between 35 and 45 GBytes per
hour during storage pool to storage pool operations.  For instance,
migration from Disk to Tape or Backup stg tape to tape.  In addition, you
can expect to see about the same performance when clients are writing data
directly to tape (or even multiple tapes simultaneously while using the stg
pool parameter copystgpool).  When sizing an environment, use the 35 GB/Hour
number and you won't be unhappy.

As for reliability.  That turns out to be a very mixed bag.  I have seen
sites with high volumes of data and no errors or problems with all three and
I have seen sites with numerous problems.  The problems seem to be mostly
related to drive firmware levels and tape batches.  Once the drive firmware
is correct and bad tapes are eliminated, most sites settle down nicely.  The
more complex the environment, the more likely the problems.  For instance,
we have a site with a large fiber channel and LTO configuration.  No end to
the problems so far and they are very serious problems.  Is this a result of
the tape technology?  I doubt it, but one never knows, do one?\

Due to the nature of AIT3, I would suspect that overall reliability numbers
will be lower than for LTO and SDLT, but my hands-on experience doesn't show
that.

As for Automation.  There are gazillions of libraries for each technology.
Clear winners in my opinion are Qualstar and perhaps IBM.  I give the IBM
libraries a perhaps as we have had very good experience with the 349x
libraries and only limited experience with 3584.  These seem OK, but not
much experience.  The lower end IBM libraries are based on someone else's
technology so I would think one might get a better deal buying direct from
that manufacturer.

Compatibility with previous technology.  Some DLT bigots are SDLT bigots
because they believe in investment protection.  I think that's balderdash as
very few people would ever try to read a DLT tape with an SDLT drive anyway
so what difference does it make?  All three of these are relatively new
technologies and you are going to switch to one anyway, so investigate all
three.

The all important Kelly recommendation:

For value, AIT3 is unsurpassed: very good performance, relatively
inexpensive, great automation, manufactured by one company so technology is
first rate

Re: Tape Reliability Recommendations

2003-02-18 Thread Peter Ford
 -Original Message-
 From: Kelly J. Lipp [mailto:[EMAIL PROTECTED]]
 Sent: Tuesday, February 18, 2003 10:32 AM
 To: [EMAIL PROTECTED]
 Subject: Re: Tape Reliability Recommendations

 As for reliability.  That turns out to be a very mixed bag.  

...snip...

 For instance,
 we have a site with a large fiber channel and LTO 
 configuration.  No end to
 the problems so far and they are very serious problems.  Is 
 this a result of
 the tape technology?  I doubt it, but one never knows, do one?\

I would be very curious to hear what type of reliability problems you have seen with 
LTO.  I have posted here before, but we have been experiencing an incredibly high 
number of read errors with our 3584 LTO library.  We regularly see errors when trying 
to restore data from tapes.  We have been auditing volumes recently and have seen 
errors on a tape during one audit, and then audit again, with no errors.  There is no 
discernable pattern to these errors (across multiple tapes and multiple drives).  Due 
to the nature of the data we are backing up, the data does not change often (and 
therefore the tapes are generally written to once, and the data stays there), so 
over-used tapes should not be an issue.  

Anything that you could share with the list, or me directly, would be greatly 
appreciated.

Thanks.
Peter

Peter Ford
System Engineer


Stentor, Inc.
 5000 Marina Blvd, 
 Brisbane, CA 94005-1811  
 Main Phone: 650-228-
 Fax: 650 228-5566 
 http://www.stentor.com
 [EMAIL PROTECTED] 

 



Re: Tape Reliability Recommendations

2003-02-18 Thread Peter Pijpelink - P.L.C.S. BV Storage Consultants

I would be very curious to hear what type of reliability problems you have
seen with LTO.  I have posted here before, but we have been experiencing
an incredibly high number of read errors with our 3584 LTO library.  We
regularly see errors when trying to restore data from tapes.  We have been
auditing volumes recently and have seen errors on a tape during one audit,
and then audit again, with no errors.  There is no discernable pattern to
these errors (across multiple tapes and multiple drives).  Due to the
nature of the data we are backing up, the data does not change often (and
therefore the tapes are generally written to once, and the data stays
there), so over-used tapes should not be an issue.


I can name without much thinking 5 to 6 customers with either a 3583 or
3584 with lots of errors on tapes. For the 3583 this is due to the quality
of the library.
For the 3584 we got better since microcode 25D4 is installed on the (fc)
drives. The library we keep on fw 2460. Since that time we saw that the IBM
engineers did not have to come every week onsite to replace drives where
stuck tapes were locked inside.

We also found out it is not a matter of tapebrands, the IBM or the Imation
give an equal amout of errors.

I think it can only solved by keep checking and doing audits on volumes, in
some envrioment we even moved the data of the tape and removed that
tapecartridge.

Another thing about reliability is the internal storwatch specialist which
is running, or not, oh yes it works,, no it does not Even setting
everything to 10mbit half duplex did not solve this issue.

I would move the data from tape to tape every year to avoid tapeproblems.

good luck

Peter



Re: Tape Reliability Recommendations

2003-02-18 Thread Greg Redell
We were getting a fair amount of 36 errors on our 3584 and they
corresponded to I/O errors I was seeing in TSM.  After talking to IBM and
our CE, we wound up updating our library firmware to 3060 and (knock on
wood) we haven't seen any errors since.

Library 3584
4 fiber attached LTO drives
Library firmware: 3060
Drive firmware: 25D4
TSM server 5.1.1.6

Greg Redell
Great-West Life  Annuity Insurance Co.
Phone: 314-525-5877
Email: [EMAIL PROTECTED]



|-+
| |   Peter Ford   |
| |   [EMAIL PROTECTED]|
| |   M   |
| |   Sent by: ADSM:  |
| |   Dist Stor|
| |   Manager |
| |   [EMAIL PROTECTED]|
| |   .EDU|
| ||
| ||
| |   02/18/2003 12:43 |
| |   PM   |
| |   Please respond to|
| |   ADSM: Dist Stor |
| |   Manager |
|-+
  
--|
  |
  |
  |   To:   [EMAIL PROTECTED]   
  |
  |   cc:  
  |
  |   Subject:  Re: Tape Reliability Recommendations   
  |
  
--|




 -Original Message-
 From: Kelly J. Lipp [mailto:[EMAIL PROTECTED]]
 Sent: Tuesday, February 18, 2003 10:32 AM
 To: [EMAIL PROTECTED]
 Subject: Re: Tape Reliability Recommendations

 As for reliability.  That turns out to be a very mixed bag.

...snip...

 For instance,
 we have a site with a large fiber channel and LTO
 configuration.  No end to
 the problems so far and they are very serious problems.  Is
 this a result of
 the tape technology?  I doubt it, but one never knows, do one?\

I would be very curious to hear what type of reliability problems you have
seen with LTO.  I have posted here before, but we have been experiencing an
incredibly high number of read errors with our 3584 LTO library.  We
regularly see errors when trying to restore data from tapes.  We have been
auditing volumes recently and have seen errors on a tape during one audit,
and then audit again, with no errors.  There is no discernable pattern to
these errors (across multiple tapes and multiple drives).  Due to the
nature of the data we are backing up, the data does not change often (and
therefore the tapes are generally written to once, and the data stays
there), so over-used tapes should not be an issue.

Anything that you could share with the list, or me directly, would be
greatly appreciated.

Thanks.
Peter

Peter Ford
System Engineer


Stentor, Inc.
 5000 Marina Blvd,
 Brisbane, CA 94005-1811
 Main Phone: 650-228-
 Fax: 650 228-5566
 http://www.stentor.com
 [EMAIL PROTECTED]





Re: Tape Reliability Recommendations

2003-02-18 Thread Kelly J. Lipp
I have done a significant amount of testing and have quite a lot of
practical experience with what I will refer to as the Big Three tape
technologies:

AIT3
LTO1
SDLT320

Of the three, I know AIT the best.  It's always good to know where someone
sits before they tell you where they stand.

In a TSM environment, all three of these technologies perform very
similarly: within 10-15% of each other.  Don't let the manufacturer's
performance claims sway your decision.  Backup is generally not about the
hardware but more about the software.  TSM is quite powerful, but often
trades power for speed.  Remember, we have a sophisticated database running
here to track what's going on.

For all three drives, we are able to sustain between 35 and 45 GBytes per
hour during storage pool to storage pool operations.  For instance,
migration from Disk to Tape or Backup stg tape to tape.  In addition, you
can expect to see about the same performance when clients are writing data
directly to tape (or even multiple tapes simultaneously while using the stg
pool parameter copystgpool).  When sizing an environment, use the 35 GB/Hour
number and you won't be unhappy.

As for reliability.  That turns out to be a very mixed bag.  I have seen
sites with high volumes of data and no errors or problems with all three and
I have seen sites with numerous problems.  The problems seem to be mostly
related to drive firmware levels and tape batches.  Once the drive firmware
is correct and bad tapes are eliminated, most sites settle down nicely.  The
more complex the environment, the more likely the problems.  For instance,
we have a site with a large fiber channel and LTO configuration.  No end to
the problems so far and they are very serious problems.  Is this a result of
the tape technology?  I doubt it, but one never knows, do one?\

Due to the nature of AIT3, I would suspect that overall reliability numbers
will be lower than for LTO and SDLT, but my hands-on experience doesn't show
that.

As for Automation.  There are gazillions of libraries for each technology.
Clear winners in my opinion are Qualstar and perhaps IBM.  I give the IBM
libraries a perhaps as we have had very good experience with the 349x
libraries and only limited experience with 3584.  These seem OK, but not
much experience.  The lower end IBM libraries are based on someone else's
technology so I would think one might get a better deal buying direct from
that manufacturer.

Compatibility with previous technology.  Some DLT bigots are SDLT bigots
because they believe in investment protection.  I think that's balderdash as
very few people would ever try to read a DLT tape with an SDLT drive anyway
so what difference does it make?  All three of these are relatively new
technologies and you are going to switch to one anyway, so investigate all
three.

The all important Kelly recommendation:

For value, AIT3 is unsurpassed: very good performance, relatively
inexpensive, great automation, manufactured by one company so technology is
first rate.

For openness (or perceived openness) LTO: excellent performance, reasonably
priced, so-so automation, standards based and built by more than one
manufacturer (but how many of us are going to buy from more than one anyway
and if you attend presentations by each one about their LTO product you come
away from each one in succession thinking you have found the best, i.e.,
they all lie equally convincingly (probably shouldn't have two ly words in
the same sentence)).

For perceived technical excellence, SDLT: Quantum has very neat technology
in their drives.  Does it matter much?  Probably not, but cool anyway.

So:

For the price conscious: AIT3 going to AIT4 when available.
If you're an Open kind of dude: LTO
If you believe in Quantum: SDLT.  They offer a very good product IMHO.

LTO and SDLT will be very close in price so go with your gut.

As always, study, study, study.  Get input from those you respect.  Choose
wisely and then get and stay behind your choice.  STORServer supports all
three technologies equally.

Views expressed here are my own.

Kelly J. Lipp
STORServer, Inc.
485-B Elkton Drive
Colorado Springs, CO 80907
[EMAIL PROTECTED] or [EMAIL PROTECTED]
www.storsol.com or www.storserver.com
(719)531-5926
Fax: (240)539-7175


-Original Message-
From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED]]On Behalf Of
Colby Morgan
Sent: Thursday, February 13, 2003 3:24 PM
To: [EMAIL PROTECTED]
Subject: Tape Reliability Recommendations


We are currently running TSM 5.1.5 on Win2k with an IBM Mammoth-2 drive for
offsite copypools.  We have had problems with both our onsite M2 and offsite
M2 drives at our disaster recovery center.  IBM has replaced the drive more
than a dozen times in the last two years and Exabyte has replaced countless
tapes.  Most recently we are experiencing a high rate of media write
failures on a newly replaced drive as well as media read failures in DR
testing, both using brand new 225m AME media

Tape Reliability Recommendations

2003-02-13 Thread Colby Morgan
We are currently running TSM 5.1.5 on Win2k with an IBM Mammoth-2 drive for
offsite copypools.  We have had problems with both our onsite M2 and offsite
M2 drives at our disaster recovery center.  IBM has replaced the drive more
than a dozen times in the last two years and Exabyte has replaced countless
tapes.  Most recently we are experiencing a high rate of media write
failures on a newly replaced drive as well as media read failures in DR
testing, both using brand new 225m AME media.

Is anybody else out their running an IBM/Exabyte Mammoth-2 drive and if so
what kind of results do you see?

My real question is what is the most common/reliable removable tape
technologies for the Intel TSM environment?  We are considering switching
technologies and I wanted to solicit testimonies on other technologies (DLT,
LTO, SDLT,etc...).

We currently copy around 135GB to 300GB offsite daily.


Thanks,

Colby



Re: Tape Reliability Recommendations

2003-02-13 Thread Steve Bennett
Colby,

We have been using the Exabyte M2 drives and tapes for a couple of years
now. We did have lots of problems and wore out several drives before we
found out that that running our four drive library on one 29160
interface was causing the drives and tapes to wear out. We reconfigured
and now only have two drives on each 29160 and the problems have almost
completely ceased.

We have two TSM server sites. Total primary pool storage use is about
3tb. In site 1 we have one library with 80 slots and four M2 drives for
our primary tapepool. We also have two external M2 drives that we create
offsite copypool tapes with. The tapes from one copypool go across town
by courier and the other copypool tapes are sent DHL to site 2 in
another city. At site 2 we have a 3494 with two 3590e drives for our
primary tapepool and use the same drives for creating copypool tapes
that go across the street. Another set of external M2 drives create
copypool tapes that are sent DHL to site 1. Each site is the disaster
recovery site for the other.

I have run a complete recovery test using the media that was sent from
site 1 to site 2 and had no problems reading the M2 tapes.

The M2 drives are better than the M2 media. Heavy usage of the tapes
would require regular replacement. Not like the 3590 drive and tapes
that we beat to death and as they say takes a lickin and keeps on
tickin.

In my experience Exabyte support is marginal at best. It takes an act of
God for them to admit that one of their tapes might be defective.

Contact me offline if you want to discuss further.


Colby Morgan wrote:

 We are currently running TSM 5.1.5 on Win2k with an IBM Mammoth-2 drive for
 offsite copypools.  We have had problems with both our onsite M2 and offsite
 M2 drives at our disaster recovery center.  IBM has replaced the drive more
 than a dozen times in the last two years and Exabyte has replaced countless
 tapes.  Most recently we are experiencing a high rate of media write
 failures on a newly replaced drive as well as media read failures in DR
 testing, both using brand new 225m AME media.

 Is anybody else out their running an IBM/Exabyte Mammoth-2 drive and if so
 what kind of results do you see?

 My real question is what is the most common/reliable removable tape
 technologies for the Intel TSM environment?  We are considering switching
 technologies and I wanted to solicit testimonies on other technologies (DLT,
 LTO, SDLT,etc...).

 We currently copy around 135GB to 300GB offsite daily.

 Thanks,

 Colby

--

Steve Bennett, (907) 465-5783
State of Alaska, Information Technology Group, Technical Services
Section