Re: [zfs-discuss] Using consumer drives in a zraid2

2009-08-26 Thread Tristan Ball
The intended use is NFS storage to back some VMWare servers running a
range of different VM's, including Exchange, Lotus Domino, SQL Server
and Oracle. :-) It's a very random workload, and all the research I've
done points to mirroring as the better option for providing a better
total IOP/s. The server in question is limited in the amount of RAM it
can take, so the effectiveness of both the arc and l2arc will be limited
somewhat, so I believe mirroring to be lower risk from a performance
point of view. But I've got mirrored 25-E's in there for the zil, and an
X25-M for arc as well.

 

Obviously some of this situation is not ideal, nor of my choosing. I'd
like to have a newer-faster server in there, and a second JBOD for more
drives, and probably another couple of X25-M's. Actually, I'd like to
just grab one of the Sun 7000 series and drop that in. :-) The only way
I'll get approval for the extra expenditure is to show that the current
system is viable in an initial proof of concept limited deployment
project, and one of the ways I'm doing that is to ensure I get what I
believe to be the best possible performance from my existing hardware -
and I think mirroring will do that.

 

Performance was actually more important than capacity, and I wasn't
willing to bet in advance on the arc's effectiveness. Actually, I
believe the current system will give me the requisite IOP/s without the
l2arc, I added the arc because for the cost I considered it silly not to
considering the relative cost. For those periods that it is effective,
it really makes a difference too!

 

T.

 



From: Tim Cook [mailto:t...@cook.ms] 
Sent: Wednesday, 26 August 2009 3:48 PM
To: Tristan Ball
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Using consumer drives in a zraid2

 

 

On Wed, Aug 26, 2009 at 12:27 AM, Tristan Ball
tristan.b...@leica-microsystems.com wrote:

The remaining drive would only have been flagged as dodgy if the bad
sectors had been found, hence my comments (and general best practice)
about data scrub's being necessary. While I agree it's possibly likely
that the enterprise drive would flag errors earlier, I wouldn't
necessarily bet on it. Just because a given sector has successfully been
read a number of times before doesn't guarantee that it will be read
successfully again, and again the enterprise drive doesn't try as hard.
In the absence of scrubs, resilvering can be the hardest thing the drive
does, and by my experience is likely to show up errors that haven't
occurred before. But you make a good point about retrying the resilver
until it works, presuming I don't hit a too many errors, device
faulted condition. :-)

 

I would have liked to go RaidZ2, but performance has dictated mirroring.
Physical, Financial and Capacity constraints have conspired together to
restrict me to 2 way mirroring rather than 3 way, which would have been
my next choice. :-)

 

 

Regards

Tristan

 

(Who is now going to spend the afternoon figuring out how to win lottery
by osmosis: http://en.wikipedia.org/wiki/Osmosis :-) )


My suggestion/question/whatever would be: why wouldn't raidz+an SSD arc
not meet both financial and performance requirements?  It would
literally be a first for me.

--Tim 


__
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
__

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using consumer drives in a zraid2

2009-08-26 Thread Ross
Here's one horror story of mine - ZFS taking over 20 minutes to flag a drive as 
faulty, with the entire pool responding so slowly during those 20 minutes that 
it crashed six virtual machines running off the pool:
http://www.opensolaris.org/jive/thread.jspa?messageID=369265#369265

There are some performance tweaks mentioned in that thread, but I haven't been 
able to test their effectiveness yet, and I'm still concerned.  A pool 
consisting of nothing but three way mirrors should not even break a sweat when 
faced with a single drive problem.

When you're implementing what's sold as the most advanced file system ever, 
billed as good enough to obsolete raid controllers, you don't expect to be 
doing manual tweaks just so it can cope with a drive failure without hanging 
the entire pool.

ZFS has its benefits, but if you're not running it on Sun hardware you need to 
do a *lot* of homework.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using consumer drives in a zraid2

2009-08-26 Thread Michael Herf
I'm using the Caviar Green drives in a 5-disk config.

I downloaded the WDTLER utility and set all the drives to have a 7-second 
timeout, like the RE series have.

WDTLER boots a small DOS app and you have to hit a key for each drive to 
adjust. So this might take time for a large raidz2.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using consumer drives in a zraid2

2009-08-26 Thread Thomas Burgess
I've been running ZFS under FreeBSD which is experimental, and i've had
nothing but great luckso i guess it depeneds on a number of things.  I
went with FreeBSD because the hardware i had wasn't supported in
solarisi expected problems but honestly, it's been rock solidit's
survived all kinds of things...power failures, a drive failure, partial
drive failure and a power supply dying...
I'm running 3 raidz1 VDEVS all with 1TB drives.   I've asked in the FreeBSD
forums for horror stories because i see SO many on this thread but so far,
noone has has ANYTHING bad to say about ZFS in FreeBSD.

I'm sure this is somewhat due to the fact that it's not as popular in
FreeBSD yet but i still expected to hear SOME horror storiesFrom my
experience and the experiences of the other people in the FreeBSD forums
it's been great running on both non-sun hardware AND software.


On Wed, Aug 26, 2009 at 3:06 AM, Ross myxi...@googlemail.com wrote:

 Here's one horror story of mine - ZFS taking over 20 minutes to flag a
 drive as faulty, with the entire pool responding so slowly during those 20
 minutes that it crashed six virtual machines running off the pool:
 http://www.opensolaris.org/jive/thread.jspa?messageID=369265#369265

 There are some performance tweaks mentioned in that thread, but I haven't
 been able to test their effectiveness yet, and I'm still concerned.  A pool
 consisting of nothing but three way mirrors should not even break a sweat
 when faced with a single drive problem.

 When you're implementing what's sold as the most advanced file system ever,
 billed as good enough to obsolete raid controllers, you don't expect to be
 doing manual tweaks just so it can cope with a drive failure without hanging
 the entire pool.

 ZFS has its benefits, but if you're not running it on Sun hardware you need
 to do a *lot* of homework.
 --
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using consumer drives in a zraid2

2009-08-26 Thread Bob Friesenhahn

On Wed, 26 Aug 2009, Tristan Ball wrote:


Complete disk failures are comparatively rare, while media or transient
errors are far more common. As a media I/O or transient error on the


It seems that this assumption is is not always the case.  The 
expensive small-capacity SCSI/SAS enterprise drives rarely experience 
media errors so total drive failure becomes a larger factor.  Large 
capacity SATA drives tend to report many more media failures and the 
whole drive failure rate is perhaps not much worse than enterprise 
SCSI/SAS drives.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using consumer drives in a zraid2

2009-08-26 Thread Troels Nørgaard Nielsen

Hi Tim Cook.

If I was building my own system again, I would prefer not to go with  
consumer harddrives.
I had a raidz pool containing eight drives on a snv108 system, after  
rebooting, four of the eight drives was so broken they could not be  
seen by format, let alone the zpool they belonged to.


This was with Samsung HD103UJ revision 1112 and 1113 disks. No matter  
what kind of hotspares, raidz2 or 3-way mirror would have saved me, so  
it was RMA the drives, buy some new and restore from backup. The  
controller was LSI1068E - A cheap USB-to-SATA adapter could see them,  
but with masive stalls and errors.
These disks was at the moment the cheapest 1 TB disks avaliable, I  
understand why now.


But I'm still stuck with 6 of them in my system ;-(

Best regards
Troels Nørgaard

Den 26/08/2009 kl. 07.46 skrev Tim Cook:


On Wed, Aug 26, 2009 at 12:22 AM, thomas tjohnso...@gmail.com wrote:
 I'll admit, I was cheap at first and my
 fileserver right now is consumer drives. nbsp;You
 can bet all my future purchases will be of the enterprise grade.  
nbsp;And
 guess what... none of the drives in my array are less than 5 years  
old, so even

 if they did die, and I had bought the enterprise versions, they'd be
 covered.

Anything particular happen that made you change your mind? I started  
with
enterprise grade because of similar information discussed in this  
thread.. but I've
also been wondering how zfs holds up with consumer level drives and  
if I could save
money by using them in the future. I guess I'm looking for horror  
stories that can be

attributed to them? ;)


When it comes to my ZFS project, I am currently lacking horror  
stories.  When it comes to what the hell, this drive literally  
failed a week after the warranty was up, I unfortunately PERSONALLY  
have 3 examples.  I'm guessing (hoping) it's just bad luck.  Perhaps  
the luck wasn't SO bad though, as I had backups of all of those  
(proof, you should never rely on a single drive to last up to,or  
beyond its warranty period.


--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss




smime.p7s
Description: S/MIME cryptographic signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using consumer drives in a zraid2

2009-08-26 Thread Joseph L. Casale
If I was building my own system again, I would prefer not to go with consumer 
harddrives.
I had a raidz pool containing eight drives on a snv108 system, after 
rebooting, four of
the eight drives was so broken they could not be seen by format, let alone the 
zpool they
belonged to.

This was with Samsung HD103UJ revision 1112 and 1113 disks. No matter what 
kind of hotspares,
raidz2 or 3-way mirror would have saved me, so it was RMA the drives, buy some 
new and restore
from backup. The controller was LSI1068E - A cheap USB-to-SATA adapter could 
see them, but with
masive stalls and errors. These disks was at the moment the cheapest 1 TB 
disks avaliable, I
understand why now.

I can attest to the same experience with very nearly the same hardware. My next 
non critical system
will not use consumer HD's either. The stalling issue seemed to vary with 
different hardware but
left my chasing my tail endlessly. I am sure Seagate has a bulletin with my 
name on it...

jlc
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using consumer drives in a zraid2

2009-08-26 Thread Adam Sherman
But the real question is whether the enterprise drives would have  
avoided your problem.


A.

--
Adam Sherman
+1.613.797.6819

On 2009-08-26, at 11:38, Troels Nørgaard Nielsen t...@t86.dk wrote:


Hi Tim Cook.

If I was building my own system again, I would prefer not to go with  
consumer harddrives.
I had a raidz pool containing eight drives on a snv108 system, after  
rebooting, four of the eight drives was so broken they could not be  
seen by format, let alone the zpool they belonged to.


This was with Samsung HD103UJ revision 1112 and 1113 disks. No  
matter what kind of hotspares, raidz2 or 3-way mirror would have  
saved me, so it was RMA the drives, buy some new and restore from  
backup. The controller was LSI1068E - A cheap USB-to-SATA adapter  
could see them, but with masive stalls and errors.
These disks was at the moment the cheapest 1 TB disks avaliable, I  
understand why now.


But I'm still stuck with 6 of them in my system ;-(

Best regards
Troels Nørgaard

Den 26/08/2009 kl. 07.46 skrev Tim Cook:

On Wed, Aug 26, 2009 at 12:22 AM, thomas tjohnso...@gmail.com  
wrote:

 I'll admit, I was cheap at first and my
 fileserver right now is consumer drives. nbsp;You
 can bet all my future purchases will be of the enterprise grade.  
nbsp;And
 guess what... none of the drives in my array are less than 5  
years old, so even
 if they did die, and I had bought the enterprise versions, they'd  
be

 covered.

Anything particular happen that made you change your mind? I  
started with
enterprise grade because of similar information discussed in this  
thread.. but I've
also been wondering how zfs holds up with consumer level drives and  
if I could save
money by using them in the future. I guess I'm looking for horror  
stories that can be

attributed to them? ;)


When it comes to my ZFS project, I am currently lacking horror  
stories.  When it comes to what the hell, this drive literally  
failed a week after the warranty was up, I unfortunately  
PERSONALLY have 3 examples.  I'm guessing (hoping) it's just bad  
luck.  Perhaps the luck wasn't SO bad though, as I had backups of  
all of those (proof, you should never rely on a single drive to  
last up to,or beyond its warranty period.


--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using consumer drives in a zraid2

2009-08-26 Thread Richard Elling

On Aug 25, 2009, at 9:38 PM, Tristan Ball wrote:

What I’m worried about that time period where the pool is  
resilvering to the hot spare. For example: one half of a mirror has  
failed completely, and the mirror is being rebuilt onto the spare –  
if I get a read error from the remaining half of the mirror, then  
I’ve lost data. If the RE drives return’s an error for a request  
that a consumer drive would have (eventually) returned, then in this  
specific case I would have been better off with the consumer drive.


The difference is the error detection time. In general, you'd like  
errors to be
detected quickly.  The beef with the consumer drives and the Solaris +  
ZFS
architecture is that the drives do not return on error, they just keep  
trying.
So you have to wait for the sd (or other) driver to timeout the  
request. By
default, this is on the order of minutes. Meanwhile, ZFS is patiently  
awaiting
a status on the request. For enterprise class drives, there is a  
limited number
of retries on the disk before it reports an error. You can expect  
responses of
success in the order of 10 seconds or less. After the error is  
detected, ZFS

can do something about it.

All of this can be tuned, of course.  Sometimes that tuning is ok by  
default,

sometimes not. Until recently, the biggest gripes were against the iscsi
client which had a hardwired 3 minute error detection. For current  
builds

you can tune these things without recompiling.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using consumer drives in a zraid2

2009-08-26 Thread Neal Pollack

On 08/25/09 10:46 PM, Tim Cook wrote:
On Wed, Aug 26, 2009 at 12:22 AM, thomas tjohnso...@gmail.com 
mailto:tjohnso...@gmail.com wrote:


  I'll admit, I was cheap at first and my
  fileserver right now is consumer drives. nbsp;You
  can bet all my future purchases will be of the enterprise grade.
nbsp;And
  guess what... none of the drives in my array are less than 5
years old, so even
  if they did die, and I had bought the enterprise versions, they'd be
  covered.

Anything particular happen that made you change your mind? I started
with
enterprise grade because of similar information discussed in this
thread.. but I've
also been wondering how zfs holds up with consumer level drives and
if I could save
money by using them in the future. I guess I'm looking for horror
stories that can be
attributed to them? ;)



When it comes to my ZFS project, I am currently lacking horror stories.  
When it comes to what the hell, this drive literally failed a week 
after the warranty was up, I unfortunately PERSONALLY have 3 examples.  
I'm guessing (hoping) it's just bad luck.  



Luck or design/usage ?
Let me explain;   I've also had many drives fail over the last 25
years of working on computers, I.T., engineering, manufacturing,
and building my own PCs.

Drive life can be directly affected by heat.  Many home tower designs,
until the last year or two, had no cooling fans or air flow where
the drives mount.  I'd say over 80% of desktop average PCs do
not have any cooling or air flow for the drive.
(I've replaced many many for friends).
[HP small form factor desktops are the worst offenders
 in what I jokingly call zero cooling design :-)
Just look at the quantity of refurbished ones offered for sale]

Once I started adding cooling fans for my drives in my own
workstations I build, the rate of drive failures went
down by a lot.  The drive life went up by a lot.

You can still have random failures for a dozen reasons, but
heat is one of the big killers.  I did some experiments over
the last 5 years and found that ANY amount of air flow makes
a big difference.  If you run a 12 volt fan at 7 volts by
connecting it's little red and black wires across the outside
of a disk drive connecter (red and orange wires, 12 and 5 volt, difference
is 7), then the fan is silent, moves a small flow of air, and drops
the disk drive temperature by a lot.
[Translation:  It can be as quiet as a dell, but twice as good
since you built it :-) ]

That said, there are some garbage disk drive designs on the market.
But if a lot of yours fail early, close to warranty, they might
be getting abused or run near the max design temperature?

Neal


Perhaps the luck wasn't SO
bad though, as I had backups of all of those (proof, you should never 
rely on a single drive to last up to,or beyond its warranty period.


--Tim




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using consumer drives in a zraid2

2009-08-26 Thread Tim Cook
On Wed, Aug 26, 2009 at 11:45 AM, Neal Pollack neal.poll...@sun.com wrote:


 Luck or design/usage ?
 Let me explain;   I've also had many drives fail over the last 25
 years of working on computers, I.T., engineering, manufacturing,
 and building my own PCs.

 Drive life can be directly affected by heat.  Many home tower designs,
 until the last year or two, had no cooling fans or air flow where
 the drives mount.  I'd say over 80% of desktop average PCs do
 not have any cooling or air flow for the drive.
 (I've replaced many many for friends).
 [HP small form factor desktops are the worst offenders
  in what I jokingly call zero cooling design :-)
 Just look at the quantity of refurbished ones offered for sale]

 Once I started adding cooling fans for my drives in my own
 workstations I build, the rate of drive failures went
 down by a lot.  The drive life went up by a lot.

 You can still have random failures for a dozen reasons, but
 heat is one of the big killers.  I did some experiments over
 the last 5 years and found that ANY amount of air flow makes
 a big difference.  If you run a 12 volt fan at 7 volts by
 connecting it's little red and black wires across the outside
 of a disk drive connecter (red and orange wires, 12 and 5 volt, difference
 is 7), then the fan is silent, moves a small flow of air, and drops
 the disk drive temperature by a lot.
 [Translation:  It can be as quiet as a dell, but twice as good
 since you built it :-) ]

 That said, there are some garbage disk drive designs on the market.
 But if a lot of yours fail early, close to warranty, they might
 be getting abused or run near the max design temperature?

 Neal


I've always cooled my drives.  I just blame it on MAXTOR having horrible
designs.

Funny, everyone bagged on the 75GXP's from IBM, but I had a pair that I
bought when they first came out, used them for 5 years, then sold them to a
friend who got at least another 3 years out of them (heck, he might still be
using them for all I know).  Those maxtor's weren't worth the packaging they
came in.  I wasn't sad to see them bought up.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using consumer drives in a zraid2

2009-08-26 Thread thomas
Hi Richard,


 So you have to wait for the sd (or other) driver to
 timeout the request. By
 default, this is on the order of minutes. Meanwhile,
 ZFS is patiently awaiting a status on the request. For
 enterprise class drives, there is a limited number
 of retries on the disk before it reports an error.
 You can expect responses of success in the order of
 10 seconds or less. After the error is detected, ZFS
 can do something about it.
 
 All of this can be tuned, of course.  Sometimes that
 tuning is ok by default, sometimes not. Until recently, the
 biggest gripes were against the iscsi client which had a
 hardwired 3 minute error detection. For current  
 builds you can tune these things without recompiling.
   -- richard


So are you suggesting that tuning the sd driver's settings to timeout sooner if 
using
a consumer class drive wouldn't be wise for perhaps other reasons?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using consumer drives in a zraid2

2009-08-26 Thread Richard Elling

On Aug 26, 2009, at 1:17 PM, thomas wrote:


Hi Richard,



So you have to wait for the sd (or other) driver to
timeout the request. By
default, this is on the order of minutes. Meanwhile,
ZFS is patiently awaiting a status on the request. For
enterprise class drives, there is a limited number
of retries on the disk before it reports an error.
You can expect responses of success in the order of
10 seconds or less. After the error is detected, ZFS
can do something about it.

All of this can be tuned, of course.  Sometimes that
tuning is ok by default, sometimes not. Until recently, the
biggest gripes were against the iscsi client which had a
hardwired 3 minute error detection. For current
builds you can tune these things without recompiling.
 -- richard



So are you suggesting that tuning the sd driver's settings to  
timeout sooner if using

a consumer class drive wouldn't be wise for perhaps other reasons?


Unfortunately, it requires skill and expertise :-(. Getting it
wrong can lead to an unstable system. For this reason, the
main users of such tuning are the large integrators or IHVs.

Note: this sort of thing often does not have an immediate or
obvious impact. But it can have an impact when a failure or
unanticipated event occurs. In other words, when the going
gets tough, serviceability can be negatively impacted. Murphy's
law implies that will happen at the worst possible moment.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using consumer drives in a zraid2

2009-08-25 Thread Tristan Ball
I guess it depends on whether or not you class the various Raid
Edition drives as consumer? :-)

My one concern with these RE drives is that because they will return
errors early rather than retry is that they may fault when a normal
consumer drive would have returned the data eventually. If the pool is
already degraded due to a bad device, that could mean faulting the
entire pool.

Regards,
Tristan

-Original Message-
From: zfs-discuss-boun...@opensolaris.org
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of thomas
Sent: Wednesday, 26 August 2009 12:55 PM
To: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Using consumer drives in a zraid2

Are there *any* consumer drives that don't respond for a long time
trying to recover from an error? In my experience they all behave this
way which has been a nightmare on hardware raid controllers.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

__
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
__
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using consumer drives in a zraid2

2009-08-25 Thread Tim Cook
On Tue, Aug 25, 2009 at 10:56 PM, Tristan Ball 
tristan.b...@leica-microsystems.com wrote:

 I guess it depends on whether or not you class the various Raid
 Edition drives as consumer? :-)

 My one concern with these RE drives is that because they will return
 errors early rather than retry is that they may fault when a normal
 consumer drive would have returned the data eventually. If the pool is
 already degraded due to a bad device, that could mean faulting the
 entire pool.

 Regards,
Tristan



Having it return errors when they really exist isn't a bad thing.  What it
boils down to is: you need a hot spare.  If you're running raid-z and a
drive fails, the hot spare takes over.  Once it's done resilvering, you send
the *bad drive* back in for an RMA.  (added bonus is the RE's have a 5 year
instead of 3 year warranty).

You seem to be upset that the drive is more conservative about fail modes.
 To me, that's a good thing.  I'll admit, I was cheap at first and my
fileserver right now is consumer drives.  You can bet all my future
purchases will be of the enterprise grade.  And guess what... none of the
drives in my array are less than 5 years old, so even if they did die, and I
had bought the enterprise versions, they'd be covered.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using consumer drives in a zraid2

2009-08-25 Thread Tristan Ball
Not upset as such :-)

 

What I'm worried about that time period where the pool is resilvering to
the hot spare. For example: one half of a mirror has failed completely,
and the mirror is being rebuilt onto the spare - if I get a read error
from the remaining half of the mirror, then I've lost data. If the RE
drives return's an error for a request that a consumer drive would have
(eventually) returned, then in this specific case I would have been
better off with the consumer drive.

 

That said, my initial ZFS systems are built with consumer drives, not
Raid Edition's, as much as anything as we got burned by some early RE
drives in some of our existing raid boxes here, so I had a general low
opinion of them. However, having done a little more reading about the
error recovery time stuff, I will also be putting in RE drives for the
production systems, and moving the consumer drives to the DR systems.

 

My logic is pretty straight forward:

 

Complete disk failures are comparatively rare, while media or transient
errors are far more common. As a media I/O or transient error on the
drive can affect the performance of the entire pool, I'm best of with
the RE drives to mitigate that. The risk of a double disk failure as
described above is partially mitigated by regular scrubs. The impact of
a double disk failure is mitigated by send/recv'ing to another box, and
catastrophic and human failures are partially mitigated by backing the
whole lot up to tape. :-)

 

Regards

Tristan.

 

 

 



From: Tim Cook [mailto:t...@cook.ms] 
Sent: Wednesday, 26 August 2009 2:08 PM
To: Tristan Ball
Cc: thomas; zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Using consumer drives in a zraid2

 

 

On Tue, Aug 25, 2009 at 10:56 PM, Tristan Ball
tristan.b...@leica-microsystems.com wrote:

I guess it depends on whether or not you class the various Raid
Edition drives as consumer? :-)

My one concern with these RE drives is that because they will return
errors early rather than retry is that they may fault when a normal
consumer drive would have returned the data eventually. If the pool is
already degraded due to a bad device, that could mean faulting the
entire pool.

Regards,
   Tristan



Having it return errors when they really exist isn't a bad thing.  What
it boils down to is: you need a hot spare.  If you're running raid-z and
a drive fails, the hot spare takes over.  Once it's done resilvering,
you send the *bad drive* back in for an RMA.  (added bonus is the RE's
have a 5 year instead of 3 year warranty).  

You seem to be upset that the drive is more conservative about fail
modes.  To me, that's a good thing.  I'll admit, I was cheap at first
and my fileserver right now is consumer drives.  You can bet all my
future purchases will be of the enterprise grade.  And guess what...
none of the drives in my array are less than 5 years old, so even if
they did die, and I had bought the enterprise versions, they'd be
covered.

--Tim


__
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
__

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using consumer drives in a zraid2

2009-08-25 Thread Tim Cook
On Tue, Aug 25, 2009 at 11:38 PM, Tristan Ball 
tristan.b...@leica-microsystems.com wrote:

  Not upset as such J



 What I’m worried about that time period where the pool is resilvering to
 the hot spare. For example: one half of a mirror has failed completely, and
 the mirror is being rebuilt onto the spare – if I get a read error from the
 remaining half of the mirror, then I’ve lost data. If the RE drives return’s
 an error for a request that a consumer drive would have (eventually)
 returned, then in this specific case I would have been better off with the
 consumer drive.



 That said, my initial ZFS systems are built with consumer drives, not Raid
 Edition’s, as much as anything as we got burned by some early RE drives in
 some of our existing raid boxes here, so I had a general low opinion of
 them. However, having done a little more reading about the error recovery
 time stuff, I will also be putting in RE drives for the production systems,
 and moving the consumer drives to the DR systems.



 My logic is pretty straight forward:



 Complete disk failures are comparatively rare, while media or transient
 errors are far more common. As a media I/O or transient error on the drive
 can affect the performance of the entire pool, I’m best of with the RE
 drives to mitigate that. The risk of a double disk failure as described
 above is partially mitigated by regular scrubs. The impact of a double disk
 failure is mitigated by send/recv’ing to another box, and catastrophic and
 human failures are partially mitigated by backing the whole lot up to tape.
 J



 Regards

 Tristan.


The part you're missing is the good drive should have flagged bad long,
long before a consumer drive would have.  That being the case, the odds are
far, far less likely you'd get a bad read from an enterprise drive, than
you would get a good read from a consumer drive constantly retrying.
That's ignoring the fact you could re-issue the resilver repeatedly until
you got the response you wanted from the good drive.

In any case, unless performance is absolutely forcing you to do otherwise,
if you're that paranoid just do a raid-z2/3, and you won't have to worry
about it.  The odds of 4 drives not returning valid data are so rare (even
among RE drives), you might as well stop working and live in a hole (as your
odds are better being hit by a meteor or winning the lottery by osmosis).

I KIIID.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using consumer drives in a zraid2

2009-08-25 Thread thomas
 I'll admit, I was cheap at first and my
 fileserver right now is consumer drives. nbsp;You
 can bet all my future purchases will be of the enterprise grade. nbsp;And
 guess what... none of the drives in my array are less than 5 years old, so 
 even
 if they did die, and I had bought the enterprise versions, they'd be
 covered.

Anything particular happen that made you change your mind? I started with
enterprise grade because of similar information discussed in this thread.. 
but I've
also been wondering how zfs holds up with consumer level drives and if I could 
save
money by using them in the future. I guess I'm looking for horror stories that 
can be
attributed to them? ;)
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using consumer drives in a zraid2

2009-08-25 Thread Tristan Ball
The remaining drive would only have been flagged as dodgy if the bad
sectors had been found, hence my comments (and general best practice)
about data scrub's being necessary. While I agree it's possibly likely
that the enterprise drive would flag errors earlier, I wouldn't
necessarily bet on it. Just because a given sector has successfully been
read a number of times before doesn't guarantee that it will be read
successfully again, and again the enterprise drive doesn't try as hard.
In the absence of scrubs, resilvering can be the hardest thing the drive
does, and by my experience is likely to show up errors that haven't
occurred before. But you make a good point about retrying the resilver
until it works, presuming I don't hit a too many errors, device
faulted condition. :-)

 

I would have liked to go RaidZ2, but performance has dictated mirroring.
Physical, Financial and Capacity constraints have conspired together to
restrict me to 2 way mirroring rather than 3 way, which would have been
my next choice. :-)

 

 

Regards

Tristan

 

(Who is now going to spend the afternoon figuring out how to win lottery
by osmosis: http://en.wikipedia.org/wiki/Osmosis :-) )

 



From: Tim Cook [mailto:t...@cook.ms] 
Sent: Wednesday, 26 August 2009 3:01 PM
To: Tristan Ball
Cc: thomas; zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Using consumer drives in a zraid2

 

 

On Tue, Aug 25, 2009 at 11:38 PM, Tristan Ball
tristan.b...@leica-microsystems.com wrote:

Not upset as such :-)

 

What I'm worried about that time period where the pool is resilvering to
the hot spare. For example: one half of a mirror has failed completely,
and the mirror is being rebuilt onto the spare - if I get a read error
from the remaining half of the mirror, then I've lost data. If the RE
drives return's an error for a request that a consumer drive would have
(eventually) returned, then in this specific case I would have been
better off with the consumer drive.

 

That said, my initial ZFS systems are built with consumer drives, not
Raid Edition's, as much as anything as we got burned by some early RE
drives in some of our existing raid boxes here, so I had a general low
opinion of them. However, having done a little more reading about the
error recovery time stuff, I will also be putting in RE drives for the
production systems, and moving the consumer drives to the DR systems.

 

My logic is pretty straight forward:

 

Complete disk failures are comparatively rare, while media or transient
errors are far more common. As a media I/O or transient error on the
drive can affect the performance of the entire pool, I'm best of with
the RE drives to mitigate that. The risk of a double disk failure as
described above is partially mitigated by regular scrubs. The impact of
a double disk failure is mitigated by send/recv'ing to another box, and
catastrophic and human failures are partially mitigated by backing the
whole lot up to tape. :-)

 

Regards

Tristan.


The part you're missing is the good drive should have flagged bad
long, long before a consumer drive would have.  That being the case, the
odds are far, far less likely you'd get a bad read from an enterprise
drive, than you would get a good read from a consumer drive constantly
retrying.  That's ignoring the fact you could re-issue the resilver
repeatedly until you got the response you wanted from the good drive.


In any case, unless performance is absolutely forcing you to do
otherwise, if you're that paranoid just do a raid-z2/3, and you won't
have to worry about it.  The odds of 4 drives not returning valid data
are so rare (even among RE drives), you might as well stop working and
live in a hole (as your odds are better being hit by a meteor or winning
the lottery by osmosis).

I KIIID.

--Tim 


__
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
__

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using consumer drives in a zraid2

2009-08-25 Thread Tim Cook
On Wed, Aug 26, 2009 at 12:22 AM, thomas tjohnso...@gmail.com wrote:

  I'll admit, I was cheap at first and my
  fileserver right now is consumer drives. nbsp;You
  can bet all my future purchases will be of the enterprise grade.
 nbsp;And
  guess what... none of the drives in my array are less than 5 years old,
 so even
  if they did die, and I had bought the enterprise versions, they'd be
  covered.

 Anything particular happen that made you change your mind? I started with
 enterprise grade because of similar information discussed in this
 thread.. but I've
 also been wondering how zfs holds up with consumer level drives and if I
 could save
 money by using them in the future. I guess I'm looking for horror stories
 that can be
 attributed to them? ;)



When it comes to my ZFS project, I am currently lacking horror stories.
When it comes to what the hell, this drive literally failed a week after
the warranty was up, I unfortunately PERSONALLY have 3 examples.  I'm
guessing (hoping) it's just bad luck.  Perhaps the luck wasn't SO bad
though, as I had backups of all of those (proof, you should never rely on a
single drive to last up to,or beyond its warranty period.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using consumer drives in a zraid2

2009-08-25 Thread Tim Cook
On Wed, Aug 26, 2009 at 12:27 AM, Tristan Ball 
tristan.b...@leica-microsystems.com wrote:

  The remaining drive would only have been flagged as dodgy if the bad
 sectors had been found, hence my comments (and general best practice) about
 data scrub’s being necessary. While I agree it’s possibly likely that the
 enterprise drive would flag errors earlier, I wouldn’t necessarily bet on
 it. Just because a given sector has successfully been read a number of times
 before doesn’t guarantee that it will be read successfully again, and again
 the enterprise drive doesn’t try as hard.  In the absence of scrubs,
 resilvering can be the hardest thing the drive does, and by my experience is
 likely to show up errors that haven’t occurred before. But you make a good
 point about retrying the resilver until it works, presuming I don’t hit a
 “too many errors, device faulted” condition. J



 I would have liked to go RaidZ2, but performance has dictated mirroring.
 Physical, Financial and Capacity constraints have conspired together to
 restrict me to 2 way mirroring rather than 3 way, which would have been my
 next choice. J





 Regards

 Tristan



 (Who is now going to spend the afternoon figuring out how to win lottery by
 osmosis: http://en.wikipedia.org/wiki/Osmosis :-) )


My suggestion/question/whatever would be: why wouldn't raidz+an SSD arc not
meet both financial and performance requirements?  It would literally be a
first for me.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Using consumer drives in a zraid2

2009-08-24 Thread Ron Mexico
I'm putting together a 48 bay NAS for my company [24 drives to start]. My 
manager has already ordered 24 2TB [b]WD Caviar Green[/b] consumer drives - 
should we send these back and order the 2TB [b]WD RE4-GP[/b] enterprise drives 
instead? 

I'm tempted to try these out. First off, they're about $100 less per drive. 
Second, my experience with so called consumer drives in a raid controller has 
been pretty good - only 2 drive failures in five years with an Apple X-Raid.

Thoughts? Opinions? Flames? All input is appreciated. Thanks.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using consumer drives in a zraid2

2009-08-24 Thread Richard Elling

On Aug 24, 2009, at 11:10 AM, Ron Mexico wrote:

I'm putting together a 48 bay NAS for my company [24 drives to  
start]. My manager has already ordered 24 2TB [b]WD Caviar Green[/b]  
consumer drives - should we send these back and order the 2TB [b]WD  
RE4-GP[/b] enterprise drives instead?


I would.

I'm tempted to try these out. First off, they're about $100 less per  
drive. Second, my experience with so called consumer drives in a  
raid controller has been pretty good - only 2 drive failures in five  
years with an Apple X-Raid.


Thoughts? Opinions? Flames? All input is appreciated. Thanks.


The enterprise class drives offer better vibration compensation which  
will

be needed in a 48-drive bay.

From a RAS perspective, the enterprise class drives have what WD calls
RAID-specific time-limited error recovery which will be needed if you
have a high availability requirement.

I realize it is not easy to make a $ trade-off for some of these  
seemingly

intangible features. It is becoming more difficult as vendors implement
features such as IntelliPower which makes performance predictions
difficult. But beware that the larger, 2 TB drives are tending to be  
slower.

 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using consumer drives in a zraid2

2009-08-24 Thread Ron Mexico
Is there a formula to determine the optimal size of dedicated cache space for 
zraid systems to improve speed?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss