Re[2]: [zfs-discuss] Re: ZFS - Use h/w raid or not? Thoughts. Considerations.

2007-06-01 Thread Robert Milkowski
Hello Richard,


RE But I am curious as to why you believe 2x CF are necessary?
RE I presume this is so that you can mirror.  But the remaining memory
RE in such systems is not mirrored.  Comments and experiences are welcome.

I was thinking about mirroring - it's not clear from the comment above
why it is not needed?




-- 
Best regards,
 Robertmailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS - Use h/w raid or not? Thoughts. Considerations.

2007-05-30 Thread Roch - PAE

Torrey McMahon writes:
  Toby Thain wrote:
  
   On 25-May-07, at 1:22 AM, Torrey McMahon wrote:
  
   Toby Thain wrote:
  
   On 22-May-07, at 11:01 AM, Louwtjie Burger wrote:
  
   On 5/22/07, Pål Baltzersen [EMAIL PROTECTED] wrote:
   What if your HW-RAID-controller dies? in say 2 years or more..
   What will read your disks as a configured RAID? Do you know how to 
   (re)configure the controller or restore the config without 
   destroying your data? Do you know for sure that a spare-part and 
   firmware will be identical, or at least compatible? How good is 
   your service subscription? Maybe only scrapyards and museums will 
   have what you had. =o
  
   Be careful when talking about RAID controllers in general. They are
   not created equal! ...
   Hardware raid controllers have done the job for many years ...
  
   Not quite the same job as ZFS, which offers integrity guarantees 
   that RAID subsystems cannot.
  
   Depend on the guarantees. Some RAID systems have built in block 
   checksumming.
  
  
   Which still isn't the same. Sigh. 
  
  Yep.you get what you pay for. Funny how ZFS is free to purchase 
  isn't it?
  

With RAID level block checksumming, if the data gets
corrupted on it's way  _to_ the array, that data is lost.

With ZFS and RAID-Z or Mirroring, you will recover the
data.

-r


  ___
  zfs-discuss mailing list
  zfs-discuss@opensolaris.org
  http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS - Use h/w raid or not? Thoughts. Considerations.

2007-05-30 Thread Toby Thain


On 30-May-07, at 12:33 PM, Roch - PAE wrote:



Torrey McMahon writes:

Toby Thain wrote:


On 25-May-07, at 1:22 AM, Torrey McMahon wrote:


Toby Thain wrote:


On 22-May-07, at 11:01 AM, Louwtjie Burger wrote:


On 5/22/07, Pål Baltzersen [EMAIL PROTECTED] wrote:

What if your HW-RAID-controller dies? in say 2 years or more..
What will read your disks as a configured RAID? Do you know  
how to

(re)configure the controller or restore the config without
destroying your data? Do you know for sure that a spare-part  
and

firmware will be identical, or at least compatible? How good is
your service subscription? Maybe only scrapyards and museums  
will

have what you had. =o


Be careful when talking about RAID controllers in general.  
They are

not created equal! ...
Hardware raid controllers have done the job for many years ...


Not quite the same job as ZFS, which offers integrity guarantees
that RAID subsystems cannot.


Depend on the guarantees. Some RAID systems have built in block
checksumming.



Which still isn't the same. Sigh.


Yep.you get what you pay for. Funny how ZFS is free to purchase
isn't it?



With RAID level block checksumming, if the data gets
corrupted on it's way  _to_ the array, that data is lost.


Or _from_. There's many a slip 'twixt cup and lip.

--T



With ZFS and RAID-Z or Mirroring, you will recover the
data.

-r



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS - Use h/w raid or not? Thoughts. Considerations.

2007-05-29 Thread Richard Elling

Robert Milkowski wrote:

Hello Richard,

Thursday, May 24, 2007, 6:10:34 PM, you wrote:
RE Incidentally, thumper field reliability is better than we expected.  This 
is causing
RE me to do extra work, because I have to explain why.

I've got some thumpers and there're very reliable.
Even disks aren't failing that much - even less than I expected from
observation on other arrays in the same environment.


Yes, our data is consistent with your observation.


The main problems with x4500+zfs are:

1. hot spare support in zfs - right now it is far from ideal


Agree.  The team is working on this, but I'm not sure of the current status.


2. raidz2 - resilver with lot of small files takes too long

3. SVM root disk mirror over jumpstart doesn't work with x4500 (bug
   opened)

4. I would consider future version of x4500 to have a 2xCF card (or
   something similar) to boot system from - so two disk won't be
   wasted just for OS (2x1TB in a few months).


Current version has a CF card slot, but AFAIK, it is not supported.
We have a number of servers which do support CF for boot, and more in
the pipeline (very popular with some deployment scenarios :-).

But I am curious as to why you believe 2x CF are necessary?
I presume this is so that you can mirror.  But the remaining memory
in such systems is not mirrored.  Comments and experiences are welcome.
 -- richard
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS - Use h/w raid or not? Thoughts. Considerations.

2007-05-29 Thread Carson Gaspar

Richard Elling wrote:


But I am curious as to why you believe 2x CF are necessary?
I presume this is so that you can mirror.  But the remaining memory
in such systems is not mirrored.  Comments and experiences are welcome.


CF == bit-rot-prone disk, not RAM. You need to mirror it for all the 
same reasons you need to mirror hard disks, and then some.


--
Carson
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


RE: [zfs-discuss] Re: ZFS - Use h/w raid or not?Thoughts. Considerations.

2007-05-29 Thread Ellis, Mike
Also the unmirrored memory for the rest of the system has ECC and
ChipKill, which provides at least SOME protection against random
bit-flips.

--

Question: It appears that CF and friends would make a descent live-boot
(but don't run on me like I'm a disk) type of boot-media due to the
limited write/re-write limitations of flash-media. (at least the
non-exotic type of flash-media)

Would something like future zfs-booting on a pair of CF-devices
reduce/lift that limitation? (does the COW nature of ZFS automatically
spread WRITES across the entire CF device?) [[ is tmp-fs/swap going to
remain a problem till zfs-swap adds some COW leveling to the swap-area?
]]

Thanks,

 -- MikeE

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Carson Gaspar
Sent: Tuesday, May 29, 2007 8:05 PM
To: Richard Elling
Cc: zfs-discuss@opensolaris.org; Anton B. Rang
Subject: Re: [zfs-discuss] Re: ZFS - Use h/w raid or not?Thoughts.
Considerations.


Richard Elling wrote:

 But I am curious as to why you believe 2x CF are necessary?
 I presume this is so that you can mirror.  But the remaining memory
 in such systems is not mirrored.  Comments and experiences are
welcome.

CF == bit-rot-prone disk, not RAM. You need to mirror it for all the 
same reasons you need to mirror hard disks, and then some.

-- 
Carson
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS - Use h/w raid or not?Thoughts. Considerations.

2007-05-29 Thread Richard Elling

Ellis, Mike wrote:

Also the unmirrored memory for the rest of the system has ECC and
ChipKill, which provides at least SOME protection against random
bit-flips.


CF devices, at least the ones we'd be interested in, do have ECC as
well as spare sectors and write verification.

Note: flash memories do not suffer from the same radiation-based
bit-flip mechanisms as DRAMS or SRAMS.  The main failure mode we
worry about is write endurance.


Question: It appears that CF and friends would make a descent live-boot
(but don't run on me like I'm a disk) type of boot-media due to the
limited write/re-write limitations of flash-media. (at least the
non-exotic type of flash-media)


Where we see current use is for boot devices, which have the expectation
of read-mostly workloads.  The devices also implement wear leveling.


Would something like future zfs-booting on a pair of CF-devices
reduce/lift that limitation? (does the COW nature of ZFS automatically
spread WRITES across the entire CF device?) [[ is tmp-fs/swap going to
remain a problem till zfs-swap adds some COW leveling to the swap-area?
]]


The belief is that COW file systems which implement checksums and data
redundancy (eg, ZFS and the ZFS copies option) will be redundant over
CF's ECC and wear leveling *at the block level.*  We believe ZFS will
excel in this area, but has limited bootability today.  This will become
more interesting over time, especially when ZFS boot is ubiquitous.

As for swap, it is a good idea if you are sized such that you don't
need to physically use swap.  Most servers today are in this category.
Actually, most servers today have much more memory than would fit in a
reasonably priced CF, so it might be a good idea to swap elsewhere.

In other words, it is more difficult to build the (technical) case for
redundant CFs for boot than it is for disk drives.  Real data would be
greatly appreciated.
 -- richard
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS - Use h/w raid or not?Thoughts. Considerations.

2007-05-29 Thread Bill Sommerfeld
On Tue, 2007-05-29 at 18:48 -0700, Richard Elling wrote:
 The belief is that COW file systems which implement checksums and data
 redundancy (eg, ZFS and the ZFS copies option) will be redundant over
 CF's ECC and wear leveling *at the block level.*  We believe ZFS will
 excel in this area, but has limited bootability today.  This will become
 more interesting over time, especially when ZFS boot is ubiquitous.

I suspect that an interesting config would put the boot archive and not
much else on the CF, and the live root and anything else that needs
regular updates in a main zfs pool on disk.  Something like the original
zfs_mountroot approach would be involved - and that allowed the pool
containing the root to exist in a fully general pool, not limited to the
zfs boot config.

That would further reduce both the size of CF required and the frequency
of updates to it..

- Bill

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re[2]: [zfs-discuss] Re: ZFS - Use h/w raid or not? Thoughts. Considerations.

2007-05-27 Thread Robert Milkowski
Hello Richard,

Thursday, May 24, 2007, 6:10:34 PM, you wrote:


RE Incidentally, thumper field reliability is better than we expected.  This 
is causing
RE me to do extra work, because I have to explain why.

I've got some thumpers and there're very reliable.
Even disks aren't failing that much - even less than I expected from
observation on other arrays in the same environment.

The main problems with x4500+zfs are:

1. hot spare support in zfs - right now it is far from ideal

2. raidz2 - resilver with lot of small files takes too long

3. SVM root disk mirror over jumpstart doesn't work with x4500 (bug
   opened)

4. I would consider future version of x4500 to have a 2xCF card (or
   something similar) to boot system from - so two disk won't be
   wasted just for OS (2x1TB in a few months).






-- 
Best regards,
 Robertmailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS - Use h/w raid or not? Thoughts. Considerations.

2007-05-25 Thread Casper . Dik


Depend on the guarantees. Some RAID systems have built in block 
checksumming.


But we all know that block checksums stored with the blocks do
not catch a number of common errors.

(Ghost writes, misdirected writes, misdirected reads)

Casper

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS - Use h/w raid or not? Thoughts. Considerations.

2007-05-25 Thread Toby Thain


On 25-May-07, at 1:22 AM, Torrey McMahon wrote:


Toby Thain wrote:


On 22-May-07, at 11:01 AM, Louwtjie Burger wrote:


On 5/22/07, Pål Baltzersen [EMAIL PROTECTED] wrote:

What if your HW-RAID-controller dies? in say 2 years or more..
What will read your disks as a configured RAID? Do you know how  
to (re)configure the controller or restore the config without  
destroying your data? Do you know for sure that a spare-part  
and firmware will be identical, or at least compatible? How good  
is your service subscription? Maybe only scrapyards and museums  
will have what you had. =o


Be careful when talking about RAID controllers in general. They are
not created equal! ...
Hardware raid controllers have done the job for many years ...


Not quite the same job as ZFS, which offers integrity guarantees  
that RAID subsystems cannot.


Depend on the guarantees. Some RAID systems have built in block  
checksumming.




Which still isn't the same. Sigh.

--T___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS - Use h/w raid or not? Thoughts. Considerations.

2007-05-25 Thread Torrey McMahon

Toby Thain wrote:


On 25-May-07, at 1:22 AM, Torrey McMahon wrote:


Toby Thain wrote:


On 22-May-07, at 11:01 AM, Louwtjie Burger wrote:


On 5/22/07, Pål Baltzersen [EMAIL PROTECTED] wrote:

What if your HW-RAID-controller dies? in say 2 years or more..
What will read your disks as a configured RAID? Do you know how to 
(re)configure the controller or restore the config without 
destroying your data? Do you know for sure that a spare-part and 
firmware will be identical, or at least compatible? How good is 
your service subscription? Maybe only scrapyards and museums will 
have what you had. =o


Be careful when talking about RAID controllers in general. They are
not created equal! ...
Hardware raid controllers have done the job for many years ...


Not quite the same job as ZFS, which offers integrity guarantees 
that RAID subsystems cannot.


Depend on the guarantees. Some RAID systems have built in block 
checksumming.




Which still isn't the same. Sigh. 


Yep.you get what you pay for. Funny how ZFS is free to purchase 
isn't it?


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS - Use h/w raid or not? Thoughts. Considerations.

2007-05-24 Thread Richard Elling

Anton B. Rang wrote:

Thumper seems to be designed as a file server (but curiously, not for high 
availability).


hmmm... Often people think that because a system is not clustered, then it is 
not
designed to be highly available.  Any system which provides a single view of 
data
(eg. a persistent storage device) must have at least one single point of 
failure.
The 4 components in a system which break most often are: fans, power supplies, 
disks,
and DIMMs.  You will find that most servers, including thumper, has redundancy 
to
cover these failure modes.  We've done extensive modelling and measuring of 
these
systems and think that we have hit a pretty good balance of availability and 
cost.
A thumper is not a STK9990V, nor does it cost nearly as much.

Incidentally, thumper field reliability is better than we expected.  This is 
causing
me to do extra work, because I have to explain why.

It's got plenty of I/O bandwidth. Mid-range and high-end servers, though, are starved of 
I/O bandwidth relative to their CPU  memory. This is particularly true for Sun's hardware.


Please tell us how many storage arrays are required to meet a theoretical I/O 
bandwidth of
244 GBytes/s?  Note: I have to say theoretical bandwidth here because no such 
system has
ever been built for testing, and such a system would be very, very expensive.
 -- richard
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS - Use h/w raid or not? Thoughts. Considerations.

2007-05-24 Thread Dave Fisk


 Please tell us how many storage arrays are required to meet a 
theoretical I/O bandwidth of 244 GBytes/s?


Just considering disks, you need approximately 6,663 all streaming 50 
MB/sec with RAID-5 3+1  (for example).
That is assuming sustained large block sequential I/O.  If you have 8 KB 
Random I/O you need somewhere between 284,281 and 426,421 disks each 
delivering between 100 and 150 IOPS.


Dave

Richard Elling wrote:

Anton B. Rang wrote:
Thumper seems to be designed as a file server (but curiously, not for 
high availability).


hmmm... Often people think that because a system is not clustered, 
then it is not
designed to be highly available.  Any system which provides a single 
view of data
(eg. a persistent storage device) must have at least one single point 
of failure.
The 4 components in a system which break most often are: fans, power 
supplies, disks,
and DIMMs.  You will find that most servers, including thumper, has 
redundancy to
cover these failure modes.  We've done extensive modelling and 
measuring of these
systems and think that we have hit a pretty good balance of 
availability and cost.

A thumper is not a STK9990V, nor does it cost nearly as much.

Incidentally, thumper field reliability is better than we expected.  
This is causing

me to do extra work, because I have to explain why.

It's got plenty of I/O bandwidth. Mid-range and high-end servers, 
though, are starved of I/O bandwidth relative to their CPU  memory. 
This is particularly true for Sun's hardware.


Please tell us how many storage arrays are required to meet a 
theoretical I/O bandwidth of
244 GBytes/s?  Note: I have to say theoretical bandwidth here because 
no such system has
ever been built for testing, and such a system would be very, very 
expensive.

 -- richard
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


--
Dave Fisk, ORtera Inc.
http://www.ORtera.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: ZFS - Use h/w raid or not? Thoughts. Considerations.

2007-05-24 Thread Anton B. Rang
Richard wrote:
 Any system which provides a single view of data (eg. a persistent storage
 device) must have at least one single point of failure.

Why?

Consider this simple case: A two-drive mirrored array.

Use two dual-ported drives, two controllers, two power supplies,
arranged roughly as follows:

  -- controller-A = disk A = controller-B --
  \/
   \  /
\ disk B /

Remind us where the single point of failure is in this arrangement?

Seriously, I think it's pretty clear that high-end storage hardware is built to
eliminate single points of failure.  I don't think that NetApp, LSI Logic, IBM,
etc. would agree with your contention.  But maybe I'm missing something; is
there some more fundamental issue?  Do you mean that the entire system is a
single point of failure, if it's the only copy of said data?  That would be a
tautology

I had written:

 Mid-range and high-end servers, though, are starved of I/O bandwidth
 relative to their CPU  memory. This is particularly true for Sun's hardware.

and Richard had asked (rhetorically?)

 Please tell us how many storage arrays are required to meet a
 theoretical I/O bandwidth of 244 GBytes/s?

My point is simply that, on most non-file-server hardware, the I/O bandwidth
available is not sufficient to keep all CPUs busy.  Host-based RAID can make
things worse since it takes away from the bandwidth available for user jobs.
Consider a Sun Fire 25K; the theoretical I/O bandwidth is 35 GB/sec (IIRC
that's the full-duplex number) while its 144 processors could do upwards of
259 GFlops.  That's 0.14 bytes/flop.

To answer your rhetorical question, the DSC9550 does 3 GB/second for reads
and writes (doing RAID 6 and with hardware parity checks on reads -- nice!),
so you'd need 82 arrays.  In real life (an actual file system), ASC Purple with
GPFS got 102 GB/sec using 416 arrays.

-- Anton
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS - Use h/w raid or not? Thoughts. Considerations.

2007-05-24 Thread Darren J Moffat

Anton B. Rang wrote:

Richard wrote:

Any system which provides a single view of data (eg. a persistent storage
device) must have at least one single point of failure.


Why?

Consider this simple case: A two-drive mirrored array.

Use two dual-ported drives, two controllers, two power supplies,
arranged roughly as follows:

  -- controller-A = disk A = controller-B --
  \/
   \  /
\ disk B /

Remind us where the single point of failure is in this arrangement?


The single instance of the operating system you are running if you 
aren't running in a cluster.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS - Use h/w raid or not? Thoughts. Considerations.

2007-05-24 Thread Frank Fitch

Anton B. Rang wrote:

Richard wrote:

Any system which provides a single view of data (eg. a persistent storage
device) must have at least one single point of failure.


Why?

Consider this simple case: A two-drive mirrored array.

Use two dual-ported drives, two controllers, two power supplies,
arranged roughly as follows:

  -- controller-A = disk A = controller-B --
  \/
   \  /
\ disk B /

Remind us where the single point of failure is in this arrangement?



disk backplane?

Regards,
-Frank

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS - Use h/w raid or not? Thoughts. Considerations.

2007-05-24 Thread Richard Elling

Anton B. Rang wrote:

Richard wrote:

Any system which provides a single view of data (eg. a persistent storage
device) must have at least one single point of failure.


Why?

Consider this simple case: A two-drive mirrored array.

Use two dual-ported drives, two controllers, two power supplies,
arranged roughly as follows:

  -- controller-A = disk A = controller-B --
  \/
   \  /
\ disk B /

Remind us where the single point of failure is in this arrangement?


The software which provides the single view of the data.


Seriously, I think it's pretty clear that high-end storage hardware is built to
eliminate single points of failure.  I don't think that NetApp, LSI Logic, IBM,
etc. would agree with your contention.  But maybe I'm missing something; is
there some more fundamental issue?  Do you mean that the entire system is a
single point of failure, if it's the only copy of said data?  That would be a
tautology


Does anyone believe that the software or firmware in these systems
is infallible?  For all possible failure modes in the system?


I had written:


Mid-range and high-end servers, though, are starved of I/O bandwidth
relative to their CPU  memory. This is particularly true for Sun's hardware.


and Richard had asked (rhetorically?)


Please tell us how many storage arrays are required to meet a
theoretical I/O bandwidth of 244 GBytes/s?


My point is simply that, on most non-file-server hardware, the I/O bandwidth
available is not sufficient to keep all CPUs busy.  Host-based RAID can make
things worse since it takes away from the bandwidth available for user jobs.
Consider a Sun Fire 25K; the theoretical I/O bandwidth is 35 GB/sec (IIRC
that's the full-duplex number) while its 144 processors could do upwards of
259 GFlops.  That's 0.14 bytes/flop.


Consider something more current.  The M9000 has 244 GBytes/s of theoretical
I/O bandwidth.  Its been measured at 1.228 TFlops (peak).  So we see a ratio
of 0.19 bytes/flop.  But this ratio doesn't mean much, since there doesn't
seem to be a storage system that big connected to a single OS instance -- yet 
:-)

When people make this claim of bandwidth limitation, we often find that the
inherent latency limitation is more problematic.  For example, we can get
good memory bandwidth from DDR2 DIMMs, which we collect into 8-wide banks.
But we can't get past the latency of DRAM access.  Similarly, we can get upwards
of 100 MBytes/s media bandwidth from a fast, large disk, but can't get past
the 4.5 ms seek or 4.1 ms rotational delay time.  It is this latency issue
which effectively killed software RAID-5 (read-modify-write).  Fortunately,
ZFS's raidz is designed to avoid the need to do a read-modify-write.


To answer your rhetorical question, the DSC9550 does 3 GB/second for reads
and writes (doing RAID 6 and with hardware parity checks on reads -- nice!),
so you'd need 82 arrays.  In real life (an actual file system), ASC Purple with
GPFS got 102 GB/sec using 416 arrays.


Yeah, this impressive, but parallel (multi system/multi storage), so it is
really apples and oranges.
 -- richard
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS - Use h/w raid or not? Thoughts. Considerations.

2007-05-24 Thread Torrey McMahon

Toby Thain wrote:


On 22-May-07, at 11:01 AM, Louwtjie Burger wrote:


On 5/22/07, Pål Baltzersen [EMAIL PROTECTED] wrote:

What if your HW-RAID-controller dies? in say 2 years or more..
What will read your disks as a configured RAID? Do you know how to 
(re)configure the controller or restore the config without 
destroying your data? Do you know for sure that a spare-part and 
firmware will be identical, or at least compatible? How good is your 
service subscription? Maybe only scrapyards and museums will have 
what you had. =o


Be careful when talking about RAID controllers in general. They are
not created equal! ...
Hardware raid controllers have done the job for many years ...


Not quite the same job as ZFS, which offers integrity guarantees that 
RAID subsystems cannot. 


Depend on the guarantees. Some RAID systems have built in block 
checksumming.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS - Use h/w raid or not? Thoughts. Considerations.

2007-05-24 Thread Torrey McMahon
I did say depends on the guarantees, right?  :-)  My point is that all 
hw raid systems are not created equally.


Nathan Kroenert wrote:
Which has little benefit if it's the HBA or the Array internals change 
the meaning of the message...


That's the whole point of ZFS's checksumming - It's end to end...

Nathan.

Torrey McMahon wrote:

Toby Thain wrote:


On 22-May-07, at 11:01 AM, Louwtjie Burger wrote:


On 5/22/07, Pål Baltzersen [EMAIL PROTECTED] wrote:

What if your HW-RAID-controller dies? in say 2 years or more..
What will read your disks as a configured RAID? Do you know how to 
(re)configure the controller or restore the config without 
destroying your data? Do you know for sure that a spare-part and 
firmware will be identical, or at least compatible? How good is 
your service subscription? Maybe only scrapyards and museums will 
have what you had. =o


Be careful when talking about RAID controllers in general. They are
not created equal! ...
Hardware raid controllers have done the job for many years ...


Not quite the same job as ZFS, which offers integrity guarantees 
that RAID subsystems cannot. 


Depend on the guarantees. Some RAID systems have built in block 
checksumming.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS - Use h/w raid or not? Thoughts. Considerations.

2007-05-23 Thread Brad Plecs
  At the moment, I'm hearing that using h/w raid under my zfs may be
 better for some workloads and the h/w hot spare would be nice to
 have across multiple raid groups, but the checksum capabilities in
 zfs are basically nullified with single/multiple h/w lun's
 resulting in reduced protection.  Therefore, it sounds like I
 should be strongly leaning towards not using the hardware raid in
 external disk arrays and use them like a JBOD.

 The big reasons for continuing to use hw raid is speed, in some cases, 
 and heterogeneous environments where you can't farm out non-raid 
 protected LUNs and raid protected LUNs from the same storage array. In 
 some cases the array will require a raid protection setting, like the 
 99x0, before you can even start farming out storage.

Just a data point -- I've had miserable luck with ZFS JBOD drives
failing.  They consistently wedge my machines (Ultra-45, E450, V880,
using SATA, SCSI drives) when one of the drives fails.  The system
recovers okay and without data loss after a reboot, but a total drive
failure (when a drive stops talking to the system) is not handled
well.

Therefore I would recommend a hardware raid for high-availability
applications.

Note, it's not clear that this is a ZFS problem.  I suspect it's a
solaris or hardware controller or driver problem, so this may not be
an issue if you find a controller that doesn't freak on a drive
failure.

BP 

-- 
[EMAIL PROTECTED]
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: ZFS - Use h/w raid or not? Thoughts. Considerations.

2007-05-23 Thread Anton B. Rang
 If you've got the internal system bandwidth to drive all drives then RAID-Z 
 is definitely 
 superior to HW RAID-5.  Same with mirroring.

You'll need twice as much I/O bandwidth as with a hardware controller, plus the 
redundancy, since the reconstruction is done by the host. For instance, to be 
equivalent to the performance of a mirrored array on a single 4 Gb FC channel, 
you need to use four 4 Gb FC channels, at least if you can't tolerate a 50% 
degradation during reconstruction; or two 4 Gb FC channels if you don't mind 
the performance loss during reconstruction.

RAID-Z also uses system CPU and memory bandwidth, which is fine for file 
servers since they're normally overprovisioned there anyway, but may be less 
appropriate for some other uses.

 HW RAID can offload some I/O bandwidth from the system, but new systems,
 like Thumper, should have more than enough bandwidth, so why bother with
 HW RAID?

Thumper seems to be designed as a file server (but curiously, not for high 
availability). It's got plenty of I/O bandwidth. Mid-range and high-end 
servers, though, are starved of I/O bandwidth relative to their CPU  memory. 
This is particularly true for Sun's hardware.

Anton
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: ZFS - Use h/w raid or not? Thoughts. Considerations.

2007-05-22 Thread Pål Baltzersen
What if your HW-RAID-controller dies? in say 2 years or more..
What will read your disks as a configured RAID? Do you know how to 
(re)configure the controller or restore the config without destroying your 
data? Do you know for sure that a spare-part and firmware will be identical, or 
at least compatible? How good is your service subscription? Maybe only 
scrapyards and museums will have what you had. =o
With ZFS/JBOD you will be safe; just get a new controller (or server) -- any 
kind that is protocol-compatible (and OS-complatible) you may have floating 
around (SATA2 | SCSI | FC..) -- and zpool import :) And you can safely buy the 
latestgratest and come out with something better than you had.

With ZFS I prefer JBOD. For performance you may want external HW-RAIDs (=2) 
and let ZFS mirror them as virtual JBOD. -- depends where the bottleneck is; 
I/O or spindle.
I disable any RAID-features on internal RAID-chipsets (nForce etc.).
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS - Use h/w raid or not? Thoughts. Considerations.

2007-05-22 Thread Louwtjie Burger

On 5/22/07, Pål Baltzersen [EMAIL PROTECTED] wrote:

What if your HW-RAID-controller dies? in say 2 years or more..
What will read your disks as a configured RAID? Do you know how to (re)configure the 
controller or restore the config without destroying your data? Do you know for sure 
that a spare-part and firmware will be identical, or at least compatible? How good 
is your service subscription? Maybe only scrapyards and museums will have what you 
had. =o


Be careful when talking about RAID controllers in general. They are
not created equal!

I can remove all of my disks from my RAID controller, reshuffle them
and put them back in a new random order.. my controller will continue
functioning correctly.

I can remove my RAID controller and replace with a similar firmware
(or higher), and my volumes will continue to live with correct
initiators blacklisted/not.

Hardware raid controllers have done the job for many years ... I'm a
little bit concerned about the new message (from some) out there that
they are  no good  anymore. Given the code on those controllers are
probably not as elegant as zfs ... and given my personal preference of
being in control, I cannot dismiss the fact that some of these
storage units are fast as hell, especially when you start piling on
the pressure!

I'm also interested to see how Sun handles this phenomenon, and how
they position zfs so that it doesn't eat into their high-margin (be it
low turnover) Storagetek block storage. I'm also interested to see
whether they will release a product dedicated for a solaris/zfs
environment.

Interesting times...

PS: I've also noticed some persperation on the heads of some Symantec
account managers.

:)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS - Use h/w raid or not? Thoughts. Considerations.

2007-05-22 Thread Toby Thain


On 22-May-07, at 11:01 AM, Louwtjie Burger wrote:


On 5/22/07, Pål Baltzersen [EMAIL PROTECTED] wrote:

What if your HW-RAID-controller dies? in say 2 years or more..
What will read your disks as a configured RAID? Do you know how to  
(re)configure the controller or restore the config without  
destroying your data? Do you know for sure that a spare-part and  
firmware will be identical, or at least compatible? How good is  
your service subscription? Maybe only scrapyards and museums will  
have what you had. =o


Be careful when talking about RAID controllers in general. They are
not created equal! ...
Hardware raid controllers have done the job for many years ...


Not quite the same job as ZFS, which offers integrity guarantees that  
RAID subsystems cannot.



I'm a
little bit concerned about the new message (from some) out there that
they are  no good  anymore. Given the code on those controllers are
probably not as elegant as zfs ... and given my personal preference of
being in control, I cannot dismiss the fact that some of these
storage units are fast as hell, ...


Being in control may mean *avoiding* black box RAID hardware in  
favour of inspectable  maintainable open source software, which was  
the earlier poster's point.


--Toby___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: ZFS - Use h/w raid or not? Thoughts. Considerations.

2007-05-21 Thread Phillip Fiedler
Thanks for the input.  So, I'm trying to meld the two replies and come up with 
a direction for my case and maybe a rule of thumb that I can use in the 
future (i.e., near future until new features come out in zfs) when I have 
external storage arrays that have built in RAID.

At the moment, I'm hearing that using h/w raid under my zfs may be better for 
some workloads and the h/w hot spare would be nice to have across multiple raid 
groups, but the checksum capabilities in zfs are basically nullified with 
single/multiple h/w lun's resulting in reduced protection.  Therefore, it 
sounds like I should be strongly leaning towards not using the hardware raid in 
external disk arrays and use them like a JBOD.

When will Sun have global hot spare capability?
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: ZFS - Use h/w raid or not? Thoughts. Considerations.

2007-05-21 Thread Paul Armstrong
There isn't a global hot spare, but you can add a hot spare to multiple pools.

Paul
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: ZFS - Use h/w raid or not? Thoughts. Considerations.

2007-05-21 Thread MC
 Personally I would go with ZFS entirely in most cases.

That's the rule of thumb :)  If you have a fast enough CPU and enough RAM, do 
everything with ZFS.  This sounds koolaid-induced, but you'll need nothing else 
because ZFS does it all.

My second personal rule of thumb concerns RAIDZ performance.  Benchmarks that 
were posted here in the past showed that RAIDZ worked best with no more than 4 
or 5 disks per array.  After that, certain types of performance dropped off 
pretty hard.  So if top performance matters and you can handle doing 4-5 disk 
arrays, that is a smart path to take.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS - Use h/w raid or not? Thoughts. Considerations.

2007-05-21 Thread Richard Elling

More redundancy below...

Torrey McMahon wrote:

Phillip Fiedler wrote:
Thanks for the input.  So, I'm trying to meld the two replies and come 
up with a direction for my case and maybe a rule of thumb that I can 
use in the future (i.e., near future until new features come out in 
zfs) when I have external storage arrays that have built in RAID.


At the moment, I'm hearing that using h/w raid under my zfs may be 
better for some workloads and the h/w hot spare would be nice to have 
across multiple raid groups, but the checksum capabilities in zfs are 
basically nullified with single/multiple h/w lun's resulting in 
reduced protection.  Therefore, it sounds like I should be strongly 
leaning towards not using the hardware raid in external disk arrays 
and use them like a JBOD.


The bit ...

   the checksum capabilities in zfs are basically nullified with 
single/multiple h/w lun's resulting in reduced protection.

is not accurate. With one large LUN, then yes, you can only detect 
errors. With multiple LUNs in a mirror or RAIDZ{2} then you can correct 
errors.


You can also add redundancy with ZFS filesystem copies parameter.  This is
similar to, but not the same as mirroring.

The big reasons for continuing to use hw raid is speed, in some cases, 
and heterogeneous environments where you can't farm out non-raid 
protected LUNs and raid protected LUNs from the same storage array. In 
some cases the array will require a raid protection setting, like the 
99x0, before you can even start farming out storage.


Yes.  ZFS data protection builds on top of this.  You always gain a benefit
when the data protection is done as close to the application as possible -- as
opposed to implementing the data protection as close to the storage as possible.
 -- richard
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss