Re: [zfs-discuss] Nice chassis for ZFS server

2008-03-17 Thread Jacob Ritorto
Hi all,
Did anyone ever confirm whether this ssr212 box, without hardware raid 
option, works reliably under OpenSolaris without fooling around with external 
drivers, etc.?  I need a box like this, but can't find a vendor that will give 
me a try  buy.  (Yes, I'm spoiled by Sun).

thx
jake
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Nice chassis for ZFS server

2007-12-15 Thread Frank Cusack
On December 13, 2007 10:12:52 PM -0800 can you guess? 
[EMAIL PROTECTED] wrote:
 On December 13, 2007 12:51:55 PM -0800 can you
 guess?
 [EMAIL PROTECTED] wrote:
  ...
 
  when the difference between an unrecoverable
 single
  bit error is not just
  1 bit but the entire file, or corruption of an
 entire
  database row (etc),
  those small and infrequent errors are an
 extremely
  big deal.
 
  You are confusing unrecoverable disk errors (which
 are rare but orders of
  magnitude more common) with otherwise
 *undetectable* errors (the
  occurrence of which is at most once in petabytes by
 the studies I've
  seen, rather than once in terabytes), despite my
 attempt to delineate the
  difference clearly.

 No I'm not.  I know exactly what you are talking
 about.

 Then you misspoke in your previous post by referring to an unrecoverable
 single bit error rather than to an undetected single-bit error, which
 I interpreted as a misunderstanding.

I did misspeak.  thanks.
-frank
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Nice chassis for ZFS server

2007-12-14 Thread can you guess?
 On Dec 14, 2007 1:12 AM, can you guess?
 [EMAIL PROTECTED] wrote:
   yes.  far rarer and yet home users still see
 them.
 
  I'd need to see evidence of that for current
 hardware.
 What would constitute evidence?  Do anecdotal tales
 from home users
 qualify?  I have two disks (and one controller!) that
 generate several
 checksum errors per day each.

I assume that you're referring to ZFS checksum errors rather than to transfer 
errors caught by the CRC resulting in retries.

If so, then the next obvious question is, what is causing the ZFS checksum 
errors?  And (possibly of some help in answering that question) is the disk 
seeing CRC transfer errors (which show up in its SMART data)?

If the disk is not seeing CRC errors, then the likelihood that data is being 
'silently' corrupted as it crosses the wire is negligible (1 in 65,536 if 
you're using ATA disks, given your correction below, else 1 in 4.3 billion for 
SATA).  Controller or disk firmware bugs have been known to cause otherwise 
undetected errors (though I'm not familiar with any recent examples in normal 
desktop environments - e.g., the CERN study discussed earlier found a disk 
firmware bug that seemed only activated by the unusual demands placed on the 
disk by a RAID controller, and exacerbated by that controller's propensity just 
to ignore disk time-outs).  So, for that matter, have buggy file systems.  
Flaky RAM can result in ZFS checksum errors (the CERN study found correlations 
there when it used its own checksum mechanisms).

  I've also seen
 intermittent checksum
 fails that go away once all the cables are wiggled.

Once again, a significant question is whether the checksum errors are 
accompanied by a lot of CRC transfer errors.  If not, that would strongly 
suggest that they're not coming from bad transfers (and while they could 
conceivably be the result of commands corrupted on the wire, so much more data 
is transferred compared to command bandwidth that you'd really expect to see 
data CRC errors too if commands were getting mangled).  When you wiggle the 
cables, other things wiggle as well (I assume you've checked that your RAM is 
solidly seated).

On the other hand, if you're getting a whole bunch of CRC errors, then with 
only a 16-bit CRC it's entirely conceivable that a few are sneaking by 
unnoticed.

 
  Unlikely, since transfers over those connections
 have been protected by 32-bit CRCs since ATA busses
 went to 33 or 66 MB/sec. (SATA has even stronger
 protection)
 The ATA/7 spec specifies a 32-bit CRC (older ones
 used a 16-bit CRC)
 [1].

Yup - my error:  the CRC was indeed introduced in ATA-4 (33 MB/sec. version), 
but was only 16 bits wide back then.

  The serial ata protocol also specifies 32-bit
 CRCs beneath 8/10b
 coding (1.0a p. 159)[2].  That's not much stronger at
 all.

The extra strength comes more from its additional coverage (commands as well as 
data).

- bill
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Nice chassis for ZFS server

2007-12-14 Thread Casper . Dik

...
though I'm not familiar with any recent examples in normal desktop environments



One example found during early use of zfs in Solaris engineering was
a system with a flaky power supply.

It seemed to work just fine with ufs but when zfs was installed the
sata drives started to shows many ZFS checksum errors.

After replacing the powersupply, the system did not detect any more
errors.

Flaky powersupplies are an important contributor to PC unreliability; they
also tend to fail a lot in various ways.

Casper

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Nice chassis for ZFS server

2007-12-14 Thread can you guess?
 
 ...
 though I'm not familiar with any recent examples in
 normal desktop environments
 
 
 
 One example found during early use of zfs in Solaris
 engineering was
 a system with a flaky power supply.
 
 It seemed to work just fine with ufs but when zfs was
 installed the
 sata drives started to shows many ZFS checksum
 errors.
 
 After replacing the powersupply, the system did not
 detect any more
 errors.
 
 Flaky powersupplies are an important contributor to
 PC unreliability; they
 also tend to fail a lot in various ways.

Thanks - now that you mention it, I think I remember reading about that here 
somewhere.

But did anyone delve into these errors sufficiently to know that they were 
specifically due to controller or disk firmware bugs (since you seem to be 
suggesting by the construction of your response above that they were) rather 
than, say, to RAM errors (if the system in question didn't have ECC RAM, 
anyway) between checksum generation and disk access on either reads or writes 
(the CERN study found a correlation even using ECC RAM between detected RAM 
errors and silent data corruption)?

Not that the generation of such otherwise undetected errors due to a flaky PSU 
isn't interesting in its own right, but this specific sub-thread was about 
whether poor connections were a significant source of such errors (my comment 
about controller and disk firmware bugs having been a suggested potential 
alternative source) - so identifying the underlying mechanisms is of interest 
as well.

- bill
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Nice chassis for ZFS server

2007-12-14 Thread Will Murnane
On Dec 14, 2007 4:23 AM, can you guess? [EMAIL PROTECTED] wrote:
 I assume that you're referring to ZFS checksum errors rather than to transfer 
 errors caught by the CRC resulting in retries.

Correct.

 If so, then the next obvious question is, what is causing the ZFS checksum 
 errors?  And (possibly of some help in answering that question) is the disk 
 seeing CRC transfer errors (which show up in its SMART data)?

The memory is ECC in this machine, and Memtest passed it for five
days.  The disk was indeed getting some pretty lousy SMART scores, but
that doesn't explain the controller issue.  This particular controller
is a SIIG-branded silicon image 0680 chipset (which is, apparently, a
piece of junk - if I'd done my homework I would've bought something
else)... but the premise stands.  I bought a piece of consumer-level
hardware off the shelf, it had corruption issues, and ZFS told me
about it when XFS had been silent.

 Once again, a significant question is whether the checksum errors are 
 accompanied by a lot of CRC transfer errors.  If not, that would strongly 
 suggest that they're not coming from bad transfers (and while they could 
 conceivably be the result of commands corrupted on the wire, so much more 
 data is transferred compared to command bandwidth that you'd really expect to 
 see data CRC errors too if commands were getting mangled).  When you wiggle 
 the cables, other things wiggle as well (I assume you've checked that your 
 RAM is solidly seated).

I don't remember offhand if I got CRC errors with the working
controller and drive and bad cabling, sorry.  RAM was solid, as
mentioned earlier.

 The extra strength comes more from its additional coverage (commands as well 
 as data).

Ah, that explains it.

Will
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Nice chassis for ZFS server

2007-12-14 Thread can you guess?
the next obvious question is, what is
 causing the ZFS checksum errors?  And (possibly of
 some help in answering that question) is the disk
 seeing CRC transfer errors (which show up in its
 SMART data)?
 
 The memory is ECC in this machine, and Memtest passed
 it for five
 days.  The disk was indeed getting some pretty lousy
 SMART scores,

Seagate ATA disks (if that's what you were using) are notorious for this in a 
couple of specific metrics:  they ship from the factory that way.  This does 
not appear to be indicative of any actual problem but rather of error 
tablulation which they perform differently than other vendors do (e.g., I could 
imagine that they did something unusual in their burn-in exercising that 
generated nominal errors, but that's not even speculation, just a random guess).

 but
 that doesn't explain the controller issue.  This
 particular controller
 is a SIIG-branded silicon image 0680 chipset (which
 is, apparently, a
 piece of junk - if I'd done my homework I would've
 bought something
 else)... but the premise stands.  I bought a piece of
 consumer-level
 hardware off the shelf, it had corruption issues, and
 ZFS told me
 about it when XFS had been silent.

Then we've been talking at cross-purposes.  Your original response was to my 
request for evidence that *platter errors that escape detection by the disk's 
ECC mechanisms* occurred sufficiently frequently to be a cause for concern - 
and that's why I asked specifically what was causing the errors you saw (to see 
whether they were in fact the kind for which I had requested evidence).

Not that detecting silent errors due to buggy firmware is useless:  it clearly 
saved you from continuing corruption in this case.  My impression is that in 
conventional consumer installations (typical consumers never crack open their 
case at all, let alone to add a RAID card) controller and disk firmware is 
sufficiently stable (especially for the limited set of functions demanded of 
it) that ZFS's added integrity checks may not count for a great deal (save 
perhaps peace of mind, but typical consumers aren't sufficiently aware of 
potential dangers to suffer from deficits in that area) - but your experience 
indicates that when you stray from that mold ZFS's added protection may 
sometimes be as significant as it was for Robert's mid-range array firmware 
bugs.

And since there indeed was a RAID card involved in the original hypothetical 
situation under discussion, the fact that I was specifically referring to 
undetectable *disk* errors was only implied by my subsequent discussion of disk 
error rates, rather than explicit.

The bottom line appears to be that introducing non-standard components into the 
path between RAM and disk has, at least for some specific subset of those 
components, the potential to introduce silent errors of the form that ZFS can 
catch - quite possibly in considerably greater numbers that the kinds of 
undetected disk errors that I was talking about ever would (that RAID card you 
were using has a relatively popular low-end chipset, and Robert's mid-range 
arrays were hardly fly-by-night).  So while I'm still not convinced that ZFS 
offers significant features in the reliability area compared with other 
open-source *software* solutions, the evidence that it may do so in more 
sophisticated (but not quite high-end) hardware environments is becoming more 
persuasive.

- bill
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Nice chassis for ZFS server

2007-12-13 Thread MP
 this anti-raid-card movement is puzzling. 

I think you've misinterpreted my questions.
I queried the necessity of paying extra for an seemingly unnecessary RAID card 
for zfs. I didn't doubt that it could perform better.
Wasn't one of the design briefs of zfs, that it would provide it's feature set 
without expensive RAID hardware?
Of course, if you have the money then you can always go faster, but this is a 
zfs discussion thread (I know I've perpetuated the extravagant cross-posting of 
the OP).
Cheers.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Nice chassis for ZFS server

2007-12-13 Thread Richard Elling
MP wrote:
 this anti-raid-card movement is puzzling. 
 

 I think you've misinterpreted my questions.
 I queried the necessity of paying extra for an seemingly unnecessary RAID 
 card for zfs. I didn't doubt that it could perform better.
 Wasn't one of the design briefs of zfs, that it would provide it's feature 
 set without expensive RAID hardware?
   

In general, feature set != performance.  For example, a VIA 
x86-compatible processor
is not capable of beating the performance of a high-end Xeon, though the 
feature sets
are largely the same.  Additional examples abound.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Nice chassis for ZFS server

2007-12-13 Thread MP
 Additional examples abound.

Doubtless :)

More usefully, can you confirm whether Solaris works on this chassis without 
the RAID controller?
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Nice chassis for ZFS server

2007-12-13 Thread can you guess?
 Are there benchmarks somewhere showing a RAID10
 implemented on an LSI card with, say, 128MB of cache
 being beaten in terms of performance by a similar
 zraid configuration with no cache on the drive
 controller?
 
 Somehow I don't think they exist. I'm all for data
 scrubbing, but this anti-raid-card movement is
 puzzling.

Oh, for joy - a chance for me to say something *good* about ZFS. rather than 
just try to balance out excessive enthusiasm.

Save for speeding up synchronous writes (if it has enough on-board NVRAM to 
hold them until it's convenient to destage them to disk), a RAID-10 card should 
not enjoy any noticeable performance advantage over ZFS mirroring.

By contrast, if extremely rare undetected and (other than via ZFS checksums) 
undetectable (or considerably more common undetected but detectable via disk 
ECC codes, *if* the data is accessed) corruption occurs, if the RAID card is 
used to mirror the data there's a good chance that even ZFS's validation scans 
won't see the problem (because the card happens to access the good copy for the 
scan rather than the bad one) - in which case you'll lose that data if the disk 
with the good data fails.  And in the case of (extremely rare) 
otherwise-undetectable corruption, if the card *does* return the bad copy then 
IIRC ZFS (not knowing that a good copy also exists) will just claim that the 
data is gone (though I don't know if it will then flag it such that you'll 
never have an opportunity to find the good copy).

If the RAID card scrubs its disks the difference (now limited to the extremely 
rare undetectable-via-disk-ECC corruption) becomes pretty negligible - but I'm 
not sure how many RAIDs below the near-enterprise category perform such scrubs.

In other words, if you *don't* otherwise scrub your disks then ZFS's 
checksums-plus-internal-scrubbing mechanisms assume greater importance:  it's 
only the contention that other solutions that *do* offer scrubbing can't 
compete with ZFS in effectively protecting your data that's somewhat over the 
top.

- bill
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Nice chassis for ZFS server

2007-12-13 Thread Frank Cusack
On December 13, 2007 9:47:00 AM -0800 MP [EMAIL PROTECTED] wrote:
 Additional examples abound.

 Doubtless :)

 More usefully, can you confirm whether Solaris works on this chassis
 without the RAID controller?

way back, i had Solaris working with a promise j200s (jbod sas) chassis,
to the extent that the sas driver at the time worked.  i can't IMAGINE
why this chassis would be any different from Solaris' perspective.

-frank
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Nice chassis for ZFS server

2007-12-13 Thread Frank Cusack
On December 13, 2007 11:34:54 AM -0800 can you guess? 
[EMAIL PROTECTED] wrote:
 By contrast, if extremely rare undetected and (other than via ZFS
 checksums) undetectable (or considerably more common undetected but
 detectable via disk ECC codes, *if* the data is accessed) corruption
 occurs, if the RAID card is used to mirror the data there's a good chance
 that even ZFS's validation scans won't see the problem (because the card
 happens to access the good copy for the scan rather than the bad one) -
 in which case you'll lose that data if the disk with the good data fails.
 And in the case of (extremely rare) otherwise-undetectable corruption, if
 the card *does* return the bad copy then IIRC ZFS (not knowing that a
 good copy also exists) will just claim that the data is gone (though I
 don't know if it will then flag it such that you'll never have an
 opportunity to find the good copy).

i like this answer, except for what you are implying by extremely rare.

 If the RAID card scrubs its disks the difference (now limited to the
 extremely rare undetectable-via-disk-ECC corruption) becomes pretty
 negligible - but I'm not sure how many RAIDs below the near-enterprise
 category perform such scrubs.

 In other words, if you *don't* otherwise scrub your disks then ZFS's
 checksums-plus-internal-scrubbing mechanisms assume greater importance:
 it's only the contention that other solutions that *do* offer scrubbing
 can't compete with ZFS in effectively protecting your data that's
 somewhat over the top.

the problem with your discounting of zfs checksums is that you aren't
taking into account that extremely rare is relative to the number of
transactions, which are extremely high.  in such a case even extremely
rare errors do happen, and not just to extremely few folks, but i would
say to all enterprises.  hell it happens to home users.

when the difference between an unrecoverable single bit error is not just
1 bit but the entire file, or corruption of an entire database row (etc),
those small and infrequent errors are an extremely big deal.

considering all the pieces, i would much rather run zfs on a jbod than
on a raid, wherever i could.  it gives better data protection, and it
is ostensibly cheaper.

-frank
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Nice chassis for ZFS server

2007-12-13 Thread can you guess?
...

 when the difference between an unrecoverable single
 bit error is not just
 1 bit but the entire file, or corruption of an entire
 database row (etc),
 those small and infrequent errors are an extremely
 big deal.

You are confusing unrecoverable disk errors (which are rare but orders of 
magnitude more common) with otherwise *undetectable* errors (the occurrence of 
which is at most once in petabytes by the studies I've seen, rather than once 
in terabytes), despite my attempt to delineate the difference clearly.  
Conventional approaches using scrubbing provide as complete protection against 
unrecoverable disk errors as ZFS does:  it's only the far rarer otherwise 
*undetectable* errors that ZFS catches and they don't.

- bill
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Nice chassis for ZFS server

2007-12-13 Thread can you guess?
...

  If the RAID card scrubs its disks
 
 A scrub without checksum puts a huge burden on disk
 firmware and  
 error reporting paths :-)

Actually, a scrub without checksum places far less burden on the disks and 
their firmware than ZFS-style scrubbing does, because it merely has to scan the 
disk sectors sequentially rather than follow a tree path to each relatively 
small leaf block.  Thus it also compromises runtime operation a lot less as 
well (though in both cases doing it infrequently in the background should 
usually reduce any impact to acceptable levels).

- bill
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Nice chassis for ZFS server

2007-12-13 Thread Frank Cusack
On December 13, 2007 12:51:55 PM -0800 can you guess? 
[EMAIL PROTECTED] wrote:
 ...

 when the difference between an unrecoverable single
 bit error is not just
 1 bit but the entire file, or corruption of an entire
 database row (etc),
 those small and infrequent errors are an extremely
 big deal.

 You are confusing unrecoverable disk errors (which are rare but orders of
 magnitude more common) with otherwise *undetectable* errors (the
 occurrence of which is at most once in petabytes by the studies I've
 seen, rather than once in terabytes), despite my attempt to delineate the
 difference clearly.

No I'm not.  I know exactly what you are talking about.

  Conventional approaches using scrubbing provide as
 complete protection against unrecoverable disk errors as ZFS does:  it's
 only the far rarer otherwise *undetectable* errors that ZFS catches and
 they don't.

yes.  far rarer and yet home users still see them.

that the home user ever sees these extremely rare (undetectable) errors
may have more to do with poor connection (cables, etc) to the disk, and
less to do with disk media errors.  enterprise users probably have
better connectivity and see errors due to high i/o.  just thinking
out loud.

regardless, zfs on non-raid provides better protection than zfs on raid
(well, depending on raid configuration) so just from the data integrity
POV non-raid would generally be preferred.  the fact that the type of
error being prevented is rare doesn't change that and i was further
arguing that even though it's rare the impact can be high so you don't
want to write it off.

-frank
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Nice chassis for ZFS server

2007-12-13 Thread Marion Hakanson
[EMAIL PROTECTED] said:
 You are confusing unrecoverable disk errors (which are rare but orders of
 magnitude more common) with otherwise *undetectable* errors (the occurrence
 of which is at most once in petabytes by the studies I've seen, rather than
 once in terabytes), despite my attempt to delineate the difference clearly.

I could use a little clarification on how these unrecoverable disk errors
behave -- or maybe a lot, depending on one's point of view.

So, when one of these once in around ten (or 100) terabytes read events
occurs, my understanding is that a read error is returned by the drive,
and the corresponding data is lost as far as the drive is concerned.
Maybe just a bit is gone, maybe a byte, maybe a disk sector, it probably
depends on the disk, OS, driver, and/or the rest of the I/O hardware
chain.  Am I doing OK so far?


 Conventional approaches using scrubbing provide as complete protection
 against unrecoverable disk errors as ZFS does:  it's only the far rarer
 otherwise *undetectable* errors that ZFS catches and they don't. 

I found it helpful to my own understanding to try restating the above
in my own words.  Maybe others will as well.

If my assumptions are correct about how these unrecoverable disk errors
are manifested, then a dumb scrubber will find such errors by simply
trying to read everything on disk -- no additional checksum is required.
Without some form of parity or replication, the data is lost, but at
least somebody will know about it.

Now it seems to me that without parity/replication, there's not much
point in doing the scrubbing, because you could just wait for the error
to be detected when someone tries to read the data for real.  It's
only if you can repair such an error (before the data is needed) that
such scrubbing is useful.

For those well-versed in this stuff, apologies for stating the obvious.

Regards,

Marion


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Nice chassis for ZFS server

2007-12-13 Thread Anton B. Rang
 I could use a little clarification on how these unrecoverable disk errors
 behave -- or maybe a lot, depending on one's point of view.
 
 So, when one of these once in around ten (or 100) terabytes read events
 occurs, my understanding is that a read error is returned by the drive,
 and the corresponding data is lost as far as the drive is concerned.

Yes -- the data being one or more disk blocks.  (You can't lose a smaller
amount of data, from the drive's point of view, since the error correction
code covers the whole block.)

 If my assumptions are correct about how these unrecoverable disk errors
 are manifested, then a dumb scrubber will find such errors by simply
 trying to read everything on disk -- no additional checksum is required.
 Without some form of parity or replication, the data is lost, but at
 least somebody will know about it.

Right.  Generally if you have replication and scrubbing, then you'll also
re-write any data which was found to be unreadable, thus fixing the
problem (and protecting yourself against future loss of the second copy).

 Now it seems to me that without parity/replication, there's not much
 point in doing the scrubbing, because you could just wait for the error
 to be detected when someone tries to read the data for real.  It's
 only if you can repair such an error (before the data is needed) that
 such scrubbing is useful.

Pretty much, though if you're keeping backups, you could recover the
data from backup at this point. Of course, backups could be considered
a form of replication, but most of us in file systems don't think of them
that way.

Anton
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Nice chassis for ZFS server

2007-12-13 Thread can you guess?
...

  Now it seems to me that without parity/replication,
 there's not much
  point in doing the scrubbing, because you could
 just wait for the error
  to be detected when someone tries to read the data
 for real.  It's
  only if you can repair such an error (before the
 data is needed) that
  such scrubbing is useful.
 
 Pretty much

I think I've read (possibly in the 'MAID' descriptions) the contention that at 
least some unreadable sectors get there in stages, such that if you catch them 
early they will be only difficult to read rather than completely unreadable.  
In such a case, scrubbing is worthwhile even without replication, because it 
finds the problem early enough that the disk itself (or higher-level mechanisms 
if the disk gives up but the higher level is more persistent) will revector the 
sector when it finds it difficult (but not impossible) to read.

- bill
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Nice chassis for ZFS server

2007-12-13 Thread Will Murnane
On Dec 14, 2007 1:12 AM, can you guess? [EMAIL PROTECTED] wrote:
  yes.  far rarer and yet home users still see them.

 I'd need to see evidence of that for current hardware.
What would constitute evidence?  Do anecdotal tales from home users
qualify?  I have two disks (and one controller!) that generate several
checksum errors per day each.  I've also seen intermittent checksum
fails that go away once all the cables are wiggled.

 Unlikely, since transfers over those connections have been protected by 
 32-bit CRCs since ATA busses went to 33 or 66 MB/sec. (SATA has even stronger 
 protection)
The ATA/7 spec specifies a 32-bit CRC (older ones used a 16-bit CRC)
[1].  The serial ata protocol also specifies 32-bit CRCs beneath 8/10b
coding (1.0a p. 159)[2].  That's not much stronger at all.

Will

[1] http://www.t10.org/t13/project/d1532v3r4a-ATA-ATAPI-7.pdf
[2] http://www.ece.umd.edu/courses/enee759h.S2003/references/serialata10a.pdf
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Nice chassis for ZFS server

2007-12-12 Thread Frank Cusack
On November 29, 2007 5:56:04 AM -0800 MP [EMAIL PROTECTED] wrote:
 Intel show a configuration of this chassis in the Hardware Technical
 Specification:

 http://download.intel.com/support/motherboards/server/ssr212mc2/sb/ssr212
 mc2_tps_12.pdf

 without the RAID controller. I assume that then the 4xSAS ports on the
 Blackford chipset are then used, rather than the 4xSAS on the RAID card.
 As Blackford is supported in Opensolaris, then this configuration would
 be the one to choose?

Makes no difference.  The host running Solaris, OpenSolaris or whatever
talks SAS to the enclosure.  The chipset used by the enclosure doesn't
make any difference to the host OS (bug workarounds excepted).

-frank
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Nice chassis for ZFS server

2007-12-08 Thread Mick Russom
Are there benchmarks somewhere showing a RAID10 implemented on an LSI card 
with, say, 128MB of cache being beaten in terms of performance by a similar 
zraid configuration with no cache on the drive controller?

Somehow I don't think they exist. I'm all for data scrubbing, but this 
anti-raid-card movement is puzzling.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Nice chassis for ZFS server

2007-11-30 Thread John Martinez

On Nov 30, 2007, at 2:47 AM, MP wrote:

 I evaled one of these too. Worked great with ZFS.

 Was that with OpenSolaris and was that with or without the Intel  
 RAID controller?
 Cheers.

Solaris 10 8/07, it was with the built-in RAID controller.

-john

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Nice chassis for ZFS server

2007-11-30 Thread MP
 I evaled one of these too. Worked great with ZFS.

Was that with OpenSolaris and was that with or without the Intel RAID 
controller?
Cheers.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Nice chassis for ZFS server

2007-11-29 Thread MP
Intel show a configuration of this chassis in the Hardware Technical 
Specification:

http://download.intel.com/support/motherboards/server/ssr212mc2/sb/ssr212mc2_tps_12.pdf

without the RAID controller. I assume that then the 4xSAS ports on the 
Blackford chipset are then used, rather than the 4xSAS on the RAID card.
As Blackford is supported in Opensolaris, then this configuration would be the 
one to choose?
Anyone tried this yet?
Many thanks.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Nice chassis for ZFS server

2007-11-29 Thread Ross
Rumours are that Dell are going to start supporting ZFS now they're shipping 
Solaris.  I'm waiting to see if there are going to be some nice little boxes 
from them :)
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Nice chassis for ZFS server

2007-11-13 Thread Mick Russom
Sun did something like this with the v60 and v65 servers, and they should do it 
again with the SSR212MC2.

The heart of the SAS subsystem of the SSR212MC2 is the SRCSAS144E .

This card is interfacing with a Vitesse VSC410 SAS-expander and is plugged into 
a S5000PSL motherboard. 

This card is closely related to the MegaRAID SAS 8208ELP . 

All the drives, 12, in front are SAS/SATA-II hotswappable. 

http://www.intel.com/design/servers/storage/ssr212mc2/index.htm .

This is a pure Intel reference design. This is the most drives that fits into a 
2U **EVER**. This is the best storage product in existence today. 

The SRCSAS144E is a MegaRAID SAS controller. 

Sun's own v60 and Sun v65 were pure Intel reference servers that worked 
GREAT! 

Everything works in Linux, want to see?

cat /etc/redhat-release 
CentOS release 5 (Final)

uname -a
Linux localhost.localdomain 2.6.18-8.1.15.el5 #1 SMP Mon Oct 22 08:32:28 EDT 
2007 x86_64 x86_64 x86_64 GNU/Linux

megasas: 00.00.03.05 Mon Oct 02 11:21:32 PDT 2006
megasas: 0x1000:0x0411:0x8086:0x1003: bus 9:slot 14:func 0
ACPI: PCI Interrupt :09:0e.0[A] - GSI 18 (level, low) - IRQ 185
megasas: FW now in Ready state
scsi0 : LSI Logic SAS based MegaRAID driver
Vendor: Intel Model: SSR212MC Rev: 01A 
Type: Enclosure ANSI SCSI revision: 05
Vendor: INTEL Model: SRCSAS144E Rev: 1.03
Type: Direct-Access ANSI SCSI revision: 05
SCSI device sda: 2919915520 512-byte hdwr sectors (1494997 MB)
sda: Write Protect is off
sda: Mode Sense: 1f 00 00 08
SCSI device sda: drive cache: write back
SCSI device sda: 2919915520 512-byte hdwr sectors (1494997 MB)
sda: Write Protect is off
sda: Mode Sense: 1f 00 00 08
SCSI device sda: drive cache: write back
sda: sda1 sda2 sda3
sd 0:2:0:0: Attached scsi disk sda
Fusion MPT base driver 3.04.02
Copyright (c) 1999-2005 LSI Logic Corporation
Fusion MPT SAS Host driver 3.04.02
ACPI: PCI Interrupt :04:00.0[A] - GSI 17 (level, low) - IRQ 177
mptbase: Initiating ioc0 bringup
ioc0: SAS1064E: Capabilities={Initiator}
PCI: Setting latency timer of device :04:00.0 to 64
scsi1 : ioc0: LSISAS1064E, FwRev=0110h, Ports=1, MaxQ=511, IRQ=177

09:0e.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS
Subsystem: Intel Corporation SRCSAS144E RAID Controller

09:0e.0 0104: 1000:0411
Subsystem: 8086:1003

I know that Google and Yahoo are buying these chassis in droves, and many of 
the other folks I know in the industry are seeing massive sales of this box.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Nice chassis for ZFS server

2007-11-13 Thread Mick Russom
Internal drives suck. If you go through the trouble of putting in a
drive, at least make it hot pluggable.

They are all hot-swappable/pluggable on the the SSR212MC2. There are two 
additional internal 2.5 SAS bonus drives that arent, but the front 12 are.

I for one think external enclosures are annoying. Whats wrong with God-boxes 
like this? You will invariably use up more than 2U for every 12 3.5 drivers 
with **all** other alternatives to this. 

argv! surely this is a clerical error?
No, its annoying with the best platforms, especially from vendors like Intel 
who go a long way to support the product properly over long periods of time, do 
not land on the HCL. These are the best platforms to certify.

Hope this box lands on the HCL, its a beaut.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Nice chassis for ZFS server

2007-11-13 Thread Richard Elling
Mick Russom wrote:
 Sun's own v60 and Sun v65 were pure Intel reference servers that worked 
 GREAT!

I'm glad they worked for you.  But I'll note that the critical deficiencies
in those platforms is solved by the newer Sun AMD/Intel/SPARC small form factor
rackmount servers.  The new chassis are far superior to the V60/V65 chassis, 
which
were not data center class designs even though they were rack-mountable.
  -- richard
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Nice chassis for ZFS server

2007-09-26 Thread Richard Elling
Nigel Smith wrote:
 It's a pity that Sun does not manufacture something like this.
 The x4500 Thumper, with 48 disks is way over the top for most companies,
 and too expensive.  And the new X4150 only has 8 disks.
 This Intel box with 12 hot-swap drives and two internal boot drives
 looks like the sweet-spot to me.

Internal drives suck.  If you go through the trouble of putting in a
drive, at least make it hot pluggable.

 The only problem is that Intel are not listing Solaris as a 
 supported operating system.

argv!  surely this is a clerical error?

 The question is how are all those SAS/SATA disks interfaced to the 
 motherboard. As far as I can see it's using some new chipset
 called 'Blackford'. Has Solaris got a driver for that chipset?

Blackford is the 2-socket bridge chip for the latest Intel quad cores.
We use the Blackford on the X4150.

 I don't think so, but I'd love to be wrong on that.

I love it when you're wrong :-)
  -- richard
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Nice chassis for ZFS server

2007-09-26 Thread Tomas Ögren
On 26 September, 2007 - Nigel Smith sent me these 1,2K bytes:

 It's a pity that Sun does not manufacture something like this.
 The x4500 Thumper, with 48 disks is way over the top for most companies,
 and too expensive.  And the new X4150 only has 8 disks.
 This Intel box with 12 hot-swap drives and two internal boot drives
 looks like the sweet-spot to me.
 The only problem is that Intel are not listing Solaris as a 
 supported operating system.
 The question is how are all those SAS/SATA disks interfaced to the 
 motherboard. As far as I can see it's using some new chipset
 called 'Blackford'. Has Solaris got a driver for that chipset?
 I don't think so, but I'd love to be wrong on that.
 It looks like the chipset provide 4-ports on the motherboard,
 and then they use four lane SAS cables.
 
 Has anyone tried the HP ProLiant DL320s with Solaris  ZFS?
 http://h10010.www1.hp.com/wwpc/us/en/sm/WF05a/15351-15351-3328412-241644-241475-3232017.html
 It has a similar 12+2 drive bay arrangement, and I believe
 HP do support Solaris and have drivers for their disk interface cards.

Tried booting u3 (I think, could have been a sxcr ~50-60) on one of
those.. Required drivers for the HBA was additional stuff, but the USB
controller was not working under Solaris, so I couldn't stick it onto an
USB cd and adding it to the miniroot required another x86 Solaris
machine to loopback mount the UFS file (only have sparc, not
compatible). Gave up due to time constraints (borrowed machine for this
test).

So, not straight out of the bot but maybe.

/Tomas
-- 
Tomas Ögren, [EMAIL PROTECTED], http://www.acc.umu.se/~stric/
|- Student at Computing Science, University of Umeå
`- Sysadmin at {cs,acc}.umu.se
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss