Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-09-03 Thread Simon Breden
So what's the consensus on checksum errors appearing within mirror vdevs? Is it caused the same bug announced by Adam, or is something else causing it? If so, what's the bug id? Cheers, Simon -- This message posted from opensolaris.org ___ zfs-discuss

Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-09-03 Thread Simon Breden
Thanks Gaëtan. What's the bug id for this iommu bug on Intel platforms? In my case, I have an AMD processor with ECC RAM, so probably not related to the Intel iommu bug. I'm seeing the checksum errors in a mirrored rpool using SSDs so maybe it could be something like cosmic rays causing

Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-09-03 Thread Frank Middleton
It was someone from Sun that recently asked me to repost here about the checksum problem on mirrored drives. I was reluctant to do so because you and Bob might start flames again, and you did! You both sound very defensive, but of course I would never make an unsubstantiated speculation that you

Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-09-02 Thread Nigel Smith
Adam The 'OpenSolaris Development Release Packaging Repository' has recently been updated to release 121. http://mail.opensolaris.org/pipermail/opensolaris-announce/2009-August/001253.html http://pkg.opensolaris.org/dev/en/index.shtml Just to be totally clear, as you recommending that

Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-09-02 Thread Daniel Carosone
Furthermore, this clarity needs to be posted somewhere much, much more visible than buried in some discussion thread. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org

Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-09-02 Thread Henrik Johansson
Hi Adam, On Sep 2, 2009, at 1:54 AM, Adam Leventhal wrote: Hi James, After investigating this problem a bit I'd suggest avoiding deploying RAID-Z until this issue is resolved. I anticipate having it fixed in build 124. For those of us which have already upgraded and written data to our

Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-09-02 Thread Frank Middleton
On 09/02/09 05:40 AM, Henrik Johansson wrote: For those of us which have already upgraded and written data to our raidz pools, are there any risks of inconsistency, wrong checksums in the pool? Is there a bug id? This may not be a new problem insofar as it may also affect mirrors. As part of

Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-09-02 Thread Gaëtan Lehmann
Le 2 sept. 09 à 15:27, Frank Middleton a écrit : On 09/02/09 05:40 AM, Henrik Johansson wrote: For those of us which have already upgraded and written data to our raidz pools, are there any risks of inconsistency, wrong checksums in the pool? Is there a bug id? This may not be a new

Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-09-02 Thread Eric Sproul
Adam Leventhal wrote: Hi James, After investigating this problem a bit I'd suggest avoiding deploying RAID-Z until this issue is resolved. I anticipate having it fixed in build 124. Adam, Is it known approximately when this bug was introduced? I have a system running snv_111 with a large

Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-09-02 Thread Frank Middleton
On 09/02/09 10:01 AM, Gaëtan Lehmann wrote: I see the same problem on a workstation with ECC RAM and disks in mirror. The host is a Dell T5500 with 2 cpus and 24 GB of RAM. Would you know if it has ECC on the buses? I have no idea if or what Solaris does on X86 to check or correct bus errors,

Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-09-02 Thread Simon Breden
I too see checksum errors ocurring for the first time using OpenSolaris 2009.06 on the /dev package repository at version snv_121. I see the problem occur within a mirrored boot pool (rpool) using SSDs. Hardware is AMD BE-2350 (ECC) processor with 4GB ECC memory on MCP55 chipset, although SATA

Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-09-02 Thread Markus Kovero
...@opensolaris.org [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Simon Breden Sent: 2. syyskuuta 2009 17:34 To: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool I too see checksum errors ocurring for the first time using OpenSolaris

Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-09-02 Thread Frank Middleton
On 09/02/09 10:34 AM, Simon Breden wrote: I too see checksum errors ocurring for the first time using OpenSolaris 2009.06 on the /dev package repository at version snv_121. I see the problem occur within a mirrored boot pool (rpool) using SSDs. Hardware is AMD BE-2350 (ECC) processor with 4GB

Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-09-02 Thread Simon Breden
Thanks Markus, I'll give that a try. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-09-02 Thread Simon Breden
Cheers Frank, I'll give it a try... also, doesn't sound good if the problem goes back pre snv_100... -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org

Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-09-02 Thread Brent Jones
On Wed, Sep 2, 2009 at 6:27 AM, Frank Middletonf.middle...@apogeect.com wrote: On 09/02/09 05:40 AM, Henrik Johansson wrote: For those of us which have already upgraded and written data to our raidz pools, are there any risks of inconsistency, wrong checksums in the pool? Is there a bug id?

Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-09-02 Thread Richard Elling
On Sep 2, 2009, at 2:38 AM, Daniel Carosone wrote: Furthermore, this clarity needs to be posted somewhere much, much more visible than buried in some discussion thread. I've added a note in the ZFS Troubleshooting Guide wiki. However, I could not find a public CR. If someone inside Sun

Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-09-02 Thread Richard Elling
On Sep 2, 2009, at 6:27 AM, Frank Middleton wrote: On 09/02/09 05:40 AM, Henrik Johansson wrote: For those of us which have already upgraded and written data to our raidz pools, are there any risks of inconsistency, wrong checksums in the pool? Is there a bug id? This may not be a new

Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-09-02 Thread Simon Breden
Hi Richard, I just took at that link and it only mentions problems with RAID-Z vdevs, but some people here, including myself, have checksum errors with mirrors too, so maybe the link could be updated with this info? Cheers, Simon -- This message posted from opensolaris.org

Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-09-02 Thread Henrik Johansson
Hello all, I have backed down to snv_117, when scrubbing this pool i got my first checksum errors ever on any build except snv_121. I wonder if this is a coincidence or if bad checksums have been generated by snv_121? So i have been running for 10 months without any checksum errors, i

Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-09-02 Thread Simon Breden
And in addition to which solaris version people are using, is it relevant which ZFS level their pool is using? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org

Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-09-02 Thread Frank Middleton
On 09/02/09 12:31 PM, Richard Elling wrote: I believe this is a different problem. Adam, was this introduced in b120? Doubtless you are correct as usual. However, if this is a new problem, how did it get through Sun's legendary testing process unless it is (as you have always maintained)

Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-09-02 Thread Edward Pilatowicz
hey richard, so i just got a bunch of zfs checksum errors after replacing some mirrored disks on my desktop (u27). i originally blamed the new disks, until i saw this thread, at which point i started digging in bugster. i found the following related bugs (i'm not sure which one adam was

Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-09-02 Thread Bob Friesenhahn
On Wed, 2 Sep 2009, Frank Middleton wrote: On 09/02/09 12:31 PM, Richard Elling wrote: I believe this is a different problem. Adam, was this introduced in b120? Doubtless you are correct as usual. However, if this is a new problem, how did it get through Sun's legendary testing process

Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-09-02 Thread Jeff Victor
Bob Friesenhahn wrote: On Wed, 2 Sep 2009, Frank Middleton wrote: On 09/02/09 12:31 PM, Richard Elling wrote: I believe this is a different problem. Adam, was this introduced in b120? Doubtless you are correct as usual. However, if this is a new problem, how did it get through Sun's legendary

Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-09-02 Thread Bob Friesenhahn
On Wed, 2 Sep 2009, Frank Middleton wrote: OK, I stand corrected. So the new snv121 checksum bug somehow made it through the simple sanity checks. Based on this thread, I wonder if it is still doing so (my intuition is that the problem still doesn't show up on Sun hardware). No doubt there's

Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-09-02 Thread Adam Leventhal
Hey Bob, I have seen few people more prone to unsubstantiated conjecture than you. The raidz checksum code was recently reworked to add raidz3. It seems likely that a subtle bug was added at that time. That appears to be the case. I'm investigating the problem and hope to have and update

Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-09-02 Thread Tim Cook
On Wed, Sep 2, 2009 at 3:02 PM, Frank Middleton f.middle...@apogeect.comwrote: On 09/02/09 02:17 PM, Jeff Victor wrote: Just to expand on that: there are now three levels of testing (and therefore stability) in [Open]Solaris: * Nevada builds - I don't know the details, but it's what BobF

Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-09-02 Thread Chris Csanady
2009/9/2 Eric Sproul espr...@omniti.com: Adam, Is it known approximately when this bug was introduced?  I have a system running snv_111 with a large raidz2 pool and I keep running into checksum errors though the drives are brand new.  They are 2TB drives, but the pool is only about 14%

Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-09-01 Thread Adam Leventhal
Hi James, After investigating this problem a bit I'd suggest avoiding deploying RAID-Z until this issue is resolved. I anticipate having it fixed in build 124. Apologies for the inconvenience. Adam On Aug 28, 2009, at 8:20 PM, James Lever wrote: On 28/08/2009, at 3:23 AM, Adam Leventhal

Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-09-01 Thread James Lever
On 02/09/2009, at 9:54 AM, Adam Leventhal wrote: After investigating this problem a bit I'd suggest avoiding deploying RAID-Z until this issue is resolved. I anticipate having it fixed in build 124. Thanks for the status update on this Adam. cheers, James

Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-08-28 Thread Gary Gendel
Alan, Super find. Thanks, I thought I was just going crazy until I rolled back to 110 and the errors disappeared. When you do work out a fix, please ping me to let me know when I can try an upgrade again. Gary -- This message posted from opensolaris.org

Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-08-28 Thread James Lever
On 28/08/2009, at 3:23 AM, Adam Leventhal wrote: There appears to be a bug in the RAID-Z code that can generate spurious checksum errors. I'm looking into it now and hope to have it fixed in build 123 or 124. Apologies for the inconvenience. Are the errors being generated likely to cause

Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-08-27 Thread Gary Gendel
It looks like It's definitely related to the snv_121 upgrade. I decided to roll back to snv_110 and the checksum errors have disappeared. I'd like to issue a bug report, but I don't have any information that might help track this down, just lots of checksum errors. Looks like I'm stuck at

Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-08-27 Thread Albert Chin
On Thu, Aug 27, 2009 at 06:29:52AM -0700, Gary Gendel wrote: It looks like It's definitely related to the snv_121 upgrade. I decided to roll back to snv_110 and the checksum errors have disappeared. I'd like to issue a bug report, but I don't have any information that might help track this

Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-08-27 Thread Casper . Dik
It looks like It's definitely related to the snv_121 upgrade. I decided to roll back to snv_110 and the checksum errors have disappeared. I'd like to issue a bug report, but I don't have any information that might help track this down, just lots of checksum errors. Looks like I'm stuck at

Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-08-27 Thread Adam Leventhal
Hey Gary, There appears to be a bug in the RAID-Z code that can generate spurious checksum errors. I'm looking into it now and hope to have it fixed in build 123 or 124. Apologies for the inconvenience. Adam On Aug 25, 2009, at 5:29 AM, Gary Gendel wrote: I have a 5-500GB disk Raid-Z

[zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-08-25 Thread Gary Gendel
I have a 5-500GB disk Raid-Z pool that has been producing checksum errors right after upgrading SXCE to build 121. They seem to be randomly occurring on all 5 disks, so it doesn't look like a disk failure situation. Repeatingly running a scrub on the pools randomly repairs between 20 and a few

Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-08-25 Thread Henrik Johansson
Hello, On 25 aug 2009, at 14.29, Gary Gendel g...@genashor.com wrote: I have a 5-500GB disk Raid-Z pool that has been producing checksum errors right after upgrading SXCE to build 121. They seem to be randomly occurring on all 5 disks, so it doesn't look like a disk failure situation.

Re: [zfs-discuss] snv_110 - snv_121 produces checksum errors on Raid-Z pool

2009-08-25 Thread Neal Pollack
On 08/25/09 05:29 AM, Gary Gendel wrote: I have a 5-500GB disk Raid-Z pool that has been producing checksum errors right after upgrading SXCE to build 121. They seem to be randomly occurring on all 5 disks, so it doesn't look like a disk failure situation. Repeatingly running a scrub on the