Re: [zfs-discuss] Finding corrupted files

2010-10-20 Thread Stephan Budach
Am 19.10.2010 um 22:36 schrieb Tuomas Leikola tuomas.leik...@gmail.com: On Mon, Oct 18, 2010 at 4:55 PM, Edward Ned Harvey sh...@nedharvey.com wrote: Thank you, but, the original question was whether a scrub would identify just corrupt blocks, or if it would be able to map corrupt blocks to

Re: [zfs-discuss] Finding corrupted files

2010-10-20 Thread Edward Ned Harvey
From: Stephan Budach [mailto:stephan.bud...@jvm.de] Just in case this wasn't already clear. After scrub sees read or checksum errors, zpool status -v will list filenames that are affected. At least in my experience. -- - Tuomas That didn't do it for me. I used scrub and

Re: [zfs-discuss] Finding corrupted files

2010-10-20 Thread Darren J Moffat
On 20/10/2010 12:20, Edward Ned Harvey wrote: It's one of the big selling points, reasons for ZFS to exist. You should always give ZFS JBOD devices to work on, so ZFS is able to scrub both of the redundant sides of the data, and when a checksum error occurs, ZFS is able to detect *and* correct

Re: [zfs-discuss] Finding corrupted files

2010-10-20 Thread Stephan Budach
From: Stephan Budach [mailto:stephan.bud...@jvm.de] Just in case this wasn't already clear. After scrub sees read or checksum errors, zpool status -v will list filenames that are affected. At least in my experience. -- - Tuomas That didn't do it for me. I used scrub and afterwards

Re: [zfs-discuss] Finding corrupted files

2010-10-20 Thread Edward Ned Harvey
From: Stephan Budach [mailto:stephan.bud...@jvm.de] Although, I have to say that I do have exactly 3 files that are corrupt in each snapshot until I finally deleted them and restored them from their original source. zfs send will abort when trying to send them, while scrub doesn't notice

Re: [zfs-discuss] Finding corrupted files

2010-10-20 Thread Edward Ned Harvey
-Original Message- From: Darren J Moffat [mailto:darr...@opensolaris.org] It's one of the big selling points, reasons for ZFS to exist. You should always give ZFS JBOD devices to work on, so ZFS is able to scrub both of the redundant sides of the data, and when a checksum error

Re: [zfs-discuss] Finding corrupted files

2010-10-20 Thread Stephan Budach
Am 20.10.10 15:11, schrieb Edward Ned Harvey: From: Stephan Budach [mailto:stephan.bud...@jvm.de] Although, I have to say that I do have exactly 3 files that are corrupt in each snapshot until I finally deleted them and restored them from their original source. zfs send will abort when trying

Re: [zfs-discuss] Finding corrupted files

2010-10-19 Thread Tuomas Leikola
On Mon, Oct 18, 2010 at 4:55 PM, Edward Ned Harvey sh...@nedharvey.com wrote: Thank you, but, the original question was whether a scrub would identify just corrupt blocks, or if it would be able to map corrupt blocks to a list of corrupt files. Just in case this wasn't already clear. After

Re: [zfs-discuss] Finding corrupted files

2010-10-18 Thread Edward Ned Harvey
From: Richard Elling [mailto:richard.ell...@gmail.com] On Oct 17, 2010, at 6:17 AM, Edward Ned Harvey wrote: From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Edward Ned Harvey If scrub is operating at a block-level (and I think it

Re: [zfs-discuss] Finding corrupted files

2010-10-17 Thread Orvar Korvar
budy, here are some links. Remember, the reason you get corrupted files, is because ZFS detects it. Probably, you got corruption earlier as well, but your hardware did not notice it. This is called Silent Corruption. But ZFS is designed to detect and correct Silent Corruption. Which no normal

Re: [zfs-discuss] Finding corrupted files

2010-10-17 Thread Kees Nuyt
On Sun, 17 Oct 2010 03:05:34 PDT, Orvar Korvar knatte_fnatte_tja...@yahoo.com wrote: here are some links. Wow, that's a great overview, thanks! -- ( Kees Nuyt ) c[_] ___ zfs-discuss mailing list zfs-discuss@opensolaris.org

Re: [zfs-discuss] Finding corrupted files

2010-10-17 Thread Edward Ned Harvey
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Edward Ned Harvey If scrub is operating at a block-level (and I think it is), then how can checksum failures be mapped to file names? For example, this is a long-requested feature of zfs

Re: [zfs-discuss] Finding corrupted files

2010-10-17 Thread Richard Elling
On Oct 17, 2010, at 6:17 AM, Edward Ned Harvey wrote: From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Edward Ned Harvey If scrub is operating at a block-level (and I think it is), then how can checksum failures be mapped to file names?

Re: [zfs-discuss] Finding corrupted files

2010-10-16 Thread Richard Elling
On Oct 15, 2010, at 6:18 AM, Stephan Budach wrote: So, what would you suggest, if I wanted to create really big pools? Say in the 100 TB range? That would be quite a number of single drives then, especially when you want to go with zpool raid-1. For 100 TB, the methods change dramatically.

Re: [zfs-discuss] Finding corrupted files

2010-10-16 Thread Pasi Kärkkäinen
On Sat, Oct 16, 2010 at 08:38:28AM -0700, Richard Elling wrote: On Oct 15, 2010, at 6:18 AM, Stephan Budach wrote: So, what would you suggest, if I wanted to create really big pools? Say in the 100 TB range? That would be quite a number of single drives then, especially

Re: [zfs-discuss] Finding corrupted files

2010-10-16 Thread Richard Elling
On Oct 16, 2010, at 4:13 PM, Pasi Kärkkäinen wrote: On Sat, Oct 16, 2010 at 08:38:28AM -0700, Richard Elling wrote: On Oct 15, 2010, at 6:18 AM, Stephan Budach wrote: So, what would you suggest, if I wanted to create really big pools? Say in the 100 TB range? That would be quite a

Re: [zfs-discuss] Finding corrupted files

2010-10-15 Thread Stephan Budach
Am 14.10.10 17:48, schrieb Edward Ned Harvey: From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Toby Thain I don't want to heat up the discussion about ZFS managed discs vs. HW raids, but if RAID5/6 would be that bad, no one would use it

Re: [zfs-discuss] Finding corrupted files

2010-10-15 Thread Ross Walker
On Oct 15, 2010, at 9:18 AM, Stephan Budach stephan.bud...@jvm.de wrote: Am 14.10.10 17:48, schrieb Edward Ned Harvey: From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Toby Thain I don't want to heat up the discussion about ZFS managed

Re: [zfs-discuss] Finding corrupted files

2010-10-15 Thread Stephan Budach
Am 12.10.10 14:21, schrieb Edward Ned Harvey: From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Stephan Budach c3t211378AC0253d0 ONLINE 0 0 0 How many disks are there inside of c3t211378AC0253d0? How are they

Re: [zfs-discuss] Finding corrupted files

2010-10-15 Thread Edward Ned Harvey
From: Stephan Budach [mailto:stephan.bud...@jvm.de] Point taken! So, what would you suggest, if I wanted to create really big pools? Say in the 100 TB range? That would be quite a number of single drives then, especially when you want to go with zpool raid-1. You have a lot of disks.

Re: [zfs-discuss] Finding corrupted files

2010-10-14 Thread Stephan Budach
I'd like to see those docs as well. As all HW raids are driven by software, of course - and software can be buggy. I don't want to heat up the discussion about ZFS managed discs vs. HW raids, but if RAID5/6 would be that bad, no one would use it anymore. So… just post the link and I will take a

Re: [zfs-discuss] Finding corrupted files

2010-10-14 Thread Toby Thain
On 14-Oct-10, at 3:27 AM, Stephan Budach wrote: I'd like to see those docs as well. As all HW raids are driven by software, of course - and software can be buggy. It's not that the software 'can be buggy' - that's not the point here. The point being made is that conventional RAID just

Re: [zfs-discuss] Finding corrupted files

2010-10-14 Thread Edward Ned Harvey
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Toby Thain I don't want to heat up the discussion about ZFS managed discs vs. HW raids, but if RAID5/6 would be that bad, no one would use it anymore. It is. And there's no reason not

Re: [zfs-discuss] Finding corrupted files

2010-10-13 Thread Orvar Korvar
Budy, if you are using raid-5 or raid-6 underneath ZFS, then you should know that raid-5/6 might corrupt data. See here for lots of technical articles why raid-5 is bad: http://www.baarf.com/ raid-6 is not better. I can show you links about raid-6 being not safe. I is a good thing you run ZFS,

Re: [zfs-discuss] Finding corrupted files

2010-10-13 Thread Richard Elling
On Oct 13, 2010, at 12:59 PM, Orvar Korvar wrote: On the other hand, ZFS is safe. There are research papers showing that ZFS detects and corrects all errors. You want to see them? I would. URLs please? -- richard -- OpenStorage Summit, October 25-27, Palo Alto, CA

Re: [zfs-discuss] Finding corrupted files

2010-10-12 Thread Stephan Budach
You are implying that the issues resulted from the H/W raid(s) and I don't think that this is appropriate. I configured a striped pool using two raids - this is exactly the same as using two single hard drives without mirroring them. I simply cannot see what zfs would be able to do in case of

Re: [zfs-discuss] Finding corrupted files

2010-10-12 Thread Tuomas Leikola
On Tue, Oct 12, 2010 at 9:39 AM, Stephan Budach stephan.bud...@jvm.de wrote: You are implying that the issues resulted from the H/W raid(s) and I don't think that this is appropriate. Not exactly. Because the raid is managed in hardware, and not by zfs, is the reason why zfs cannot fix these

Re: [zfs-discuss] Finding corrupted files

2010-10-12 Thread Stephan Budach
If the case is, as speculated, that one mirror has bad data and one has good, scrub or any IO has 50% chances of seeing the corruption. scrub does verify checksums. Yes, if the vdev would be a mirrored one, which it wasn't. There weren't any mirrors setup. Plus, if the checksums would have been

Re: [zfs-discuss] Finding corrupted files

2010-10-12 Thread Edward Ned Harvey
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Stephan Budach You are implying that the issues resulted from the H/W raid(s) and I don't think that this is appropriate. Please quote originals when you reply. If you don't - then it's

Re: [zfs-discuss] Finding corrupted files

2010-10-12 Thread Edward Ned Harvey
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Stephan Budach c3t211378AC0253d0 ONLINE 0 0 0 How many disks are there inside of c3t211378AC0253d0? How are they configured? Hardware raid 5? A mirror of

Re: [zfs-discuss] Finding corrupted files

2010-10-12 Thread Ross Walker
On Oct 12, 2010, at 8:21 AM, Edward Ned Harvey sh...@nedharvey.com wrote: From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Stephan Budach c3t211378AC0253d0 ONLINE 0 0 0 How many disks are there inside of

Re: [zfs-discuss] Finding corrupted files

2010-10-12 Thread Edward Ned Harvey
From: Stephan Budach [mailto:stephan.bud...@jvm.de] I now also got what you meant by good half but I don't dare to say whether or not this is also the case in a raid6 setup. The same concept applies to raid5 or raid6. When you read the device, you never know if you're actually reading the

Re: [zfs-discuss] Finding corrupted files

2010-10-11 Thread David Dyer-Bennet
On Fri, October 8, 2010 04:47, Stephan Budach wrote: So, I decided to give tar a whirl, after zfs send encountered the next corrupted file, resulting in an I/O error, even though scrub ran successfully w/o any erors. I must say that this concept of scrub running w/o error when corrupted

Re: [zfs-discuss] Finding corrupted files

2010-10-11 Thread Richard Elling
On Oct 6, 2010, at 1:26 PM, Stephan Budach wrote: Hi Cindy, thanks for bringing that to my attention. I checked fmdump and found a lot of these entries Okt 06 2010 17:52:12.862812483 ereport.io.scsi.cmd.disk.tran ... Okt 06 2010 17:52:12.862813713 ereport.io.scsi.cmd.disk.recovered

Re: [zfs-discuss] Finding corrupted files

2010-10-11 Thread Edward Ned Harvey
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of David Dyer-Bennet I must say that this concept of scrub running w/o error when corrupted files, detectable to zfs send, apparently exist, is very disturbing. As previously mentioned, the OP

Re: [zfs-discuss] Finding corrupted files

2010-10-11 Thread Stephan Budach
I think one has to accept that zfs send appearently is able to detect such errors while scrub is not. scrub is operates only on the block level and makes sure that each block can be read and is in line with its's checksum. However, zfs send seems to have detected some errors in the file system

Re: [zfs-discuss] Finding corrupted files

2010-10-08 Thread Stephan Budach
So, I decided to give tar a whirl, after zfs send encountered the next corrupted file, resulting in an I/O error, even though scrub ran successfully w/o any erors. I then issued a /usr/gnu/bin/tar -cf /dev/null /obelixData/…/.zfs/snapshot/actual snapshot/DTP which finished without any issue

Re: [zfs-discuss] Finding corrupted files

2010-10-08 Thread Stephan Budach
So - after 10 hrs and 21 mins. the incremental zfs send/recv finished without a problem. ;) Seems that using tar for checking all files is an appropriate action. Cheers, budy -- This message posted from opensolaris.org ___ zfs-discuss mailing list

Re: [zfs-discuss] Finding corrupted files

2010-10-07 Thread Stephan Budach
Hi Edward, well that was exactly my point, when I raised this question. If zfs send is able to identify corrupted files while it transfers a snapshot, why shouldn't scrub be able to do the same? ZFS send quit with an I/O error and zpool status -v showed my the file that indeed had problems.

Re: [zfs-discuss] Finding corrupted files

2010-10-07 Thread Ian Collins
On 10/ 7/10 06:22 PM, Stephan Budach wrote: Hi Edward, these are interesting points. I have considered a couple of them, when I started playing around with ZFS. I am not sure whether I disagree with all of your points, but I conducted a couple of tests, where I configured my raids as jbods

Re: [zfs-discuss] Finding corrupted files

2010-10-07 Thread Stephan Budach
Ian, I know - and I will address this, by upgrading the vdevs to mirrors, but there're a lot of other SPOFs around. So I started out by reducing the most common failures and I have found that to be the disc drives, not the chassis. The beauty is: one can work their way up until the point of

Re: [zfs-discuss] Finding corrupted files

2010-10-07 Thread Edward Ned Harvey
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Stephan Budach I conducted a couple of tests, where I configured my raids as jbods and mapped each drive out as a seperate LUN and I couldn't notice a difference in performance in any way.

Re: [zfs-discuss] Finding corrupted files

2010-10-07 Thread Edward Ned Harvey
From: edmud...@mail.bounceswoosh.org [mailto:edmud...@mail.bounceswoosh.org] On Behalf Of Eric D. Mudama On Wed, Oct 6 at 22:04, Edward Ned Harvey wrote: * Because ZFS automatically buffers writes in ram in order to aggregate as previously mentioned, the hardware WB cache is not

Re: [zfs-discuss] Finding corrupted files

2010-10-07 Thread Cindy Swearingen
I would not discount the performance issue... Depending on your workload, you might find that performance increases with ZFS on your hardware RAID in JBOD mode. Cindy On 10/07/10 06:26, Edward Ned Harvey wrote: From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-

Re: [zfs-discuss] Finding corrupted files

2010-10-07 Thread Toby Thain
On 7-Oct-10, at 1:22 AM, Stephan Budach wrote: Hi Edward, these are interesting points. I have considered a couple of them, when I started playing around with ZFS. I am not sure whether I disagree with all of your points, but I conducted a couple of tests, where I configured my raids as

Re: [zfs-discuss] Finding corrupted files

2010-10-07 Thread Edward Ned Harvey
From: Cindy Swearingen [mailto:cindy.swearin...@oracle.com] I would not discount the performance issue... Depending on your workload, you might find that performance increases with ZFS on your hardware RAID in JBOD mode. Depends on the raid card you're comparing to. I've certainly seen

[zfs-discuss] Finding corrupted files

2010-10-06 Thread Stephan Budach
Hi, I recently discovered some - or at least one corrupted file on one ofmy ZFS datasets, which caused an I/O error when trying to send a ZFDS snapshot to another host: zpool status -v obelixData pool: obelixData state: ONLINE status: One or more devices has experienced an error resulting

Re: [zfs-discuss] Finding corrupted files

2010-10-06 Thread Tomas Ögren
On 06 October, 2010 - Stephan Budach sent me these 2,1K bytes: Hi, I recently discovered some - or at least one corrupted file on one ofmy ZFS datasets, which caused an I/O error when trying to send a ZFDS snapshot to another host: zpool status -v obelixData pool: obelixData

Re: [zfs-discuss] Finding corrupted files

2010-10-06 Thread Stephan Budach
No - not a trick question., but maybe I didn't make myself clear. Is there a way to discover such bad files other than trying to actually read from them one by one, say using cp or by sending a snapshot elsewhere? I am well aware that the file shown in zpool status -v is damaged and I have

Re: [zfs-discuss] Finding corrupted files

2010-10-06 Thread Scott Meilicke
Scrub? On Oct 6, 2010, at 6:48 AM, Stephan Budach wrote: No - not a trick question., but maybe I didn't make myself clear. Is there a way to discover such bad files other than trying to actually read from them one by one, say using cp or by sending a snapshot elsewhere? I am well aware

Re: [zfs-discuss] Finding corrupted files

2010-10-06 Thread Jim Dunham
Budy, No - not a trick question., but maybe I didn't make myself clear. Is there a way to discover such bad files other than trying to actually read from them one by one, say using cp or by sending a snapshot elsewhere? As noted by your original email, ZFS reports on any corruption using the

Re: [zfs-discuss] Finding corrupted files

2010-10-06 Thread Stephan Budach
Well I think, that answers my question then: after a successful scrub, zpool status -v should then list all damaged files on an entire zpool. I only asked, because I read a thread in this forum that one guy had a problem with different files, aven after a successful scrub. Thanks, budy --

Re: [zfs-discuss] Finding corrupted files

2010-10-06 Thread Cindy Swearingen
Budy, Your previous zpool status output shows a non-redundant pool with data corruption. You should use the fmdump -eV command to find out the underlying cause of this corruption. You can review the hardware-level monitoring tools, here:

Re: [zfs-discuss] Finding corrupted files

2010-10-06 Thread Ian Collins
On 10/ 6/10 09:52 PM, Stephan Budach wrote: Hi, I recently discovered some - or at least one corrupted file on one ofmy ZFS datasets, which caused an I/O error when trying to send a ZFDS snapshot to another host: zpool status -v obelixData pool: obelixData state: ONLINE status: One or

Re: [zfs-discuss] Finding corrupted files

2010-10-06 Thread Stephan Budach
Hi Cindy, thanks for bringing that to my attention. I checked fmdump and found a lot of these entries: Okt 06 2010 17:52:12.862812483 ereport.io.scsi.cmd.disk.tran nvlist version: 0 class = ereport.io.scsi.cmd.disk.tran ena = 0x514dc67d57e1 detector = (embedded

Re: [zfs-discuss] Finding corrupted files

2010-10-06 Thread Stephan Budach
Ian, yes, although these vdevs are FC raids themselves, so the risk is… uhm… calculated. Unfortuanetly, one of the devices seems to have some issues, as stated im my previous post. I will, nevertheless, add redundancy to my pool asap. Thanks, budy -- This message posted from opensolaris.org

Re: [zfs-discuss] Finding corrupted files

2010-10-06 Thread Edward Ned Harvey
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Stephan Budach Ian, yes, although these vdevs are FC raids themselves, so the risk is… uhm… calculated. Whenever possible, you should always JBOD the storage and let ZFS manage the raid,

Re: [zfs-discuss] Finding corrupted files

2010-10-06 Thread Edward Ned Harvey
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Stephan Budach Now, scrub would reveal corrupted blocks on the devices, but is there a way to identify damaged files as well? I saw a lot of people offering the same knee-jerk reaction that

Re: [zfs-discuss] Finding corrupted files

2010-10-06 Thread Stephan Budach
Hi Edward, these are interesting points. I have considered a couple of them, when I started playing around with ZFS. I am not sure whether I disagree with all of your points, but I conducted a couple of tests, where I configured my raids as jbods and mapped each drive out as a seperate LUN

Re: [zfs-discuss] Finding corrupted files

2010-10-06 Thread Eric D. Mudama
On Wed, Oct 6 at 22:04, Edward Ned Harvey wrote: * Because ZFS automatically buffers writes in ram in order to aggregate as previously mentioned, the hardware WB cache is not beneficial. There is one exception. If you are doing sync writes to spindle disks, and you don't have a dedicated log