Re: [zfs-discuss] zpool scrub bad block list

2011-11-08 Thread Paul Kraus
On Tue, Nov 8, 2011 at 9:14 AM, Didier Rebeix
 wrote:

> Very interesting... I didn't know disk firwares were responsible for
> automagically relocating bad blocks. Knowing this, it makes no sense for
> a filesystem to try to deal with this kind of errors.

In the dark ages, hard drives came with "bad block" lists taped to
them so you could load them into the device driver for that drive. New
bad blocks would be mapped out by the device driver. All that
functionality was moved into the drive a long time ago (at least 10-15
years).

Under Solaris, you can see the size of the bad block lists through
FORMAT -> DEFECT -> PRIMARY will give you the size of the list from
the factory and FORMAT -> DEFECT -> GROWN will give you those added
since the drive left the factory. I tend to open a support case to
have a drive replaced if the GROWN list is much above 0 or is growing.

Keep in mind that any type of hardware RAID should report back 0
for both to the OS.

-- 
{1-2-3-4-5-6-7-}
Paul Kraus
-> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ )
-> Sound Coordinator, Schenectady Light Opera Company (
http://www.sloctheater.org/ )
-> Technical Advisor, RPI Players
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool scrub bad block list

2011-11-08 Thread Didier Rebeix
Very interesting... I didn't know disk firwares were responsible for
automagically relocating bad blocks. Knowing this, it makes no sense for
a filesystem to try to deal with this kind of errors.

For now, any disk with read/write errors detected will be discarded
from my filers and replaced...

Thanks !

Le Tue, 08 Nov 2011 13:03:57 +,
"Andrew Gabriel"  a écrit :

> ZFS detects far more errors that traditional filesystems will simply
> miss. This means that many of the possible causes for those errors
> will be something other than a real bad block on the disk. As Edward
> said, the disk firmware should automatically remap real bad blocks,
> so if ZFS did that too, we'd not use the remapped block, which is
> probably fine. For other errors, there's nothing wrong with the real
> block on the disk - it's going to be firmware, driver, cache
> corruption, or something else, so blacklisting the block will not
> solve the issue. Also, with some types of disk (SSD), block numbers
> are moved around to achieve wear leveling, so blacklistinng a block
> number won't stop you reusing that real block.
> 


-- 
Didier REBEIX
Universite de Bourgogne
Direction des Systèmes d'Information
BP 27877
21078 Dijon Cedex
Tel: +33 380395205


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool scrub bad block list

2011-11-08 Thread Andrew Gabriel
ZFS detects far more errors that traditional filesystems will simply miss. 
This means that many of the possible causes for those errors will be 
something other than a real bad block on the disk. As Edward said, the disk 
firmware should automatically remap real bad blocks, so if ZFS did that 
too, we'd not use the remapped block, which is probably fine. For other 
errors, there's nothing wrong with the real block on the disk - it's going 
to be firmware, driver, cache corruption, or something else, so 
blacklisting the block will not solve the issue. Also, with some types of 
disk (SSD), block numbers are moved around to achieve wear leveling, so 
blacklistinng a block number won't stop you reusing that real block.


--
Andrew Gabriel (from mobile)

--- Original message ---
From: Edward Ned Harvey 


To: didier.reb...@u-bourgogne.fr, zfs-discuss@opensolaris.org
Sent: 8.11.'11,  12:50


From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
boun...@opensolaris.org] On Behalf Of Didier Rebeix

 from ZFS documentation it appears unclear to me if a "zpool
scrub" will black list any found bad blocks so they won't be used
anymore.


If there are any physically bad blocks, such that the hardware (hard 
disk)
will return an error every time that block is used, then the disk should 
be

replaced.  All disks have a certain amount of error detection/correction
built in, and remap bad blocks internally and secretly behind the scenes,
transparent to the OS.  So if there are any blocks regularly reporting 
bad

to the OS, then it means there is a growing problem inside the disk.
Offline the disk and replace it.

It is ok to get an occasional cksum error.  Say, once a year.  Because 
the

occasional cksum error will be re-read and as long as the data is correct
the second time, no problem.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool scrub bad block list

2011-11-08 Thread Edward Ned Harvey
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
> boun...@opensolaris.org] On Behalf Of Didier Rebeix
> 
>   from ZFS documentation it appears unclear to me if a "zpool
> scrub" will black list any found bad blocks so they won't be used
> anymore.

If there are any physically bad blocks, such that the hardware (hard disk)
will return an error every time that block is used, then the disk should be
replaced.  All disks have a certain amount of error detection/correction
built in, and remap bad blocks internally and secretly behind the scenes,
transparent to the OS.  So if there are any blocks regularly reporting bad
to the OS, then it means there is a growing problem inside the disk.
Offline the disk and replace it.

It is ok to get an occasional cksum error.  Say, once a year.  Because the
occasional cksum error will be re-read and as long as the data is correct
the second time, no problem.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zpool scrub bad block list

2011-11-08 Thread Didier Rebeix
Hi list,

from ZFS documentation it appears unclear to me if a "zpool
scrub" will black list any found bad blocks so they won't be used
anymore.

I know Netapp's WAFL scrub does reallocate bad blocks and mark them as
unsable. Does ZFS have this kind of strategy ?

Thanks.

-- 
Didier
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss