[zfs-discuss] ZFS fsck?

2010-07-06 Thread Roy Sigurd Karlsbakk
Hi all

With several messages in here about troublesome zpools, would there be a good 
reason to be able to fsck a pool? As in, check the whole thing instead of 
having to boot into live CDs and whatnot?
 
Vennlige hilsener / Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
r...@karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er 
et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av 
idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og 
relevante synonymer på norsk.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS fsck?

2010-07-06 Thread iMx
- Original Message -
 From: Roy Sigurd Karlsbakk r...@karlsbakk.net
 To: OpenSolaris ZFS discuss zfs-discuss@opensolaris.org
 Sent: Tuesday, 6 July, 2010 6:35:51 PM
 Subject: [zfs-discuss] ZFS fsck?
 Hi all
 
 With several messages in here about troublesome zpools, would there be
 a good reason to be able to fsck a pool? As in, check the whole thing
 instead of having to boot into live CDs and whatnot?
 
 Vennlige hilsener / Best regards
 
 roy

Scrub? :)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS fsck?

2010-07-06 Thread Mark J Musante

On Tue, 6 Jul 2010, Roy Sigurd Karlsbakk wrote:


Hi all

With several messages in here about troublesome zpools, would there be a 
good reason to be able to fsck a pool? As in, check the whole thing 
instead of having to boot into live CDs and whatnot?


You can do this with zpool scrub.  It visits every allocated block and 
verifies that everything is correct.  It's not the same as fsck in that 
scrub can detect and repair problems with the pool still online and all 
datasets mounted, whereas fsck cannot handle mounted filesystems.


If you really want to use it on an exported pool, you can use zdb, 
although it might take some time.  Here's an example on a small empty 
pool:


# zpool create -f mypool raidz c4t1d0s0 c4t2d0s0 c4t3d0s0 c4t4d0s0 c4t5d0s0
# zpool list mypool
NAME SIZE  ALLOC   FREECAP  DEDUP  HEALTH  ALTROOT
mypool   484M   280K   484M 0%  1.00x  ONLINE  -
# zpool export mypool
# zdb -ebcc mypool

Traversing all blocks to verify checksums and verify nothing leaked ...

No leaks (block sum matches space maps exactly)

bp count:  48
bp logical:378368  avg:   7882
bp physical:39424  avg:821 compression:   9.60
bp allocated:  185344  avg:   3861 compression:   2.04
bp deduped: 0ref1:  0   deduplication:   1.00
SPA allocated: 185344 used:  0.04%

#
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS fsck?

2010-07-06 Thread Roy Sigurd Karlsbakk
 You can do this with zpool scrub. It visits every allocated block
 and
 verifies that everything is correct. It's not the same as fsck in that
 scrub can detect and repair problems with the pool still online and
 all
 datasets mounted, whereas fsck cannot handle mounted filesystems.
 
 If you really want to use it on an exported pool, you can use zdb,
 although it might take some time. Here's an example on a small empty
 pool:
 
 # zpool create -f mypool raidz c4t1d0s0 c4t2d0s0 c4t3d0s0 c4t4d0s0
 c4t5d0s0
 # zpool list mypool
 NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT
 mypool 484M 280K 484M 0% 1.00x ONLINE -
 # zpool export mypool
 # zdb -ebcc mypool
...

what I'm saying is that there are several posts in here where the only solution 
is to boot onto a live cd and then do an import, due to metadata corruption. 
This should be doable from the installed system

Vennlige hilsener / Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
r...@karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er 
et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av 
idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og 
relevante synonymer på norsk.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS fsck?

2010-07-06 Thread Mark J Musante

On Tue, 6 Jul 2010, Roy Sigurd Karlsbakk wrote:

what I'm saying is that there are several posts in here where the only 
solution is to boot onto a live cd and then do an import, due to 
metadata corruption. This should be doable from the installed system


Ah, I understand now.

A couple of things worth noting:

- if the root filesystem in a boot pool cannot be mounted, it's 
problematic to access the tools necessary to repair it.  So going to a 
livecd (or a network boot for that matter) is the best way forward.


- if the tools available to failsafe are insufficient to repair a pool, 
then booting off a livecd/network is the only way forward.


It is also worth pointing out here that the 134a build has the pool 
recovery code built-in.  The -F option to zpool import only became 
available after build 128 or 129.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS + fsck

2009-11-10 Thread Joerg Moellenkamp
Hi, 

 *everybody* is interested in the flag days page. Including me.
 Asking me to raise the priority is not helpful.
 
 From my perspective, it's a surprise that 'everybody' is interested, as I'm
 not seeing a lot of people complaining that the flag day page is not updating.
 Only a couple of people on this list, and one of those is me!
 Perhaps I'm looking in the wrong places.

I used this page frequently, too. But now i'm just using the twitter account 
feeded by onnv-notify . You can look to it at  http://twitter.com/codenews 

Regards
 Joerg

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS + fsck

2009-11-09 Thread Nigel Smith
On Thu Nov 5 14:38:13 PST 2009, Gary Mills wrote:
 It would be nice to see this information at:
 http://hub.opensolaris.org/bin/view/Community+Group+on/126-130
 but it hasn't changed since 23 October.

Well it seems we have an answer:

http://mail.opensolaris.org/pipermail/zfs-discuss/2009-November/033672.html

On Mon Nov 9 14:26:54 PST 2009, James C. McPherson wrote:
 The flag days page has not been updated since the switch
 to XWiki, it's on my todo list but I don't have an ETA
 for when it'll be done.

Perhaps anyone interested in seeing the flags days page
resurrected can petition James to raise the priority on
his todo list.
Thanks
Nigel Smith
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS + fsck

2009-11-09 Thread James C. McPherson

Nigel Smith wrote:

On Thu Nov 5 14:38:13 PST 2009, Gary Mills wrote:

It would be nice to see this information at:
http://hub.opensolaris.org/bin/view/Community+Group+on/126-130
but it hasn't changed since 23 October.


Well it seems we have an answer:

http://mail.opensolaris.org/pipermail/zfs-discuss/2009-November/033672.html

On Mon Nov 9 14:26:54 PST 2009, James C. McPherson wrote:

The flag days page has not been updated since the switch
to XWiki, it's on my todo list but I don't have an ETA
for when it'll be done.


Perhaps anyone interested in seeing the flags days page
resurrected can petition James to raise the priority on
his todo list.


Nigel,
*everybody* is interested in the flag days page. Including me.

Asking me to raise the priority is not helpful.


James C. McPherson
--
Senior Kernel Software Engineer, Solaris
Sun Microsystems
http://blogs.sun.com/jmcp   http://www.jmcp.homeunix.com/blog
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS + fsck

2009-11-08 Thread Robert Milkowski


fyi

Robert Milkowski wrote:

XXX wrote:

| Have you actually tried to roll-back to previous uberblocks when you
| hit the issue?  I'm asking as I haven't yet heard about any case
| of the issue witch was not solved by rolling back to a previous
| uberblock. The problem though was that the way to do it was hackish.

 Until recently I didn't even know that this was possible or a likely
solution to 'pool panics system on import' and similar pool destruction,
and I don't have any tools to do it. (Since we run Solaris 10, we won't
have official support for it for quite some time.)
  
I wouldn't be that surprised if this particular feature would actually 
be backported to S10 soon. At least you may raise a CR asking for it - 
maybe you will get an access to IDR first (I'm not saying there is or 
isn't already one).



 If there are (public) tools for doing this, I will give them a try
the next time I get a test pool into this situation.
  


IIRC someone send one to the zfs-discuss list some time ago.
Then usually you will also need to poke with zdb.
A sketchy and unsupported procedure was discussed on the list as well.
Look at the archives.


| The bugs which prevented importing a pool in some circumstances were
| really annoying but lets face it - it was bound to happen and they
| are just bugs which are getting fixed. ZFS is still young after all.
| And when you google for data loss on other filesystems I'm sure you
| will find lots of user testimonies - be it ufs, ext3, raiserfs or your
| favourite one.

 The difference between ZFS and those other filesystems is that with
a few exceptions (XFS, ReiserFS), which sysadmins in the field didn't
like either, those filesystems didn't generally lose *all* your data
when something went wrong. Their official repair tools could usually
put things back together to at least some extent.
  
Generally they didn't although I've seen situation when entire ext2 
and ufs were lost and fsck was not able to get them even mounted 
(kernel panics right after mounting them). In other occassion fsck was 
crashing the box in yet another one fsck claimed everything was ok but 
then when doing backup system was crashing (fsck can't really properly 
fix filesystem state - it is more of guessing and sometimes it goes 
terribly wrong).


But I agrre that generally with other file systems you can recover 
most or all data just fine.
And generally it is the case with zfs - there were probably more bugs 
in ZFS as it is much younger filesystem, but most of them were very 
quickly fixed. And the uberblock one - I 100% agree then when you hit 
the issue and didn't know about manual method to recover it was very 
bad - but it has finally been fixed.



(Just as importantly, when they couldn't put things back together you
could honestly tell management and the users 'we ran the recovery tools
and this is all they could get back'. At the moment, we would have
to tell users and management 'well, there are no (official) recovery
tools...', unless Sun Support came through for once.)
  


But these tools are built-in into zfs and are happening automatically 
and with virtually 100% confidence that if something can be fixed it 
is fixed correctly and if something is wrong it will be detected - 
thanks to end-to-end checksumming of data and meta-data. The problem 
*was* that one case scenario when rolling back to previous uberblock 
is required was not implemented and required a complicated and 
undocumented procedure to follow. It wasn't high priority for Sun as 
it was very rare , wasn't affecting much enterprise customers and 
although complicated the procedure is there is one and was 
successfully used on many occasions even for non paying customers 
thanks to guys like Victor on the zfs mailing list who helped some 
people in such a situations.


But you didn't know about it and it seems like Sun's support service 
was no use for you - which is really a shame.
In your case I would probably point that out to them and at least get 
some good deal as a compensation or something...


But what is most important is that finally fully supported, built in 
and easy to use procedure is available to recover from such 
situations. As time will progress and more bugs will be fixed ZFS will 
behave much better under many corner cases as it does already in Open 
Solaris - last 6 months or so were really very productive in fixing 
many bugs like that.


| However the whole point of the discussion is that zfs really 
doesn't | need a fsck tool.
| All the problems encountered so far were bugs and most of them are 
| already fixed. One missing feature was a built-in support for | 
rolling-back uberblock which just has been integrated. But I'm sure | 
there are more bugs to be found..


 I disagree strongly. Fsck tools have multiple purposes; ZFS obsoletes
some of them but not all. One thing fsck is there for is to recover as
much as possible after things happen that are supposed to be impossible,
like 

Re: [zfs-discuss] ZFS + fsck

2009-11-08 Thread Jason King
On Sun, Nov 8, 2009 at 7:55 AM, Robert Milkowski mi...@task.gda.pl wrote:

 fyi

 Robert Milkowski wrote:

 XXX wrote:

 | Have you actually tried to roll-back to previous uberblocks when you
 | hit the issue?  I'm asking as I haven't yet heard about any case
 | of the issue witch was not solved by rolling back to a previous
 | uberblock. The problem though was that the way to do it was hackish.

  Until recently I didn't even know that this was possible or a likely
 solution to 'pool panics system on import' and similar pool destruction,
 and I don't have any tools to do it. (Since we run Solaris 10, we won't
 have official support for it for quite some time.)


 I wouldn't be that surprised if this particular feature would actually be
 backported to S10 soon. At least you may raise a CR asking for it - maybe
 you will get an access to IDR first (I'm not saying there is or isn't
 already one).

  If there are (public) tools for doing this, I will give them a try
 the next time I get a test pool into this situation.


 IIRC someone send one to the zfs-discuss list some time ago.
 Then usually you will also need to poke with zdb.
 A sketchy and unsupported procedure was discussed on the list as well.
 Look at the archives.

 | The bugs which prevented importing a pool in some circumstances were
 | really annoying but lets face it - it was bound to happen and they
 | are just bugs which are getting fixed. ZFS is still young after all.
 | And when you google for data loss on other filesystems I'm sure you
 | will find lots of user testimonies - be it ufs, ext3, raiserfs or your
 | favourite one.

  The difference between ZFS and those other filesystems is that with
 a few exceptions (XFS, ReiserFS), which sysadmins in the field didn't
 like either, those filesystems didn't generally lose *all* your data
 when something went wrong. Their official repair tools could usually
 put things back together to at least some extent.


 Generally they didn't although I've seen situation when entire ext2 and
 ufs were lost and fsck was not able to get them even mounted (kernel panics
 right after mounting them). In other occassion fsck was crashing the box in
 yet another one fsck claimed everything was ok but then when doing backup
 system was crashing (fsck can't really properly fix filesystem state - it is
 more of guessing and sometimes it goes terribly wrong).

 But I agrre that generally with other file systems you can recover most or
 all data just fine.
 And generally it is the case with zfs - there were probably more bugs in
 ZFS as it is much younger filesystem, but most of them were very quickly
 fixed. And the uberblock one - I 100% agree then when you hit the issue and
 didn't know about manual method to recover it was very bad - but it has
 finally been fixed.

 (Just as importantly, when they couldn't put things back together you
 could honestly tell management and the users 'we ran the recovery tools
 and this is all they could get back'. At the moment, we would have
 to tell users and management 'well, there are no (official) recovery
 tools...', unless Sun Support came through for once.)


 But these tools are built-in into zfs and are happening automatically and
 with virtually 100% confidence that if something can be fixed it is fixed
 correctly and if something is wrong it will be detected - thanks to
 end-to-end checksumming of data and meta-data. The problem *was* that one
 case scenario when rolling back to previous uberblock is required was not
 implemented and required a complicated and undocumented procedure to follow.
 It wasn't high priority for Sun as it was very rare , wasn't affecting much
 enterprise customers and although complicated the procedure is there is one
 and was successfully used on many occasions even for non paying customers
 thanks to guys like Victor on the zfs mailing list who helped some people in
 such a situations.

 But you didn't know about it and it seems like Sun's support service was
 no use for you - which is really a shame.
 In your case I would probably point that out to them and at least get some
 good deal as a compensation or something...

 But what is most important is that finally fully supported, built in and
 easy to use procedure is available to recover from such situations. As time
 will progress and more bugs will be fixed ZFS will behave much better under
 many corner cases as it does already in Open Solaris - last 6 months or so
 were really very productive in fixing many bugs like that.

 | However the whole point of the discussion is that zfs really doesn't |
 need a fsck tool.
 | All the problems encountered so far were bugs and most of them are |
 already fixed. One missing feature was a built-in support for | rolling-back
 uberblock which just has been integrated. But I'm sure | there are more bugs
 to be found..

  I disagree strongly. Fsck tools have multiple purposes; ZFS obsoletes
 some of them but not all. One thing fsck is there for is to 

Re: [zfs-discuss] ZFS + fsck

2009-11-05 Thread Tim Haley

Orvar Korvar wrote:

Does this putback mean that I have to upgrade my zpool, or is it a zfs tool? If 
I missed upgrading my zpool I am smoked?


The putback did not bump zpool or zfs versions.  You shouldn't have to upgrade 
your pool.


-tim

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS + fsck

2009-11-05 Thread Miles Nordin
 csb == Craig S Bell cb...@standard.com writes:

   csb Two: If you lost data with another filesystem, you may have
   csb overlooked it and blamed the OS or the application,

yeah, but with ZFS you often lose the whole pool in certain classes of
repeatable real-world failures, like hotswap disks with flakey power
or SAN's without NVRAM where the target reboots and the initiator does
not.  Losing the whole pool is relevantly different to corrupting the
insides of a few files.  Yes, I know, the red-eyed screaming ZFS rats
will come out of the walls screaming ``that 1 bit could have been
critical Banking Data on which millions of lives depend and nuclear
reactors and spaceships too!  Wouldn't you rather KNOW, even if ZFS
desides to inform with zpool_self-destruct_condescending-error()?''
Maybe, sometimes, yes, but USUALLY, **NO**!

I've no objection to deciding how much recovery tools are needed based
on experience rather than wide-eyed kool-aid ranting or presumptions
from earlier filesystems, but so far experience says the recovery work
was really needed, so I can't agree with the bloggers rehashing each
other's zealotry.

It would be nice to isolate and fix the underlying problems, though.
That is the spirit in all these ``we don't need no fsck because we are
perfect'' blogs with which I do agree.  Their overoptimism isn't as
honest as I'd like about the way ZFS's error messages do not enough to
lead us toward the real cause in the case of SAN problems because they
are all designed presuming spatially-clustered, temporally-spread,
disk-based failures rather than temporally-clustered interconnect
failures, so rather the error detection becomes no more than
``printf(simon sez u will not blame me, blame someone else.  these
aren't the droids you're looking for.  move along.);'' but, yeah,
the blogger's point of banging on the whole stack until it works
rather than concealing errors, is a good one.  Unfortunately I don't
think that's what will actually happen with these dropped-write SAN
failures.  I think people will just use the new recovery bits, which
conceal errors just like earlier filesystems and fsck tools, and
shrug.


pgpRg4gotskPU.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS + fsck

2009-11-05 Thread Miles Nordin
 rm == Robert Milkowski mi...@task.gda.pl writes:

rm Personally I don't blame Sun that implementing the CR took so
rm long as it mostly affected home users with cheap hardware from
rm BestBuy like sources 

no, many of the reports were FC SAN's.

rm and even then it was relatively rare.

no, they said they were losing way more zpools than they ever lost
vxfs's in the same environment.

rm called enterprise customers were affected even less and then
rm either they had enough expertise or called Sun's support
rm organization to get a pool manually reverted to its previous
rm uberblock.

which is probably why the tool exists.  but, great!


pgpFajIq35ZZW.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS + fsck

2009-11-05 Thread Robert Milkowski

Miles Nordin wrote:

csb == Craig S Bell cb...@standard.com writes:
   csb Two: If you lost data with another filesystem, you may have
   csb overlooked it and blamed the OS or the application,

yeah, but with ZFS you often lose the whole pool in certain classes of
repeatable real-world failures, like hotswap disks with flakey power
or SAN's without NVRAM where the target reboots and the initiator does
not.  Losing the whole pool is relevantly different to corrupting the
insides of a few files. 
I think that most people including ZFS developers agree with you that 
losing an access to entire pool is not acceptable. And this has been 
fixed in snv_126 so now in those rare circumstances you should be able 
to import a pool. And generally you will end-up in a much better 
situation than with legacy filesystems + fsck.



--
Robert Milkowski
http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS + fsck

2009-11-05 Thread Robert Milkowski

Miles Nordin wrote:

rm == Robert Milkowski mi...@task.gda.pl writes:



rm Personally I don't blame Sun that implementing the CR took so
rm long as it mostly affected home users with cheap hardware from
rm BestBuy like sources 


no, many of the reports were FC SAN's.

rm and even then it was relatively rare.

no, they said they were losing way more zpools than they ever lost
vxfs's in the same environment.

  
Well, who's they? I've been depolying ZFS for years on many different 
platforms from low-end, jbods, thru midrange, SAN, and high-end disk 
arrays and I have yet to loose a pool (hopefully not).
It doesn't mean that some other people did not have problems or did not 
loose they pools - in most if not in all such cases almost all data 
could probably be recovered by following manual and hackish procedure 
to rollback to a previous uberblock. Now it is integrated into ZFS and 
no special knowledge is required to be able to do so in such circumstances.


Then there might have been other bugs... life, no software is without them.


rm called enterprise customers were affected even less and then
rm either they had enough expertise or called Sun's support
rm organization to get a pool manually reverted to its previous
rm uberblock.

which is probably why the tool exists.  but, great!
  
The point is that you don't need the tool now as it is built-in in zfs 
starting with snv_126.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS + fsck

2009-11-05 Thread Nigel Smith
Hi Robert
I think you mean snv_128 not 126 :-)

  6667683  need a way to rollback to an uberblock from a previous txg 
  http://bugs.opensolaris.org/view_bug.do?bug_id=6667683

  http://hg.genunix.org/onnv-gate.hg/rev/8aac17999e4d

Regards
Nigel Smith
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS + fsck

2009-11-05 Thread Tim Haley

Robert Milkowski wrote:

Miles Nordin wrote:

csb == Craig S Bell cb...@standard.com writes:
   csb Two: If you lost data with another filesystem, you may have
   csb overlooked it and blamed the OS or the application,

yeah, but with ZFS you often lose the whole pool in certain 
classes of

repeatable real-world failures, like hotswap disks with flakey power
or SAN's without NVRAM where the target reboots and the initiator 
does
not.  Losing the whole pool is relevantly different to corrupting 
the
insides of a few files. 
I think that most people including ZFS developers agree with you that 
losing an access to entire pool is not acceptable. And this has been 
fixed in snv_126 so now in those rare circumstances you should be able 
to import a pool. And generally you will end-up in a much better 
situation than with legacy filesystems + fsck.



Just a slight correction.  The current build in-process is 128 and that's the 
build into which the changes were pushed.


-tim

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS + fsck

2009-11-05 Thread Nigel Smith
Hi Gary
I will let 'website-discuss' know about this problem.
They normally fix issues like that.
Those pages always seemed to just update automatically.
I guess it's related to the website transition.
Thanks
Nigel Smith
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS + fsck

2009-11-05 Thread Dave Koelmeyer
Thanks for taking the time to write this - very useful info :)
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS + fsck

2009-11-04 Thread Kevin Walker
Hi all,

Just subscribed to the list after a debate on our helpdesk lead me to the 
posting about  ZFS corruption and the need for a fsck repair tool of some 
kind...

Has there been any update on this?



Kind regards,
 
Kevin Walker
Coreix Limited
 
DDI: (+44) 0207 183 1725 ext 90
Mobile: (+44) 07960 967818
Fax: (+44) 0208 53 44 111

*
This message is intended solely for the use of the individual or organisation 
to whom it is addressed. It may contain privileged or confidential information. 
If you are not the intended recipient, you should not use, copy, alter, or 
disclose the contents of this message
*
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS + fsck

2009-11-04 Thread Rob Warner
ZFS scrub will detect many types of error in your data or the filesystem 
metadata.

If you have sufficient redundancy in your pool and the errors were not due to 
dropped or misordered writes, then they can often be automatically corrected 
during the scrub.

If ZFS detects an error from which it cannot automatically recover, it will 
often instantly lock your entire pool to prevent any read or write access, 
informing you only that you must destroy it and restore from backups to get 
your data back.

Your only recourse in such situations is to do exactly that, or enlist the help 
of Victor Latushkin to attempt to recover your pool using painstaking manual 
manipulation.

Recent putbacks seem to indicate that future releases will provide a mechanism 
to allow mere mortals to recover from some of the errors caused by dropped 
writes.

cheers,

Rob
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS + fsck

2009-11-04 Thread Orvar Korvar
Such a functionality is in the ZFS code now. It will be available later for us
http://c0t0d0s0.org/archives/6067-PSARC-2009479-zpool-recovery-support.html
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS + fsck

2009-11-04 Thread Craig S. Bell
Joerg just posted a lengthy answer to the fsck question:

http://www.c0t0d0s0.org/archives/6071-No,-ZFS-really-doesnt-need-a-fsck.html

Good stuff.  I see two answers to nobody complained about lying hardware 
before ZFS.

One:  The user has never tried another filesystem that tests for end-to-end 
data integrity, so ZFS notices more problems, and sooner.

Two: If you lost data with another filesystem, you may have overlooked it and 
blamed the OS or the application, instead of the inexpensive hardware.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS + fsck

2009-11-04 Thread Robert Milkowski

Kevin Walker wrote:

Hi all,

Just subscribed to the list after a debate on our helpdesk lead me to the 
posting about  ZFS corruption and the need for a fsck repair tool of some 
kind...

Has there been any update on this?

  


I guess the discussion started after someone read an article on OSNEWS.

The way zfs works is that basically you get an fsck equivalent while 
using a pool.
ZFS checks checksums for all metadata and user data while reading it. 
Then all metadata are using ditto blocks to provide 2 or three copies of 
it (totally independent from any pool redundancy) depends on type of 
metadata. If it is corrupted a second (or third) copy will be used so 
correct data is returned and a corrupted block is automatically 
repaired. The ability to repair a block containing a user data depends 
on if you have a pool configured with or without redundancy. But even if 
pool is non-redundant (lets say a single disk drive) zfs still will be 
able to detect corruption and will be able to tell you what files are 
affected while metadata will be correct in most cases (unless corruption 
is so large and not localized so it affected all copies of a block in a 
pool). You will be able to read all other files and other parts of the file.


So fsck actually happens while you are accessing your data and it is 
even better than fsck on most other filesystems as thanks to 
checksumming of all data and metadata zfs knows exactly when/if 
something is wrong and in most cases is even able to fix it on the fly. 
If you want to scan entire pool including all redundant copies and get 
them fix if something doesn't checksum then you can schedule the pool 
scrubbing (while your applications are still using the pool!). This will 
force zfs to read all blocks from all copies to be read, their checksum 
checked and if needed data corrected if possible and the fact reported 
to user. Legacy fsck is not even close to it.



I think that the perceived need for fsck for ZFS probably comes from 
lack of understanding how ZFS works and from some frustrated users where 
under a very unlikely and rare circumstances due to data corruption a 
user might be in a position of not being able to import the pool 
therefore not being able to access any data at all while a corruption 
might have affected only a relatively small amount of data. Most other 
filesystem will allow you to access most of the data after fsck in such 
a situation (probably with some data loss) while zfs left user with no 
access to data at all. In such a case the problem lies with zfs 
uberblock and the remedy is to revert a pool to its previous uberblock 
version (or even an earlier one). In almost all the cases this will 
render a pool importable and then the mechanisms described in the first 
paragraph above will kick-in. The problem is (was) that the procedure to 
revert a pool to one of its previous uberblock is not documented nor is 
automatic and is definitely far from being sys-admin friendly. But 
thanks to some community members (most notably mr. Victor I think) some 
users affected by the issue were given a hand and were able to recover 
most/all their data. Others were probably assisted by Sun's support 
service I guess.


Fortunately a much more user-friendly mechanism has been finally 
implemented and inegrated into Open Solaris build 126 which allows a 
user to import a pool and force it to on of the previous versions of its 
uberblock if necessary. See 
http://c0t0d0s0.org/archives/6067-PSARC-2009479-zpool-recovery-support.html 
for more details.


There is another CR (don't have its number at hand) which is about 
implementing a delayed re-use on just freed blocks which should allow 
for more data to be recovered in such a case as above. Although I'm not 
sure if it has been implemented yet.


IMHO with the above CR implemented, in most cases ZFS currently provides 
*much* better solution to random data corruption than any other 
filesystem+fsck in the market.


Personally I don't blame Sun that implementing the CR took so long as it 
mostly affected home users with cheap hardware from BestBuy like sources 
and even then it was relatively rare. So called enterprise customers 
were affected even less and then either they had enough expertise or 
called Sun's support organization to get a pool manually reverted to its 
previous uberblock. So from Sun's perspective the issue was far from 
being top-priority and the resources are limited as usual. Still IIRC it 
was thanks to some vocal users here complaining about the issue which 
convinced ZFS developers to get it expedited... :)


ps. sorry for a chaotic email but lack of time is mine friend as usual :)

--
Robert Milkowski
http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS + fsck

2009-11-04 Thread Tim Haley

Robert Milkowski wrote:

Kevin Walker wrote:

Hi all,

Just subscribed to the list after a debate on our helpdesk lead me to 
the posting about  ZFS corruption and the need for a fsck repair tool 
of some kind...


Has there been any update on this?

  


I guess the discussion started after someone read an article on OSNEWS.

The way zfs works is that basically you get an fsck equivalent while 
using a pool.
ZFS checks checksums for all metadata and user data while reading it. 
Then all metadata are using ditto blocks to provide 2 or three copies of 
it (totally independent from any pool redundancy) depends on type of 
metadata. If it is corrupted a second (or third) copy will be used so 
correct data is returned and a corrupted block is automatically 
repaired. The ability to repair a block containing a user data depends 
on if you have a pool configured with or without redundancy. But even if 
pool is non-redundant (lets say a single disk drive) zfs still will be 
able to detect corruption and will be able to tell you what files are 
affected while metadata will be correct in most cases (unless corruption 
is so large and not localized so it affected all copies of a block in a 
pool). You will be able to read all other files and other parts of the 
file.


So fsck actually happens while you are accessing your data and it is 
even better than fsck on most other filesystems as thanks to 
checksumming of all data and metadata zfs knows exactly when/if 
something is wrong and in most cases is even able to fix it on the fly. 
If you want to scan entire pool including all redundant copies and get 
them fix if something doesn't checksum then you can schedule the pool 
scrubbing (while your applications are still using the pool!). This will 
force zfs to read all blocks from all copies to be read, their checksum 
checked and if needed data corrected if possible and the fact reported 
to user. Legacy fsck is not even close to it.



I think that the perceived need for fsck for ZFS probably comes from 
lack of understanding how ZFS works and from some frustrated users where 
under a very unlikely and rare circumstances due to data corruption a 
user might be in a position of not being able to import the pool 
therefore not being able to access any data at all while a corruption 
might have affected only a relatively small amount of data. Most other 
filesystem will allow you to access most of the data after fsck in such 
a situation (probably with some data loss) while zfs left user with no 
access to data at all. In such a case the problem lies with zfs 
uberblock and the remedy is to revert a pool to its previous uberblock 
version (or even an earlier one). In almost all the cases this will 
render a pool importable and then the mechanisms described in the first 
paragraph above will kick-in. The problem is (was) that the procedure to 
revert a pool to one of its previous uberblock is not documented nor is 
automatic and is definitely far from being sys-admin friendly. But 
thanks to some community members (most notably mr. Victor I think) some 
users affected by the issue were given a hand and were able to recover 
most/all their data. Others were probably assisted by Sun's support 
service I guess.


Fortunately a much more user-friendly mechanism has been finally 
implemented and inegrated into Open Solaris build 126 which allows a 
user to import a pool and force it to on of the previous versions of its 
uberblock if necessary. See 
http://c0t0d0s0.org/archives/6067-PSARC-2009479-zpool-recovery-support.html 
for more details.


There is another CR (don't have its number at hand) which is about 
implementing a delayed re-use on just freed blocks which should allow 
for more data to be recovered in such a case as above. Although I'm not 
sure if it has been implemented yet.


IMHO with the above CR implemented, in most cases ZFS currently provides 
*much* better solution to random data corruption than any other 
filesystem+fsck in the market.


The code for the putback of 2009/479 allows reverting to an earlier uberblock 
AND defers the re-use of blocks for a short time to make this rewind safer.


-tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS + fsck

2009-11-04 Thread Robert Milkowski

Tim Haley wrote:

Robert Milkowski wrote:


There is another CR (don't have its number at hand) which is about 
implementing a delayed re-use on just freed blocks which should allow 
for more data to be recovered in such a case as above. Although I'm 
not sure if it has been implemented yet.


IMHO with the above CR implemented, in most cases ZFS currently 
provides *much* better solution to random data corruption than any 
other filesystem+fsck in the market.


The code for the putback of 2009/479 allows reverting to an earlier 
uberblock AND defers the re-use of blocks for a short time to make 
this rewind safer.




Excellent! Thank you for the information.

--
Robert Milkowski
http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss