Re: [zfs-discuss] about btrfs and zfs

2011-11-14 Thread Edward Ned Harvey
> From: Nico Williams [mailto:n...@cryptonector.com]
> 
> > B-trees should be logarithmic time, which is the best O() you can possibly
> > achieve.  So if HFS+ is dog slow, it's an implementation detail and not a
> > general fault of b-trees.
> 
> Hash tables can do much better than O(log N) for searching: O(1) for
> best case, and O(n) for the worst case.

You're right to challenge me saying O(log) is the best you can possibly achieve 
- The assumption I was making is that the worst case is what matters, and 
that's not always true.

Which is better?  An algorithm whose best case and worse case are both O(log 
n), or an algorithm that takes O(1) in the best case and O(n) in the worst case?

The answer is subjective - and the question might be completely irrelevant, as 
it doesn't necessarily relate to any of the filesystems we're talking about 
anyway.  ;-)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-11-14 Thread Nico Williams
On Mon, Nov 14, 2011 at 8:33 AM, Edward Ned Harvey
 wrote:
>> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
>> boun...@opensolaris.org] On Behalf Of Paul Kraus
>>
>> Is it really B-Tree based? Apple's HFS+ is B-Tree based and falls
>> apart (in terms of performance) when you get too many objects in one
>> FS, which is specifically what drove us to ZFS. We had 4.5 TB of data
>
> According to wikipedia, btrfs is a b-tree.
> I know in ZFS, the DDT is an AVL tree, but what about the rest of the
> filesystem?

ZFS directories are hashed.  Aside from this, the filesystem (and
volume) have a tree structure, but that's not what's interesting here
-- what's interesting is how directories are indexed.

> B-trees should be logarithmic time, which is the best O() you can possibly
> achieve.  So if HFS+ is dog slow, it's an implementation detail and not a
> general fault of b-trees.

Hash tables can do much better than O(log N) for searching: O(1) for
best case, and O(n) for the worst case.

Also, b-trees are O(log_b N), where b is the number of entries
per-node.  6e7 entries/directory probably works out to 2-5 reads
(assuming 0% cache hit rate) depending on the size of each directory
entry and the size of the b-tree blocks.

Nico
--
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-11-14 Thread Michael Schuster
On Mon, Nov 14, 2011 at 14:40, Paul Kraus  wrote:
> On Fri, Nov 11, 2011 at 9:25 PM, Edward Ned Harvey
>  wrote:
>
>> LOL.  Well, for what it's worth, there are three common pronunciations for
>> btrfs.  Butterfs, Betterfs, and B-Tree FS (because it's based on b-trees.)
>> Check wikipedia.  (This isn't really true, but I like to joke, after saying
>> something like that, I wrote the wikipedia page just now.)   ;-)
>
> Is it really B-Tree based? Apple's HFS+ is B-Tree based and falls
> apart (in terms of performance) when you get too many objects in one
> FS, which is specifically what drove us to ZFS. We had 4.5 TB of data
> in about 60 million files/directories on an Apple X-Serve and X-RAID
> and the overall response was terrible. We moved the data to ZFS and
> the performance was limited by the Windows client at that point.
>
>> Speaking of which. zettabyte filesystem.   ;-)  Is it just a dumb filesystem
>> with a lot of address bits?  Or is it something that offers functionality
>> that other filesystems don't have?      ;-)
>
> The stories I have heard indicate that the name came after the TLA.
> "zfs" came first and "zettabyte" later.

as Jeff told it (IIRC), the "expanded" version of zfs underwent
several changes during the development phase, until it was decided one
day to attach none of them to "zfs" and just have it be "the last word
in filesystems". (perhaps he even replied to a similar message on this
list ... check the archives :-)

regards
-- 
Michael Schuster
http://recursiveramblings.wordpress.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-11-14 Thread Edward Ned Harvey
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
> boun...@opensolaris.org] On Behalf Of Paul Kraus
>
> Is it really B-Tree based? Apple's HFS+ is B-Tree based and falls
> apart (in terms of performance) when you get too many objects in one
> FS, which is specifically what drove us to ZFS. We had 4.5 TB of data

According to wikipedia, btrfs is a b-tree.
I know in ZFS, the DDT is an AVL tree, but what about the rest of the
filesystem?

B-trees should be logarithmic time, which is the best O() you can possibly
achieve.  So if HFS+ is dog slow, it's an implementation detail and not a
general fault of b-trees.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-11-14 Thread Paul Kraus
On Fri, Nov 11, 2011 at 9:25 PM, Edward Ned Harvey
 wrote:

> LOL.  Well, for what it's worth, there are three common pronunciations for
> btrfs.  Butterfs, Betterfs, and B-Tree FS (because it's based on b-trees.)
> Check wikipedia.  (This isn't really true, but I like to joke, after saying
> something like that, I wrote the wikipedia page just now.)   ;-)

Is it really B-Tree based? Apple's HFS+ is B-Tree based and falls
apart (in terms of performance) when you get too many objects in one
FS, which is specifically what drove us to ZFS. We had 4.5 TB of data
in about 60 million files/directories on an Apple X-Serve and X-RAID
and the overall response was terrible. We moved the data to ZFS and
the performance was limited by the Windows client at that point.

> Speaking of which. zettabyte filesystem.   ;-)  Is it just a dumb filesystem
> with a lot of address bits?  Or is it something that offers functionality
> that other filesystems don't have?      ;-)

The stories I have heard indicate that the name came after the TLA.
"zfs" came first and "zettabyte" later.

-- 
{1-2-3-4-5-6-7-}
Paul Kraus
-> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ )
-> Sound Coordinator, Schenectady Light Opera Company (
http://www.sloctheater.org/ )
-> Technical Advisor, RPI Players
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-11-13 Thread Edward Ned Harvey
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
> boun...@opensolaris.org] On Behalf Of Jeff Liu
> 
> Why not give some tolerance to Btrfs? You can kindly drop an email to
> its mail list for any issue you are not satisfied with.
> Satirize or lampoon does not make sense to any open source project.

Agreed.  Not only that, but probably most people who use zfs would also have
interest in btrfs and actually like it.  It's not like posting an anti-MS
email on a pro-Apple mailing list or something...

ZFS is more mature, btrfs is comparitively lacking some important features,
but the same is true in both directions.  Each is better in its own way.
But for most things, right now, zfs is better in most ways, due to maturity.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-11-13 Thread Jeff Liu
On 11/13/2011 05:18 PM, Nomen Nescio wrote:

>> LOL.  Well, for what it's worth, there are three common pronunciations for
>> btrfs.  Butterfs, Betterfs, and B-Tree FS (because it's based on b-trees.)
>> Check wikipedia.  (This isn't really true, but I like to joke, after
>> saying something like that, I wrote the wikipedia page just now.)   ;-)
> 
> You forget Broken Tree File System, Badly Trashed File System, etc. Follow
> the newsgroup and you'll get plenty more ideas for names ;-)

Why not give some tolerance to Btrfs? You can kindly drop an email to
its mail list for any issue you are not satisfied with.
Satirize or lampoon does not make sense to any open source project.


Thanks,
-Jeff

> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-11-13 Thread Nomen Nescio
> LOL.  Well, for what it's worth, there are three common pronunciations for
> btrfs.  Butterfs, Betterfs, and B-Tree FS (because it's based on b-trees.)
> Check wikipedia.  (This isn't really true, but I like to joke, after
> saying something like that, I wrote the wikipedia page just now.)   ;-)

You forget Broken Tree File System, Badly Trashed File System, etc. Follow
the newsgroup and you'll get plenty more ideas for names ;-)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-11-11 Thread Fajar A. Nugraha
On Sat, Nov 12, 2011 at 9:25 AM, Edward Ned Harvey
 wrote:
>> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
>> boun...@opensolaris.org] On Behalf Of Linder, Doug
>>
>> All technical reasons aside, I can tell you one huge reason I love ZFS,
> and it's
>> one that is clearly being completely ignored by btrfs: ease of use.  The
> zfs
>> command set is wonderful and very English-like (for a unix command set).
>> It's simple, clear, and logical.  The grammar makes sense.  I almost never
> have
>> to refer to the man page.  The last time I looked, the commands for btrfs
>> were the usual incomprehensible gibberish with a thousand squiggles and
>> numbers.  It looked like a real freaking headache, to be honest.
>
> Maybe you're doing different things from me.  btrfs subvol create, delete,
> snapshot, mkfs, ...
> For me, both ZFS and BTRFS have "normal" user interfaces and/or command
> syntax.

the gramatically-correct syntax would be "btrfs create subvolume", but
the current tool/syntax is an improvement over the old ones (btrfsctl,
btrfs-vol, etc).

>
>
>> 1) Change the stupid name.   "Btrfs" is neither a pronounceable word nor a
>> good acromyn.  "ButterFS" sounds stupid.  Just call it "BFS" or something,
>> please.
>
> LOL.  Well, for what it's worth, there are three common pronunciations for
> btrfs.  Butterfs, Betterfs, and B-Tree FS (because it's based on b-trees.)

... as long as you don't call it BiTterly bRoken FS :)

-- 
Fajar
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-11-11 Thread Edward Ned Harvey
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
> boun...@opensolaris.org] On Behalf Of Linder, Doug
> 
> All technical reasons aside, I can tell you one huge reason I love ZFS,
and it's
> one that is clearly being completely ignored by btrfs: ease of use.  The
zfs
> command set is wonderful and very English-like (for a unix command set).
> It's simple, clear, and logical.  The grammar makes sense.  I almost never
have
> to refer to the man page.  The last time I looked, the commands for btrfs
> were the usual incomprehensible gibberish with a thousand squiggles and
> numbers.  It looked like a real freaking headache, to be honest.

Maybe you're doing different things from me.  btrfs subvol create, delete,
snapshot, mkfs, ...
For me, both ZFS and BTRFS have "normal" user interfaces and/or command
syntax.


> 1) Change the stupid name.   "Btrfs" is neither a pronounceable word nor a
> good acromyn.  "ButterFS" sounds stupid.  Just call it "BFS" or something,
> please.

LOL.  Well, for what it's worth, there are three common pronunciations for
btrfs.  Butterfs, Betterfs, and B-Tree FS (because it's based on b-trees.)
Check wikipedia.  (This isn't really true, but I like to joke, after saying
something like that, I wrote the wikipedia page just now.)   ;-)

Speaking of which. zettabyte filesystem.   ;-)  Is it just a dumb filesystem
with a lot of address bits?  Or is it something that offers functionality
that other filesystems don't have?      ;-)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-11-11 Thread Nico Williams
On Fri, Nov 11, 2011 at 4:27 PM, Paul Kraus  wrote:
> The command syntax paradigm of zfs (command sub-command object
> parameters) is not unique to zfs, but seems to have been the "way of
> doing things" in Solaris 10. The _new_ functions of Solaris 10 were
> all this way (to the best of my knowledge)...
>
> zonecfg
> zoneadm
> svcadm
> svccfg
> ... and many others are written this way. To boot the zone named foo
> you use the command "zoneadm -z foo boot", to disable the service
> named sendmail, "svcadm disable sendmail", etc. Someone at Sun was
> thinking :-)

I'd have preferred "zoneadm boot foo".  The -z zone command thing is a
bit of a sore point, IMO.

But yes, all these new *adm(1M( and *cfg(1M) commands in S10 are
wonderful, especially when compared to past and present alternatives
in the Unix/Linux world.

Nico
--
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-11-11 Thread Paul Kraus
On Fri, Nov 11, 2011 at 1:39 PM, Linder, Doug
 wrote:
> Paul Kraus wrote:
>
>>> My main reasons for using zfs are pretty basic compared to some here
>>
>> What are they ? (the reasons for using ZFS)
>
> All technical reasons aside, I can tell you one huge reason I love ZFS, and 
> it's one that is clearly being completely ignored by btrfs: ease of use.  The 
> zfs command set is wonderful and very English-like (for a unix command set).  
> It's simple, clear, and logical.  The grammar makes sense.  I almost never 
> have to refer to the man page.  The last time I looked, the commands for 
> btrfs were the usual incomprehensible gibberish with a thousand squiggles and 
> numbers.  It looked like a real freaking headache, to be honest.
>

The command syntax paradigm of zfs (command sub-command object
parameters) is not unique to zfs, but seems to have been the "way of
doing things" in Solaris 10. The _new_ functions of Solaris 10 were
all this way (to the best of my knowledge)...

zonecfg
zoneadm
svcadm
svccfg
... and many others are written this way. To boot the zone named foo
you use the command "zoneadm -z foo boot", to disable the service
named sendmail, "svcadm disable sendmail", etc. Someone at Sun was
thinking :-)

-- 
{1-2-3-4-5-6-7-}
Paul Kraus
-> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ )
-> Sound Coordinator, Schenectady Light Opera Company (
http://www.sloctheater.org/ )
-> Technical Advisor, RPI Players
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-11-11 Thread Linder, Doug
Paul Kraus wrote:

>> My main reasons for using zfs are pretty basic compared to some here
>
> What are they ? (the reasons for using ZFS)

All technical reasons aside, I can tell you one huge reason I love ZFS, and 
it's one that is clearly being completely ignored by btrfs: ease of use.  The 
zfs command set is wonderful and very English-like (for a unix command set).  
It's simple, clear, and logical.  The grammar makes sense.  I almost never have 
to refer to the man page.  The last time I looked, the commands for btrfs were 
the usual incomprehensible gibberish with a thousand squiggles and numbers.  It 
looked like a real freaking headache, to be honest.

With zfs I can do really complex operations off the top of my head.  It's very 
clear to me that someone spent a lot of time making the commands work that way, 
and that the commands have a lot of intelligence behind the scenes.  After many 
years spent poring over manuals for SVM and VxFS and writing meter-long 
commands with a thousand fiddly little parameters, it is SUCH a relief.  It's a 
pleasure to use.  Like swimming in crystal clear water after years in murky 
soup.

I haven't used btrfs.  But just from what I've heard, I have two suggestions 
for it:

1) Change the stupid name.   "Btrfs" is neither a pronounceable word nor a good 
acromyn.  "ButterFS" sounds stupid.  Just call it "BFS" or something, please.

2) After renaming it BFS, steal the entire ZFS command set and change the "z"s 
to "b"s.  Have 'bpool' and 'bfs' commands, and exactly copy their syntax.  The 
source code underneath may be copyrighted, but I doubt you can copyright 
command names, and probably even Oracle wouldn't be petty enough to raise a 
legal stink (though you never now with them).

It would be nice if, for once, people writing software actually took usability 
into account, and the ulcers of sysadmins.  Kudos to ZFS for bucking the 
horrible trend of impossibly complex syntax.
--
Learn more about Merchant Link at www.merchantlink.com.

THIS MESSAGE IS CONFIDENTIAL.  This e-mail message and any attachments are 
proprietary and confidential information intended only for the use of the 
recipient(s) named above.  If you are not the intended recipient, you may not 
print, distribute, or copy this message or any attachments.  If you have 
received this communication in error, please notify the sender by return e-mail 
and delete this message and any attachments from your computer.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-19 Thread Pawel Jakub Dawidek
On Wed, Oct 19, 2011 at 10:13:56AM -0400, David Magda wrote:
> On Wed, October 19, 2011 08:15, Pawel Jakub Dawidek wrote:
> 
> > Fsck can only fix known file system inconsistencies in file system
> > structures. Because there is no atomicity of operations in UFS and other
> > file systems it is possible that when you remove a file, your system can
> > crash between removing directory entry and freeing inode or blocks.
> > This is expected with UFS, that's why there is fsck to verify that no
> > such thing happend.
> 
> Slightly OT, but this non-atomic delay between meta-data updates and
> writes to the disk is exploited by "soft updates" with FreeBSD's UFS:
> 
> http://www.freebsd.org/doc/en/books/handbook/configtuning-disk.html#SOFT-UPDATES
> 
> It may be of some interest to the file system geeks on the list.

Well, soft-updates thanks to careful ordering of operation allow to
mount file system even in inconsistent state and run fsck in background,
as the only inconsistencies are resource leaks - directory entry will
never point at unallocated inode and an inode will never point at
unallocated block, etc. This is still not atomic.

With recent versions of FreeBSD, soft-updates were extended to journal
those resource leaks, so background fsck is not needed anymore.

-- 
Pawel Jakub Dawidek   http://www.wheelsystems.com
FreeBSD committer http://www.FreeBSD.org
Am I Evil? Yes, I Am! http://yomoli.com


pgp1e542EIuks.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-19 Thread Nico Williams
On Wed, Oct 19, 2011 at 7:24 AM, Garrett D'Amore
 wrote:
> I'd argue that from a *developer* point of view, an fsck tool for ZFS might 
> well be useful.  Isn't that what zdb is for? :-)
>
> But ordinary administrative users should never need something like this, 
> unless they have encountered a bug in ZFS itself.  (And bugs are as likely to 
> exist in the checker tool as in the filesystem. ;-)

zdb can be useful for admins -- say, to gather stats not reported by
the system, to explore the fs/vol layout, for educational purposes,
and so on.

Nico
--
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-19 Thread Garrett D'Amore
I'd argue that from a *developer* point of view, an fsck tool for ZFS might 
well be useful.  Isn't that what zdb is for? :-)

But ordinary administrative users should never need something like this, unless 
they have encountered a bug in ZFS itself.  (And bugs are as likely to exist in 
the checker tool as in the filesystem. ;-)

- Garrett


On Oct 19, 2011, at 2:15 PM, Pawel Jakub Dawidek wrote:

> On Wed, Oct 19, 2011 at 08:40:59AM +1100, Peter Jeremy wrote:
>> fsck verifies the logical consistency of a filesystem.  For UFS, this
>> includes: used data blocks are allocated to exactly one file,
>> directory entries point to valid inodes, allocated inodes have at
>> least one link, the number of links in an inode exactly matches the
>> number of directory entries pointing to that inode, directories form a
>> single tree without loops, file sizes are consistent with the number
>> of allocated blocks, unallocated data/inodes blocks are in the
>> relevant free bitmaps, redundant superblock data is consistent.  It
>> can't verify data.
> 
> Well said. I'd add that people who insist on ZFS having a fsck are
> missing the whole point of ZFS transactional model and copy-on-write
> design.
> 
> Fsck can only fix known file system inconsistencies in file system
> structures. Because there is no atomicity of operations in UFS and other
> file systems it is possible that when you remove a file, your system can
> crash between removing directory entry and freeing inode or blocks.
> This is expected with UFS, that's why there is fsck to verify that no
> such thing happend.
> 
> In ZFS on the other hand there are no inconsistencies like that. If all
> blocks match their checksums and you find directory loop or something
> like that, it is a bug in ZFS, not expected inconsistency. It should be
> fixed in ZFS and not work-arounded with some fsck for ZFS.
> 
> -- 
> Pawel Jakub Dawidek   http://www.wheelsystems.com
> FreeBSD committer http://www.FreeBSD.org
> Am I Evil? Yes, I Am! http://yomoli.com
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-19 Thread Brian Wilson

On 10/18/11 03:31 PM, Tim Cook wrote:



On Tue, Oct 18, 2011 at 3:27 PM, Peter Tribble 
mailto:peter.trib...@gmail.com>> wrote:


On Tue, Oct 18, 2011 at 9:12 PM, Tim Cook mailto:t...@cook.ms>> wrote:
>
>
> On Tue, Oct 18, 2011 at 3:06 PM, Peter Tribble
mailto:peter.trib...@gmail.com>>
> wrote:
>>
>> On Tue, Oct 18, 2011 at 8:52 PM, Tim Cook mailto:t...@cook.ms>> wrote:
>> >
>> > Every scrub I've ever done that has found an error required
manual
>> > fixing.
>> >  Every pool I've ever created has been raid-z or raid-z2, so
the silent
>> > healing, while a great story, has never actually happened in
practice in
>> > any
>> > environment I've used ZFS in.
>>
>> You have, of course, reported each such failure, because if that
>> was indeed the case then it's a clear and obvious bug?
>>
>> For what it's worth, I've had ZFS repair data corruption on
>> several occasions - both during normal operation and as a
>> result of a scrub, and I've never had to intervene manually.
>>
>> --
>> -Peter Tribble
>> http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/
>
>
> Given that there  are guides on how to manually fix
the corruption, I don't
> see any need to report it.  It's considered acceptable and
expected behavior
> from everyone I've talked to at Sun...
> http://dlc.sun.com/osol/docs/content/ZFSADMIN/gbbwl.html

If you have adequate redundancy, ZFS will - and does -
repair errors. The document you quote is for the case
where you don't actually have adequate redundancy: ZFS
will refuse to make up data for you, and report back where
the problem was. Exactly as designed.

(And yes, I've come across systems without redundant
storage, or had multiple simultaneous failures. The original
statement was that if you have redundant copies of the data
or, in the case of raidz, enough information to reconstruct
it, then ZFS will repair it for you. Which has been exactly in
accord with my experience.)

--
-Peter Tribble
http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/




I had and have redundant storage, it has *NEVER* automatically fixed 
it.  You're the first person I've heard that has had it automatically 
fix it.
Per the page "or an unlikely series of events conspired to corrupt 
multiple copies of a piece of data."


Their unlikely series of events, that goes unnamed, is not that 
unlikely in my experience.


--Tim
Just another 2 cents towards a euro/dollar/yen.  I've only had data 
redundancy in ZFS via mirrors (not that it should matter as long as 
there's redundancy), and in every case I've had it repair data 
automatically via a scrub.  The one case where it didn't was when the 
disk controller both drives happened to share (bad design, yes) started 
erroring and corrupting writes to both disks in parallel, so there was 
no good data to fix it with.  I was still happy to be using ZFS, as a 
filesystem without a scrub/scan of some sort wouldn't have even noticed 
in my experience - I suspect btrfs would have if it's scan works similarly.


cheers,
Brian




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss



--
---
Brian Wilson, Solaris SE, UW-Madison DoIT
Room 3114 CS&S608-263-8047
brian.wilson(a)doit.wisc.edu
'I try to save a life a day. Usually it's my own.' - John Crichton
---

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-19 Thread David Magda
On Wed, October 19, 2011 08:15, Pawel Jakub Dawidek wrote:

> Fsck can only fix known file system inconsistencies in file system
> structures. Because there is no atomicity of operations in UFS and other
> file systems it is possible that when you remove a file, your system can
> crash between removing directory entry and freeing inode or blocks.
> This is expected with UFS, that's why there is fsck to verify that no
> such thing happend.

Slightly OT, but this non-atomic delay between meta-data updates and
writes to the disk is exploited by "soft updates" with FreeBSD's UFS:

http://www.freebsd.org/doc/en/books/handbook/configtuning-disk.html#SOFT-UPDATES

It may be of some interest to the file system geeks on the list.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-19 Thread Paul Kraus
Thank you. The following is the best "layman's" explanation as to
_why_ ZFS does not have an fsck equivalent (or even needs one). On the
other hand, there are situations where you really do need to force ZFS
to do something that may not be a"good idea", but is the best of a bad
set of choices. Hence the zpool import -F (and other such tools
available via zdb). While the ZFS data may not be corrupt, it is
possible to corrupt the ZFS metadata, uberblock, and labals in such a
way that force is necessary.

On Wed, Oct 19, 2011 at 8:15 AM, Pawel Jakub Dawidek  wrote:

> Well said. I'd add that people who insist on ZFS having a fsck are
> missing the whole point of ZFS transactional model and copy-on-write
> design.
>
> Fsck can only fix known file system inconsistencies in file system
> structures. Because there is no atomicity of operations in UFS and other
> file systems it is possible that when you remove a file, your system can
> crash between removing directory entry and freeing inode or blocks.
> This is expected with UFS, that's why there is fsck to verify that no
> such thing happend.
>
> In ZFS on the other hand there are no inconsistencies like that. If all
> blocks match their checksums and you find directory loop or something
> like that, it is a bug in ZFS, not expected inconsistency. It should be
> fixed in ZFS and not work-arounded with some fsck for ZFS.
>
> --
> Pawel Jakub Dawidek                       http://www.wheelsystems.com
> FreeBSD committer                         http://www.FreeBSD.org
> Am I Evil? Yes, I Am!                     http://yomoli.com
>
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>
>



-- 
{1-2-3-4-5-6-7-}
Paul Kraus
-> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ )
-> Sound Coordinator, Schenectady Light Opera Company (
http://www.sloctheater.org/ )
-> Technical Advisor, RPI Players
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-19 Thread Pawel Jakub Dawidek
On Wed, Oct 19, 2011 at 08:40:59AM +1100, Peter Jeremy wrote:
> fsck verifies the logical consistency of a filesystem.  For UFS, this
> includes: used data blocks are allocated to exactly one file,
> directory entries point to valid inodes, allocated inodes have at
> least one link, the number of links in an inode exactly matches the
> number of directory entries pointing to that inode, directories form a
> single tree without loops, file sizes are consistent with the number
> of allocated blocks, unallocated data/inodes blocks are in the
> relevant free bitmaps, redundant superblock data is consistent.  It
> can't verify data.

Well said. I'd add that people who insist on ZFS having a fsck are
missing the whole point of ZFS transactional model and copy-on-write
design.

Fsck can only fix known file system inconsistencies in file system
structures. Because there is no atomicity of operations in UFS and other
file systems it is possible that when you remove a file, your system can
crash between removing directory entry and freeing inode or blocks.
This is expected with UFS, that's why there is fsck to verify that no
such thing happend.

In ZFS on the other hand there are no inconsistencies like that. If all
blocks match their checksums and you find directory loop or something
like that, it is a bug in ZFS, not expected inconsistency. It should be
fixed in ZFS and not work-arounded with some fsck for ZFS.

-- 
Pawel Jakub Dawidek   http://www.wheelsystems.com
FreeBSD committer http://www.FreeBSD.org
Am I Evil? Yes, I Am! http://yomoli.com


pgpXffQuNhb6M.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-19 Thread Richard Elling
On Oct 18, 2011, at 6:35 PM, David Magda wrote:

> If we've found one bad disk, what are our options?

Live with it or replace it :-)
 -- richard

-- 

ZFS and performance consulting
http://www.RichardElling.com
VMworld Copenhagen, October 17-20
OpenStorage Summit, San Jose, CA, October 24-27
LISA '11, Boston, MA, December 4-9 













___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-18 Thread David Magda
On Oct 18, 2011, at 10:35, Brian Wilson wrote:

> Where ZFS doesn't have an fsck command - and that really used to bug me - it 
> does now have a -F option on zpool import.  To me it's the same functionality 
> for my environment - the ability to try to roll back to a 'hopefully' good 
> state and get the filesystem mounted up, leaving the corrupted data objects 
> corrupted.  So that if the 10-1000 files and objects that went missing aren't 
> required for my 24x7 5+ 9s application to run (e.g. log files), I can get it 
> rolling again without them quickly, and then get those files recovered from 
> backup afterwards as needed, without having to recover the entire pool from 
> backup.

To a certain extent fsck is a false sense of security: while the utility has 
walked the file system and fixed some data structures (and perhaps put some 
stuff in lost+found), what guarantees does that actually give you that you 
don't have corrupted files from incomplete, in-flight operations.

Without checksums you're assuming everything is fine. Faith may be fine for 
some aspects of life, but not necessarily for others. :)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-18 Thread David Magda
On Oct 18, 2011, at 20:35, Edward Ned Harvey wrote:

> In fact, I saw, actual work started on this task about a month ago.  So it's
> not just planned, it's really in the works.  Now we're talking open source
> timelines here, which means, "you'll get it when it's ready," and nobody
> knows when that will be.  As mentioned elsewhere in this thread, there are
> some other major features that have been "ready in 2 weeks" for like 2 years
> now...  YMMV. 

To be fair, we've been waiting for bp* rewrite for a while as well. :)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-18 Thread David Magda
On Oct 18, 2011, at 20:26, Edward Ned Harvey wrote:

> Yes, but when scrub encounters uncorrectable errors, it doesn't attempt to
> correct them.  Fsck will do things like recover lost files into the
> lost+found directory, and stuff like that...

You say "recover lost files" like you know that they're actually recovered 
properly. :) Fsck does place things in lost+found, but there is no guarantee of 
their usefulness.

I recently had to redeploy a VM because the hosting machine's NIC was 
corrupting data, and so the underlying disk image became completely hosed. The 
Linux guest instance merrily went trying to run even though large parts of the 
Ext3 file system were a mess. After first noticing the problem we did an fsck 
and lost+found had several thousand entries. It was simpler to redeploy from 
scratch than wade through the 'recovered' files.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-18 Thread Edward Ned Harvey
> From: Fajar A. Nugraha [mailto:w...@fajar.net]
> Sent: Tuesday, October 18, 2011 7:46 PM
> 
> > * In btrfs, there is no equivalent or alternative to "zfs send | zfs
> > receive"
> 
> Planned. No actual working implementation yet.

In fact, I saw, actual work started on this task about a month ago.  So it's
not just planned, it's really in the works.  Now we're talking open source
timelines here, which means, "you'll get it when it's ready," and nobody
knows when that will be.  As mentioned elsewhere in this thread, there are
some other major features that have been "ready in 2 weeks" for like 2 years
now...  YMMV.  

But to me personally, zfs send is one of the HUGEST winning characteristics,
so I'm really eager for btrfs send to exist...  That's one of the biggest
missing characteristics that make btrfs seriously less attractive than ZFS
for me right now.

But I'll sure tell you, building a time machine server (mac) using the
latest netatalk on ubuntu beta is sure a HECK of a lot easier than doing the
same thing on solaris right now.   ;-)  Not to mention, I'm happy to run
ubuntu on dell servers where solaris was formerly a crash & burn.  So I'm
using btrfs anywhere that linux is required, and using ZFS anywhere that is
OS agnostic (or solaris advantaged) and I just need a filesystem.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-18 Thread Edward Ned Harvey
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
> boun...@opensolaris.org] On Behalf Of Bob Friesenhahn
> 
> On Wed, 19 Oct 2011, Peter Jeremy wrote:
> >> Doesn't a scrub do more than what 'fsck' does?
> >
> > It does different things.  I'm not sure about "more".
> 
> Zfs scrub validates user data while 'fsck' does not.  I consider that
> as being definitely "more".

Yes, but when scrub encounters uncorrectable errors, it doesn't attempt to
correct them.  Fsck will do things like recover lost files into the
lost+found directory, and stuff like that...

So, scrub does more of one thing, and fsck does more of a different thing...
Which one you call "more" is a matter of perspective.  I would just call
them different, and each one "better" in its own way, depending on your
needs.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-18 Thread Edward Ned Harvey
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
> boun...@opensolaris.org] On Behalf Of Tim Cook
>  
> I had and have redundant storage, it has *NEVER* automatically fixed
> it.  You're the first person I've heard that has had it automatically fix
it.

That's probably just because it's normal and expected behavior to
automatically fix it - I always have redundancy, and every cksum error I
ever find is always automatically fixed.  I never tell anyone here because
it's normal and expected.

If you have redundancy, and cksum errors, and it's not automatically fixed,
then you should report the bug.

I do have a few suggestions, possible ways that you may think you have
redundancy and still have such an error...

If you're using hardware raid, then ZFS will only see one virtual aggregate
device.  There's no interface to tell the hardware "go read the other copy,
because this one was bad."  You have to present the individual JBOD disks to
the OS, and let ZFS assembe a raid volume out of it.  Then ZFS will manage
the redundant copies.

If your cksum error happened in memory, or in the bus or something, then
even fetching new copies from the (actually good) disks might still be
received corrupted in memory and result in a cksum error.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-18 Thread Edward Ned Harvey
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
> boun...@opensolaris.org] On Behalf Of Paul Kraus
> 
> I have done a "poor man's" rebalance by copying data after adding
> devices. I know this is not a substitute for a real online rebalance,
> but it gets the job done (if you can take the data offline, I do it a
> small chunk at a time).

I have done the same thing.  It's uncomfortable.  It was like this...

I want to rebalance, or add compression to existing data, or one of the other 
reasons somebody might want to do this.  I find a directory that is temporarily 
static, and I do this:  (cd workdir ; sudo tar cpf - .) | (mkdir workdir2 ; cd 
workdir2 ; sudo tar xpf - ) ; sudo mv workdir trash ; sudo mv workdir2 workdir 
; sudo rm -rf trash

Unfortunately that failed.  The idea was to reconstruct the data without 
anybody noticing, and then then perform an instantaneous "mv" operation to put 
it into place.  Unfortunately, if anything is being used at all in the old dir, 
then the mv fails, and I end up with workdir/workdir2 and two copies on disk.

In practice, I only found this to work:
sudo rm -rf workdir ; mkdir workdir ; (cd /blah/snapshot/mysnap ; sudo tar cpf 
- .) | (cd workdir ; sudo tar xpf -)

Hence, I say, it's uncomfortable.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-18 Thread Fajar A. Nugraha
On Tue, Oct 18, 2011 at 7:18 PM, Edward Ned Harvey
 wrote:
> I recently put my first btrfs system into production.  Here are the
> similarities/differences I noticed different between btrfs and zfs:
>
> Differences:
> * Obviously, one is meant for linux and the other solaris (etc)
> * In btrfs, there is only raid1.  They don't have raid5, 6, etc yet.
> * In btrfs, snapshots are read-write.  Cannot be made read-only without
> quotas, which aren't implemented yet.

Minor correction: btrfs support ro snapshot. It's available on vanilla
linux, but IIRC it requires an (unofficial) updated btrfs-progs (which
basically tracks patches sent but not yet integrated to official
tree), but it works.

> * zfs supports quotas.  Also, by default creates snapshots read-only but
> could be made read-write by cloning.

There are proposed patches for btrfs quota support, but the kernel
part has not been accepted upstream.

> * In btrfs, there is no equivalent or alternative to "zfs send | zfs
> receive"

Planned. No actual working implementation yet.

> * In zfs, you have the hidden ".zfs" subdir that contains your snapshots.
> * In btrfs, your snapshots need to be mounted somewhere, inside the same
> filesystem.  So in btrfs, you do something like this...  Create a
> filesystem, then create a subvol called "@" and use it to store all your
> work.  Later when you create snapshots, you essentially duplicate that
> subvol "@2011-10-18-07-40-00" or something.

Yes. basically btrfs treats a subvolume and snapshot in the same way.

> * Both do compression.  By default zfs compression is fast but you could use
> zlib if you want.  By default btrfs uses zlib, but you could opt for fast if
> you want.

lzo is planned to be the default in the future.

-- 
Fajar
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-18 Thread Fajar A. Nugraha
On Tue, Oct 18, 2011 at 8:38 PM, Gregory Shaw  wrote:
> I came to the conclusion that btrfs isn't ready for prime time.  I'll 
> re-evaluate as development continues and the missing portions are provided.

For someone with @oracle.com email address, you could probably arrive
to that conclusion faster by asking Chris Mason directly :)

>
> I'm seriously thinking about converting the Linux system in question into a 
> FreeBSD system so that I can use ZFS.

FreeBSD? Not Solaris? Hmmm ... :)

Anyway, the way I see it now Linux has more choices. You can try out
either btrfs or zfs (even without separate /boot) with a few tweaks.
Neither of it are labeled production-ready, but that doesn't stop some
people (which, presumably, know what they're doing) from putting in in
production.

I'm still hoping oracle would release source updates to zfs soon so
other OS can also use its new features (e.g. encryption).

-- 
Fajar
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-18 Thread Bob Friesenhahn

On Wed, 19 Oct 2011, Peter Jeremy wrote:

Doesn't a scrub do more than what 'fsck' does?


It does different things.  I'm not sure about "more".


Zfs scrub validates user data while 'fsck' does not.  I consider that 
as being definitely "more".


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-18 Thread Peter Jeremy
On 2011-Oct-18 23:18:02 +1100, Edward Ned Harvey 
 wrote:
>I recently put my first btrfs system into production.  Here are the
>similarities/differences I noticed different between btrfs and zfs:

Thanks for that.

>* zfs has storage tiering.  (cache & log devices, such as SSD's to
>accelerate performance.)  btrfs doesn't have this yet.

I'd call that "multi-level caching and journalling".  To me, storage
tiering means something like HSM - something that lets me push rarely
used data to near-line storage (eg big green SATA drives that are spun
down most of the time) whilst retaining the ability to transparently
access it.

On 2011-Oct-19 03:46:30 +1100, Mark Sandrock  wrote:
>Doesn't a scrub do more than what 'fsck' does?

It does different things.  I'm not sure about "more".

fsck verifies the logical consistency of a filesystem.  For UFS, this
includes: used data blocks are allocated to exactly one file,
directory entries point to valid inodes, allocated inodes have at
least one link, the number of links in an inode exactly matches the
number of directory entries pointing to that inode, directories form a
single tree without loops, file sizes are consistent with the number
of allocated blocks, unallocated data/inodes blocks are in the
relevant free bitmaps, redundant superblock data is consistent.  It
can't verify data.

scrub uses checksums to verify the contents of all blocks and attempts
to correct errors using redundant copies of blocks.  This implicitly
detects some types of logical errors.  I don't know if scrub includes
explicit logic to detect things like directory loops, missing free
blocks, unreachable allocated blocks, multiply allocated blocks, etc.

>IIRC, fsck was seldom needed at
>my former site once UFS journalling
>became available. Sweet update.

Whilst Solaris very rarely insists we run fsck, we have had a number
of cases where we have found files corrupted following a crash - even
with UFS journalling enabled.  Unfortunately, this isn't the sort of
thing that fsck could detect.

-- 
Peter Jeremy


pgpe2tUImniF1.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-18 Thread Ian Collins

 On 10/19/11 09:31 AM, Tim Cook wrote:


I had and have redundant storage, it has *NEVER* automatically fixed 
it.  You're the first person I've heard that has had it automatically 
fix it.


I'm another, I have had many cases of ZFS fixing corrupted data on a 
number of different pool configurations.


Per the page "or an unlikely series of events conspired to corrupt 
multiple copies of a piece of data."


Their unlikely series of events, that goes unnamed, is not that 
unlikely in my experience.


The only one I've seen where ZFS reported, but was unable to repair was 
data corruption caused by bad memory.  I haven't seen any of those since 
adopting a "no ZFS without ECC" rule.


I would probably still be blissfully unaware of the corruption is I 
wasn't using ZFS...


--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-18 Thread Cyril Plisko
On Tue, Oct 18, 2011 at 10:31 PM, Tim Cook  wrote:
>
>
> I had and have redundant storage, it has *NEVER* automatically fixed it.
>  You're the first person I've heard that has had it automatically fix it.

Well, here comes another person - I have ZFS automatically fixing
corrupted data on a number of raidz pools. Moreover, my laptop (single
drive) with copies=2 experienced a number of corruptions that were
fixed automatically due to extra copy of the relevant data.

I am pretty sure there are much more people with similar experience...


-- 
Regards,
        Cyril
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-18 Thread Paul Kraus
On Tue, Oct 18, 2011 at 4:31 PM, Tim Cook  wrote:

> I had and have redundant storage, it has *NEVER* automatically fixed it.
>  You're the first person I've heard that has had it automatically fix it.

I have had ZFS automatically repair corrupted raw data when one
component of the redundancy failed, just as DiskSuite (SLVM) will
resync a failed mirror.

   I think you may be using different definitions of "corrupt". In my
case, the backend storage / drive that was part of a redundant zpool
failed (or became unreliable). Once the issue was resolved, a resilver
operation rewrote the data that had been corrupted on the failing
component. No corrupt data was ever presented to the application.

-- 
{1-2-3-4-5-6-7-}
Paul Kraus
-> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ )
-> Sound Coordinator, Schenectady Light Opera Company (
http://www.sloctheater.org/ )
-> Technical Advisor, RPI Players
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-18 Thread Tim Cook
On Tue, Oct 18, 2011 at 3:27 PM, Peter Tribble wrote:

> On Tue, Oct 18, 2011 at 9:12 PM, Tim Cook  wrote:
> >
> >
> > On Tue, Oct 18, 2011 at 3:06 PM, Peter Tribble 
> > wrote:
> >>
> >> On Tue, Oct 18, 2011 at 8:52 PM, Tim Cook  wrote:
> >> >
> >> > Every scrub I've ever done that has found an error required manual
> >> > fixing.
> >> >  Every pool I've ever created has been raid-z or raid-z2, so the
> silent
> >> > healing, while a great story, has never actually happened in practice
> in
> >> > any
> >> > environment I've used ZFS in.
> >>
> >> You have, of course, reported each such failure, because if that
> >> was indeed the case then it's a clear and obvious bug?
> >>
> >> For what it's worth, I've had ZFS repair data corruption on
> >> several occasions - both during normal operation and as a
> >> result of a scrub, and I've never had to intervene manually.
> >>
> >> --
> >> -Peter Tribble
> >> http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/
> >
> >
> > Given that there  are guides on how to manually fix the corruption, I
> don't
> > see any need to report it.  It's considered acceptable and expected
> behavior
> > from everyone I've talked to at Sun...
> > http://dlc.sun.com/osol/docs/content/ZFSADMIN/gbbwl.html
>
> If you have adequate redundancy, ZFS will - and does -
> repair errors. The document you quote is for the case
> where you don't actually have adequate redundancy: ZFS
> will refuse to make up data for you, and report back where
> the problem was. Exactly as designed.
>
> (And yes, I've come across systems without redundant
> storage, or had multiple simultaneous failures. The original
> statement was that if you have redundant copies of the data
> or, in the case of raidz, enough information to reconstruct
> it, then ZFS will repair it for you. Which has been exactly in
> accord with my experience.)
>
> --
> -Peter Tribble
> http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/
>



I had and have redundant storage, it has *NEVER* automatically fixed it.
 You're the first person I've heard that has had it automatically fix it.
Per the page "or an unlikely series of events conspired to corrupt multiple
copies of a piece of data."

Their unlikely series of events, that goes unnamed, is not that unlikely in
my experience.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-18 Thread Peter Tribble
On Tue, Oct 18, 2011 at 9:12 PM, Tim Cook  wrote:
>
>
> On Tue, Oct 18, 2011 at 3:06 PM, Peter Tribble 
> wrote:
>>
>> On Tue, Oct 18, 2011 at 8:52 PM, Tim Cook  wrote:
>> >
>> > Every scrub I've ever done that has found an error required manual
>> > fixing.
>> >  Every pool I've ever created has been raid-z or raid-z2, so the silent
>> > healing, while a great story, has never actually happened in practice in
>> > any
>> > environment I've used ZFS in.
>>
>> You have, of course, reported each such failure, because if that
>> was indeed the case then it's a clear and obvious bug?
>>
>> For what it's worth, I've had ZFS repair data corruption on
>> several occasions - both during normal operation and as a
>> result of a scrub, and I've never had to intervene manually.
>>
>> --
>> -Peter Tribble
>> http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/
>
>
> Given that there  are guides on how to manually fix the corruption, I don't
> see any need to report it.  It's considered acceptable and expected behavior
> from everyone I've talked to at Sun...
> http://dlc.sun.com/osol/docs/content/ZFSADMIN/gbbwl.html

If you have adequate redundancy, ZFS will - and does -
repair errors. The document you quote is for the case
where you don't actually have adequate redundancy: ZFS
will refuse to make up data for you, and report back where
the problem was. Exactly as designed.

(And yes, I've come across systems without redundant
storage, or had multiple simultaneous failures. The original
statement was that if you have redundant copies of the data
or, in the case of raidz, enough information to reconstruct
it, then ZFS will repair it for you. Which has been exactly in
accord with my experience.)

-- 
-Peter Tribble
http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-18 Thread Tim Cook
On Tue, Oct 18, 2011 at 3:06 PM, Peter Tribble wrote:

> On Tue, Oct 18, 2011 at 8:52 PM, Tim Cook  wrote:
> >
> > Every scrub I've ever done that has found an error required manual
> fixing.
> >  Every pool I've ever created has been raid-z or raid-z2, so the silent
> > healing, while a great story, has never actually happened in practice in
> any
> > environment I've used ZFS in.
>
> You have, of course, reported each such failure, because if that
> was indeed the case then it's a clear and obvious bug?
>
> For what it's worth, I've had ZFS repair data corruption on
> several occasions - both during normal operation and as a
> result of a scrub, and I've never had to intervene manually.
>
> --
> -Peter Tribble
> http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/
>


Given that there  are guides on how to manually fix the corruption, I don't
see any need to report it.  It's considered acceptable and expected behavior
from everyone I've talked to at Sun...
http://dlc.sun.com/osol/docs/content/ZFSADMIN/gbbwl.html

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-18 Thread Peter Tribble
On Tue, Oct 18, 2011 at 8:52 PM, Tim Cook  wrote:
>
> Every scrub I've ever done that has found an error required manual fixing.
>  Every pool I've ever created has been raid-z or raid-z2, so the silent
> healing, while a great story, has never actually happened in practice in any
> environment I've used ZFS in.

You have, of course, reported each such failure, because if that
was indeed the case then it's a clear and obvious bug?

For what it's worth, I've had ZFS repair data corruption on
several occasions - both during normal operation and as a
result of a scrub, and I've never had to intervene manually.

-- 
-Peter Tribble
http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-18 Thread Tim Cook
On Tue, Oct 18, 2011 at 2:41 PM, Kees Nuyt  wrote:

> On Tue, 18 Oct 2011 12:05:29 -0500, Tim Cook  wrote:
>
> >> Doesn't a scrub do more than what
> >> 'fsck' does?
> >>
> > Not really.  fsck will work on an offline filesystem to correct errors
> and
> > bring it back online.  Scrub won't even work until the filesystem is
> already
> > imported and online. If it's corrupted you can't even import it, hence
> the
> > -F flag addition.  Plus, IIRC, scrub won't actually correct any errors,
> it
> > will only flag them.  Manually fixing what scrub finds can be a giant
> pain.
>
> IIRC Scrub will correct errors if the pool has sufficient
> redundancy. So will any read of a corrupted block.
>
> http://hub.opensolaris.org/bin/view/Community+Group+zfs/selfheal
> --
>  (  Kees Nuyt
>  )
> c[_]
>
>

Every scrub I've ever done that has found an error required manual fixing.
 Every pool I've ever created has been raid-z or raid-z2, so the silent
healing, while a great story, has never actually happened in practice in any
environment I've used ZFS in.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-18 Thread Kees Nuyt
On Tue, 18 Oct 2011 12:05:29 -0500, Tim Cook  wrote:

>> Doesn't a scrub do more than what
>> 'fsck' does?
>>
> Not really.  fsck will work on an offline filesystem to correct errors and
> bring it back online.  Scrub won't even work until the filesystem is already
> imported and online. If it's corrupted you can't even import it, hence the
> -F flag addition.  Plus, IIRC, scrub won't actually correct any errors, it
> will only flag them.  Manually fixing what scrub finds can be a giant pain.

IIRC Scrub will correct errors if the pool has sufficient
redundancy. So will any read of a corrupted block.

http://hub.opensolaris.org/bin/view/Community+Group+zfs/selfheal
-- 
  (  Kees Nuyt
  )
c[_]
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-18 Thread Ian Collins

 On 10/19/11 01:18 AM, Edward Ned Harvey wrote:

I recently put my first btrfs system into production.  Here are the
similarities/differences I noticed different between btrfs and zfs:

Differences:
* Obviously, one is meant for linux and the other solaris (etc)
* In btrfs, there is only raid1.  They don't have raid5, 6, etc yet.
* In btrfs, snapshots are read-write.  Cannot be made read-only without
quotas, which aren't implemented yet.
* zfs supports quotas.  Also, by default creates snapshots read-only but
could be made read-write by cloning.
* In btrfs, there is no equivalent or alternative to "zfs send | zfs
receive"
* In zfs, you have the hidden ".zfs" subdir that contains your snapshots.
* In btrfs, your snapshots need to be mounted somewhere, inside the same
filesystem.  So in btrfs, you do something like this...  Create a
filesystem, then create a subvol called "@" and use it to store all your
work.  Later when you create snapshots, you essentially duplicate that
subvol "@2011-10-18-07-40-00" or something.
* btrfs is able to shrink.  zfs is not able to shrink.
* btrfs is able to defrag.  zfs doesn't have defrag yet.
* btrfs is able to balance.  (after adding new blank devices, rebalance, so
the data&  workload are distributed across all the devices.)  zfs is not
able to do this yet.
* zfs has storage tiering.  (cache&  log devices, such as SSD's to
accelerate performance.)  btrfs doesn't have this yet.


So does it suffer the same performance issues as zfs (without a log 
device) when serving over NFS?



* btrfs has no dedup yet. They are planning to do offline dedup.  ZFS has
online dedup.  I wouldn't recommend zfs dedup yet until performance issues
are resolved, which seems like never.  But when and if zfs dedup performance
issues are resolved, online dedup should greatly outperform offline dedup,
both in terms of speed and disk usage.
* zfs has the concept of a zvol, you can export iscsi or format with any
filesystem you like. If you want to do the same in btrfs, you have to create
a file and use it loopback.  This accomplishes the same thing, but the
creation time is much longer (zero time versus linear time, could literally
be called "infinitely" longer) ... so this is an advantage for zfs.
* zfs has filesystem property inheritance and recursion of commands like
"snapshot" and "send."  Btrfs doesn't.
* zfs has permissions - allow users or groups to create/destroy snapshots
and stuff like that.  In btrfs you'll have to kludge something through sudo
or whatever.

Similarities:
* Both are able to grow.  (Add devices&  storage)
* Neither one has a fsck.  They both have scrub.  (btrfs calls it "scan" and
zfs calls it "scrub.")  (Correction ... In the latest btrfs beta, I see
there exists btrfsck, but I don't know if it's a full fledged fsck.  Maybe
it's just a frontend for scan? People are still saying there is no fsck.)
* Both do compression.  By default zfs compression is fast but you could use
zlib if you want.  By default btrfs uses zlib, but you could opt for fast if
you want.


Good input, thanks.

Does btrfs have NFSv4 ACL support?

--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-18 Thread Ian Collins

 On 10/19/11 03:12 AM, Paul Kraus wrote:

On Tue, Oct 18, 2011 at 9:13 AM, Darren J Moffat
  wrote:

On 10/18/11 14:04, Jim Klimov wrote:

2011-10-18 16:26, Darren J Moffat пишет:

ZFS does slightly biases new vdevs for new writes so that we will get
to a more even spread. It doesn't go and move already written blocks
onto the new vdevs though. So while there isn't an admin interface to
rebalancing ZFS does do something in this area.

This is implemented in metaslab_alloc_dva()


http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/fs/zfs/metaslab.c


See lines 1356-1378


And the admin interface would be what exactly?..

As I said there isn't one because that isn't how it works today it is all
automatic and only for new writes.

I was pointing out that ZFS does do 'something' not that it had an exactly
matching feature.

I have done a "poor man's" rebalance by copying data after adding
devices. I know this is not a substitute for a real online rebalance,
but it gets the job done (if you can take the data offline, I do it a
small chunk at a time).


I do the same.

Whether you do the balance by hand, or the filesystem does it the data 
still has to be moved around which can be resource intensive.  I'd 
rather do that at a time of my choosing.


--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-18 Thread Tim Cook
On Tue, Oct 18, 2011 at 11:46 AM, Mark Sandrock wrote:

>
> On Oct 18, 2011, at 11:09 AM, Nico Williams wrote:
>
> > On Tue, Oct 18, 2011 at 9:35 AM, Brian Wilson wrote:
> >> I just wanted to add something on fsck on ZFS - because for me that used
> to
> >> make ZFS 'not ready for prime-time' in 24x7 5+ 9s uptime environments.
> >> Where ZFS doesn't have an fsck command - and that really used to bug me
> - it
> >> does now have a -F option on zpool import.  To me it's the same
> >> functionality for my environment - the ability to try to roll back to a
> >> 'hopefully' good state and get the filesystem mounted up, leaving the
> >> corrupted data objects corrupted.  [...]
> >
> > Yes, that's exactly what it is.  There's no point calling it fsck
> > because fsck fixes individual filesystems, while ZFS fixups need to
> > happen at the volume level (at volume import time).
> >
> > It's true that this should have been in ZFS from the word go.  But
> > it's there now, and that's what matters, IMO.
>
> Doesn't a scrub do more than what
> 'fsck' does?
>
>
Not really.  fsck will work on an offline filesystem to correct errors and
bring it back online.  Scrub won't even work until the filesystem is already
imported and online. If it's corrupted you can't even import it, hence the
-F flag addition.  Plus, IIRC, scrub won't actually correct any errors, it
will only flag them.  Manually fixing what scrub finds can be a giant pain.



> >
> > It's also true that this was never necessary with hardware that
> > doesn't lie, but it's good to have it anyways, and is critical for
> > personal systems such as laptops.
>
> IIRC, fsck was seldom needed at
> my former site once UFS journalling
> became available. Sweet update.
>
> Mark
>
>

We all hope to never have to run fsck, but not having it at all is a bit of
a non-starter in most environments.


--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-18 Thread Brian Wilson

On 10/18/11 11:46 AM, Mark Sandrock wrote:

On Oct 18, 2011, at 11:09 AM, Nico Williams wrote:


On Tue, Oct 18, 2011 at 9:35 AM, Brian Wilson wrote:

I just wanted to add something on fsck on ZFS - because for me that used to
make ZFS 'not ready for prime-time' in 24x7 5+ 9s uptime environments.
Where ZFS doesn't have an fsck command - and that really used to bug me - it
does now have a -F option on zpool import.  To me it's the same
functionality for my environment - the ability to try to roll back to a
'hopefully' good state and get the filesystem mounted up, leaving the
corrupted data objects corrupted.  [...]

Yes, that's exactly what it is.  There's no point calling it fsck
because fsck fixes individual filesystems, while ZFS fixups need to
happen at the volume level (at volume import time).

It's true that this should have been in ZFS from the word go.  But
it's there now, and that's what matters, IMO.

Doesn't a scrub do more than what
'fsck' does?
Oh yes, I wasn't trying to talk about scrub, in comparison with 'fsck' - 
I was talking about zpool import -F.  I believe scrub does a lot more.



It's also true that this was never necessary with hardware that
doesn't lie, but it's good to have it anyways, and is critical for
personal systems such as laptops.

IIRC, fsck was seldom needed at
my former site once UFS journalling
became available. Sweet update.

Mark
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss



--
---
Brian Wilson, Solaris SE, UW-Madison DoIT
Room 3114 CS&S608-263-8047
brian.wilson(a)doit.wisc.edu
'I try to save a life a day. Usually it's my own.' - John Crichton
---

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-18 Thread Mark Sandrock

On Oct 18, 2011, at 11:09 AM, Nico Williams wrote:

> On Tue, Oct 18, 2011 at 9:35 AM, Brian Wilson wrote:
>> I just wanted to add something on fsck on ZFS - because for me that used to
>> make ZFS 'not ready for prime-time' in 24x7 5+ 9s uptime environments.
>> Where ZFS doesn't have an fsck command - and that really used to bug me - it
>> does now have a -F option on zpool import.  To me it's the same
>> functionality for my environment - the ability to try to roll back to a
>> 'hopefully' good state and get the filesystem mounted up, leaving the
>> corrupted data objects corrupted.  [...]
> 
> Yes, that's exactly what it is.  There's no point calling it fsck
> because fsck fixes individual filesystems, while ZFS fixups need to
> happen at the volume level (at volume import time).
> 
> It's true that this should have been in ZFS from the word go.  But
> it's there now, and that's what matters, IMO.

Doesn't a scrub do more than what
'fsck' does?

> 
> It's also true that this was never necessary with hardware that
> doesn't lie, but it's good to have it anyways, and is critical for
> personal systems such as laptops.

IIRC, fsck was seldom needed at
my former site once UFS journalling
became available. Sweet update.

Mark
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-18 Thread Nico Williams
On Tue, Oct 18, 2011 at 9:35 AM, Brian Wilson  wrote:
> I just wanted to add something on fsck on ZFS - because for me that used to
> make ZFS 'not ready for prime-time' in 24x7 5+ 9s uptime environments.
> Where ZFS doesn't have an fsck command - and that really used to bug me - it
> does now have a -F option on zpool import.  To me it's the same
> functionality for my environment - the ability to try to roll back to a
> 'hopefully' good state and get the filesystem mounted up, leaving the
> corrupted data objects corrupted.  [...]

Yes, that's exactly what it is.  There's no point calling it fsck
because fsck fixes individual filesystems, while ZFS fixups need to
happen at the volume level (at volume import time).

It's true that this should have been in ZFS from the word go.  But
it's there now, and that's what matters, IMO.

It's also true that this was never necessary with hardware that
doesn't lie, but it's good to have it anyways, and is critical for
personal systems such as laptops.

Nico
--
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-18 Thread Brian Wilson

On 10/18/11 07:18 AM, Edward Ned Harvey wrote:

From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
boun...@opensolaris.org] On Behalf Of Harry Putnam

As a common slob who isn't very skilled, I like to see some commentary
from some of the pros here as to any comparison of zfs against btrfs.

* Neither one has a fsck.  They both have scrub.  (btrfs calls it "scan" and
zfs calls it "scrub.")  (Correction ... In the latest btrfs beta, I see
there exists btrfsck, but I don't know if it's a full fledged fsck.  Maybe
it's just a frontend for scan? People are still saying there is no fsck.)
I just wanted to add something on fsck on ZFS - because for me that used 
to make ZFS 'not ready for prime-time' in 24x7 5+ 9s uptime environments.
Where ZFS doesn't have an fsck command - and that really used to bug me 
- it does now have a -F option on zpool import.  To me it's the same 
functionality for my environment - the ability to try to roll back to a 
'hopefully' good state and get the filesystem mounted up, leaving the 
corrupted data objects corrupted.  So that if the 10-1000 files and 
objects that went missing aren't required for my 24x7 5+ 9s application 
to run (e.g. log files), I can get it rolling again without them 
quickly, and then get those files recovered from backup afterwards as 
needed, without having to recover the entire pool from backup.


cheers,
Brian


--


---
Brian Wilson, Solaris SE, UW-Madison DoIT
Room 3114 CS&S608-263-8047
brian.wilson(a)doit.wisc.edu
'I try to save a life a day. Usually it's my own.' - John Crichton
---
--
---
Brian Wilson, Solaris SE, UW-Madison DoIT
Room 3114 CS&S608-263-8047
brian.wilson(a)doit.wisc.edu
'I try to save a life a day. Usually it's my own.' - John Crichton
---

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-18 Thread Bob Friesenhahn

On Tue, 18 Oct 2011, Gregory Shaw wrote:


I'm seriously thinking about converting the Linux system in question 
into a FreeBSD system so that I can use ZFS.


FreeBSD is a wonderfully stable, coherent, and well-documented system 
which has stood the test of time and has an excellent development 
team.  Zfs 28 is fairly new to FreeBSD but there is every reason to 
believe that it will be close to "production" grade when FreeBSD 9.0 
is released.


The main shortcoming of zfs in FreeBSD is that kernel memory 
allocation is not yet coherent/shared as it is in Solaris.  If you 
install enough memory, then this becomes a non-issue.


If you are planning to build an NFS server, then it is good to know 
that Solaris does NFS better than Linux or FreeBSD.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-18 Thread Harry Putnam
Gregory Shaw  writes:

> I looked into btrfs some time ago for the same reasons.  I had a Linux
> system that I wanted to do more intelligent things with storage.

Great details, thanks.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-18 Thread Paul Kraus
On Tue, Oct 18, 2011 at 9:13 AM, Darren J Moffat
 wrote:
> On 10/18/11 14:04, Jim Klimov wrote:
>>
>> 2011-10-18 16:26, Darren J Moffat пишет:

>>>
>>> ZFS does slightly biases new vdevs for new writes so that we will get
>>> to a more even spread. It doesn't go and move already written blocks
>>> onto the new vdevs though. So while there isn't an admin interface to
>>> rebalancing ZFS does do something in this area.
>>>
>>> This is implemented in metaslab_alloc_dva()
>>>
>>>
>>> http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/fs/zfs/metaslab.c
>>>
>>>
>>> See lines 1356-1378
>>>
>>
>> And the admin interface would be what exactly?..
>
> As I said there isn't one because that isn't how it works today it is all
> automatic and only for new writes.
>
> I was pointing out that ZFS does do 'something' not that it had an exactly
> matching feature.

I have done a "poor man's" rebalance by copying data after adding
devices. I know this is not a substitute for a real online rebalance,
but it gets the job done (if you can take the data offline, I do it a
small chunk at a time).

-- 
{1-2-3-4-5-6-7-}
Paul Kraus
-> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ )
-> Sound Coordinator, Schenectady Light Opera Company (
http://www.sloctheater.org/ )
-> Technical Advisor, RPI Players
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-18 Thread Harry Putnam
Edward Ned Harvey 
writes:

> I recently put my first btrfs system into production.  Here are the
> similarities/differences I noticed different between btrfs and zfs:

Great input.. thanks for the details.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-18 Thread Gregory Shaw
I looked into btrfs some time ago for the same reasons.   I had a Linux system 
that I wanted to do more intelligent things with storage.

However, I reverted to Ext3/4 and MD because of the portions of btrfs that 
haven't been completed.   It seems that btrfs development is very slow, which 
doesn't make me feel that a bug that I find (or even a fsck tool) will be 
fixed/provided.

Another item that made me nervous was my experience with ZFS.  Even when called 
'ready for production', a number of bugs were found that were pretty nasty.   
They've since been fixed (years ago), but there were some surprises there that 
I'd rather not encounter on a Linux system.

While I like to try the latest thing, I've spent quite a bit of time 
generating/collecting my data.I really don't want to lose it if I can avoid 
it.  :-)

I came to the conclusion that btrfs isn't ready for prime time.  I'll 
re-evaluate as development continues and the missing portions are provided.

I'm seriously thinking about converting the Linux system in question into a 
FreeBSD system so that I can use ZFS.

On Oct 17, 2011, at 9:29 AM, Harry Putnam wrote:

> This subject may have been ridden to death... I missed it if so.
> 
> Not wanting to start a flame fest or whatever but
> 
> As a common slob who isn't very skilled, I like to see some commentary
> from some of the pros here as to any comparison of zfs against btrfs.
> 
> I realize btrfs is a lot less `finished' but I see it is starting to
> show up as an option on some linux install routines... Debian an
> ubuntu I noticed and probably many others.
> 
> My main reasons for using zfs are pretty basic compared to some here
> and I wondered how btrfs stacks up on the basic qualities.
> 
> 
> 
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

-
Gregory Shaw, Enterprise IT Architect
Phone: (303) 246-5411
Oracle Global IT Service Design Group
500 Eldorado Blvd, UBRM02-157   greg.s...@oracle.com (work)
Broomfield, CO 80021  gr...@fmsoft.com (home)
Hoping the problem magically goes away by ignoring it is the "microsoft 
approach to programming" and should never be allowed. (Linus Torvalds)  
  



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-18 Thread Darren J Moffat

On 10/18/11 14:04, Jim Klimov wrote:

2011-10-18 16:26, Darren J Moffat пишет:

On 10/18/11 13:18, Edward Ned Harvey wrote:

* btrfs is able to balance. (after adding new blank devices,
rebalance, so
the data& workload are distributed across all the devices.) zfs is not
able to do this yet.


ZFS does slightly biases new vdevs for new writes so that we will get
to a more even spread. It doesn't go and move already written blocks
onto the new vdevs though. So while there isn't an admin interface to
rebalancing ZFS does do something in this area.

This is implemented in metaslab_alloc_dva()

http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/fs/zfs/metaslab.c


See lines 1356-1378



And the admin interface would be what exactly?..


As I said there isn't one because that isn't how it works today it is 
all automatic and only for new writes.


I was pointing out that ZFS does do 'something' not that it had an 
exactly matching feature.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-18 Thread Jim Klimov

2011-10-18 16:26, Darren J Moffat пишет:

On 10/18/11 13:18, Edward Ned Harvey wrote:
* btrfs is able to balance. (after adding new blank devices, 
rebalance, so

the data& workload are distributed across all the devices.) zfs is not
able to do this yet.


ZFS does slightly biases new vdevs for new writes so that we will get 
to a more even spread. It doesn't go and move already written blocks 
onto the new vdevs though. So while there isn't an admin interface to 
rebalancing ZFS does do something in this area.


This is implemented in metaslab_alloc_dva()

http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/fs/zfs/metaslab.c 



See lines 1356-1378



And the admin interface would be what exactly?..

After adding a device, I'd kick it "go rewrite old
data including snapshots and clones so it's written
in a balanced manner anew? Kind of like send-recv
in the same pool? Why is it not done yet? ;)

//Jim

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-18 Thread Darren J Moffat

On 10/18/11 13:18, Edward Ned Harvey wrote:

* btrfs is able to balance.  (after adding new blank devices, rebalance, so
the data&  workload are distributed across all the devices.)  zfs is not
able to do this yet.


ZFS does slightly biases new vdevs for new writes so that we will get to 
a more even spread.  It doesn't go and move already written blocks onto 
the new vdevs though.  So while there isn't an admin interface to 
rebalancing ZFS does do something in this area.


This is implemented in metaslab_alloc_dva()

http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/fs/zfs/metaslab.c

See lines 1356-1378

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-18 Thread Edward Ned Harvey
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
> boun...@opensolaris.org] On Behalf Of Harry Putnam
> 
> FreeNAS and freebsd.
> 
> Maybe you can give a little synopsis of those too.  I mean when it
> comes to utilizing zfs; is it much the same as if running it on
> solaris?

For somebody who didn't want to start a flame war, you sure picked the wrong
question.  ;-)

I personally will say:  I personally use only solaris.  I have reasons for
that, but there are a lot of other people here who use other systems.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-18 Thread Edward Ned Harvey
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
> boun...@opensolaris.org] On Behalf Of Harry Putnam
> 
> As a common slob who isn't very skilled, I like to see some commentary
> from some of the pros here as to any comparison of zfs against btrfs.

I recently put my first btrfs system into production.  Here are the
similarities/differences I noticed different between btrfs and zfs:

Differences:
* Obviously, one is meant for linux and the other solaris (etc)
* In btrfs, there is only raid1.  They don't have raid5, 6, etc yet.
* In btrfs, snapshots are read-write.  Cannot be made read-only without
quotas, which aren't implemented yet.
* zfs supports quotas.  Also, by default creates snapshots read-only but
could be made read-write by cloning.
* In btrfs, there is no equivalent or alternative to "zfs send | zfs
receive"
* In zfs, you have the hidden ".zfs" subdir that contains your snapshots.  
* In btrfs, your snapshots need to be mounted somewhere, inside the same
filesystem.  So in btrfs, you do something like this...  Create a
filesystem, then create a subvol called "@" and use it to store all your
work.  Later when you create snapshots, you essentially duplicate that
subvol "@2011-10-18-07-40-00" or something.
* btrfs is able to shrink.  zfs is not able to shrink.
* btrfs is able to defrag.  zfs doesn't have defrag yet.
* btrfs is able to balance.  (after adding new blank devices, rebalance, so
the data & workload are distributed across all the devices.)  zfs is not
able to do this yet.
* zfs has storage tiering.  (cache & log devices, such as SSD's to
accelerate performance.)  btrfs doesn't have this yet.
* btrfs has no dedup yet. They are planning to do offline dedup.  ZFS has
online dedup.  I wouldn't recommend zfs dedup yet until performance issues
are resolved, which seems like never.  But when and if zfs dedup performance
issues are resolved, online dedup should greatly outperform offline dedup,
both in terms of speed and disk usage.
* zfs has the concept of a zvol, you can export iscsi or format with any
filesystem you like. If you want to do the same in btrfs, you have to create
a file and use it loopback.  This accomplishes the same thing, but the
creation time is much longer (zero time versus linear time, could literally
be called "infinitely" longer) ... so this is an advantage for zfs.
* zfs has filesystem property inheritance and recursion of commands like
"snapshot" and "send."  Btrfs doesn't.
* zfs has permissions - allow users or groups to create/destroy snapshots
and stuff like that.  In btrfs you'll have to kludge something through sudo
or whatever.

Similarities:
* Both are able to grow.  (Add devices & storage)
* Neither one has a fsck.  They both have scrub.  (btrfs calls it "scan" and
zfs calls it "scrub.")  (Correction ... In the latest btrfs beta, I see
there exists btrfsck, but I don't know if it's a full fledged fsck.  Maybe
it's just a frontend for scan? People are still saying there is no fsck.)
* Both do compression.  By default zfs compression is fast but you could use
zlib if you want.  By default btrfs uses zlib, but you could opt for fast if
you want.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-17 Thread Freddie Cash
On Mon, Oct 17, 2011 at 10:50 AM, Harry Putnam  wrote:

> Freddie Cash  writes:
>
> > If you only want RAID0 or RAID1, then btrfs is okay.  There's no support
> for
> > RAID5+ as yet, and it's been "in development" for a couple of years now.
>
> [...] snipped excellent information
>
> Thanks much, I've very appreciative of the good information.  Much
> better to hear from actual users than pouring thru webpages to get a
> picture.
>
> I'm googling on the citations you posted:
>
> FreeNAS and freebsd.
>
> Maybe you can give a little synopsis of those too.  I mean when it
> comes to utilizing zfs; is it much the same as if running it on
> solaris?
>
> FreeBSD 8-STABLE (what will become 8.3) and 9.0-RELEASE (will be released
hopefully this month) both include ZFSv28, the latest open-source version of
ZFS.  This includes raidz3 and dedupe support, same as OpenSolaris, Illumos,
and other OSol-based distros.  Not sure what the latest version of ZFS is in
Solaris 10.

The ZFS bits work the same as on Solaris with only 2 small differences:
  - sharenfs property just writes data to /etc/zfs/exports, which is read by
the standard NFS daemons (it's easier to just use /etc/exports to share ZFS
filesystems)
  - sharesmb property doesn't do anything; you have to use Samba to share
ZFS filesystems

The only real differences are how the OSes themselves work.  If you are
fluent in Solaris, then FreeBSD will seem strange (and vice-versa).  If you
are fluent in Linux, then FreeBSD will be similar (but a lot more cohesive
and "put-together").


> I knew freebsd had a port, but assumed it would stack up kind of sorry
> compared to Solaris zfs.
>
> Maybe something on the order of the linux fuse/zfs adaptation in usability.
>
> Is that assumption wrong?
>
> Absolutely, completely, and utterly false.  :)  The FreeBSD port of ZFS is
pretty much on par with ZFS on OpenSolaris.  The Linux port of ZFS is just
barely usable.  No comparison at all.  :)


> I actually have some experience with Freebsd, (long before there was a
> zfs port), and it is very linux like in many ways.
>
> That's like saying that OpenIndiana is very Linux-like in many ways.  :)


-- 
Freddie Cash
fjwc...@gmail.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-17 Thread Harry Putnam
Freddie Cash  writes:

> If you only want RAID0 or RAID1, then btrfs is okay.  There's no support for
> RAID5+ as yet, and it's been "in development" for a couple of years now.

[...] snipped excellent information 

Thanks much, I've very appreciative of the good information.  Much
better to hear from actual users than pouring thru webpages to get a
picture. 

I'm googling on the citations you posted:

FreeNAS and freebsd.

Maybe you can give a little synopsis of those too.  I mean when it
comes to utilizing zfs; is it much the same as if running it on
solaris?

I knew freebsd had a port, but assumed it would stack up kind of sorry
compared to Solaris zfs. 

Maybe something on the order of the linux fuse/zfs adaptation in usability.

Is that assumption wrong?

I actually have some experience with Freebsd, (long before there was a
zfs port), and it is very linux like in many ways.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-17 Thread Michael DeMan
Or, if you absolutely must run linux for the operating system, see: 
http://zfsonlinux.org/

On Oct 17, 2011, at 8:55 AM, Freddie Cash wrote:

> If you absolutely must run Linux on your storage server, for whatever reason, 
> then you probably won't be running ZFS.  For the next year or two, it would 
> probably be safer to run software RAID (md), with LVM on top, with XFS or 
> Ext4 on top.  It's not the easiest setup to manage, but it would be safer 
> than btrfs.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-17 Thread Paul Kraus
On Mon, Oct 17, 2011 at 11:29 AM, Harry Putnam  wrote:

> My main reasons for using zfs are pretty basic compared to some here

What are they ? (the reasons for using ZFS)

> and I wondered how btrfs stacks up on the basic qualities.

I use ZFS @ work because it is the only FS we have been able to find
that scales to what we need (hundreds of millions of small files in
ONE filesystem).

I use ZFS @ home because I really can't afford to have my data
corrupted and I can't afford Enterprise grade hardware.

-- 
{1-2-3-4-5-6-7-}
Paul Kraus
-> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ )
-> Sound Coordinator, Schenectady Light Opera Company (
http://www.sloctheater.org/ )
-> Technical Advisor, RPI Players
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] about btrfs and zfs

2011-10-17 Thread Freddie Cash
On Mon, Oct 17, 2011 at 8:29 AM, Harry Putnam  wrote:

> This subject may have been ridden to death... I missed it if so.
>
> Not wanting to start a flame fest or whatever but
>
> As a common slob who isn't very skilled, I like to see some commentary
> from some of the pros here as to any comparison of zfs against btrfs.
>
> I realize btrfs is a lot less `finished' but I see it is starting to
> show up as an option on some linux install routines... Debian an
> ubuntu I noticed and probably many others.
>
> My main reasons for using zfs are pretty basic compared to some here
> and I wondered how btrfs stacks up on the basic qualities.
>

If you only want RAID0 or RAID1, then btrfs is okay.  There's no support for
RAID5+ as yet, and it's been "in development" for a couple of years now.

There's no working fsck tool for btrfs.  It's been "in development" and
"released in two weeks" for over a year now.  Don't put any data you need
onto btrfs.  It's extremely brittle in the face of power loss.

My biggest gripe with btrfs is that they have come up with all new
terminology that only applies to them.  Filesystem now means "a collection
of block devices grouped together".  While "sub-volume" is what we'd
normally call a "filesystem".  And there's a few other weird terms thrown in
as well.

>From all that I've read on the btrfs mailing list, and news sites around the
web, btrfs is not ready for production use on any system with data that you
can't afford to lose.

If you absolutely must run Linux on your storage server, for whatever
reason, then you probably won't be running ZFS.  For the next year or two,
it would probably be safer to run software RAID (md), with LVM on top, with
XFS or Ext4 on top.  It's not the easiest setup to manage, but it would be
safer than btrfs.

If you don't need to run Linux on your storage server, then definitely give
ZFS a try.  There are many options, depending on your level of expertise:
 FreeNAS for plug-n-play simplicity with a web GUI, FreeBSD for a simpler OS
that runs well on x86/amd64 systems, any of the OpenSolaris-based distros,
or even Solaris if you have the money.

With ZFS you get:
  - working single, dual, triple parity raidz (RAID5, RAID6, "RAID7"
equivalence)
  - n-way mirroring
  - end-to-end checksums for all data/metadata blocks
  - unlimited snapshots
  - pooled storage
  - unlimited filesystems
  - send/recv capabilities
  - built-in compression
  - built-in dedupe
  - built-in encryption (in ZFSv31, which is currently only in Solaris 11)
  - built-in CIFS/NFS sharing (on Solaris-based systems; FreeBSD uses normal
nfsd and Samba for this)
  - automatic hot-spares (on Solaris-based systems; FreeBSD only supports
manual spares)
  - and more

Maybe in another 5 years or so, Btrfs will be up to the point of ZFS today.
 Just image where ZFS will be in 5 years of so.  :)

-- 
Freddie Cash
fjwc...@gmail.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] about btrfs and zfs

2011-10-17 Thread Harry Putnam
This subject may have been ridden to death... I missed it if so.

Not wanting to start a flame fest or whatever but

As a common slob who isn't very skilled, I like to see some commentary
from some of the pros here as to any comparison of zfs against btrfs.

I realize btrfs is a lot less `finished' but I see it is starting to
show up as an option on some linux install routines... Debian an
ubuntu I noticed and probably many others.

My main reasons for using zfs are pretty basic compared to some here
and I wondered how btrfs stacks up on the basic qualities.



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss