[zfs-discuss] Terminology question on ZFS COW

2012-06-05 Thread Jim Klimov

Hello all,

  I recently heard an argument from a colleague that ZFS mis-uses
the term COW (Copy-On-Write). According to him, the original term
was introduced by some vendors and was to be taken literally: that
is, whenever a new write comes to update an existing logical block
in the storage, the block's old contents are first copied away to
another physical location (i.e. to be used for snapshotting or for
recovery of untimely poweroff/panic), then the original on-disk
location is rewritten with the new data.

  Arguably, while this incurs a hit when rewriting existing data,
this combats fragmentation and speeds up reads (i.e. all pieces of
the file's live version are stored as contiguously as possible).
This may be important for large objects randomly updated inside,
like VM disk images and iSCSI backing stores, precreated database
table files, maybe swapfiles, etc.

  I understand why ZFS does what it does, and how, but it may be
possible that such subtle differences in terminology may cause
misunderstanding between people of the same trade. At least, I'd
keep this possibility in mind when talking to non-Solaris storage
admins ;)

  I wonder if this use of the term is indeed more valid (making a
copy of old data upon a new write), and if any vendors actually
did that procedure outlined above?

Thanks,
//Jim Klimov
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Terminology question on ZFS COW

2012-06-05 Thread Edward Ned Harvey
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Jim Klimov
 
I recently heard an argument from a colleague that ZFS mis-uses
 the term COW (Copy-On-Write). According to him, the original term
 was introduced by some vendors and was to be taken literally: that
 is, whenever a new write comes to update an existing logical block
 in the storage, the block's old contents are first copied away to
 another physical location (i.e. to be used for snapshotting or for
 recovery of untimely poweroff/panic), then the original on-disk
 location is rewritten with the new data.

What you described (actually copying the disk sectors upon request to
overwrite the disk sectors) is what MS does.  It may seem more intuitive to
call this COW, in a files perspective, but COW is a computer science term
that was used in memory before it was ever used for disk.  The ZFS behavior
follows the traditional meaning of COW in regards to memory management.

http://en.wikipedia.org/wiki/Copy-on-write


Arguably, while this incurs a hit when rewriting existing data,
 this combats fragmentation and speeds up reads (i.e. all pieces of
 the file's live version are stored as contiguously as possible).
 This may be important for large objects randomly updated inside,
 like VM disk images and iSCSI backing stores, precreated database
 table files, maybe swapfiles, etc.

Correct.  Pay now or pay later.  In some cases, pay now is better for the
long run, and in some cases, pay later is better for the long run.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Terminology question on ZFS COW

2012-06-05 Thread Paul Kraus
On Tue, Jun 5, 2012 at 6:32 AM, Jim Klimov jimkli...@cos.ru wrote:

  I recently heard an argument from a colleague that ZFS mis-uses
 the term COW (Copy-On-Write). According to him, the original term
 was introduced by some vendors and was to be taken literally: that
 is, whenever a new write comes to update an existing logical block
 in the storage, the block's old contents are first copied away to
 another physical location (i.e. to be used for snapshotting or for
 recovery of untimely poweroff/panic), then the original on-disk
 location is rewritten with the new data.

This is what I have seen traditional filesystems (UFS, VxFS) do
in when dealing with snapshots. Once a snapshot is taken, for any data
that is being re-written, a copy of the original must be made before
committing the write.

  Arguably, while this incurs a hit when rewriting existing data,

The hit to write performance can be substantial and the space to
store each snapshot's data can also be large. This is one of the big
differences between ZFS and others. The cost (both write performance
and space) for snapshots in ZFS is minimal while for traditional
filesystems it can be huge (depending on the number of snapshots).

 this combats fragmentation and speeds up reads (i.e. all pieces of
 the file's live version are stored as contiguously as possible).

As long as the file has not grown beyond the original allocation
segment. Once you grow out of that you are (usually) fragmented.

 This may be important for large objects randomly updated inside,
 like VM disk images and iSCSI backing stores, precreated database
 table files, maybe swapfiles, etc.

-- 
{1-2-3-4-5-6-7-}
Paul Kraus
- Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ )
- Assistant Technical Director, LoneStarCon 3 (http://lonestarcon3.org/)
- Sound Coordinator, Schenectady Light Opera Company (
http://www.sloctheater.org/ )
- Technical Advisor, Troy Civic Theatre Company
- Technical Advisor, RPI Players
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] snapshot size

2012-06-05 Thread Albert Shih
Hi all,

Two questions from a newbie.

1/ What REFER mean in zfs list ? 

2/ How can I known the size of all snapshot size for a partition ?
(OK I can add zfs list -t snapshot)

Regards.

JAS


-- 
Albert SHIH
DIO bâtiment 15
Observatoire de Paris
5 Place Jules Janssen
92195 Meudon Cedex
Téléphone : 01 45 07 76 26/06 86 69 95 71
xmpp: j...@jabber.obspm.fr
Heure local/Local time:
mar 5 jui 2012 16:57:38 CEST
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] snapshot size

2012-06-05 Thread Stefan Ring
 Two questions from a newbie.

        1/ What REFER mean in zfs list ?

The amount of data that is reachable from the file system root. It's
just what I would call the contents of the file system.

        2/ How can I known the size of all snapshot size for a partition ?
        (OK I can add zfs list -t snapshot)

zfs get usedbysnapshots zfs-name

Or if you have a recent enough system, have a look at the written
property: http://blog.delphix.com/matt/files/2011/11/oss.pdf (pg 8).
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] snapshot size

2012-06-05 Thread Albert Shih
 Le 05/06/2012 ? 17:08:51+0200, Stefan Ring a écrit
  Two questions from a newbie.
 
         1/ What REFER mean in zfs list ?
 
 The amount of data that is reachable from the file system root. It's
 just what I would call the contents of the file system.

OK thanks. 

 
         2/ How can I known the size of all snapshot size for a partition ?
         (OK I can add zfs list -t snapshot)
 
 zfs get usedbysnapshots zfs-name

Thansk 

Can I say 

USED-REFER=snapshot size ? 


Regards.

JAS
-- 
Albert SHIH
DIO bâtiment 15
Observatoire de Paris
5 Place Jules Janssen
92195 Meudon Cedex
Téléphone : 01 45 07 76 26/06 86 69 95 71
xmpp: j...@jabber.obspm.fr
Heure local/Local time:
mar 5 jui 2012 17:16:07 CEST
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] snapshot size

2012-06-05 Thread Stefan Ring
 Can I say

        USED-REFER=snapshot size ?

No. USED is the space that would be freed if you destroyed the
snapshot _right now_. This can change (and usually does) if you
destroy previous snapshots.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Terminology question on ZFS COW

2012-06-05 Thread Nico Williams
COW goes back at least to the early days of virtual memory and fork().
 On fork() the kernel would arrange for writable pages in the parent
process to be made read-only so that writes to them could be caught
and then the page fault handler would copy the page (and restore write
access) so the parent and child each have their own private copies.
COW as used in ZFS is not the same, but the concept was introduced
very early also, IIRC in the mid-80s -- certainly no later than
BSD4.4's log structure filesystem (which ZFS resembles in many ways).

So, is COW a misnomer?  Yes and no, and anyways, it's irrelevant.  The
important thing is that when you say COW people understand that you're
not saving a copy of the old thing but rather writing the new thing to
a new location.  (The old version of whatever was copied-on-write is
stranded, unless -of course- you have references left to it from
things like snapshots.)

Nico
--
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss