Re: [zfs-discuss] Question about ZFS Incremental Send/Receive

2009-04-29 Thread Darren J Moffat

Mattias Pantzare wrote:

O I feel like I understand what tar is doing, but I'm curious about what is it

that ZFS is looking at that makes it a successful incremental send? That
is, not send the entire file again. Does it have to do with how the
application (tar in this example) does a file open, fopen(), and what mode
is used? i.e. open for read, open for write, open for append. Or is it
looking at a file system header, or checksum? I'm just trying to explain
some observed behavior we're seeing during our testing.

My proof of concept is to remote replicate these container files, which
are created by a 3rd party application.


ZFS knows what blocks where written since the first snapshot was taken.

Filenames or type of open is not important.

If you open a file and rewrite all blocks in that file with the same
content all those block will be sent. If you rewrite 5 block only 5
blocks are sent (plus the meta data that where updated).


Providing the application doing this does exactly what you said and not 
what a lot of apps (particularly editors) do which is write in a tmp 
file, unlink and rename.



The way it works is that all blocks have a time stamp. Block with a time
stamp newer that the first snapshot will be sent.


Not really a time stamp but a transaction group number.

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Question about ZFS Incremental Send/Receive

2009-04-28 Thread Patrick Pinchera
I'm using ZFS snapshots and send and receive for a proof of concept, and 
I'd like to better understand how the incremental feature works.


Consider this example:

  1. create a tar file using tar -cvf of 10 image files
  2. ZFS snapshot the filesystem that contains this tar file
  3. Use ZFS send and receive and ssh to replicate this file system on
 another (remote) system
  4. add 3 more image files to the tar file using tar -uvf
  5. ZFS snapshot the same file system
  6. Repeat step 3 above but this time do an incremental on the zfs send
  7. observing the network traffic (iftop) I see that only the
 incremental data is transferred between the systems. This is my
 goal, to NOT have to resend the entire tar, or container file,
 over the network of each incremental.

If I repeat the above experiment, but instead do a tar cvf at step 4, 
and just add more image files each time, i.e.

   step 1:  tar cvf container01.tar file02 file02 file03 file04 file05
   step 4:  tar cvf container01.tar file02 file02 file03 file04 file05 
file06 file07 file08
I see the amount of data equivalent to the entire container01.tar get 
transferred over the network. This not the behavior I want.


In the second experimment above, what is it about ZFS that's catching 
the fact that it is a new file.
I used tar in my experiments just because I'm familiar with it and it's 
on my Solaris 10 VM's.  Does the tar uvf do an open() with the append 
flag, so ZFS somehow knows about that?  What got changed when I did the 
tar cvf the second time, writing to the same file name, but instead 
with more files?


I feel like I understand what tar is doing, but I'm curious about what 
is it that ZFS is looking at that makes it a successful incremental 
send? That is, not send the entire file again. Does it have to do with 
how the application (tar in this example) does a file open, fopen(), and 
what mode is used? i.e. open for read, open for write, open for append. 
Or is it looking at a file system header, or checksum? I'm just trying 
to explain some observed behavior we're seeing during our testing.


My proof of concept is to remote replicate these container files, 
which are created by a 3rd party application.


Thanks in advance,
Pat

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Question about ZFS Incremental Send/Receive

2009-04-28 Thread Mattias Pantzare
O I feel like I understand what tar is doing, but I'm curious about what is it
 that ZFS is looking at that makes it a successful incremental send? That
 is, not send the entire file again. Does it have to do with how the
 application (tar in this example) does a file open, fopen(), and what mode
 is used? i.e. open for read, open for write, open for append. Or is it
 looking at a file system header, or checksum? I'm just trying to explain
 some observed behavior we're seeing during our testing.

 My proof of concept is to remote replicate these container files, which
 are created by a 3rd party application.

ZFS knows what blocks where written since the first snapshot was taken.

Filenames or type of open is not important.

If you open a file and rewrite all blocks in that file with the same
content all those block will be sent. If you rewrite 5 block only 5
blocks are sent (plus the meta data that where updated).

The way it works is that all blocks have a time stamp. Block with a time
stamp newer that the first snapshot will be sent.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss