Re: [zfs-discuss] zfs diff performance disappointing

2011-09-26 Thread Nico Williams
On Mon, Sep 26, 2011 at 1:55 PM, Jesus Cea j...@jcea.es wrote:
 I just upgraded to Solaris 10 Update 10, and one of the improvements
 is zfs diff.

 Using the birthtime of the sectors, I would expect very high
 performance. The actual performance doesn't seems better that an
 standard rdiff, though. Quite disappointing...

 Should I disable atime to improve zfs diff performance? (most data
 doesn't change, but atime of most files would change).

atime has nothing to do with it.

How much work zfs diff has to do depends on how much has changed
between snapshots.

Nico
--
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs diff performance disappointing

2011-09-26 Thread David Magda
On Mon, September 26, 2011 14:55, Jesus Cea wrote:
[...]
 real10m0.272s
 user0m0.809s
 sys 2m6.693s
 

 10 minutes to diff 7.55 GB is... disappointing.

 This machine uses a 2-mirror configurations, and there is no more
 activity going on in the machine. ZPOOL version 29, ZFS version 5.

 Am I missing anything?
[...]

Talking about 7.55 GB is mostly useless as well. If it's a dozen video
files then stat()ing them all with be done very quickly by just running
find(1). If however the 7.55 GB is made up of 7,550,000 files then going
through them would take quite a long time.

How long would it take for (say) rsync to walk two file systems (or
snapshot directories) to come up with the same list?  Ten minutes may seem
like a lot in 'absolute' terms, but if something like rsync takes an hour
or two to stat() every file, then it's a big improvement.

So the question is: by what metric are you comparing that you came up with
the disappointing conclusion? Why is ten minutes disappointing? What
would /not/ be disappointing to you? 8m? 5m? 3.14 seconds?



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs diff performance disappointing

2011-09-26 Thread Jesus Cea
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 26/09/11 22:29, David Magda wrote:
 Talking about 7.55 GB is mostly useless as well. If it's a dozen
 video files then stat()ing them all with be done very quickly by
 just running find(1). If however the 7.55 GB is made up of
 7,550,000 files then going through them would take quite a long
 time.

Point taken, although zfs diff time is (should) proportional to
changes, not to number of files.

 How long would it take for (say) rsync to walk two file systems
 (or snapshot directories) to come up with the same list?  Ten
 minutes may seem like a lot in 'absolute' terms, but if something
 like rsync takes an hour or two to stat() every file, then it's a
 big improvement.

rsync takes a bit less than 7 minutes. So zfs diff is actually
slower!.

 So the question is: by what metric are you comparing that you came
 up with the disappointing conclusion? Why is ten minutes
 disappointing? What would /not/ be disappointing to you? 8m? 5m?
 3.14 seconds?

If I change 10 files in dataset with a trillion files, I would expect
less than a couple of seconds. Given the tree walking pruning with
birthdate age, I actually think this is reasonable (you skip over
entire on-disk branches if there are no changes under them).

- -- 
Jesus Cea Avion _/_/  _/_/_/_/_/_/
j...@jcea.es - http://www.jcea.es/ _/_/_/_/  _/_/_/_/  _/_/
jabber / xmpp:j...@jabber.org _/_/_/_/  _/_/_/_/_/
.  _/_/  _/_/_/_/  _/_/  _/_/
Things are not so easy  _/_/  _/_/_/_/  _/_/_/_/  _/_/
My name is Dump, Core Dump   _/_/_/_/_/_/  _/_/  _/_/
El amor es poner tu felicidad en la felicidad de otro - Leibniz
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQCVAwUBToDmlJlgi5GaxT1NAQKh7QP+OCokqiBNo79Tojtvy9aLztQy0T+mNMoh
i5z9BW38h8xdTNHiUqp8qnYaK3c+t8kyl90ZPR42dCKAl3hkk11x695yZuvRp+bm
IKO+CPHfQ+wu3G2hoWWwvoHEdiXRvpg2MRZxXXZnzqldthrlq0PtSpNAGctm5Apl
Ca564U9dkes=
=TeMO
-END PGP SIGNATURE-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs diff performance disappointing

2011-09-26 Thread Jesus Cea
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 26/09/11 21:31, Nico Williams wrote:
 atime has nothing to do with it.
 
 How much work zfs diff has to do depends on how much has changed 
 between snapshots.

That is what I thought, but look at my example: less than 20 changes
and more than 10 minutes to locate them...

Technically, if a datasets have atime active, the FS diverges from
the dataset even if the data is not changed.

I just did a snapshot over another unchanged snapshot. zfs diff
finish inmediatelly with no changes, and it should be. But doing a
zfs diff of /usr/local/ takes a lot of time, even without changes.
I am really thinking that atime is actually playing a role.

In my personal situation, I am doing zfs diff between snapshots
taken on the receive side of an rdiff --inplace. I would say that
rdiff is modifying the atime of ALL files in the receiving
dataset, and although that is not showed in zfs diff, it is
breaking the tree pruning by birthdate age.

I just disabled atime in this particular dataset. I do a new rdiff
- --inplace on it (as the destination). After that, zfs diff takes 12
seconds instead of the initial 10 minutes. A big improvement.

So, yes, atime seems to be harmful. Badly.

PS: I saw something similar with zfs send too.

- -- 
Jesus Cea Avion _/_/  _/_/_/_/_/_/
j...@jcea.es - http://www.jcea.es/ _/_/_/_/  _/_/_/_/  _/_/
jabber / xmpp:j...@jabber.org _/_/_/_/  _/_/_/_/_/
.  _/_/  _/_/_/_/  _/_/  _/_/
Things are not so easy  _/_/  _/_/_/_/  _/_/_/_/  _/_/
My name is Dump, Core Dump   _/_/_/_/_/_/  _/_/  _/_/
El amor es poner tu felicidad en la felicidad de otro - Leibniz
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQCVAwUBToDml5lgi5GaxT1NAQIWQgQAnoeFnltM1SyzUWDb5fxxYQJIff19B8Gp
5jpfHw3dcri6OYQzUkqxCAq0QvQdzMP899HPE2gx8yW1XqC706H1xaVsM1Ho7IJM
ZzKPulCAoEZ7njYo2ycipDIlQtxdaSuA9UPu6XDY142fq5GmnMx9lCChuWLK5gDb
Ox+ffh4867k=
=Ji6T
-END PGP SIGNATURE-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs diff performance disappointing

2011-09-26 Thread Jesus Cea
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 26/09/11 22:54, Jesus Cea wrote:
 On 26/09/11 22:29, David Magda wrote:
 Talking about 7.55 GB is mostly useless as well. If it's a
 dozen video files then stat()ing them all with be done very
 quickly by just running find(1). If however the 7.55 GB is made
 up of 7,550,000 files then going through them would take quite a
 long time.
 
 Point taken, although zfs diff time is (should) proportional to 
 changes, not to number of files.

Providing info, the used column in zfs list for these snapshots,
giving the difference between adjacent snapshots, is around 30MB
(with atime active). 10 minutes to dig in 30MB...

- -- 
Jesus Cea Avion _/_/  _/_/_/_/_/_/
j...@jcea.es - http://www.jcea.es/ _/_/_/_/  _/_/_/_/  _/_/
jabber / xmpp:j...@jabber.org _/_/_/_/  _/_/_/_/_/
.  _/_/  _/_/_/_/  _/_/  _/_/
Things are not so easy  _/_/  _/_/_/_/  _/_/_/_/  _/_/
My name is Dump, Core Dump   _/_/_/_/_/_/  _/_/  _/_/
El amor es poner tu felicidad en la felicidad de otro - Leibniz
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQCVAwUBToDnrJlgi5GaxT1NAQJKFwP/XqkUeEi66WynywY4BpWishHwmEtMfZIv
Ex5YG38/5k+0lmuMDX3wGKxTueA08AxV5YOSyFJ23Rf3FCqksJ7C8ZX2PFIT3I2D
4Z52QKMF6tw9OzcCavkLE+15pp1IEixutcLnS8mVv7gw1SHrmGyIQvXpouL3sM4a
dbKdHyUVHQk=
=sD8O
-END PGP SIGNATURE-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs diff performance disappointing

2011-09-26 Thread Bob Friesenhahn

On Mon, 26 Sep 2011, Jesus Cea wrote:


rsync takes a bit less than 7 minutes. So zfs diff is actually
slower!.


It is important to define what is meant by rsync.  For example, a 
common rsync operating mode is to simply compare whole-file timestamps 
and file size in order to determine that a file has changed. 
However, zfs surely works at the zfs block level so it does more work 
due to files being comprised of multiple blocks.


Rsync may be executed in a mode (--checksum) by which it compares 
blocks of data.  This mode would be considerably slower.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs diff performance disappointing

2011-09-26 Thread Bill Sommerfeld
On 09/26/11 12:31, Nico Williams wrote:
 On Mon, Sep 26, 2011 at 1:55 PM, Jesus Cea j...@jcea.es wrote:
 Should I disable atime to improve zfs diff performance? (most data
 doesn't change, but atime of most files would change).
 
 atime has nothing to do with it.

based on my experiences with time-based snapshots and atime on a server which
had cron-driven file tree walks running every night, I can easily believe
atime has a lot to do with it - the atime updates associated with a tree walk
will mean that that much of a filesystem's metadata will diverge between the
writeable filesystem and its last snapshot.

- Bill
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs diff performance disappointing

2011-09-26 Thread Nico Williams
Ah yes, of course.  I'd misread your original post.  Yes, disabling
atime updates will reduce the number of superfluous transactions.
It's *all* transactions that count, not just the ones the app
explicitly caused, and atime implies lots of transactions.

Nico
--
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs diff performance disappointing

2011-09-26 Thread Ian Collins

 On 09/27/11 07:55 AM, Jesus Cea wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

I just upgraded to Solaris 10 Update 10, and one of the improvements
is zfs diff.

Using the birthtime of the sectors, I would expect very high
performance. The actual performance doesn't seems better that an
standard rdiff, though. Quite disappointing...

Should I disable atime to improve zfs diff performance? (most data
doesn't change, but atime of most files would change).

I tend to disable atime in the root filesystem and only enable it on a 
filesystem if required.  So far, it has never been required on any of 
the systems I look after!


--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs diff performance disappointing

2011-09-26 Thread Tomas Forsman
On 27 September, 2011 - Ian Collins sent me these 0,8K bytes:

  On 09/27/11 07:55 AM, Jesus Cea wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 I just upgraded to Solaris 10 Update 10, and one of the improvements
 is zfs diff.

 Using the birthtime of the sectors, I would expect very high
 performance. The actual performance doesn't seems better that an
 standard rdiff, though. Quite disappointing...

 Should I disable atime to improve zfs diff performance? (most data
 doesn't change, but atime of most files would change).

 I tend to disable atime in the root filesystem and only enable it on a  
 filesystem if required.  So far, it has never been required on any of  
 the systems I look after!

I've found it useful time after time.. do things and then check atime
to see whatever files it looked at..
(yes, I know about truss and dtrace)

/Tomas
-- 
Tomas Forsman, st...@acc.umu.se, http://www.acc.umu.se/~stric/
|- Student at Computing Science, University of UmeƄ
`- Sysadmin at {cs,acc}.umu.se
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs diff performance disappointing

2011-09-26 Thread Ian Collins

 On 09/27/11 10:59 AM, Tomas Forsman wrote:

On 27 September, 2011 - Ian Collins sent me these 0,8K bytes:


  On 09/27/11 07:55 AM, Jesus Cea wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

I just upgraded to Solaris 10 Update 10, and one of the improvements
is zfs diff.

Using the birthtime of the sectors, I would expect very high
performance. The actual performance doesn't seems better that an
standard rdiff, though. Quite disappointing...

Should I disable atime to improve zfs diff performance? (most data
doesn't change, but atime of most files would change).


I tend to disable atime in the root filesystem and only enable it on a
filesystem if required.  So far, it has never been required on any of
the systems I look after!

I've found it useful time after time.. do things and then check atime
to see whatever files it looked at..
(yes, I know about truss and dtrace)

It can be useful, but unless you really want the functionality, it 
generates a lot of unnecessary writes.


--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss