Re: [zfs-discuss] How do I determine dedupe effectiveness?

2009-12-19 Thread Colin Raven
On Sat, Dec 19, 2009 at 05:25, Ian Collins i...@ianshome.com wrote:

 Stacy Maydew wrote:

 The commands zpool list and zpool get dedup pool both show a ratio
 of 1.10.
 So thanks for that answer.  I'm a bit confused though if the dedup is
 applied per zfs filesystem, not zpool, why can I only see the dedup on a per
 pool basis rather than for each zfs filesystem?

 Seems to me there should be a way to get this information for a given zfs
 filesystem?



 The information, if present, would probably be meaningless.  Consider which
 filesystem holds the block and which the dupe?  What happens if the original
 is removed?


AHA - original/copy I fell into the same trap.

This is the question I had back in November. Michael Schuster
http://blogs.sun.com/recursion helped me out and that's my reference point.

Here was my scenario:

in /home/fred there's a photo collection
 another collection exists in /home/janet
 at some point in the past, fred sent janet a party picture, let's call
 it DSC4456.JPG
 In the dataset, there are now two copies of the file, which are
 genuinely identical.

 So then:
 - When you de-dupe, which copy of the file gets flung?


Michael provided the following really illuminating explanation:

dedup (IIRC) operates at block level, not file level, so the question, as it
 stands, has no answer. what happens - again, from what I read in Jeff's blog
 - is this: zfs detects that a copy of a block with the same hash is being
 created, so instead of storing the block again, it just increments the
 reference count and makes sure whatever thing references this piece of
 data points to the old data.

 In that sense, you could probably argue that the new copy never gets
 created.


(Jeff's blog referred to above is here:
http://blogs.sun.com/bonwick/entry/zfs_dedup)

OK, fair enough but I still could quite get my head around what's actually
happening, so I posed this followup question, in order to cement the idea in
my silly head (because I still wasn't focused on new copy never gets
created)

Fred has an image (DSC4456.JPG in my example) in his home directory, he's
sent it to Janet. Arguably - when Janet pulled the attachment out of the
email and saved it to her $HOME - that copy never got written! Instead, the
reference count was incremented by one. Fair enough, but what is Janet
seeing when she does an ls and greps for that image? What is she seeing:
- a symlink?
- an apparition of some kind?
she sees the file, it's there, but what exactly is she seeing?

Michael stepped in and described this:

they're going to see the same file (the blocks of which now have a ref.
 counter that is one less than it was before).

 think posix-style hard links: two directory entries pointing to the same
 inode - both files are actually one, but as long as you don't change it,
 it doesn't matter. when you remove one (by removing the name), the other
 remains, the ref. count in the inode is decremented by one.


So, coming around full circle to your question; What happens if the
original is removed? it can be answered this way:

There is no original, there is no copy. There is one block with reference
counters.

- Fred can rm his file (because clearly it isn't a file, it's a filename
and that's all)
- result: the reference count is decremented by one - the data remains on
disk.
OR
- Janet can rm her filename
- result: the reference count is decremented by one - the data remains on
disk
OR
-both can rm the filename the reference count is now decremented by two -
but there were only two so now it's really REALLY gone.

Or is it really REALLY gone? Nope,  If you snapshotted the pool it isn't! :)

For me, within the core of the explanation, the posix hard link reference
somehow tipped the scales and made me understand, but we all have mental
hooks into different parts of an explanation (the aha moment) so YMMV :)

Dedup is fascinating, I hope you don't mind me sharing this little
list-anecdote because it honestly made a huge difference to my understanding
of the concept.

Once again, many thanks to Michael Schuster at Sun for having the patience
to walk a n00b through the steps towards enlightenment.

--
-Me
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] slog / log recovery is here!

2009-12-19 Thread James Risner
devzero:  when you have an exported pool with no log disk and you want to mount 
the pool.

Here is the changes to make it compile on dev-129:
--- logfix.c.2009-04-26 2009-12-18 11:39:40.917435361 -0800
+++ logfix.c2009-12-18 12:19:27.507337246 -0800
@@ -20,6 +20,7 @@
 #include stddef.h

 #include sys/vdev_impl.h
+#include sys/zio_checksum.h

 /*
  *  * Write a label block with a ZBT checksum.
@@ -58,16 +59,19 @@

uint64_t guid;  // ZPOOL_CONFIG_GUID
uint64_t is_log;// ZPOOL_CONFIG_IS_LOG
+   uint64_t id;// ZPOOL_CONFIG_ID
nvlist_t *vdev_tree;// ZPOOL_CONFIG_VDEV_TREE

char *buf;
size_t buflen;

-   VERIFY(argc == 4);
+   VERIFY(argc == 5);
VERIFY((fd_pool = open(argv[1], O_RDWR)) != -1);
VERIFY((fd_log = open(argv[2], O_RDWR)) != -1);
VERIFY(sscanf(argv[3], % SCNu64 , guid) == 1);
//guid = 9851295902337437618ULL;
+   VERIFY(sscanf(argv[4], % SCNu64 , id) == 1);
+   //id = 10;

VERIFY(pread64(fd_pool, vl_pool, sizeof (vdev_label_t), 0) ==
sizeof (vdev_label_t));
@@ -86,6 +90,10 @@
VERIFY(nvlist_remove_all(vdev_tree, ZPOOL_CONFIG_GUID) == 0);
VERIFY(nvlist_add_uint64(vdev_tree, ZPOOL_CONFIG_GUID, guid) == 0);

+   // fix id for vdev_log
+   VERIFY(nvlist_remove_all(vdev_tree, ZPOOL_CONFIG_ID) == 0);
+   VERIFY(nvlist_add_uint64(vdev_tree, ZPOOL_CONFIG_ID, id) == 0);
+
// remove what we are going to replace on config_pool
VERIFY(nvlist_remove_all(config_pool, ZPOOL_CONFIG_TOP_GUID) == 0);
VERIFY(nvlist_remove_all(config_pool, ZPOOL_CONFIG_GUID) == 0);
@@ -94,6 +102,7 @@
// add back what we want
VERIFY(nvlist_add_uint64(config_pool, ZPOOL_CONFIG_TOP_GUID, guid) == 
0);
VERIFY(nvlist_add_uint64(config_pool, ZPOOL_CONFIG_GUID, guid) == 0);
+   VERIFY(nvlist_add_uint64(config_pool, ZPOOL_CONFIG_ID, id) == 0);
VERIFY(nvlist_add_uint64(config_pool, ZPOOL_CONFIG_IS_LOG, is_log) == 
0);
VERIFY(nvlist_add_nvlist(config_pool, ZPOOL_CONFIG_VDEV_TREE, 
vdev_tree) == 0);

This also fixes a bug (the ID must also be unique and the existing code didn't 
work for me because I had 10 disks.)  If another disk with the same ID is 
present, it will get masked (since this mark up has a newer date.)

If pjjw doesn't hose a new binary, I'll put up a link for one.

There is also a second bug (with current osol code) that prevents the mounting 
if the labels don't match.  Using a label created with the same size as pjjw's 
suggested junk disk, you can use this command to copy those labels to the other 
4 points:

# cd /tmp
# dd if=/dev/zero of=junk bs=1024k count=64
# dd if=/dev/zero of=junk.log bs=1024k count=64
# zpool create junkpool /tmp/junk log /tmp/junk.log
# zpool export junkpool

** Fix it up! **
For file based log devices (this will be slow -- get the data off...):
# ./logfix /dev/rdsk/disk_from_your_pool /tmp/junk.log your_old_log_disk_guid 
your_old_log_disk_id

** Copy labels
# dd bs=256k count=1 if=/tmp/junk.log  /tmp/1
# (cat /tmp/1 /tmp/1 ; dd if=/tmp/junk.log bs=256k skip=2 count=252 ; cat 
/tmp/1 /tmp/1)  /tmp/my.log

** Make the log disk a device so it can be found
if this is the first lofi, it will be /dev/{r}lofi/1
# lofiadm -a /tmp/my.log

** Since my pool came from FreeBSD and had different device names, I needed to 
make files matching.  I created a directory with symbolic links to the solaris 
c5t?d? files as the FreeBSD names.  So add this log device to that directory:
# mkdir /tmp/fbsd; cd /tmp/fbsd
# ln -s /dev/dsk/c5t1d0s4 ad4p5
...
# ln -s /dev/lofi/1 slog0

** Import your pool **
# zpool import -d /tmp/fbsd/ -f tank
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS receive -dFv creates an extra e subdirectory..

2009-12-19 Thread Steven Sim




Hi;


After some very hairy testing, I came up with the following procedure
for sending a zfs send datastream to a gzip staging file and later
"receiving" it back to the same filesystem in the same pool.


The above was to enable the filesystem data to be dedup.


However, after the final ZFS received, i noticed that the same ZFS
filesystem mountpoint had changed by itself and added an extra "e"
subdirectory.

Here is the procedure

Firstly, the zfs file system in question has the following children..


o...@sunlight:/root# zfs list -t all -r myplace/Docs

NAME USED AVAIL REFER MOUNTPOINT

myplace/Docs 3.37G 1.05T 3.33G /export/home/admin/Docs
-- NOTE ORIGINAL MOUNTPOINT (see later bug below)

myplace/d...@scriptsnap2 43.0M - 3.33G -

myplace/d...@scriptsnap3 0 - 3.33G - -- latest snapshot

myplace/d...@scriptsnap1 0 - 3.33G -


As root, i did


r...@sunlight:/root# zfs send -R myplace/d...@scriptsnap3 | gzip -9c
 /var/tmp/myplace-Docs.snapshot.gz


Then I attempted to test a zfs receive by using the "-n" option...


ad...@sunlight:/var/tmp$ gzip -cd /var/tmp/myplace-Docs.snapshot.gz |
zfs receive -dnv myplace

cannot receive new filesystem stream: destination 'myplace/Docs' exists

must specify -F to overwrite it


Ok...let's specify -F...


ad...@sunlight:/var/tmp$ gzip -cd /var/tmp/myplace-Docs.snapshot.gz |
zfs receive -dFnv myplace

cannot receive new filesystem stream: destination has snapshots (eg.
myplace/d...@scriptsnap1)

must destroy them to overwrite it


Ok fine...let's destroy the existing snapshots for myplace/Docs...


ad...@sunlight:/var/tmp$ zfs list -t snapshot -r myplace/Docs

NAME USED AVAIL REFER MOUNTPOINT

myplace/d...@scriptsnap2 43.0M - 3.33G -

myplace/d...@scriptsnap3 0 - 3.33G -

myplace/d...@scriptsnap1 0 - 3.33G -


r...@sunlight:/root# zfs destroy myplace/d...@scriptsnap2

r...@sunlight:/root# zfs destroy myplace/d...@scriptsnap1

r...@sunlight:/root# zfs destroy myplace/d...@scriptsnap3


Checking...


r...@sunlight:/root# zfs list -t all -r myplace/Docs

NAME USED AVAIL REFER MOUNTPOINT

myplace/Docs 3.33G 1.05T 3.33G /export/home/admin/Docs


Ok...no more snapshots, just the parent myplace/Docs and no children...


Let's try the zfs receive command yet again with a "-n"


r...@sunlight:/root# gzip -cd /var/tmp/myplace-Docs.snapshot.gz | zfs
receive -dFnv myplace

would receive full stream of myplace/d...@scriptsnap2 into
myplace/d...@scriptsnap2

would receive incremental stream of myplace/d...@scriptsnap3 into
myplace/d...@scriptsnap3


Looks great! OK...let's go for the real thing...


r...@sunlight:/root# gzip -cd /var/tmp/myplace-Docs.snapshot.gz | zfs
receive -dFv myplace

receiving full stream of myplace/d...@scriptsnap2 into
myplace/d...@scriptsnap2

received 3.35GB stream in 207 seconds (16.6MB/sec)

receiving incremental stream of myplace/d...@scriptsnap3 into
myplace/d...@scriptsnap3

received 47.6MB stream in 6 seconds (7.93MB/sec)


Yah...looks good!


BUT...


A zfs list of myplace/Docs I get..


r...@sunlight:/root# zfs list -r myplace/Docs

NAME USED AVAIL REFER MOUNTPOINT

myplace/Docs 3.37G 1.05T 3.33G
/export/home/admin/Docs/e/Docs --- *** Here is the extra "e/Docs"..

r...@sunlight:/root# zfs set mountpoint=/export/home/admin/Docs
myplace/Docs

cannot mount '/export/home/admin/Docs': directory is not empty

property may be set but unable to remount filesystem


Ok...


I then went to remove the e/Docs directory under
/export/home/admin/Docs and it is now only
/export/home/admin/Docs...


Then..


r...@sunlight:/root# zfs set mountpoint=/export/home/admin/Docs
myplace/Docs


And all is well again..


Where did the "e/Docs" come from?


Did I do something wrong?


Warmest Regards

Steven Sim






___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZIL corrupt, not recoverable even with logfix

2009-12-19 Thread James Risner
Written by jktorn:
Have you tried build 128 which includes pool recovery support?

This is because FreeBSD hostname (and hostid?) is recorded in the
labels along with active pool state.

It does not work that way at the moment, though readonly import is
quite useful option that can be tried.

Yes, I tried 128a and 129.  Neither worked and all of them failed just like the 
127 version.  Specifically, they all reported the pool was used by another and 
ignored the -f option to import anyway.

Things I had tried before I programmed myself a solution include:
With or without pool name (zpool import tank zpool import)
With or without -f option
With or without -V (undocumented do anyway option)
With or without -F (lose data option)
With or without -FX (lose massive data option)

I finally decided to install an Opensolaris machine and compile a fixed version 
of logfix.c, which I detailed here:
http://opensolaris.org/jive/thread.jspa?threadID=62831tstart=0

Short summary of the problems preventing me from mounting this forged pool:
1) The pool had been accidentally exported on the host FreeBSD system.

2) The pool had 10 drives, and logfix was written to assume only one vdev for 
data and one for the log.  The guid needed to be changed and the generic 
sequential id also needed to be set to a unique value.  This is why my 
da4/da5 mirror disks disappeared whenever I used logfix to mark up the log 
device (it was the same id, for example in my case 1.)

3) The marked up log device had 1 label matching my pool tank and the other 3 
labels matching a scratch pool named junkpool.  These labels had to be removed  
to prevent it from reporting this error:
Assertion failed: rn-rn_nozpool == B_FALSE, file ../common/libzfs_import.c, 
line 1078, function zpool_open_func

4) The pool was last used on FreeBSD and the FreeBSD device names differed from 
OpenSolaris, for some reason they could not be properly detected.  So I had to 
make a directory and use -d /tmp/fbsd to point zpool to the directory to find 
the original device names.

5) The pool had lost 84 seconds of data on the log disk that can not be 
recovered (which required me to use -F to lose and mount.)

6) Since I made the log device to repair this pool a file, I need to use 
lofiadm to make a block/character device for it.

Happy!
# pfexec zpool import -d /tmp/fbsd/ -f -F tank
Pool tank returned to its state as of November 13, 2009 10:50:11 AM PST.
Discarded approximately 25 seconds of transactions.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS filesystems not mounted on reboot with Solaris 10 10/09

2009-12-19 Thread Gary Mills
I have a system that was recently upgraded to Solaris 10 10/09.  It
has a UFS root on local disk and a separate zpool on Iscsi disk.
After a reboot, the ZFS filesystems were not mounted, although the
zpool had been imported.  `zfs mount' showed nothing.  `zfs mount -a'
mounted them nicely.  The `canmount' property is `on'.  Why would they
not be mounted at boot?  This used to work with earlier releases of
Solaris 10.

The `zfs mount -a' at boot is run by the /system/filesystem/local:default
service.  It didn't record any errors on the console or in the log

[ Dec 19 08:09:11 Executing start method (/lib/svc/method/fs-local) ]
[ Dec 19 08:09:12 Method start exited with status 0 ]

Is a dependancy missing?

-- 
-Gary Mills--Unix Group--Computer and Network Services-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How do I determine dedupe effectiveness?

2009-12-19 Thread Bob Friesenhahn

On Sat, 19 Dec 2009, Colin Raven wrote:


There is no original, there is no copy. There is one block with reference 
counters.

- Fred can rm his file (because clearly it isn't a file, it's a filename and 
that's all)
- result: the reference count is decremented by one - the data remains on disk.


While the similarity to hard links is a good analogy, there really is 
a unique file in this case.  If Fred does a 'rm' on the file then 
the reference count on all the file blocks is reduced by one, and the 
block is freed if the reference count goes to zero.  Behavior is 
similar to the case where a snapshot references the file block.  If 
Janet updates a block in the file, then that updated block becomes 
unique to her copy of the file (and the reference count on the 
original is reduced by one) and it remains unique unless it happens to 
match a block in some other existing file (or snapshot of a file).


When we are children, we are told that sharing is good.  In the case 
or references, sharing is usually good, but if there is a huge amount 
of sharing, then it can take longer to delete a set of files since the 
mutual references create a hot spot which must be updated 
sequentially.  Files are usually created slowly so we don't notice 
much impact from this sharing, but we expect (hope) that files will be 
deleted almost instantaneously.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How do I determine dedupe effectiveness?

2009-12-19 Thread Colin Raven
On Sat, Dec 19, 2009 at 17:20, Bob Friesenhahn bfrie...@simple.dallas.tx.us
 wrote:

 On Sat, 19 Dec 2009, Colin Raven wrote:


 There is no original, there is no copy. There is one block with reference
 counters.

 - Fred can rm his file (because clearly it isn't a file, it's a filename
 and that's all)
 - result: the reference count is decremented by one - the data remains on
 disk.


 While the similarity to hard links is a good analogy, there really is a
 unique file in this case.  If Fred does a 'rm' on the file then the
 reference count on all the file blocks is reduced by one, and the block is
 freed if the reference count goes to zero.  Behavior is similar to the case
 where a snapshot references the file block.  If Janet updates a block in the
 file, then that updated block becomes unique to her copy of the file (and
 the reference count on the original is reduced by one) and it remains unique
 unless it happens to match a block in some other existing file (or snapshot
 of a file).


Wait...whoah, hold on.
If snapshots reside within the confines of the pool, are you saying that
dedup will also count what's contained inside the snapshots? I'm not sure
why, but that thought is vaguely disturbing on some level.

Then again (not sure how gurus feel on this point) but I have this probably
naive and foolish belief that snapshots (mostly) oughtta reside on a
separate physical box/disk_array...someplace else anyway. I say mostly
because I s'pose keeping 15 minute snapshots on board is perfectly OK - and
in fact handy. Hourly...ummm, maybe the same - but Daily/Monthly should
reside elsewhere.


 When we are children, we are told that sharing is good.  In the case or
 references, sharing is usually good, but if there is a huge amount of
 sharing, then it can take longer to delete a set of files since the mutual
 references create a hot spot which must be updated sequentially.


Y'know, that is a GREAT point. Taking this one step further then - does that
also imply that there's one hot spot physically on a disk that keeps
getting read/written to? if so then your point has even greater merit for
more reasons...disk wear for starters, and other stuff too, no doubt.


 Files are usually created slowly so we don't notice much impact from this
 sharing, but we expect (hope) that files will be deleted almost
 instantaneously. http://www.GraphicsMagick.org/

Indeed, that's is completely logical. Also, something most of us don't spend
time thinking about.

Bob, thanks. Your thoughts and insights are always interesting - and usually
most revealing!
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How do I determine dedupe effectiveness?

2009-12-19 Thread Bob Friesenhahn

On Sat, 19 Dec 2009, Colin Raven wrote:

 
Wait...whoah, hold on.
If snapshots reside within the confines of the pool, are you saying that dedup 
will also count
what's contained inside the snapshots? I'm not sure why, but that thought is 
vaguely disturbing on
some level.


Yes, of course.  Any block in the pool which came from a filesystem 
participating in dedup is a candidate for deduplication.  This 
includes snapshots.  In fact, the block in the snapshot may already 
have been deduped before the snapshot was even taken.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How do I determine dedupe effectiveness?

2009-12-19 Thread Andrey Kuzmin
On Sat, Dec 19, 2009 at 7:20 PM, Bob Friesenhahn
bfrie...@simple.dallas.tx.us wrote:
 On Sat, 19 Dec 2009, Colin Raven wrote:

 There is no original, there is no copy. There is one block with reference
 counters.

 - Fred can rm his file (because clearly it isn't a file, it's a filename
 and that's all)
 - result: the reference count is decremented by one - the data remains on
 disk.

 While the similarity to hard links is a good analogy, there really is a
 unique file in this case.  If Fred does a 'rm' on the file then the
 reference count on all the file blocks is reduced by one, and the block is
 freed if the reference count goes to zero.  Behavior is similar to the case
 where a snapshot references the file block.  If Janet updates a block in the
 file, then that updated block becomes unique to her copy of the file (and
 the reference count on the original is reduced by one) and it remains unique
 unless it happens to match a block in some other existing file (or snapshot
 of a file).

 When we are children, we are told that sharing is good.  In the case or
 references, sharing is usually good, but if there is a huge amount of
 sharing, then it can take longer to delete a set of files since the mutual
 references create a hot spot which must be updated sequentially.  Files
 are usually created slowly so we don't notice much impact from this sharing,
 but we expect (hope) that files will be deleted almost instantaneously.

I believe this has been taken care of in space maps design
(http://blogs.sun.com/bonwick/entry/space_maps provides a nice
overview).

Regards,
Andrey


 Bob
 --
 Bob Friesenhahn
 bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
 GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How do I determine dedupe effectiveness?

2009-12-19 Thread Toby Thain


On 19-Dec-09, at 4:35 AM, Colin Raven wrote:


...
There is no original, there is no copy. There is one block with  
reference counters.


Many blocks, potentially shared, make up a de-dup'd file. Not sure  
why you write one here.




- Fred can rm his file (because clearly it isn't a file, it's a  
filename and that's all)
- result: the reference count is decremented by one - the data  
remains on disk.

OR
- Janet can rm her filename
- result: the reference count is decremented by one - the data  
remains on disk

OR
-both can rm the filename the reference count is now decremented by  
two - but there were only two so now it's really REALLY gone.


That explanation describes hard links.

--Toby
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How do I determine dedupe effectiveness?

2009-12-19 Thread Toby Thain


On 19-Dec-09, at 11:34 AM, Colin Raven wrote:



...
Wait...whoah, hold on.
If snapshots reside within the confines of the pool, are you saying  
that dedup will also count what's contained inside the snapshots?


Snapshots themselves are only references, so yes.

I'm not sure why, but that thought is vaguely disturbing on some  
level.


Then again (not sure how gurus feel on this point) but I have this  
probably naive and foolish belief that snapshots (mostly) oughtta  
reside on a separate physical box/disk_array...



That is not possible, except in the case of a mirror, where one side  
is recoverable separately. You seem to be confusing snapshots with  
backup.



someplace else anyway. I say mostly because I s'pose keeping 15  
minute snapshots on board is perfectly OK - and in fact handy.  
Hourly...ummm, maybe the same - but Daily/Monthly should reside  
elsewhere.


When we are children, we are told that sharing is good.  In the  
case or references, sharing is usually good, but if there is a huge  
amount of sharing, then it can take longer to delete a set of files  
since the mutual references create a hot spot which must be  
updated sequentially.


Y'know, that is a GREAT point. Taking this one step further then -  
does that also imply that there's one hot spot physically on a  
disk that keeps getting read/written to?
if so then your point has even greater merit for more  
reasons...disk wear for starters,


That is not a problem. Disks don't wear - it is a non-contact medium.

--Toby


and other stuff too, no doubt.

Files are usually created slowly so we don't notice much impact  
from this sharing, but we expect (hope) that files will be deleted  
almost instantaneously.
Indeed, that's is completely logical. Also, something most of us  
don't spend time thinking about.

...___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How do I determine dedupe effectiveness?

2009-12-19 Thread Colin Raven
On Sat, Dec 19, 2009 at 19:08, Toby Thain t...@telegraphics.com.au wrote:


 On 19-Dec-09, at 11:34 AM, Colin Raven wrote

 Then again (not sure how gurus feel on this point) but I have this probably
 naive and foolish belief that snapshots (mostly) oughtta reside on a
 separate physical box/disk_array...



 That is not possible, except in the case of a mirror, where one side is
 recoverable separately.

I was referring to zipping up a snapshot and getting it outta Dodge onto
another physical box, or separate array.


 You seem to be confusing snapshots with backup.


No, I wasn't confusing them at all. Backups are backups. Snapshots however,
do have some limited value as backups. They're no substitute, but augment a
planned backup schedule rather nicely in many situations.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How do I determine dedupe effectiveness?

2009-12-19 Thread Toby Thain


On 19-Dec-09, at 2:01 PM, Colin Raven wrote:




On Sat, Dec 19, 2009 at 19:08, Toby Thain  
t...@telegraphics.com.au wrote:


On 19-Dec-09, at 11:34 AM, Colin Raven wrote

Then again (not sure how gurus feel on this point) but I have this  
probably naive and foolish belief that snapshots (mostly) oughtta  
reside on a separate physical box/disk_array...



That is not possible, except in the case of a mirror, where one  
side is recoverable separately.
I was referring to zipping up a snapshot and getting it outta Dodge  
onto another physical box, or separate array.


or zfs send



You seem to be confusing snapshots with backup.

No, I wasn't confusing them at all. Backups are backups. Snapshots  
however, do have some limited value as backups. They're no  
substitute, but augment a planned backup schedule rather nicely in  
many situations.


--T___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How do I determine dedupe effectiveness?

2009-12-19 Thread Toby Thain


On 19-Dec-09, at 11:34 AM, Colin Raven wrote:



...
When we are children, we are told that sharing is good.  In the  
case or references, sharing is usually good, but if there is a huge  
amount of sharing, then it can take longer to delete a set of files  
since the mutual references create a hot spot which must be  
updated sequentially.


Y'know, that is a GREAT point. Taking this one step further then -  
does that also imply that there's one hot spot physically on a  
disk that keeps getting read/written to?


Also, copy-on-write generally means that physical location of updates  
is ever-changing.


--T

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss