Re: [zfs-discuss] Performance drop during scrub?

2010-04-30 Thread Tonmaus
 In my opinion periodic scrubs are most useful for
 pools based on 
 mirrors, or raidz1, and much less useful for pools
 based on raidz2 or 
 raidz3.  It is useful to run a scrub at least once on
 a well-populated 
 new pool in order to validate the hardware and OS,
 but otherwise, the 
 scrub is most useful for discovering bit-rot in
 singly-redundant 
 pools.
 
 Bob

Hi,

for once, well populated pools are rarely new. Second, Best Practises 
recommendations on scrubbing intervals are based on disk product line 
(Enterprise monthly vs. Consumer weekly), not on redundancy level or pool 
configuration. Obviously, the issue under discussion affects all imaginable 
configurations, though. It may only vary in the degree.
Recommending to not using scrub doesn't even qualify as a workaround, in my 
regard.

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best practice for full stystem backup - equivelent of ufsdump/ufsrestore

2010-04-30 Thread Euan Thoms
Thanks Cindy for the links.

I see that this could possibly be a replacement for ufsbackup/ufsrestore but 
unless a further snapshot can be appended to the file containing the recursive 
rootpool snapshot, it would still regress from the incremental backup that 
ufsbackup has. It would take a long time to run every night but on the plus 
side an in-situe backup without having to stop services is an improvement from 
UFS days.

Haven't tried it yet, sounds a bit more complicated than I had hoped for.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best practice for full stystem backup - equivelent of ufsdump/ufsrestore

2010-04-30 Thread Euan Thoms
Thanks Edward, you understood me perfectly.

Your suggestion sounds very promising. I like the idea of letting the 
installation CD set everything up, that way some hardware/drivers could 
possibly be updated and yet it still work. On top of a bare metal recovery, I 
would like to leverage the incredible power of ZFS snapshots, I love the way 
zfs send / receive works. It's the root pool and BEs complexities that worry me.

My ideal solution would be to have the data accessible from the backup media 
(external HDD) as well as be used as full syatem restore. Below is what I would 
consider ideal:

1.) Create a pool on an external HDD called backup-pool
2.) Send the whole rpool (all filesystems within) to the backup pool.
3.) be able to browse the backup pool starting from /backup-pool
4.) be able to export the backup pool and import on PC2 to browse the files 
there
5.) be able to create another snapshot of rpool and zfs snapshot -i 
rp...@first-snapshot rp...@next-snapshot backup-pool/rpool (send the increment 
to the backup pool/drive
6.) be able to browse the latest snapshot data on the backup drive, whilst able 
to clone an older snapsho
7.) be able to 'zfs send' the latest backup snapshot to a fresh installation, 
thus get it back to exactly how it was before disaster.

At the moment I have successfully achieved 1-4 and I'm very impressed. I am 
currently trying to get 5-6 working, mildy confident that it will work, done it 
in part but got errors with /export/home filesystem and subsequently pool 
failed to import/export. It's just copying over again after wiping backup pool 
and starting again. I hope build 134 is a good build to test this on.

However, it's step 7 that I have no idea if it will work. Edward, your post 
gives me promise, 90% confidence is a good start.

Watch this space for my results.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Can we recover a deleted directory?

2010-04-30 Thread Ashish Nabira

Hello Experts;

There was a CIFS share we were using /export/cifs1 . It got deleted  
accidently.


Is there any way I can recover this directory. We don't have snapshot  
for this directory.



regards;
Ashish
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best practice for full stystem backup - equivelent of ufsdump/ufsrestore

2010-04-30 Thread Euan Thoms
Well I'm so impressed with zfs at the moment! I just got steps 5 and 6 (form my 
last post) to work, and it works well. Not only does it send the increment over 
to the backup drive, the latest increment/snapshot appears in the mounted 
filesystem. In nautilus I can browse an exact copy of my PC, from / to the 
deepest parts of my home folder. And it will backup my entire system in 1-2 
minutes, AMAZING!!

Below are the steps, try it for yourself on a spare USB HDD:



# Create backup storage pool on drive c12t0d0
pfexec zpool create backup-pool c12t0d0
# Recursively snapshot the root pool (rpool)
pfexec zfs snapshot -r rp...@first

# Send the entire pool in all it's snapshots to the backup pool, disable 
mounting
pfexec zfs send rp...@first | pfexec zfs receive -u backup-pool/rpool
pfexec zfs send rpool/r...@first | pfexec zfs receive -u backup-pool/rpool/ROOT
pfexec zfs send rpool/ROOT/opensolaris-2009.06-...@first | pfexec zfs receive 
-u backup-pool/rpool/ROOT/OpenSolaris-2009.06-134
pfexec zfs send rpool/d...@first | pfexec zfs receive -u backup-pool/rpool/dump
pfexec zfs send rpool/s...@first | pfexec zfs receive -u backup-pool/rpool/swap
pfexec zfs send rpool/websp...@first | pfexec zfs receive -u 
backup-pool/rpool/webspace
pfexec zfs send rpool/exp...@first | pfexec zfs receive -u 
backup-pool/rpool/export
pfexec zfs send rpool/export/h...@first | pfexec zfs receive -u 
backup-pool/rpool/export/home
pfexec zfs send rpool/export/home/e...@first | pfexec zfs receive -u 
backup-pool/rpool/export/home/euan
pfexec zfs send rpool/export/home/euan/downlo...@first | pfexec zfs receive -u 
backup-pool/rpool/export/home/euan/Downloads
pfexec zfs send rpool/export/home/euan/vbox-...@first | pfexec zfs receive -u 
backup-pool/rpool/export/home/euan/VBOX-HDD

# Change mount points to correct structure 
pfexec zfs set mountpoint=legacy backup-pool/rpool/ROOT
pfexec zfs set mountpoint=/backup-pool/opensolaris 
backup-pool/rpool/ROOT/OpenSolaris-2009.06-134
pfexec zfs set mountpoint=/backup-pool/opensolaris/rpool backup-pool/rpool
pfexec zfs set mountpoint=/backup-pool/opensolaris/opt/webspace 
backup-pool/rpool/webspace
pfexec zfs set mountpoint=/backup-pool/opensolaris/export 
backup-pool/rpool/export
pfexec zfs set mountpoint=/backup-pool/opensolaris/export/home 
backup-pool/rpool/export/home
pfexec zfs set mountpoint=/backup-pool/opensolaris/export/home/euan 
backup-pool/rpool/export/home/euan
pfexec zfs set mountpoint=/backup-pool/opensolaris/export/home/euan/Downloads 
backup-pool/rpool/export/home/euan/Downloads
pfexec zfs set mountpoint=/backup-pool/opensolaris/export/home/euan/VBOX-HDD 
backup-pool/rpool/export/home/euan/VBOX-HDD

# Now we can mount the backup pool filesystems
pfexec zfs mount backup-pool/rpool/ROOT/OpenSolaris-2009.06-134
pfexec zfs mount backup-pool/rpool
pfexec zfs mount backup-pool/rpool/webspace
pfexec zfs mount backup-pool/rpool/export
pfexec zfs mount backup-pool/rpool/export/home
pfexec zfs mount backup-pool/rpool/export/home/euan
pfexec zfs mount backup-pool/rpool/export/home/euan/Downloads
pfexec zfs mount backup-pool/rpool/export/home/euan/VBOX-HDD

# Take second snapshot at a later point in time
pfexec zfs snapshot -r rp...@second

# Send the increments to the backup pool
pfexec zfs send -i rpool/r...@first rpool/r...@second | pfexec zfs recv -F 
backup-pool/rpool/ROOT
pfexec zfs send -i rpool/ROOT/opensolaris-2009.06-...@first 
rpool/ROOT/opensolaris-2009.06-...@second | pfexec zfs recv -F 
backup-pool/rpool/ROOT/OpenSolaris-2009.06-134
pfexec zfs send -i rp...@first rp...@second | pfexec zfs recv -F 
backup-pool/rpool
pfexec zfs send -i rpool/d...@first rpool/d...@second | pfexec zfs recv -F 
backup-pool/rpool/dump
pfexec zfs send -i rpool/s...@first rpool/s...@second | pfexec zfs recv -F 
backup-pool/rpool/swap
pfexec zfs send -i rpool/websp...@first rpool/websp...@second | pfexec zfs recv 
-F backup-pool/rpool/webspace
pfexec zfs send -i rpool/exp...@first rpool/exp...@second | pfexec zfs recv -F 
backup-pool/rpool/export
pfexec zfs send -i rpool/export/h...@first rpool/export/h...@second | pfexec 
zfs recv -F backup-pool/rpool/export/home
pfexec zfs send -i rpool/export/home/e...@first rpool/export/home/e...@second | 
pfexec zfs recv -F backup-pool/rpool/export/home/euan
pfexec zfs send -i rpool/export/home/e...@first rpool/export/home/e...@second | 
pfexec zfs recv -F backup-pool/rpool/export/home/euan/Downloads
pfexec zfs send -i rpool/export/home/euan/vbox-...@first 
rpool/export/home/euan/vbox-...@second | pfexec zfs recv -F 
backup-pool/rpool/export/home/euan/VBOX-HDD
pfexec zfs send -i rpool/export/home/euan/downlo...@first 
rpool/export/home/euan/downlo...@second | pfexec zfs recv -F 
backup-pool/rpool/export/home/euan/Downloads


#pfexec zfs umount backup-pool/rpool/export/home/euan/VBOX-HDD
#pfexec zfs umount backup-pool/rpool/export/home/euan/Downloads
#pfexec zfs umount backup-pool/rpool/export/home/euan
#pfexec zfs umount backup-pool/rpool/export/home

Re: [zfs-discuss] zfs inherit vs. received properties

2010-04-30 Thread Tom Erickson

Brandon High wrote:

I'm seeing some weird behavior on b133 with 'zfs inherit' that seems
to conflict with what the docs say. According to the man page it
clears the specified property, causing it to be inherited from an
ancestor but that's not the behavior I'm seeing.

For example:

basestar:~$ zfs get compress tank/export/vmware
NAMEPROPERTY  VALUE
   SOURCE
tank/export/vmware  compression   gzip
   local
basestar:~$ zfs get compress tank/export/vmware/delusional
NAME   PROPERTYVALUE
SOURCE
tank/export/vmware/delusional  compression on
received
bh...@basestar:~$ pfexec zfs inherit compress tank/export/vmware/delusional
basestar:~$ zfs get compress tank/export/vmware/delusional
NAME   PROPERTYVALUE
SOURCE
tank/export/vmware/delusional  compression on
received

Is this a bug in inherit, or is the documentation off?



That would be a bug. 'zfs inherit' is supposed to override received property 
values. This works for me on b140:


: to...@heavy[10]; zfs get compress tank/b
NAMEPROPERTY VALUE SOURCE
tank/b  compression  gzip  local
: to...@heavy[11]; zfs get compress tank/b/c
NAME  PROPERTY VALUE SOURCE
tank/b/c  compression  onreceived
: to...@heavy[12]; zfs inherit compress tank/b/c
: to...@heavy[13]; zfs get compress tank/b/c
NAME  PROPERTY VALUE SOURCE
tank/b/c  compression  gzip  inherited from tank/b
: to...@heavy[14];

Then, to restore the received value:

: to...@heavy[14]; zfs inherit -S compress tank/b/c
: to...@heavy[15]; zfs get compress tank/b/c
NAME  PROPERTY VALUE SOURCE
tank/b/c  compression  onreceived
: to...@heavy[16];

I don't remember this being an issue. I'll let you know if I find out more.

Tom
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best practice for full stystem backup - equivelent of ufsdump/ufsrestore

2010-04-30 Thread Edward Ned Harvey
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Euan Thoms
 
 My ideal solution would be to have the data accessible from the backup
 media (external HDD) as well as be used as full syatem restore. Below
 is what I would consider ideal:
 
 1.) Create a pool on an external HDD called backup-pool
 2.) Send the whole rpool (all filesystems within) to the backup pool.
 3.) be able to browse the backup pool starting from /backup-pool
 4.) be able to export the backup pool and import on PC2 to browse the
 files there
 5.) be able to create another snapshot of rpool and zfs snapshot -i
 rp...@first-snapshot rp...@next-snapshot backup-pool/rpool (send the
 increment to the backup pool/drive
 6.) be able to browse the latest snapshot data on the backup drive,
 whilst able to clone an older snapsho
 7.) be able to 'zfs send' the latest backup snapshot to a fresh
 installation, thus get it back to exactly how it was before disaster.

Yes, all of the above are possible.  This is what I personally do.


 However, it's step 7 that I have no idea if it will work. Edward, your
 post gives me promise, 90% confidence is a good start.

The remaining 10% is:
Although I know for sure you can do all your backups as described above, I
have not attempted the bare metal restore.  Although I believe I understand
all that's needed about partitions, boot labels, etc, I must acknowledge
some uncertainty about precisely the best method of doing the bare metal
restore.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best practice for full stystem backup - equivelent of ufsdump/ufsrestore

2010-04-30 Thread Edward Ned Harvey
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Euan Thoms

 pfexec zfs send rp...@first | pfexec zfs receive -u backup-pool/rpool
 pfexec zfs send rpool/r...@first | pfexec zfs receive -u backup-
 pool/rpool/ROOT
 pfexec zfs send rpool/ROOT/opensolaris-2009.06-...@first | pfexec zfs
 receive -u backup-pool/rpool/ROOT/OpenSolaris-2009.06-134
 pfexec zfs send rpool/d...@first | pfexec zfs receive -u backup-
 pool/rpool/dump
(and so on)

I notice you have many zfs filesystems inside of other zfs filesystems.
While this is common practice, I will personally advise against it in
general, unless you can name a reason why you want to do that.

Here is one reason not to do that:  If you're working in some directory, and
you want to access a snapshot of some file you're working on, you have to go
up to the root of the filesystem that you're currently in.  If you go up too
far and find a .zfs directory in some filesystem which is above your current
filesystem, then you can't find your snapshots.  You have to know precisely
which is the right .zfs directory to go into.

Also, as you've demonstrated, it makes your backup scripts much longer.


 #pfexec zfs umount backup-pool/rpool/export/home/euan/VBOX-HDD
 #pfexec zfs umount backup-pool/rpool/export/home/euan/Downloads

Instead of mounting  unmounting the external zfs filesystems, I would
recommend importing  exporting the external zpool.  No need to
mount/unmount.  It's automatic by zpool import/export.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS snapshot versus Netapp - Security and convenience

2010-04-30 Thread Edward Ned Harvey
 From: Peter Jeremy [mailto:peter.jer...@alcatel-lucent.com]
 
 I gather you are suggesting that the inode be extended to contain a
 list of the inode numbers of all directories that contain a filename
 referring to that inode.  

Correct.


 [inodes] can have up to 32767 links [to them].  Where is 
 this list of (up to) 32767 parent inodes going to be stored?

Naively, I suggest storing the list of parents in the inode itself.  Let's
see if that's unreasonable.

How many bytes long is an inode number?  I couldn't find that easily by
googling, so for the moment, I'll guess it's a fixed size, and I'll guess
64bits (8 bytes).  Which means all inodes of Link Count 1 would be extended
by 8 bytes, and an inode could possibly require a maximum 32767*8 =
256kbytes maximum to store all the parent inode backpointers.


 Well, you need to find somewhere to store up to 32K inode numbers,
 whilst having minimal space overhead for small numbers of links.  

I think you're saying:  The number of bytes in an inode is fixed.  Not
variable.  

How many bytes is that?  Would it be exceptionally difficult to extend
and/or make variable?

Perhaps all inodes (including files) could have a property similar to
directories, where they reference a variable number of bytes written
somewhere on disk (kind of like how directories reference variable sized
files) and that allows the list of parent inodes to be stored in a block
separate from the usual inode information.

One important consideration in that hypothetical scenario would be
fragmentation.  If every inode were fragmented in two, that would be a real
drag for performance.  Perhaps every inode could be extended (for example)
32 bytes to accommodate a list of up to 4 parent inodes, but whenever the
number of parents exceeds 4, the inode itself gets fragmented to store a
variable list of parents.


 In which case, it would be trivially easy to walk back up the whole
 tree, almost instantly identifying every combination of paths that
 could possibly lead to this inode, while simultaneously correctly
 handling security concerns about bypassing security of parent
 directories and everything.
 
 Whilst it's trivially easy to get from the file to the list of
 directories containing that file, actually getting from one directory
 to its parent is less so: A directory containing N sub-directories has
 N+2 links.  Whilst the '.' link is easy to identify (it points to its
 own inode), distinguishing between the name of this directory in its
 parent and the '..' entries in its subdirectories is rather messy
 (requiring directory scans) unless you mandate that the reference to
 the parent directory is in a fixed location (ie 1st or 2nd entry in
 the parent inode list).

Interesting.  In other words, because of the .. entry in every
subdirectory, every parent directory is linked to, not just by its parents,
but also by its children.  If extending inodes to include the list of
inodes that link to this inode as I suggested, there would need to be a
simple way of distinguishing which inodes in the inodes that link to this
inode list are actually parents, and which ones are backpointers of
children.

I would suggest something simple, like this:  
The only reason to create a list of parent inodes is for the sake of
quickly identifying the absolute path of any arbitrary inode number, so you
can quickly locate all the past snaps of any arbitrary file or directory,
even if that file or directory has been renamed, moved, or relocated in the
directory tree.  

Instead of creating a list of all inodes that link to this inode, just
make it a parent inodes list.  That is:  when you create a subdirectory,
even though the subdir does link back to its parent, the inode of the subdir
is not stored in the parent's parent inodes list.  Thus, the Link Count of
a directory is allowed to differ from the number of inodes listed in the
parent inodes field.

All inodes listed in the parent inodes field would, I think, then be links
to a more shallow location in the tree hierarchy.



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ARC Summary

2010-04-30 Thread Tony MacDoodle
Was wondering if anyone of you see any issues with the following in Solaris
10 u8 ZFS?

System Memory:
Physical RAM: 11042 MB
Free Memory : 5250 MB
LotsFree: 168 MB

ZFS Tunables (/etc/system):

ARC Size:
Current Size: 4309 MB (arcsize)
Target Size (Adaptive): 10018 MB (c)
Min Size (Hard Limit): 1252 MB (zfs_arc_min)
Max Size (Hard Limit): 10018 MB (zfs_arc_max)

ARC Size Breakdown:
Most Recently Used Cache Size: 49% 5008 MB (p)
Most Frequently Used Cache Size: 50% 5009 MB (c-p)

ARC Efficency:
Cache Access Total: 5630506
Cache Hit Ratio: 98% 5564770 [Defined State for buffer]
Cache Miss Ratio: 1% 65736 [Undefined State for Buffer]
REAL Hit Ratio: 74% 445 [MRU/MFU Hits Only]

Data Demand Efficiency: 98%
Data Prefetch Efficiency: 23%

CACHE HITS BY CACHE LIST:
Anon: 24% 1342485 [ New Customer, First Cache Hit ]
Most Recently Used: 7% 396106 (mru) [ Return Customer ]
Most Frequently Used: 68% 3826139 (mfu) [ Frequent Customer ]
Most Recently Used Ghost: 0% 16 (mru_ghost) [ Return Customer Evicted, Now
Back ]
Most Frequently Used Ghost: 0% 24 (mfu_ghost) [ Frequent Customer Evicted,
Now Back ]
CACHE HITS BY DATA TYPE:
Demand Data: 50% 2833150
Prefetch Data: 0% 6806
Demand Metadata: 24% 1383720
Prefetch Metadata: 24% 1341094
CACHE MISSES BY DATA TYPE:
Demand Data: 47% 31200
Prefetch Data: 33% 22326
Demand Metadata: 10% 6596
Prefetch Metadata: 8% 5614
-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best practice for full stystem backup - equivelent of ufsdump/ufsrestore

2010-04-30 Thread Cindy Swearingen

Hi Ned,

Unless I misunderstand what bare metal recovery means, the following
procedure describes how to boot from CD, recreate the root pool, and
restore the root pool snapshots:

http://docs.sun.com/app/docs/doc/819-5461/ghzur?l=ena=view

I retest this process at every Solaris release.

Thanks,

Cindy

On 04/29/10 21:42, Edward Ned Harvey wrote:

From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
boun...@opensolaris.org] On Behalf Of Cindy Swearingen

For full root pool recovery see the ZFS Administration Guide, here:

http://docs.sun.com/app/docs/doc/819-5461/ghzvz?l=ena=view

Recovering the ZFS Root Pool or Root Pool Snapshots


Unless I misunderstand, I think the intent of the OP question is how to do
bare metal recovery after some catastrophic failure.  In this situation,
recovery is much more complex than what the ZFS Admin Guide says above.  You
would need to boot from CD, and partition and format the disk, then create a
pool, and create a filesystem, and zfs send | zfs receive into that
filesystem, and finally install the boot blocks.  Only some of these steps
are described in the ZFS Admin Guide, because simply expanding the rpool is
a fundamentally easier thing to do.

Even though I think I could do that ... I don't have a lot of confidence in
it, and I can certainly imagine some pesky little detail being a problem.

This is why I suggested the technique of:
Reinstall the OS just like you did when you first built your machine, before
the catastrophy.  It doesn't even matter if you make the same selections you
made before (IP address, package selection, authentication method, etc) as
long as you're choosing to partition and install the bootloader like you did
before.

This way, you're sure the partitions, format, pool, filesystem, and
bootloader are all configured properly.
Then boot from CD again, and zfs send | zfs receive to overwrite your
existing rpool.

And as far as I know, that will take care of everything.  But I only feel
like 90% confident that would work.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS snapshot versus Netapp - Security and convenience

2010-04-30 Thread Edward Ned Harvey
 From: Peter Jeremy [mailto:peter.jer...@alcatel-lucent.com]
 
 Whilst it's trivially easy to get from the file to the list of
 directories containing that file, actually getting from one directory
 to its parent is less so: A directory containing N sub-directories has
 N+2 links.  Whilst the '.' link is easy to identify (it points to its
 own inode), distinguishing between the name of this directory in its
 parent and the '..' entries in its subdirectories is rather messy

Oh.  Duh.
This should have been obvious from the moment you said '..'

Given:  There is exactly one absolute path to every directory.  You cannot
hardlink subdirectories into multiple parent locations.  You can only
hardlink files.  Every directory has exactly one parent, and the parent
inode number is already stored in every directory inode.

Given:  There is already the '..' entry in every directory.  Which means it
is already trivially easy to identify the absolute path of any directory,
given that you know its inode number, and you have some method to open an
arbitrary inode by number.  Which implies it can only be implemented in
kernel, or perhaps by root.  (A regular user cannot open an inode by number,
due to security reasons, the parent directories may block permission for a
regular user to open that inode.)  But the fact remains:  No change to
filesystem or inode structure is necessary, in order to quickly identify the
absolute path of an arbitrary directory, when your initial knowledge is only
the inode number of the directory.

Therefore, it should be very easy to implement proof of concept, by writing
a setuid root C program, similar to sudo which could then become root,
identify the absolute path of a directory by its inode number, and then
print that absolute path, only if the real UID has permission to ls that
path.

Fundamentally, the only difficulty is to extend inodes of files, to include
a list of parent inode directories.  And, how to make all this information
available over NFS and CIFS.

While not trivial, it's certainly possible to extend inodes of files, to
include parent pointers.

Also not trivial, it's certainly possible to make all this information
available under proposed directories, .zfs/inodes or something similar.
(Again, considering that the .zfs/inodes directory would be sufficient for
NFS, but some more information would be necessary to support CIFS, because
CIFS, as far as I know, has no knowledge of inode numbers, and therefore
cannot even begin to look for an inode under the .zfs/inodes directory.)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best practice for full stystem backup - equivelent of ufsdump/ufsrestore

2010-04-30 Thread erik.ableson

On 30 avr. 2010, at 13:47, Euan Thoms wrote:

 Well I'm so impressed with zfs at the moment! I just got steps 5 and 6 (form 
 my last post) to work, and it works well. Not only does it send the increment 
 over to the backup drive, the latest increment/snapshot appears in the 
 mounted filesystem. In nautilus I can browse an exact copy of my PC, from / 
 to the deepest parts of my home folder. And it will backup my entire system 
 in 1-2 minutes, AMAZING!!
 
 Below are the steps, try it for yourself on a spare USB HDD:
 
 # Create backup storage pool on drive c12t0d0
 pfexec zpool create backup-pool c12t0d0
 # Recursively snapshot the root pool (rpool)
 pfexec zfs snapshot -r rp...@first
 
 # Send the entire pool in all it's snapshots to the backup pool, disable 
 mounting
 pfexec zfs send rp...@first | pfexec zfs receive -u backup-pool/rpool
 [snip ]pfexec zfs send rpool/export/home/euan/vbox-...@first | pfexec zfs 
 receive -u backup-pool/rpool/export/home/euan/VBOX-HDD
 
 # Take second snapshot at a later point in time
 pfexec zfs snapshot -r rp...@second
 
 # Send the increments to the backup pool
 pfexec zfs send -i rpool/r...@first rpool/r...@second | pfexec zfs recv -F 
 backup-pool/rpool/ROOT
 [snip

 ]pfexec zfs send -i rpool/export/home/euan/downlo...@first 
 rpool/export/home/euan/downlo...@second | pfexec zfs recv -F 
 backup-pool/rpool/export/home/euan/Downloads

Just a quick comment for the send/recv operations, adding -R makes it recursive 
so you only need one line to send the rpool and all descendant filesystems. 

I use the send/recv operations for all sorts of backup operations. For the 
equivalent of a full backup of my boot volumes :
NOW=`date +%Y-%m-%d_%H-%M-%S`
pfexec /usr/sbin/zfs snapshot -r rp...@$now
pfexec /usr/sbin/zfs send –R rp...@now | /usr/bin/gzip  
/mnt/backups/rpool.$NOW.zip
pfexec /usr/sbin/zfs destroy -r rp...@$now

But for any incremental transfers it's better to recv to an actual filesystem 
that you can scrub and confirm that the stream made it over OK.

Cheers,

Erik___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-04-30 Thread David Dyer-Bennet

On Thu, April 29, 2010 17:35, Bob Friesenhahn wrote:

 In my opinion periodic scrubs are most useful for pools based on
 mirrors, or raidz1, and much less useful for pools based on raidz2 or
 raidz3.  It is useful to run a scrub at least once on a well-populated
 new pool in order to validate the hardware and OS, but otherwise, the
 scrub is most useful for discovering bit-rot in singly-redundant
 pools.

I've got 10 years of photos on my disk now, and it's growing at faster
than one year per year (since I'm scanning backwards slowly through the
negatives).  Many of them don't get accessed very often; they're archival,
not current use.  Scrub was one of the primary reasons I chose ZFS for the
fileserver they live on -- I want some assurance, 20 years from now, that
they're still valid.  I needed something to check them periodically, and
something to check *against*, and block checksums and scrub seemed to fill
the bill.

So, yes, I want to catch bit rot -- on a pool of mirrored VDEVs.
-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Panic when deleting a large dedup snapshot

2010-04-30 Thread Cindy Swearingen

Brandon,

You're probably hitting this CR:

http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6924824

I'm tracking the existing dedup issues here:

http://hub.opensolaris.org/bin/view/Community+Group+zfs/dedup

Thanks,

Cindy

On 04/29/10 23:11, Brandon High wrote:

I tried destroying a large (710GB) snapshot from a dataset that had
been written with dedup on. The host locked up almost immediately, but
there wasn't a stack trace on the console and the host required a
power cycle, but seemed to reboot normally. Once up, the snapshot was
still there. I was able to get a dump from this. The data was written
with b129, and the system is currently at b134.

I tried destroying it again, and the host started behaving badly.
'less' would hang, and there were several zfs-auto-snapshot processes
that were over an hour old, and the 'zfs snapshot' processes were
stuck on the first dataset of the pool. Eventually the host became
unusable and I rebooted again.

The host seems to be fine now, and is currently running a scrub.

Any ideas on how to avoid this in the future? I'm no longer using
dedup due to performance issues with it, which implies that the DDT is
very large.

bh...@basestar:~$ pfexec zdb -DD tank
DDT-sha256-zap-duplicate: 5339247 entries, size 348 on disk, 162 in core
DDT-sha256-zap-unique: 1479972 entries, size 1859 on disk, 1070 in core

-B


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best practice for full stystem backup - equivelent of ufsdump/ufsrestore

2010-04-30 Thread Bob Friesenhahn

On Thu, 29 Apr 2010, Edward Ned Harvey wrote:

This is why I suggested the technique of:
Reinstall the OS just like you did when you first built your machine, before
the catastrophy.  It doesn't even matter if you make the same selections you


With the new Oracle policies, it seems unlikely that you will be able 
to reinstall the OS and achieve what you had before.  An exact 
recovery method (dd of partition images or recreate pool with 'zfs 
receive') seems like the only way to be assured of recovery moving 
forward.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-04-30 Thread Bob Friesenhahn

On Thu, 29 Apr 2010, Tonmaus wrote:

Recommending to not using scrub doesn't even qualify as a 
workaround, in my regard.


As a devoted believer in the power of scrub, I believe that after the 
OS, power supplies, and controller have been verified to function with 
a good scrubbing, if there is more than one level of redundancy, 
scrubs are not really warranted.  With just one level of redundancy it 
becomes much more important to verify that both copies were written to 
disk correctly.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-04-30 Thread Freddie Cash
On Fri, Apr 30, 2010 at 11:35 AM, Bob Friesenhahn 
bfrie...@simple.dallas.tx.us wrote:

 On Thu, 29 Apr 2010, Tonmaus wrote:

  Recommending to not using scrub doesn't even qualify as a workaround, in
 my regard.


 As a devoted believer in the power of scrub, I believe that after the OS,
 power supplies, and controller have been verified to function with a good
 scrubbing, if there is more than one level of redundancy, scrubs are not
 really warranted.  With just one level of redundancy it becomes much more
 important to verify that both copies were written to disk correctly.

 Without a periodic scrub that touches every single bit of data in the pool,
how can you be sure that 10-year files that haven't been opened in 5 years
are still intact?

Self-healing only comes into play when the file is read.  If you don't read
a file for years, how can you be sure that all copies of that file haven't
succumbed to bit-rot?

Do you rally want that oh shit moment 5 years from now, when you go to
open Super Important Doc Saved for Legal Reasons and find that all copies
are corrupt?

Sure, you don't have to scrub every single week.  But you definitely want to
scrub more than once over the lifetime of the pool.

-- 
Freddie Cash
fjwc...@gmail.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-04-30 Thread Roy Sigurd Karlsbakk
 On Thu, 29 Apr 2010, Tonmaus wrote:
 
  Recommending to not using scrub doesn't even qualify as a 
  workaround, in my regard.
 
 As a devoted believer in the power of scrub, I believe that after the
 
 OS, power supplies, and controller have been verified to function with
 a good scrubbing, if there is more than one level of redundancy, 
 scrubs are not really warranted.  With just one level of redundancy it
 
 becomes much more important to verify that both copies were written to
 disk correctly.

The scrub should still be available without slowing down the system to 
something barely usable - that's why it's there. Adding new layers of security 
is nice, but dropping scrub because of OS bugs is rather ugly

roy
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-04-30 Thread David Dyer-Bennet

On Fri, April 30, 2010 13:44, Freddie Cash wrote:
 On Fri, Apr 30, 2010 at 11:35 AM, Bob Friesenhahn 
 bfrie...@simple.dallas.tx.us wrote:

 On Thu, 29 Apr 2010, Tonmaus wrote:

  Recommending to not using scrub doesn't even qualify as a workaround,
 in
 my regard.


 As a devoted believer in the power of scrub, I believe that after the
 OS,
 power supplies, and controller have been verified to function with a
 good
 scrubbing, if there is more than one level of redundancy, scrubs are not
 really warranted.  With just one level of redundancy it becomes much
 more
 important to verify that both copies were written to disk correctly.

 Without a periodic scrub that touches every single bit of data in the
 pool,
 how can you be sure that 10-year files that haven't been opened in 5 years
 are still intact?

 Self-healing only comes into play when the file is read.  If you don't
 read
 a file for years, how can you be sure that all copies of that file haven't
 succumbed to bit-rot?

Yes, that's precisely my point.  That's why it's especially relevant to
archival data -- it's important (to me), but not frequently accessed.

-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Mapping inode numbers to file names

2010-04-30 Thread A Darren Dunham
On Wed, Apr 28, 2010 at 09:49:04PM +0200, Ragnar Sundblad wrote:
 On 28 apr 2010, at 14.06, Edward Ned Harvey wrote:
 
 What indicators do you have that ONTAP/WAFL has inode-name lookup
 functionality? I don't think it has any such thing - WAFL is pretty
 much an UFS/FFS that does COW instead of in-place writing, the main
 difference is that inodes are written to special inode files rather
 than specific static areas. Directories I believe works very much like
 UFS/FFS directories. But I may have misunderstood something and be
 wrong.

You're correct that the FS is very UFS/FFS, but they've added stuff.
The inode-name lookup is a bolt-on feature that was added in 7.1 (and
can be disabled).

# touch /net/tester/vol/ntap/file
# ls -li /net/tester/vol/ntap/file
307107 -rw-r--r--   1 root other  0 Apr 30 13:35 
/net/tester/vol/ntap/file

tester* inodepath -v ntap -a 307107
Inode 307107 in volume ntap (fsid 0x29428a) has 1 name.
Volume UUID is: 588b6ef0-5231-11de-9885-00a09800f026
[1] Primary pathname = /vol/ntap/file

In general it is an internal feature and not readily exposed for user
operations.  I've seen it mentioned that it's very handy for their
interaction with virus scanners so it can pass a full path back to
them.  Also it helps with error messages.

It's not really exposed via file ops.  The only time I've ever used it
directly was when one of my servers was getting beaten up by clients.
The only diags were to scan the network and then go from the NFS FH to a
filename.  With ZFS on Solaris, I'm assuming that wouldn't be necessary
because you'd have dtrace to tell you exactly what files were being
accessed.

-- 
Darren
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Panic when deleting a large dedup snapshot

2010-04-30 Thread Jim Horng
Looks like I am hitting the same issue now 
from the earlier post that you responded. 
http://opensolaris.org/jive/thread.jspa?threadID=128532tstart=15

Continue my test migration with the dedup=off and synced couple more file 
systems.
I decided the merge two of the file systems together by copying the one file 
system into another one.  Then when I try to delete directories in the 1st file 
system, the whole system hung.  The file system was done with dedup turn on 
half way through the sync then turned off since I wasn't able to finish the 
initial sync with dedup turned on.

But now looks like there is no way to get rid of dedup file system safely.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Virtual to physical migration

2010-04-30 Thread devsk
I had created a virtualbox VM to test out opensolaris. I updated to latest dev 
build and set my things up. Tested pools and various configs/commands. Learnt 
format/partition etc.

And then, I wanted to move this stuff to a solaris partition on the physical 
disk. VB provides physical disk access. I put my solaris partition in there and 
created slices for root, swap etc. in that partition inside the OpenSolaris VM.

Created a pool called mypool on slice 0 of this solaris partition and did a zfs 
send|recv of my rpool/ROOT and rpool/export into the mypool. Verified that the 
data was correctly copied and mountpoints were set exactly like they are for 
rpool.

Next, edited the files in /mypool/boot and /mypool/etc to use the new name 
mypool instead of rpool in the signature. Edited the menu.list where findroot 
now finds the root in (pool_mypool,2,a) instead of (pool_rpool,0,a) because on 
physical disk, solaris partition is the 3rd primary partition.

Did a installgrub to slice 0 of the solaris partition.
Got out of the VM and booted my machine.
It correctly gives me the grub menu for opensolaris but when I select, it shows 
the opensolaris splash, moves the bar a bit and sits there.

So, my questions: 

1. How can turn the boot splash off to see what's going on underneath? ESC did 
not work.
2. Is there a log inside the mypool root fs that I can look at from a 
OpenSolaris livecd, which will give me more information about what happened?

I am scrubbing the pool from livecd now. Will appreciate your help!
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Virtual to physical migration

2010-04-30 Thread devsk
I think I messed up just a notch!

When I did zfs send|recv, I used the flag -u (because I wanted it to not mount 
at that time). But it set the fs property canmount to off for ROOT...YAY!

I booted into livecd, imported the mypool and fixed the mount points and 
canmount property. And I am now in the physical opensolaris install.

No X so far though...:(

Any hints on ATI drivers?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Virtual to physical migration

2010-04-30 Thread devsk
I am getting a strange reset as soon as I say startx from normal user's console 
login.

How do I troubleshoot this? Any ideas? I removed the /etc/X11/xorg.conf before 
invoking startx because that would have some PCI bus ids in there which won't 
be valid in real hardware.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Virtual to physical migration

2010-04-30 Thread devsk
Looks like the X's vesa driver can only use 1600x1200 resolution and not the 
native 1920x1200.

And if I passed -dpi to enforce 96 DPI, it just croaks.

Once -dpi was out, I am inside X with 1600x1200 resolution.

Can anyone tell me how I can get the native 1920x1200 resolution working with 
vesa driver? I know its possible because I get 1920x1200 resolution inside my 
Milax liveUSB.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Virtual to physical migration

2010-04-30 Thread Ian Collins

On 05/ 1/10 03:09 PM, devsk wrote:

Looks like the X's vesa driver can only use 1600x1200 resolution and not the 
native 1920x1200.

   
Asking these question on the ZFS list isn't going to get you very far.  
Troy the opensolaris-help list.


--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Virtual to physical migration

2010-04-30 Thread devsk
I have no idea why I posted in zfs discuss...ok, migration...I will post follow 
up in help.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best practice for full stystem backup - equivelent of ufsdump/ufsrestore

2010-04-30 Thread Edward Ned Harvey
 From: Bob Friesenhahn [mailto:bfrie...@simple.dallas.tx.us]
 Sent: Friday, April 30, 2010 1:40 PM
 
 With the new Oracle policies, it seems unlikely that you will be able
 to reinstall the OS and achieve what you had before.  An exact
 recovery method (dd of partition images or recreate pool with 'zfs
 receive') seems like the only way to be assured of recovery moving
 forward.

What???

Confusing parts:
the new Oracle policies
unlikely that you will be able to reinstall the OS and achieve what you had
before

Didn't you see Cindy's post?  Would you like to point out any specific flaws
in what was written, that I guess she probably wrote?

In particular, I found the following to be very valuable:

 From: Cindy Swearingen [mailto:cindy.swearin...@oracle.com]
 
 the following
 procedure describes how to boot from CD, recreate the root pool, and
 restore the root pool snapshots:
 
 http://docs.sun.com/app/docs/doc/819-5461/ghzur?l=ena=view
 
 I retest this process at every Solaris release.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss