[zfs-discuss] ZFS mount fails at boot

2007-03-22 Thread Matt B
I have about a dozen two disk systems that were all setup the same using a 
combination of SVM and ZFS.

s0 = / SMV Mirror
s1 = swap
s3 = /tmp
s4 = metadb
s5 = zfs mirror

The system does boot, but once it gets to zfs, zfs fails and all subsequent 
services fail as well (including ssh)

/home,/tmp, and /data are on the zfs mirror. /var is on it's own UFS/SVM mirror 
as well as root and swap.

I included the errors I am getting as well as the exact commands I used to 
build both the SVM and ZFS mirrors. (All of which appeared to work flawlessly)

I am guessing there is just something really simple that needs to be set.

Any Ideas?

--Errors--
vfcufs01# cat /var/svc/log/system-filesystem-local:default.log

[ Mar 16 11:02:58 Rereading configuration. ]

[ Mar 16 11:03:37 Executing start method (/lib/svc/method/fs-local) ]

bootadm: no matching entry found: Solaris_reboot_transient

[ Mar 16 11:03:37 Method start exited with status 0 ]

[ Mar 16 13:25:58 Executing start method (/lib/svc/method/fs-local) ]

bootadm: no matching entry found: Solaris_reboot_transient

[ Mar 16 13:25:58 Method start exited with status 0 ]

[ Mar 20 15:26:32 Executing start method (/lib/svc/method/fs-local) ]

bootadm: no matching entry found: Solaris_reboot_transient

WARNING: /usr/sbin/zfs mount -a failed: exit status 1

[ Mar 20 15:26:32 Method start exited with status 95 ]

[ Mar 21 08:27:37 Leaving maintenance because disable requested. ]

[ Mar 21 08:27:37 Disabled. ]

[ Mar 21 08:32:22 Executing start method (/lib/svc/method/fs-local) ]

bootadm: no matching entry found: Solaris_reboot_transient

WARNING: /usr/sbin/zfs mount -a failed: exit status 1

[ Mar 21 08:32:23 Method start exited with status 95 ]

[ Mar 21 08:50:20 Leaving maintenance because disable requested. ]

[ Mar 21 08:50:20 Disabled. ]

[ Mar 21 08:55:07 Executing start method (/lib/svc/method/fs-local) ]

bootadm: no matching entry found: Solaris_reboot_transient

WARNING: /usr/sbin/zfs mount -a failed: exit status 1

[ Mar 21 08:55:07 Method start exited with status 95 ]

--Commands Run to make SVM and ZFS mirror---
prtvtoc /dev/rdsk/c0t0d0s2 | fmthard -s - /dev/rdsk/c0t1d0s2

 

 

installgrub /boot/grub/stage1 /boot/grub/stage2 /dev/rdsk/c0t1d0s0

 

metadb -a -f -c 2 c0t0d0s4 c0t1d0s4

 

metainit -f d10 1 1 c0t0d0s0

metainit -f d11 1 1 c0t0d0s1

metainit -f d13 1 1 c0t0d0s3

 

metainit -f d20 1 1 c0t1d0s0

metainit -f d21 1 1 c0t1d0s1

metainit -f d23 1 1 c0t1d0s3

 

metainit d0 -m d10

metainit d1 -m d11

metainit d3 -m d13

 

metaroot d0

 

Update /etc/vfstab so that the swap partition points to the d1 just as root was 
modified by the last command to point to d0

Swap line in vfstab should look like this

/dev/md/dsk/d1  -   -   swap-   no  -

 

 

lockfs -fa

Reboot

After reboot…

metattach d0 d20

metattach d1 d21

metattach d3 d23

 

 

Then do this to check the status of the mirroring

metastat | grep %

Wait until the syncs are complete

 

 

zpool create zpool mirror c0t0d0s5 c0t1d0s5

 

Create the filesystem

 

umount /home

umount /tmp

rm -rf /data

rm -rf /home

rm -rf /tmp

  zfs create zpool/data

zfs create zpool/home

zfs create zpool/tmp

sleep 10

 

 

Make the directory for the mountpoint

 

 

mkdir /data

mkdir /home

mkdir /tmp

 

 

Make the mountpoint

 

  zfs set mountpoint=/data zpool/data

zfs set mountpoint=/home zpool/home

zfs set mountpoint=/tmp zpool/tmp

 

 

Now you should have the regular roots for these

 

Turn ZFS compression on

 

  zfs set compression=on zpool/data

zfs set compression=on zpool/home

zfs set compression=on zpool/tmp

 

Set the quotas

zfs set quota=4G zpool/home
zfs set quota=1G zpool/tmp
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: Is there any performance problem with hard links in ZFS?

2007-03-22 Thread Viktor Turskyi
Thank you very much for the consultation. The information was very useful. 

And I have one more questions about ZFS file attributes. I have found such 
information 2^56 — Number of attributes of a file in ZFS but i cant found any 
information mechanism of creating such attributes.  The situation is next: I 
want to save a lot of file information not in database but in file attributes. 
File attributes setting will be processing by a Perl or C program. Is there any 
way to do this?
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] migration/acl4 problem

2007-03-22 Thread Mark Shellenbaum

Jens Elkner wrote:

Hi,



2) On zfs
- e.g. as root do:
cp -P -r -p /dir /pool1/zfsdir
# cp: Insufficient memory to save acl entry


I will open a bug on that.


cp  -r -p /dir /pool1/zfsdir
# cp: Insufficient memory to save acl entry
find dir | cpio -puvmdP /pool1/docs/
- as user B do:
cd /pool1/zfsdir/dir
touch y
- as user A do:
cd /pool1/zfsdir/dir
echo bla y


I can't reproduce your simple test.

I have two user tester1 and tester2 and both are members of tstgroup
tester1$ mkdir a.dir
tester1$ chmod 775 a.dir
tester1$ setfacl -m d:u::rwx,d:g::rwx,d:o:r-x,d:m:rwx a.dir
# su - tester2
tester2$ cd a.dir
tester2$ touch b
tester2$ ls -l b
total 0
-rw-rw-r--   1 tester2  tstgrp 0 Mar 22 08:21 b

# find a.dir -print | cpio -Pvmudp /sandbox
/sandbox/a.dir
/sandbox/a.dir/b
0 blocks

tester1$ cd /sandbox/a.dir
tester1$ touch a
# su tester2
tester2$ touch c
tester2$ ls -l
total 3
-rw-r--r--+  1 tester1  tstgrp 0 Mar 22 08:22 a
-rw-rw-r--+  1 tester2  tstgrp 0 Mar 22 08:21 b
-rw-r--r--+  1 tester2  tstgrp 0 Mar 22 08:22 c


There is one big difference which you see here.  ZFS always honors the 
users umask, and that is why the file was created with 644 permission 
rather than 664 as UFS did.  ZFS has to always apply the users umask 
because of POSIX.




So, has anybody a clue, how one is able to migrate directories from
ufs to zfs without loosing functionality?

I've read, that it is always possible to translate POSIX_ACLs to ACL4,
but it doesn't seem to work. So I've a big migration problem ... :(((

Also I haven't found anything, which explain, how ACL4 really works on
Solaris, i.e. how the rules are applied. Yes, in order and only who
matches. But what means 'who  matches', what purpose have the
'owner@:--:--:deny'  entries, what takes precendence
(allow | deny | first match | last match), also I remember, that
sometimes I heard, that if allow once matched, everything else is
ignored - but than I' askling, why the order of the ACLEs are important.
Last but not least, what purpose have the standard perms e.g. 0644 -
completely ignored if ACLEs are present ? Or used as fallback, if no
ACLE matches or ACLE match, but have not set anywhere e.g. the r bit ?

Any hints?

Regards,
jel.


owner@ entries control the owner permissions
group@ entries control the owning group permissions
everyone@ entries control everyones permissions, not just the other 
permissions.


A little example will illustrate what everyone@ does.
# chmod A=owner@:r:allow,group@:r:allow,everyone@:rwx:allow file.test
# ls -V file.test
-rwxrwxrwx   1 tester1  tstgrp 0 Mar 22 08:29 file.test
owner@:r-:--:allow
group@:r-:--:allow
 everyone@:rwx---:--:allow

Since everyone@ is giving away rwx and there are no deny entries for 
either  owner@ or group@ the mode of the file becomes 777 and all users 
can rwx the file.


Now if I insert a deny before the group entry the mode will change.
# chmod A1+group@:wx:deny file.test
# ls -V file.test
-rwxr--rwx   1 tester1  tstgrp 0 Mar 22 08:29 file.test
owner@:r-:--:allow
group@:-wx---:--:deny
group@:r-:--:allow
 everyone@:rwx---:--:allow

Now the anyone who isn't a member of tstgroup has rwx permission to the 
file.


The ACEs are processed in order and once a requested permission has been 
granted a subsequent deny can't take it away, but if a permission has 
yet been granted then a deny for that permission will halt the access check.


For example:

# ls -V file.test
-rw-r--r--+  1 tester1  tstgrp 0 Mar 22 08:35 file.test
  user:tester2:rwx---:--:allow
  user:tester2:-w:--:deny
owner@:--x---:--:deny
owner@:rw-p---A-W-Co-:--:allow
group@:-wxp--:--:deny
group@:r-:--:allow
 everyone@:-wxp---A-W-Co-:--:deny
 everyone@:r-a-R-c--s:--:allow

In this ACL the deny entry for 'w' for tester2 has no effect, since 'w' 
would have already been granted.  If the first two entries had been 
swapped then tester2 would be denied write permission.


A normal file that doesn't really have an ACL will have a number of deny 
entries inserted into the ACL.  The reason for this is to provide POSIX 
compliance in that you are either in the owner class, group class or 
other class.  The deny entries stop the access control from proceeding 
to the next entries.  In the ACL shown below the deny entries on the 
group@ entry will prevent a member of tstgrp from picking up write 
permission from the everyone@ allow entry.


# touch file.2
# ls -V file.2
-rwxr-xrwx   1 tester1  tstgrp 0 Mar 22 08:39 file.2

Re: [zfs-discuss] Re: Proposal: ZFS hotplug support and autoconfiguration

2007-03-22 Thread Eric Schrock
On Wed, Mar 21, 2007 at 11:57:42PM -0700, Matt B wrote:
 
 Literally, someone should be able to make $7/hr with a stack of drives
 and the ability to just look or listen to a server to determin which
 drive needs to be replaced.
 
 This means ZFS will need to be able to control the HDD Status lights
 on the chassis for look, but for listen ZFS could cause the server
 to beep using one beep for the first slot, two beeps in rapid
 successions, for the second slot. A sort of lame Morse code...no
 device integration on ZFS's part required
  

This is part of ongoing work with Solaris platform integration (see my
last blog post) and future ZFS/FMA work.  We will eventually be
leveraging IPMI and SES to manage physical indicators (i.e. LEDs) in
response to Solaris events.  It will take some time to reach this point,
however.

- Eric

--
Eric Schrock, Solaris Kernel Development   http://blogs.sun.com/eschrock
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] asize is 300MB smaller than lsize - why?

2007-03-22 Thread Robert Milkowski
Hi.

 System is snv_56 x86 32bit
 
bash-3.00# zpool status solaris
  pool: solaris
 state: ONLINE
 scrub: scrub stopped with 0 errors on Thu Mar 22 16:25:23 2007
config:

NAMESTATE READ WRITE CKSUM
solaris ONLINE   0 0 0
  c0t1d0ONLINE   0 0 0

errors: No known data errors
bash-3.00# 


bash-3.00# zfs list
NAME  USED  AVAIL  REFER  MOUNTPOINT
solaris  11.7G  5.02G  3.27G  /solaris
solaris/d100 1.64G  5.02G  1.64G  /solaris/d100
solaris/[EMAIL PROTECTED]  0  -  1.64G  -
solaris/[EMAIL PROTECTED]0  -  1.64G  -
solaris/d100-copy12.0M  5.02G  12.0M  /solaris/d100-copy
solaris/d100-copy1   1.31G  5.02G  1.31G  /solaris/d100-copy1
solaris/d101  348M  5.02G  15.3M  /solaris/d101
solaris/[EMAIL PROTECTED]   333M  -   348M  -
solaris/[EMAIL PROTECTED]0  -  15.3M  -
solaris/d101-copy15.3M  5.02G  15.3M  /solaris/d101-copy
solaris/testws   5.13G  5.02G  5.13G  /export/testws/
bash-3.00# 


File systems solaris/d100 and solaris/d100-copy1 contain the same data.

bash-3.00# ls -l /solaris/d100 | wc -l
 163
bash-3.00# ls -l /solaris/d100-copy1 | wc -l
 163
bash-3.00# 

bash-3.00# gtar cvf /solaris/2.tar /solaris/d100-copy1
bash-3.00# gtar cvf /solaris/1.tar /solaris/d100
bash-3.00# ls -l /solaris/1.tar
-rw-r--r--   1 root other1755699200 Mar 22 16:15 /solaris/1.tar
bash-3.00# ls -l /solaris/2.tar
-rw-r--r--   1 root other1755699200 Mar 22 16:19 /solaris/2.tar
bash-3.00# 


bash-3.00# zdb -v solaris/d100 /tmp/1
bash-3.00# zdb -v solaris/d100-copy1 /tmp/2
bash-3.00# diff -u /tmp/1 /tmp/2 
--- /tmp/1  Thu Mar 22 16:41:52 2007
+++ /tmp/2  Thu Mar 22 16:41:57 2007
@@ -1,7 +1,7 @@
-Dataset solaris/d100 [ZPL], ID 189, cr_txg 779704, 1.64G, 807 objects
+Dataset solaris/d100-copy1 [ZPL], ID 128, cr_txg 831226, 1.31G, 807 objects
 
 Object  lvl   iblk   dblk  lsize  asize  type
- 0716K16K   416K   242K  DMU dnode
+ 0716K16K   416K   239K  DMU dnode
  1116K512512 1K  ZFS master node
  2116K512512 1K  ZFS delete queue
  3116K  10.5K  10.5K 4K  ZFS directory
@@ -807,5 +807,5 @@
806116K  66.5K  66.5K  66.5K  ZFS plain file
807116K  67.5K  67.5K  67.5K  ZFS plain file
808116K  24.5K  24.5K  24.5K  ZFS plain file
-   809316K   128K  1.58G  1.58G  ZFS plain file
+   809316K   128K  1.58G  1.24G  ZFS plain file
 
bash-3.00# 

bash-3.00# find /solaris/d100-copy1/ -inum 809 -ls
  809 1304748 -rw-r--r--   1 root other1692205056 Mar 22 16:05 
/solaris/d100-copy1/m1
bash-3.00# find /solaris/d100/ -inum 809 -ls
  809 1652825 -rw-r--r--   1 root other1692205056 Mar 22 16:05 
/solaris/d100/m1
bash-3.00# diff -b /solaris/d100/m1 /solaris/d100-copy1/m1
bash-3.00# 

While lsize is the same for both files asize is smaller fr the second one.
Why is it? When is is possible? Both file systems have compression turned off 
and default recordsize. Diff claims both files to be the same.

Any idea?
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re[2]: [zfs-discuss] Why replacing a drive generates writes to other disks?

2007-03-22 Thread Robert Milkowski
Hello Matthew,

Wednesday, March 14, 2007, 9:00:28 AM, you wrote:

MA Robert Milkowski wrote:
 Hello zfs-discuss,
 
   Subject says it all.
 
 
 I first checked - no IO activity at all to the pool named thumper-2.
 So I started replacing one drive with 'zpool replace thumper-2 c7t7d0
 c4t1d0'.
 
 Now the question is why am I seeing writes to other disks than c7t7d0?

MA Are you *sure* that nothing else is going on?  Not even atime updates?
MA Do 'zfs umount -a' and see if there's still writes to other disks. 
MA There may be some small amount of writes to update some metadata with 
MA the resilvering status.

MA I just did this yesterday on a raidz2 pool and didn't see writes to the
MA other disks.  Maybe the code has changed since you tried?


There're data in thumper-8 but no activity at all is happening to
local disk on a server. I did check it with iostat and zpool iostat
for dozen of seconds - no IOs at all.


bash-3.00# zpool replace thumper-8 c7t7d0 c4t1d0

bash-3.00# zpool status
  pool: misc
 state: ONLINE
 scrub: none requested
config:

NAME  STATE READ WRITE CKSUM
misc  ONLINE   0 0 0
  mirror  ONLINE   0 0 0
c5t0d0s4  ONLINE   0 0 0
c5t4d0s4  ONLINE   0 0 0

errors: No known data errors

  pool: thumper-8
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress, 0.12% done, 9h1m to go
config:

NAME  STATE READ WRITE CKSUM
thumper-8 ONLINE   0 0 0
  raidz2  ONLINE   0 0 0
c0t0d0ONLINE   0 0 0
c1t0d0ONLINE   0 0 0
c4t0d0ONLINE   0 0 0
c6t0d0ONLINE   0 0 0
c7t0d0ONLINE   0 0 0
c0t1d0ONLINE   0 0 0
c1t1d0ONLINE   0 0 0
c5t1d0ONLINE   0 0 0
c6t1d0ONLINE   0 0 0
c7t1d0ONLINE   0 0 0
c0t2d0ONLINE   0 0 0
  raidz2  ONLINE   0 0 0
c1t2d0ONLINE   0 0 0
c5t2d0ONLINE   0 0 0
c6t2d0ONLINE   0 0 0
c7t2d0ONLINE   0 0 0
c0t4d0ONLINE   0 0 0
c1t4d0ONLINE   0 0 0
c4t4d0ONLINE   0 0 0
c6t4d0ONLINE   0 0 0
c7t4d0ONLINE   0 0 0
c0t3d0ONLINE   0 0 0
c1t3d0ONLINE   0 0 0
  raidz2  ONLINE   0 0 0
c4t3d0ONLINE   0 0 0
c5t3d0ONLINE   0 0 0
c6t3d0ONLINE   0 0 0
c7t3d0ONLINE   0 0 0
c0t5d0ONLINE   0 0 0
c1t5d0ONLINE   0 0 0
c4t5d0ONLINE   0 0 0
c5t5d0ONLINE   0 0 0
c6t5d0ONLINE   0 0 0
c7t5d0ONLINE   0 0 0
c0t6d0ONLINE   0 0 0
  raidz2  ONLINE   0 0 0
c1t6d0ONLINE   0 0 0
c4t6d0ONLINE   0 0 0
c5t6d0ONLINE   0 0 0
c6t6d0ONLINE   0 0 0
c7t6d0ONLINE   0 0 0
c0t7d0ONLINE   0 0 0
c1t7d0ONLINE   0 0 0
c4t7d0ONLINE   0 0 0
c5t7d0ONLINE   0 0 0
c6t7d0ONLINE   0 0 0
spare ONLINE   0 0 0
  c7t7d0  ONLINE   0 0 0
  c4t1d0  ONLINE   0 0 0
spares
  c4t1d0  INUSE currently in use
  c4t2d0  AVAIL

errors: No known data errors
bash-3.00#


bash-3.00# iostat -xnz 1
[stripped out first output]
extended device statistics
r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
   58.20.0  195.20.0  0.0  0.10.31.6   2  10 c4t0d0
   60.20.0  218.70.0  0.0  0.10.61.7   4  10 c6t0d0
   72.20.0  239.80.0  0.0  0.10.21.4   1  10 c0t0d0
0.0  335.10.0 1114.8  0.0  0.10.10.2   2   5 c4t1d0
   62.20.0  221.20.0  0.0  0.10.01.5   0   9 c6t1d0
   66.20.0  207.20.0  0.0  0.10.01.4   0   9 c0t1d0
   66.20.0  214.20.0  0.0  0.10.11.5   0  10 c5t1d0
   16.10.0   45.70.0  0.0  0.00.01.3   0   2 

Re: [zfs-discuss] The value of validating your backups...

2007-03-22 Thread Brian Hechinger
On Tue, Mar 20, 2007 at 03:35:41PM -0400, Jim Mauro wrote:
 
 http://www.cnn.com/2007/US/03/20/lost.data.ap/index.html

I worked (briefly, left right after this, no point working there) at a place 
that
lost the hdd in it's main server. (small company).

That's ok!  We have backups!

Guy had been backing up the server every day for 10 years.  To the same tapes.
Never bought new tapes.

That's when I decided working for them wasn't a good idea.  ;)

-brian
-- 
The reason I don't use Gnome: every single other window manager I know of is
very powerfully extensible, where you can switch actions to different mouse
buttons. Guess which one is not, because it might confuse the poor users?
Here's a hint: it's not the small and fast one.--Linus
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Overview (rollup) of recent activity on zfs-discuss

2007-03-22 Thread Eric Boutilier


Special note: Here's a question I get a lot:
Q: Why did you miss (or miss us) last time?
A: This is a misconception that stems from the variability of the forums I
   cover each (semi-monthly) period. The set is not static; rather, it's
   based primarily on traffic volume. To illustrate, this period
   website-discuss didn't make the cut, but in previous periods it did.
---

For background on what this is, see:
http://www.opensolaris.org/jive/message.jspa?messageID=24416#24416
http://www.opensolaris.org/jive/message.jspa?messageID=25200#25200

=
zfs-discuss 03/01 - 03/15
=

Size of all threads during period:

Thread size Topic
--- -
 36   update on zfs boot support
 19   writes lost with zfs !
 13   ZFS/UFS layout for 4 disk servers
 12   C'mon ARC, stay small...
 10   X2200-M2
  9   ZFS stalling problem
  8   Why number of NFS threads jumps to the max value?
  8   Layout for multiple large streaming writes.
  7   FAULTED ZFS volume even though it is mirrored
  5   ZFS party - PANIC collection
  5   ZFS and Solaris as a VMWare guest
  5   Cluster File System Use Cases
  4   renumbering and its potential side effects.
  4   Recommended setup?
  4   How to interrupt a zpool scrub?
  3   recover user error
  3   old zfs pool and mounting
  3   nv59 + HA-ZFS
  3   ZFS configuration on X4500 for reliability
  3   Promise Ultra133TX2?
  3   Equavilent to chmod 1777 as ZFS ACl
  2   ZFS info
  2   DMU interfaces
  2   Checksum errors in storage pool
  1   zpool(1M): import -a?
  1   zfs and iscsi: cannot open device: I/O error
  1   system wont boot after zfs
  1   mem vs numbers of file systems?
  1   large numbers of zfs filesystems
  1   anyone want a Solaris 10u3 core file...
  1   [OT] Multipathing on Mac OS X
  1   Why replacing a drive generates writes to other disks?
  1   Question: Zpool replace on a disk which is getting errors
  1   Hourly Consultant Needed
  1   File System Filter Driver??
  1   FAULTED ZFS volume even though it ismirrored
  1   CSI:Munich - How to save the world with ZFS and 12 USB Sticks
  1   A little different look at filesystems ... Just looking for ideas


Posting activity by person for period:

# of posts  By
--   --
  9   rmilkowski at task.gda.pl (robert milkowski)
  7   roch.bourbonnais at sun.com (roch - pae)
  7   james.mauro at sun.com (jim mauro)
  6   richard.elling at sun.com (richard elling)
  6   lin.ling at sun.com (lin ling)
  6   ddunham at taos.com (darren dunham)
  5   manoj at clusterfs.com (manoj joseph)
  5   leon.is.here at gmail.com (leon koll)
  5   fcusack at fcusack.com (frank cusack)
  4   wonko at 4amlunch.net (brian hechinger)
  4   wade.stuart at fallon.com (wade stuart)
  4   toby at smartgames.ca (toby thain)
  4   selim.daoud at gmail.com (selim daoud)
  4   matthew.ahrens at sun.com (matthew ahrens)
  4   malachid at gmail.com (=?iso-8859-1?q?malachi_de_=c6lfweald?=)
  4   johansen-osdev at sun.com (johansen-osdev)
  3   stuart at iseek.com.au (stuart low)
  3   spencer.shepler at sun.com (spencer shepler)
  3   rayrayson at gmail.com (rayson ho)
  3   opensolaris at dotd.com (jesse defer)
  3   mattbreedlove at yahoo.com (matt b)
  3   lscharf at vt.edu (luke scharf)
  3   jasonjwwilliams at gmail.com (jason j. w. williams)
  3   jamesd.wi at gmail.com (james dickens)
  3   ginoruopolo at hotmail.com (gino ruopolo)
  3   anjum at qp.com.qa (ayaz anjum)
  2   werschlein at interlace.ch (thomas werschlein)
  2   tmcmahon2 at yahoo.com (torrey mcmahon)
  2   tim.foster at sun.com (tim foster)
  2   sanjeev.bagewadi at sun.com (sanjeev bagewadi)
  2   roch.bourbonnais at sun.com (roch bourbonnais)
  2   rasputnik at gmail.com (dick davies)
  2   przemolicc at poczta.fm (przemolicc)
  2   mgerdts at gmail.com (mike gerdts)
  2   matty91 at gmail.com (matty)
  2   lori.alt at sun.com (lori alt)
  2   jeff.bonwick at sun.com (jeff bonwick)
  2   ivwang at mail2000.com.tw (ivan wang)
  2   ian at ianshome.com (ian collins)
  2   gary at genashor.com (gary gendel)
  2   erblichs at earthlink.net (erblichs)
  2   darren.reed at sun.com (darren reed)

Re: [zfs-discuss] asize is 300MB smaller than lsize - why?

2007-03-22 Thread Matthew Ahrens

Robert Milkowski wrote:

While lsize is the same for both files asize is smaller fr the second
one. Why is it? When is is possible? Both file systems have 
compression turned off and default recordsize. Diff claims both files

to be the same.


Metadata (eg, DMU dnode, and indirect blocks for ZFS plain file,
which you can see broken out by using more -b's) is always compressed.
Because the metadata is necessarily different (there are different block
pointers, also the object numbers could be allocated differently, though
not in your situation), it can compress different amounts.

So, this is always possible, and in fact likely.

--matt
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] migration/acl4 problem

2007-03-22 Thread Mark Shellenbaum




There is one big difference which you see here.  ZFS always honors the 
users umask, and that is why the file was created with 644 permission 
rather than 664 as UFS did.  ZFS has to always apply the users umask 
because of POSIX.


Wow, that's a big show stopper! If I tell the users, that after the
transition they have to toggle their umask before/after writing to
certain directories or need to do a chmod, I'm sure they wanna hang me
right on the next tree and wanna get their OS changed to Linux/Windooze...



Only if your goal is to ignore a users intent on what permissions their 
files should be created with.  Think about users who set their umask to 
077.  They will be upset when their files are created with a more 
permissive mode.  The ZFS way is much more secure.


What is your real desired goal?  Are you just wanting anybody in a 
specific group to be able to read,write all files in a certain directory 
tree?  If so, then there are other ways to achieve this, with file and 
directory inheritance.



Isn't there a flag/property for zfs, to get back the old behavior
or to enable POSIX-ACLs instead of zfs-ACLs?
A force_directory_create_mode=0770,force_file_create_mode=0660'
(like for samba shares) property would be even better - no need to fight
with ACLs...


That would be bad.  That would mean that every file in a file system 
would be forced to be created with forced set of permissions.


  -Mark


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] 6410 expansion shelf

2007-03-22 Thread Frank Cusack

Does anyone have a 6140 expansion shelf that they can hook directly to
a host?  Just wondering if this configuration works.  Previously I
though the expansion connector was proprietary but now I see it's
just fibre channel.

I tried this before with a 3511 and it kind of worked but ultimately
had various problems and I had to give up on it.

Hoping to avoid the cost of the raid controller.

-frank
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS layout for 10 disk?

2007-03-22 Thread John-Paul Drawneek
got a 12 disk system - all 18gb
2 mirror for boot, now what to do with the rest?

The storage is to be used for user space, web stuff and to store anything else 
(dump for data).

I could do 5 mirrors, but thats a wasting quite a bit of space.

Was thinking about raidz2, as its almost as reliable and better for space.

Should i do 9 disk raidz2 with a spare, or could i do two raidz2 to get a bit 
of performance?

Only done tests with striped mirrors which seems to give it a boost, so is it 
worth it with a raidz2 of this small size?

Thanks for any help
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] migration/acl4 problem

2007-03-22 Thread Peter Tribble

On 3/22/07, Mark Shellenbaum [EMAIL PROTECTED] wrote:


 Wow, that's a big show stopper! If I tell the users, that after the
 transition they have to toggle their umask before/after writing to
 certain directories or need to do a chmod, I'm sure they wanna hang me
 right on the next tree and wanna get their OS changed to Linux/Windooze...


Only if your goal is to ignore a users intent on what permissions their
files should be created with.  Think about users who set their umask to
077.  They will be upset when their files are created with a more
permissive mode.  The ZFS way is much more secure.


One of the reasons for doing this is explicitly to override the user's umask.

Both up and down. Which allows users to have a strict umask while still allowing
shared workspaces to function correctly. Or for them to have a generous
umask while ensuring secure areas stay secure. In other words, the aim is to
override any mistakes that users might make by enforcing policy using ACLs.


What is your real desired goal?  Are you just wanting anybody in a
specific group to be able to read,write all files in a certain directory
tree?  If so, then there are other ways to achieve this, with file and
directory inheritance.


Please explain how. I've been trying to make this work for months with
no success.

The business requirement is that all files in a directory hierarchy be created
mode 660 - read and write by owner and primary group. How do I do
this?

--
-Peter Tribble
http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] migration/acl4 problem

2007-03-22 Thread Mark Shellenbaum




Please explain how. I've been trying to make this work for months with
no success.

The business requirement is that all files in a directory hierarchy be 
created

mode 660 - read and write by owner and primary group. How do I do
this?



# zfs set aclmode=passthrough dataset
# mkdir dir.test

# chmod A+group:somegroup:desired perms:fd:allow dir.test

create files and directories under dir.test.

This should allow anyone in the the desired group to read/write all 
files, and the passthrough of aclmode stops chmod(2) from prepending 
deny entries.


  -Mark
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] migration/acl4 problem

2007-03-22 Thread Peter Tribble

On 3/22/07, Mark Shellenbaum [EMAIL PROTECTED] wrote:



 Please explain how. I've been trying to make this work for months with
 no success.

 The business requirement is that all files in a directory hierarchy be
 created
 mode 660 - read and write by owner and primary group. How do I do
 this?


# zfs set aclmode=passthrough dataset
# mkdir dir.test

# chmod A+group:somegroup:desired perms:fd:allow dir.test

create files and directories under dir.test.

This should allow anyone in the the desired group to read/write all
files, and the passthrough of aclmode stops chmod(2) from prepending
deny entries.


This fails in a number of ways.


The apparent permissions do not show group write:

-rw-r--r--+  1 ptribble sysadmin 796 Mar 22 21:11 foo

Related to this, if you transfer the files somewhere else
that doesn't support these ACLs, then you lose the ACL
protection and get the permission bits, which may well
be incorrect.


You have to specify the group. This isn't always viable. The
requirement in at least some cases is that it is the user's primary
group, and will vary between files and directories.

Related to that, if you do a chgrp, the permissions don't get reset.
The ACL isn't rewritten to change the name of the group.

We need the ability for the ACL to apply to the owner and group owner
of the file, not some named group.


The file has an explicit ACL. That's not what we want. We just need
the permissions set according to the rules defined in various policies.

This leads to a number of other issues (in addition to the copy losing
information as described above). Just because it has an ACL, rcp
can't transfer it onto a non-ZFS filesystem:

rcp bentley:/samba/peter/dir.test/foo .
rcp: failed to set acl

And foo doesn't get transferred at all, leading to data loss.

Having an ACL makes it much harder to do an audit to
verify that access is correctly controlled.


Another interesting issue I just noticed in trying to work around the
above problems is that find -acl doesn't give me the files - it
only finds the top-level directory. This is for both zfs and ufs on S10U3
- it works fine for ufs on S10 FCS.


It looks like we're between a rock and a hard place. We want to use
ZFS for one project because of snapshots and data integrity - both
would give us considerable advantages over ufs (not to mention
filesystem size). Unfortunately, this is critical company data and the
access control has to be exactly right all the time: the default
ACLs as implemented in UFS are exactly what we need and work
perfectly.

My next question was going to be what the best way to transfer
an existing set of data to zfs while preserving the ACLs, but it
would appear that isn't even possible.

--
-Peter Tribble
http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re[2]: [zfs-discuss] asize is 300MB smaller than lsize - why?

2007-03-22 Thread Robert Milkowski
Hello Matthew,

Thursday, March 22, 2007, 8:07:14 PM, you wrote:

MA Robert Milkowski wrote:
 While lsize is the same for both files asize is smaller fr the second
 one. Why is it? When is is possible? Both file systems have 
 compression turned off and default recordsize. Diff claims both files
 to be the same.

MA Metadata (eg, DMU dnode, and indirect blocks for ZFS plain file,
MA which you can see broken out by using more -b's) is always compressed.
MA Because the metadata is necessarily different (there are different block
MA pointers, also the object numbers could be allocated differently, though
MA not in your situation), it can compress different amounts.

MA So, this is always possible, and in fact likely.

Well, I don't know.
DMU in both cases is so small that it doesn't really matter.
Both are the sime files (diff confirms that) about 1.6GB in size and
the actual on disk size is more than 20% different. That's really a big
difference just for one large file.

zdb -b (or -bbb) doesn't work here (b56):

bash-3.00# zdb -b solaris/d100 809
Dataset solaris/d100 [ZPL], ID 189, cr_txg 779704, 1.64G, 807 objects
bash-3.00# zdb -bbb solaris/d100 809
Dataset solaris/d100 [ZPL], ID 189, cr_txg 779704, 1.64G, 807 objects
bash-3.00# zdb -bbbvvv solaris/d100 809
Dataset solaris/d100 [ZPL], ID 189, cr_txg 779704, 1.64G, 807 objects


bash-3.00# zdb - solaris/d100 809 /tmp/a
bash-3.00# zdb - solaris/d100-copy1 809 /tmp/b
bash-3.00# cat /tmp/a | wc -l
   13070
bash-3.00# cat /tmp/b | wc -l
   10295

bash-3.00# tail -10 /tmp/a
64d0   L0 0:21342:2 2L/2P F=1 B=831385
64d2   L0 0:21344:2 2L/2P F=1 B=831385
64d4   L0 0:21346:2 2L/2P F=1 B=831385
64d6   L0 0:21348:2 2L/2P F=1 B=831385
64d8   L0 0:2134a:2 2L/2P F=1 B=831385
64da   L0 0:2134c:2 2L/2P F=1 B=831385
64dc   L0 0:ea1c:2 2L/2P F=1 B=831388

segment [, 6500) size 1.58G

bash-3.00# tail -10 /tmp/b
64d0   L0 0:116a6:2 2L/2P F=1 B=831417
64d2   L0 0:116a8:2 2L/2P F=1 B=831417
64d4   L0 0:116aa:2 2L/2P F=1 B=831417
64d6   L0 0:116ac:2 2L/2P F=1 B=831417
64d8   L0 0:116ae:2 2L/2P F=1 B=831417
64da   L0 0:116b0:2 2L/2P F=1 B=831417
64dc   L0 0:116b2:2 2L/2P F=1 B=831417

segment [14c4, 2600) size  276M

bash-3.00#

What's the last line about?
Also only /tmp/a has a Deadlist entries:
Deadlist: 33 entries, 235K (114K/114K comp)

Item   0: 0:191e0ea00:e00 4000L/e00P F=0 B=831102
Item   1: 0:ea1a2000:800 4000L/800P F=0 B=831388
Item   2: 0:191d58000:1000 4000L/1000P F=0 B=831102
Item   3: 0:2507b2200:1200 4000L/1200P F=0 B=831294
Item   4: 0:191e06200:1200 4000L/1200P F=0 B=831102
Item   5: 0:191e07400:1200 4000L/1200P F=0 B=831102
Item   6: 0:250186000:1000 4000L/1000P F=0 B=831294
Item   7: 0:191e0b800:e00 4000L/e00P F=0 B=831102
Item   8: 0:191e0d800:1200 4000L/1200P F=0 B=831102
Item   9: 0:191e03e00:1200 4000L/1200P F=0 B=831102
Item  10: 0:250188000:1000 4000L/1000P F=0 B=831294
Item  11: 0:191e09800:1200 4000L/1200P F=0 B=831102
Item  12: 0:191e10a00:1200 4000L/1200P F=0 B=831102
Item  13: 0:191e02c00:1200 4000L/1200P F=0 B=831102
Item  14: 0:191e05000:1200 4000L/1200P F=0 B=831102
Item  15: 0:191e08600:1200 4000L/1200P F=0 B=831102
Item  16: 0:2507b3400:e00 4000L/e00P F=0 B=831294
Item  17: 0:191d57000:1000 4000L/1000P F=0 B=831102
Item  18: 0:191d56000:1000 4000L/1000P F=0 B=831102
Item  19: 0:250189000:1000 4000L/1000P F=0 B=831294
Item  20: 0:191d59000:1000 4000L/1000P F=0 B=831102
Item  21: 0:191e0f800:1200 4000L/1200P F=0 B=831102
Item  22: 0:191e12e00:1200 4000L/1200P F=0 B=831102
Item  23: 0:191e11c00:1200 4000L/1200P F=0 B=831102
Item  24: 0:191e0aa00:e00 4000L/e00P F=0 B=831102
Item  25: 0:25339a400:e00 4000L/e00P F=0 B=831342
Item  26: 0:ea1a2800:800 4000L/800P F=0 B=831388
Item  27: 0:ea1a1c00:400 4000L/400P F=0 B=831388
Item  28: 0:ea1a3000:400 4000L/400P F=0 B=831388
Item  29: 0:ea1a3400:400 4000L/400P F=0 B=831388
Item  30: 0:ea1a3800:400 4000L/400P F=0 B=831388
Item  31: 0:ea1a3c00:400 4000L/400P F=0 B=831388
Item  32: 0:ea1a4000:200 400L/200P F=0 B=831388

What are those?


And even if that is to be expected (such a big difference in actual
space utilization) something is far from perfect here. Both file
systems are in the same pool and over 20% difference in size on just
one large file is huge - perhaps some 

Re: [zfs-discuss] asize is 300MB smaller than lsize - why?

2007-03-22 Thread Matthew Ahrens

Robert Milkowski wrote:

What's the last line about?


Ah -- I think that may help explain things.  It may be that your file 
has some runs of zeros in it, which are represented as holes in 
d100-copy1/m1, but as blocks of zeros in the d100/m1.  It begs the 
question, what is this file and how did you create the copy?



Also only /tmp/a has a Deadlist entries:


That's because you have snapshots of d100 but not of d100-copy1, and 
apparently the contents of the d100 fs have changed since the most 
recent snapshot.


--matt
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re[2]: [zfs-discuss] asize is 300MB smaller than lsize - why?

2007-03-22 Thread Robert Milkowski
Hello Matthew,

Friday, March 23, 2007, 12:01:12 AM, you wrote:

MA Robert Milkowski wrote:
 What's the last line about?

MA Ah -- I think that may help explain things.  It may be that your file 
MA has some runs of zeros in it, which are represented as holes in 
MA d100-copy1/m1, but as blocks of zeros in the d100/m1.  It begs the 
MA question, what is this file and how did you create the copy?

This file is full of 0s - it was created by
  dd if=/dev/zero of=/solaris/d100/m1 bs=32k

Then file system solaris/d100 was replicated in a similar way to zfs
send|zfs recv into solaris/d100-copy1.

Now I wonder how holes were created and why not as entire file...




 Also only /tmp/a has a Deadlist entries:

MA That's because you have snapshots of d100 but not of d100-copy1, and 
MA apparently the contents of the d100 fs have changed since the most 
MA recent snapshot.

thanks for info

-- 
Best regards,
 Robertmailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS layout for 10 disk?

2007-03-22 Thread Nicholas Lee

On 3/23/07, John-Paul Drawneek [EMAIL PROTECTED] wrote:
I've got the same consideration at the moment.


Should i do 9 disk raidz2 with a spare, or could i do two raidz2 to get a

bit of performance?

Only done tests with striped mirrors which seems to give it a boost, so is
it worth it with a raidz2 of this small size?



10 disks with two raidz2 pools leaves no room for a spare. You get 6 disks
of storage and can lose any two disks before worrying.

With Raid 10 you get 5 disks of storage and can loss anyone disk before
worrying, but get better performance.

I've thought about get the following:

Raidz2 over 5 disks and Raid 10 over 4 disks with one spare.  Use the mirror
for small random read stuff, like maildir and use the raidz2 pool for iscsi
system images.


See also:
http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#Should_I_Configure_a_RAID-Z.2C_RAID-Z2.2C_or_a_Mirrored_Storage_Pool.3F

and
http://blogs.sun.com/roch/date/20060531 - WHEN TO (AND NOT TO) USE RAID-Z
and
http://blogs.sun.com/relling/entry/zfs_raid_recommendations_space_performance


Nicholas
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] C'mon ARC, stay small...

2007-03-22 Thread Darren . Reed

Jim Mauro wrote:


All righty...I set c_max to 512MB, c to 512MB, and p to 256MB...

  arc::print -tad
{
 ...
c02e29e8 uint64_t size = 0t299008
c02e29f0 uint64_t p = 0t16588228608
c02e29f8 uint64_t c = 0t33176457216
c02e2a00 uint64_t c_min = 0t1070318720
c02e2a08 uint64_t c_max = 0t33176457216
...
}
  c02e2a08 /Z 0x2000
arc+0x48:   0x7b9789000 =   0x2000
  c02e29f8 /Z 0x2000
arc+0x38:   0x7b9789000 =   0x2000
  c02e29f0 /Z 0x1000
arc+0x30:   0x3dcbc4800 =   0x1000
  arc::print -tad
{
...
c02e29e8 uint64_t size = 0t299008
c02e29f0 uint64_t p = 0t268435456  -- p 
is 256MB
c02e29f8 uint64_t c = 0t536870912  -- c 
is 512MB

c02e2a00 uint64_t c_min = 0t1070318720
c02e2a08 uint64_t c_max = 0t536870912--- c_max is 
512MB

...
}

After a few runs of the workload ...

  arc::print -d size
size = 0t536788992
 


Ah - looks like we're out of the woods. The ARC remains clamped at 512MB.



Is there a way to set these fields using /etc/system?
Or does this require a new or modified init script to
run and do the above with each boot?

Darren

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] /tmp on ZFS?

2007-03-22 Thread Matt B
Is this something that should work? The assumption is that there is a dedicated 
raw SWAP slice and after install /tmp (which will be on /) will be unmounted 
and mounted to zpool/tmp (just like zpool/home)

Thoughts on this?
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS layout for 10 disk?

2007-03-22 Thread Nicholas Lee

On 3/23/07, John-Paul Drawneek [EMAIL PROTECTED] wrote:


Can i do to Raidz2 over 5 and a Raidz2 over 4 with a spare for them all?
or two Raidz2 over 4 with 2 spare?



This is a question I was planning to ask as well.

Does zfs allow a hot spare to be allocated to multiple pools or as a system
hot spare.  Or would this be done manually with a cron script.

Nicholas
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS layout for 10 disk?

2007-03-22 Thread Eric Schrock
On Fri, Mar 23, 2007 at 12:11:38PM +1200, Nicholas Lee wrote:
 On 3/23/07, John-Paul Drawneek [EMAIL PROTECTED] wrote:
 
 Can i do to Raidz2 over 5 and a Raidz2 over 4 with a spare for them all?
 or two Raidz2 over 4 with 2 spare?
 
 
 This is a question I was planning to ask as well.
 
 Does zfs allow a hot spare to be allocated to multiple pools or as a system
 hot spare.  Or would this be done manually with a cron script.
 
 Nicholas

Spares can belong to multiple pools (they can only be actively spared in
one pool, obviously).

- Eric

--
Eric Schrock, Solaris Kernel Development   http://blogs.sun.com/eschrock
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] /tmp on ZFS?

2007-03-22 Thread Michael Schuster

Matt B wrote:

Is this something that should work? The assumption is that there is a dedicated 
raw SWAP slice and after install /tmp (which will be on /) will be unmounted 
and mounted to zpool/tmp (just like zpool/home)

Thoughts on this?


you are aware that /tmp by default resides in memory these days? putting 
/tmp on disk can have quite severe impact on performance.


Michael
--
Michael Schuster
Recursion, n.: see 'Recursion'
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] migration/acl4 problem

2007-03-22 Thread Jens Elkner
On Thu, Mar 22, 2007 at 01:34:15PM -0600, Mark Shellenbaum wrote:
 
 There is one big difference which you see here.  ZFS always honors the 
 users umask, and that is why the file was created with 644 permission 
 rather than 664 as UFS did.  ZFS has to always apply the users umask 
 because of POSIX.
 
 Wow, that's a big show stopper! If I tell the users, that after the
 transition they have to toggle their umask before/after writing to
 certain directories or need to do a chmod, I'm sure they wanna hang me
 right on the next tree and wanna get their OS changed to Linux/Windooze...
 
 
 Only if your goal is to ignore a users intent on what permissions their 
 files should be created with.  Think about users who set their umask to 
 077.  They will be upset when their files are created with a more 
 permissive mode.  The ZFS way is much more secure.

Nope - you're talking about a different thing. I did not say, that
these ACLs would be set on every possible fs|directory on the system!
We and several companies I worked for use it to have a shared data dir
you might think of it as a kind of workgroup based CVS, where
the members of the owning workgroup are in the role of committers.

The rationale for this is obvious and actually the same as for CVS:
the only thing that counts is, what one can find in /data/$workgroup/**
So no need to waist time for asking, who has finally the latest version
of a document or the version, which should be used wrt. communication 
with none-internal entities, etc. and furthermore it allows to reduce
the huge pile of redundant data extremly...

We used this pattern/policy successfully for more than 10 year: for
window users it was achieved easily by using samba, on Linux servers
using XFS ACLs and on Solaris servers using UFS ACLs. ZFS breaks it.
And since Solaris has no smbmnt - we can't even get a workaround, which
makes more or less sense...

 What is your real desired goal?  Are you just wanting anybody in a 
 specific group to be able to read,write all files in a certain directory 
 tree?  If so, then there are other ways to achieve this, with file and 
 directory inheritance.

May be I didn't use the right settings, but I played around with it
before sending the original posting (zfs aclmode intentionally set
to passthrough and added fd flags), but this didn't work either.
So a working example/demo would be helpful ...

 Isn't there a flag/property for zfs, to get back the old behavior
 or to enable POSIX-ACLs instead of zfs-ACLs?
 A force_directory_create_mode=0770,force_file_create_mode=0660'
 (like for samba shares) property would be even better - no need to fight
 with ACLs...
 
 That would be bad.  That would mean that every file in a file system 
 would be forced to be created with forced set of permissions.

And that's exactly the business requirement. And even more a practical
expericence: Assume user always have to change their umask before
writing to /data/workgroup/**. Since people are usually a little bit
lazy and are focused on get the job done, it doesn't take very long
until the have added umask 007 to their .login/.profile whatever.
But now, anybody in the same workgroup is also able to read the users
private data in its $HOME, e.g. $HOME/Mail/* ...

So in theory you might be right, but in practice it turns out, that you
are achieving exactly the opposite...

Regards,
jel.
-- 
Otto-von-Guericke University http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany Tel: +49 391 67 12768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS layout for 10 disk?

2007-03-22 Thread Richard Elling

John-Paul Drawneek wrote:

got a 12 disk system - all 18gb
2 mirror for boot, now what to do with the rest?

The storage is to be used for user space, web stuff and to store anything else 
(dump for data).

I could do 5 mirrors, but thats a wasting quite a bit of space.

Was thinking about raidz2, as its almost as reliable and better for space.

Should i do 9 disk raidz2 with a spare, or could i do two raidz2 to get a bit 
of performance?

Only done tests with striped mirrors which seems to give it a boost, so is it 
worth it with a raidz2 of this small size?

Thanks for any help


Consider that 18GByte disks are old and their failure rate will
increase dramatically over the next few years.  Do something to
have redundancy.  If raidz2 works for your workload, I'd go with
that.

BTW, I was just at Fry's, new 500 GByte Seagate drives are $180.
Prices for new disks tend to approach $150 (USD) after which they
are replaced by larger drives and the inventory is price reduced
until gone. A 2-new disk mirror will be more reliable than any
reasonable combination of 5-year old disks. Food for thought.
 -- richard
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] asize is 300MB smaller than lsize - why?

2007-03-22 Thread Matthew Ahrens

Robert Milkowski wrote:
MA Ah -- I think that may help explain things.  It may be that your file 
MA has some runs of zeros in it, which are represented as holes in 
MA d100-copy1/m1, but as blocks of zeros in the d100/m1.  It begs the 
MA question, what is this file and how did you create the copy?


This file is full of 0s - it was created by
  dd if=/dev/zero of=/solaris/d100/m1 bs=32k

Then file system solaris/d100 was replicated in a similar way to zfs
send|zfs recv into solaris/d100-copy1.

Now I wonder how holes were created and why not as entire file...


Hmm, that's definitely curious.  What do you mean by a similar way to 
zfs send | zfs recv?  Can you send me the full output of your 'zdb 
- solaris/d100{-copy1} 809'?


--matt
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Is there any performance problem with hard links in ZFS?

2007-03-22 Thread Matthew Ahrens

Viktor Turskyi wrote:

Thank you very much for the consultation. The information was very
useful.


You're welcome!


And I have one more questions about ZFS file attributes. I have found
such information 2^56 — Number of attributes of a file in ZFS but i
cant found any information mechanism of creating such attributes.
The situation is next: I want to save a lot of file information not
in database but in file attributes. File attributes setting will be
processing by a Perl or C program. Is there any way to do this?


See fsattr(5)

--matt
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: Re: Proposal: ZFS hotplug supportandautoconfiguration

2007-03-22 Thread Anton B. Rang
  Consider a server [with] three drives, A, B, and C, in which A and B are 
  mirrored and
  C is not. Pull out A, B, and C, and re-insert them as A, C, and B. If
  B is slow to come up for some reason, ZFS will see C in place of
  B, and happily reformat it into a mirror of A.  (Or am I reading this 
  incorrectly?)
 Again, thanks to devids, the autoreplace code would not kick in here at
 all.  You would end up with an identical pool.

Is this because C would already have a devid? If I insert an unlabeled disk, 
what happens? What if B takes five minutes to spin up? If it never does?

  I hope that there's a way to disable the periodic probing of hot
  spares.  Spinning these drives up often might be highly annoying in
  some environments (though useful in others, since it could also verify
  that the disk is responding normally).
 
 Why is this highly annoying?  The frequency would be rather low, would
 have no effect on performance, and you're gaining the ability to know
 whether your hot spares aare actually working.

Well, in my home office it would be highly annoying if I got to hear 
spin-up/spin-down sounds every half hour. The ability to tune the time interval 
would probably make this OK, though. I could live with once a day or once a 
week.

Anton
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Re: Proposal: ZFS hotplug

2007-03-22 Thread Darren Dunham
   Consider a server [with] three drives, A, B, and C, in which A and B are 
   mirrored and
   C is not. Pull out A, B, and C, and re-insert them as A, C, and B. If
   B is slow to come up for some reason, ZFS will see C in place of
   B, and happily reformat it into a mirror of A.  (Or am I reading this 
   incorrectly?)
  Again, thanks to devids, the autoreplace code would not kick in here at
  all.  You would end up with an identical pool.
 
 Is this because C would already have a devid?

Well, it's because all the members of the ZFS pool have information
about the pool and their place in it.  The path of a member isn't
important.  

 If I insert an unlabeled disk, what happens?

Nothing.  If ZFS can't find a signature on it, it knows it's not part of
a ZFS pool.

 What if B takes five minutes to spin up?

That sounds like something for FMA to deal better with.  It might hang
for a period of time if the driver doesn't respond quickly.

 If it never does?

At some point the device driver needs to respond.  If the device doesn't
become ready, it'll have to time out and be noted as a failure.


-- 
Darren Dunham   [EMAIL PROTECTED]
Senior Technical Consultant TAOShttp://www.taos.com/
Got some Dr Pepper?   San Francisco, CA bay area
  This line left intentionally blank to confuse you. 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss