Re: [zfs-discuss] [zones-discuss] Zones on shared storage - a warning
update on this one: a workaround if you so will, or the more appropriate way to do this is apparently to use lofiadm(1M) to create a pseudo block device comprising the file hosted on NFS and use the created lofi device (eg. /dev/lofi/1) as the device for zpool create and all subsequent I/O (this was not producing the strange CKSUM errors), eg.: osoldev.root./export/home/batschul.= mount -F nfs opteron:/pool/zones /nfszone osoldev.root./export/home/batschul.= mount -v| grep nfs opteron:/pool/zones on /nfszone type nfs remote/read/write/setuid/devices/xattr/dev=9080001 on Tue Feb 9 10:37:00 2010 osoldev.root./export/home/batschul.= nfsstat -m /nfszone from opteron:/pool/zones Flags: vers=4,proto=tcp,sec=sys,hard,intr,link,symlink,acl,rsize=1048576,wsize=1048576,retrans=5,timeo=600 Attr cache:acregmin=3,acregmax=60,acdirmin=30,acdirmax=60 osoldev.root./export/home/batschul.= mkfile -n 7G /nfszone/remote.file osoldev.root./export/home/batschul.= ls -la /nfszone total 28243534 drwxrwxrwx 2 nobody nobody 6 Feb 9 09:36 . drwxr-xr-x 30 batschul other 32 Feb 8 22:24 .. -rw--- 1 nobody nobody 7516192768 Feb 9 09:36 remote.file osoldev.root./export/home/batschul.= lofiadm -a /nfszone/remote.file /dev/lofi/1 osoldev.root./export/home/batschul.= lofiadm Block Device File Options /dev/lofi/1 /nfszone/remote.file - osoldev.root./export/home/batschul.= zpool create -m /tank/zones/nfszone nfszone /dev/lofi/1 Feb 9 10:50:35 osoldev zfs: [ID 249136 kern.info] created version 22 pool nfszone using 22 osoldev.root./export/home/batschul.= zpool status -v nfszone pool: nfszone state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM nfszoneONLINE 0 0 0 /dev/lofi/1 ONLINE 0 0 0 --- frankB ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [zones-discuss] Zones on shared storage - a warning
On Fri, 08 Jan 2010 18:33:06 +0100, Mike Gerdts mger...@gmail.com wrote: I've written a dtrace script to get the checksums on Solaris 10. Here's what I see with NFSv3 on Solaris 10. jfyi, I've reproduces it as well using a Solaris 10 Update 8 SB2000 sparc client and NFSv4. much like you I also get READ errors along with the CKSUM errors which is different from my observation on a ONNV client. unfortunately your dtrace script did not worked for me, ie. it did not spit out anything :( cheers frankB ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [zones-discuss] Zones on shared storage - a warning
On Wed, 23 Dec 2009 03:02:47 +0100, Mike Gerdts mger...@gmail.com wrote: I've been playing around with zones on NFS a bit and have run into what looks to be a pretty bad snag - ZFS keeps seeing read and/or checksum errors. This exists with S10u8 and OpenSolaris dev build snv_129. This is likely a blocker for anything thinking of implementing parts of Ed's Zones on Shared Storage: http://hub.opensolaris.org/bin/view/Community+Group+zones/zoss The OpenSolaris example appears below. The order of events is: 1) Create a file on NFS, turn it into a zpool 2) Configure a zone with the pool as zonepath 3) Install the zone, verify that the pool is healthy 4) Boot the zone, observe that the pool is sick [...] r...@soltrain19# zoneadm -z osol boot r...@soltrain19# zpool status osol pool: osol state: DEGRADED status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or replace the device with 'zpool replace'. see: http://www.sun.com/msg/ZFS-8000-9P scrub: none requested config: NAME STATE READ WRITE CKSUM osol DEGRADED 0 0 0 /mnt/osolzone/root DEGRADED 0 0 117 too many errors errors: No known data errors Hey Mike, you're not the only victim of these strange CHKSUM errors, I hit the same during my slightely different testing, where I'm NFS mounting an entire, pre-existing remote file living in the zpool on the NFS server and use that to create a zpool and install zones into it. I've filed today: 6915265 zpools on files (over NFS) accumulate CKSUM errors with no apparent reason here's the relevant piece worth investigating out of it (leaving out the actual setup etc..) as in your case, creating the zpool and installing the zone into it still gives a healthy zpool, but immediately after booting the zone, the zpool served over NFS accumulated CHKSUM errors. of particular interest are the 'cksum_actual' values as reported by Mike for his test case here: http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg33041.html if compared to the 'chksum_actual' values I got in the fmdump error output on my test case/system: note, the NFS servers zpool that is serving and sharing the file we use is healthy. zone halted now on my test system, and checking fmdump: osoldev.batschul./export/home/batschul.= fmdump -eV | grep cksum_actual | sort | uniq -c | sort -n | tail 2cksum_actual = 0x4bea1a77300 0xf6decb1097980 0x217874c80a8d9100 0x7cd81ca72df5ccc0 2cksum_actual = 0x5c1c805253 0x26fa7270d8d2 0xda52e2079fd74 0x3d2827dd7ee4f21 6cksum_actual = 0x28e08467900 0x479d57f76fc80 0x53bca4db5209300 0x983ddbb8c4590e40 *A 6cksum_actual = 0x348e6117700 0x765aa1a547b80 0xb1d6d98e59c3d00 0x89715e34fbf9cdc0 *B 7cksum_actual = 0x0 0x0 0x0 0x0 *C 11cksum_actual = 0x1184cb07d00 0xd2c5aab5fe80 0x69ef5922233f00 0x280934efa6d20f40 *D 14cksum_actual = 0x175bb95fc00 0x1767673c6fe00 0xfa9df17c835400 0x7e0aef335f0c7f00 *E 17cksum_actual = 0x2eb772bf800 0x5d8641385fc00 0x7cf15b214fea800 0xd4f1025a8e66fe00 *F 20cksum_actual = 0xbaddcafe00 0x5dcc54647f00 0x1f82a459c2aa00 0x7f84b11b3fc7f80 *G 25cksum_actual = 0x5d6ee57f00 0x178a70d27f80 0x3fc19c3a19500 0x82804bc6ebcfc0 osoldev.root./export/home/batschul.= zpool status -v pool: nfszone state: DEGRADED status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or replace the device with 'zpool replace'. see: http://www.sun.com/msg/ZFS-8000-9P scrub: none requested config: NAMESTATE READ WRITE CKSUM nfszone DEGRADED 0 0 0 /nfszone DEGRADED 0 0 462 too many errors errors: No known data errors == now compare this with Mike's error output as posted here: http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg33041.html # fmdump -eV | grep cksum_actual | sort | uniq -c | sort -n | tail 2cksum_actual = 0x14c538b06b6 0x2bb571a06ddb0 0x3e05a7c4ac90c62 0x290cbce13fc59dce *D 3cksum_actual = 0x175bb95fc00 0x1767673c6fe00 0xfa9df17c835400 0x7e0aef335f0c7f00 *E 3cksum_actual = 0x2eb772bf800 0x5d8641385fc00 0x7cf15b214fea800 0xd4f1025a8e66fe00 *B 4cksum_actual = 0x0 0x0 0x0 0x0 4cksum_actual = 0x1d32a7b7b00 0x248deaf977d80 0x1e8ea26c8a2e900 0x330107da7c4bcec0 5cksum_actual = 0x14b8f7afe6 0x915db8d7f87 0x205dc7979ad73 0x4e0b3a8747b8a8 *C 6cksum_actual = 0x1184cb07d00 0xd2c5aab5fe80 0x69ef5922233f00 0x280934efa6d20f40 *A 6
Re: [zfs-discuss] [zones-discuss] Zones on shared storage - a warning
On Fri, 08 Jan 2010 13:55:13 +0100, Darren J Moffat darr...@opensolaris.org wrote: Frank Batschulat (Home) wrote: This just can't be an accident, there must be some coincidence and thus there's a good chance that these CHKSUM errors must have a common source, either in ZFS or in NFS ? What are you using for on the wire protection with NFS ? Is it shared using krb5i or do you have IPsec configured ? If not I'd recommend trying one of those and see if your symptoms change. Hey Darren, doing krb5i is certainly a good idea for additional protection in general, however I have some doubts that NFS OTW corruption will produce the exact same wrong checksum inside 2 totally different setups and networks, as comparing Mike and my results showed [see 1]. cheers frankB [1] osoldev.batschul./export/home/batschul.= fmdump -eV | grep cksum_actual | sort | uniq -c | sort -n | tail 2cksum_actual = 0x4bea1a77300 0xf6decb1097980 0x217874c80a8d9100 0x7cd81ca72df5ccc0 2cksum_actual = 0x5c1c805253 0x26fa7270d8d2 0xda52e2079fd74 0x3d2827dd7ee4f21 6cksum_actual = 0x28e08467900 0x479d57f76fc80 0x53bca4db5209300 0x983ddbb8c4590e40 *A 6cksum_actual = 0x348e6117700 0x765aa1a547b80 0xb1d6d98e59c3d00 0x89715e34fbf9cdc0 *B 7cksum_actual = 0x0 0x0 0x0 0x0 *C 11cksum_actual = 0x1184cb07d00 0xd2c5aab5fe80 0x69ef5922233f00 0x280934efa6d20f40 *D 14cksum_actual = 0x175bb95fc00 0x1767673c6fe00 0xfa9df17c835400 0x7e0aef335f0c7f00 *E 17cksum_actual = 0x2eb772bf800 0x5d8641385fc00 0x7cf15b214fea800 0xd4f1025a8e66fe00 *F 20cksum_actual = 0xbaddcafe00 0x5dcc54647f00 0x1f82a459c2aa00 0x7f84b11b3fc7f80 *G 25cksum_actual = 0x5d6ee57f00 0x178a70d27f80 0x3fc19c3a19500 0x82804bc6ebcfc0 == now compare this with Mike's error output as posted here: http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg33041.html # fmdump -eV | grep cksum_actual | sort | uniq -c | sort -n | tail 2cksum_actual = 0x14c538b06b6 0x2bb571a06ddb0 0x3e05a7c4ac90c62 0x290cbce13fc59dce *D 3cksum_actual = 0x175bb95fc00 0x1767673c6fe00 0xfa9df17c835400 0x7e0aef335f0c7f00 *E 3cksum_actual = 0x2eb772bf800 0x5d8641385fc00 0x7cf15b214fea800 0xd4f1025a8e66fe00 *B 4cksum_actual = 0x0 0x0 0x0 0x0 4cksum_actual = 0x1d32a7b7b00 0x248deaf977d80 0x1e8ea26c8a2e900 0x330107da7c4bcec0 5cksum_actual = 0x14b8f7afe6 0x915db8d7f87 0x205dc7979ad73 0x4e0b3a8747b8a8 *C 6cksum_actual = 0x1184cb07d00 0xd2c5aab5fe80 0x69ef5922233f00 0x280934efa6d20f40 *A 6cksum_actual = 0x348e6117700 0x765aa1a547b80 0xb1d6d98e59c3d00 0x89715e34fbf9cdc0 *F 16cksum_actual = 0xbaddcafe00 0x5dcc54647f00 0x1f82a459c2aa00 0x7f84b11b3fc7f80 *G 48cksum_actual = 0x5d6ee57f00 0x178a70d27f80 0x3fc19c3a19500 0x82804bc6ebcfc0 and observe that the values in 'chksum_actual' causing our CHKSUM pool errors eventually because of missmatching with what had been expected are the SAME ! for 2 totally different client systems and 2 different NFS servers (mine vrs. Mike's), see the entries marked with *A to *G. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [ufs-discuss] Re: [zfs-discuss] Differences between ZFS and UFS.
On Sat, 30 Dec 2006 18:13:04 +0100, [EMAIL PROTECTED] wrote: I think removing the ability to use link(2) or unlink(2) on directories would hurt no-one and would make a few things easier. I'd be rather carful here, see the standards implications drafted in 4917742. The standard gives permission to disallow unlink() on directories: The path argument shall not name a directory unless the process has appropriate privileges and the implementation supports using unlink() on directories. The ZFS implementation disallows it. I'd be more then happy to see this feature disappearing from UFS as well. --- frankB ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [ufs-discuss] Re: [zfs-discuss] Differences between ZFS and UFS.
On Sat, 30 Dec 2006 15:50:53 +0100, [EMAIL PROTECTED] wrote: Link with the target being a directory and the source a any file or only directories? And only as superuer? I'm sorry, I ment unlink(2) here. Ah, so symmetrical with link(2) to directories. unlink(2) doesn't always work and rmdir(2) will not remove empty directories with a link count other than 2 (for . and ..) You can't unlink . or ..; though in early days it was what was used to create directories. (mknod dir , link(dir, dir/.) , link(., dir/..)) I think removing the ability to use link(2) or unlink(2) on directories would hurt no-one and would make a few things easier. I'd be rather carful here, see the standards implications drafted in 4917742. --- frankB ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: [ufs-discuss] Differences between ZFS and UFS.
On Sat, 30 Dec 2006 02:28:49 +0100, Pawel Jakub Dawidek [EMAIL PROTECTED] wrote: Hi. Here are some things my file system test suite discovered on Solaris ZFS and UFS. Bascially ZFS pass all my tests (about 3000). I see one problem with UFS and two differences: 1. link(2) manual page states that privileged processes can make multiple links to a directory. This looks like a general comment, but it's only true for UFS. 2. link(2) in UFS allows to remove directories, but doesn't allow this in ZFS. both are actually correct, the standards wording permits either way: snip link(2)/unlink(2) POSIX IEEE Std 1003.1, 2004 Edition If path1 names a directory, link() shall fail unless the process has appropriate privileges and the implementation supports using link() on directories. The path argument shall not name a directory unless the process has appropriate privileges and the implementation supports using unlink() on directories. snip end UFS does support link(2)/unlink(2) with appropriate privilidges of directories while ZFS does not. However, it gets interesting when SVID3 comes into play: snip The link(BA_OS) and unlink(BA_OS) descriptions in SVID3 both specify that a process with appropriate privileges is allowed to operate on a directory. We have claimed to conform to SVID3 since Solaris 2.0 and have not announced that we ever plan to EOL SVID3 conformance. snip end 3. Unsuccessful link(2) can update file's ctime: # fstest mkdir foo 0755 # fstest create foo/bar 0644 # fstest chown foo/bar 65534 -1 # ctime1=`fstest stat foo/bar ctime` # sleep 1 # fstest -u 65534 link foo/bar foo/baz --- this unsuccessful operation updates ctime EACCES # ctime2=`fstest stat ${n0} ctime` # echo $ctime1 $ctime2 1167440797 1167440798 I'd call this a bug, although the standard does not presicely mentions this, but at least it sais: snip link(2) POSIX IEEE Std 1003.1, 2004 Edition Upon successful completion, link() shall mark for update the st_ctime field of the file. Also, the st_ctime and st_mtime fields of the directory that contains the new entry shall be marked for update. If link() fails, no link shall be created and the link count of the file shall remain unchanged. snip end Since we fail the successfull completion part and we should stay away from modifying the link count in that case I think it makes a good argument to also require us to stay away from updating the ctime as well. -- frankB (I'd rather be a forest than a street) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [nfs-discuss] Re: [zfs-discuss] Re: NFS Performance and Tar
On Tue, 10 Oct 2006 01:25:36 +0200, Roch [EMAIL PROTECTED] wrote: You tell me ? We have 2 issues can we make 'tar x' over direct attach, safe (fsync) and posix compliant while staying close to current performance characteristics ? In other words do we have the posix leeway to extract files in parallel ? why fsync(3C) ? it is usually more heavy weight then opening the file with O_SYNC - and both provide POSIX synchronized file integrity completion. --- frankB ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss