[zfs-discuss] Unusual Resilver Result
Hi, I just replaced a drive (c12t5d0 in the listing below). For the first 6 hours of the resilver I saw no issues. However, sometime during the last hour of the resilver, the new drive and two others in the same RAID-Z2 strip threw a couple checksum errors. Also, two of the other drives in the stripe sometime the the last hour decided they need to resilver small amounts of data (128K and 64K respectively). OS in snv126. My two questions are: Should I be worried about these checksum errors? What caused the small resilverings on c8t5d0 and c11t5d0 which were not replaced or otherwise touched? Thank you in advance. -J pool: zpool_db_css state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or replace the device with 'zpool replace'. see: http://www.sun.com/msg/ZFS-8000-9P scrub: resilver completed after 7h0m with 0 errors on Thu Sep 30 04:59:49 2010 config: NAME STATE READ WRITE CKSUM zpool_db_css ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 c7t5d0 ONLINE 0 0 0 c8t5d0 ONLINE 0 0 4 128K resilvered c10t5d0 ONLINE 0 0 0 c11t5d0 ONLINE 0 0 2 64K resilvered c12t5d0 ONLINE 0 0 3 61.0G resilvered c13t5d0 ONLINE 0 0 0 raidz2-1 ONLINE 0 0 0 c7t6d0 ONLINE 0 0 0 c8t6d0 ONLINE 0 0 0 c10t6d0 ONLINE 0 0 0 c11t6d0 ONLINE 0 0 0 c12t6d0 ONLINE 0 0 0 c13t6d0 ONLINE 0 0 0 raidz2-2 ONLINE 0 0 0 c7t7d0 ONLINE 0 0 0 c8t7d0 ONLINE 0 0 0 c10t7d0 ONLINE 0 0 0 c11t7d0 ONLINE 0 0 0 c12t7d0 ONLINE 0 0 0 c13t7d0 ONLINE 0 0 0 spares c13t4d0AVAIL c12t4d0AVAIL ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Unusual Resilver Result
On Thu, Sep 30, 2010 at 9:08 AM, Jason J. W. Williams jasonjwwilli...@gmail.com wrote: Should I be worried about these checksum errors? Maybe. Your disks, cabling or disk controller is probably having some issue which caused them. or maybe sunspots are to blame. Run a scrub often and monitor if there are more, and if there is a pattern to them. Have backups. Maybe switch hardware one by one to see if that helps. What caused the small resilverings on c8t5d0 and c11t5d0 which were not replaced or otherwise touched? It was the checksum errors. ZFS automatically read the good data on other mirrors, and replaced the broken blocks with correct data. If you run zpool clear and zpool scrub you will notice these checksum errors have vanished. If they were caused by botched writes, no new errors should probably appear. If they are botched reads, you can see some new ones appearing :( So, not critical yet but something to keep an eye on. Tuomas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Resliver making the system unresponsive
On Thu, Sep 30, 2010 at 1:16 AM, Scott Meilicke scott.meili...@craneaerospace.com wrote: Resliver speed has been beaten to death I know, but is there a way to avoid this? For example, is more enterprisy hardware less susceptible to reslivers? This box is used for development VMs, but there is no way I would consider this for production with this kind of performance hit during a resliver. According to http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6494473 http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6494473resilver should in later builds have some option to limit rebuild speed in order to allow for more IO during reconstruction, but I havent't found any guides on how to actually make use of this feature. Maybe someone can shed some light on this? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zfs unmount versus umount?
Hi Folks, Is there any technical difference between using zfs unmount to unmount a ZFS filesystem versus the standard unix umount command? I always use zfs unmount but some of my colleagues still just use umount. Is there any reason to use one over the other? Thanks. Doug Linder -- Learn more about Merchant Link at www.merchantlink.com. THIS MESSAGE IS CONFIDENTIAL. This e-mail message and any attachments are proprietary and confidential information intended only for the use of the recipient(s) named above. If you are not the intended recipient, you may not print, distribute, or copy this message or any attachments. If you have received this communication in error, please notify the sender by return e-mail and delete this message and any attachments from your computer. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs unmount versus umount?
On Thu, 30 Sep 2010, Linder, Doug wrote: Is there any technical difference between using zfs unmount to unmount a ZFS filesystem versus the standard unix umount command? I always use zfs unmount but some of my colleagues still just use umount. Is there any reason to use one over the other? No, they're identical. If you use 'zfs umount' the code automatically maps it to 'unmount'. It also maps 'recv' to 'receive' and '-?' to call into the usage function. Here's the relevant code from main(): /* * The 'umount' command is an alias for 'unmount' */ if (strcmp(cmdname, umount) == 0) cmdname = unmount; /* * The 'recv' command is an alias for 'receive' */ if (strcmp(cmdname, recv) == 0) cmdname = receive; /* * Special case '-?' */ if (strcmp(cmdname, -?) == 0) usage(B_TRUE); ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs unmount versus umount?
On 30.09.10 15:42, Mark J Musante wrote: On Thu, 30 Sep 2010, Linder, Doug wrote: Is there any technical difference between using zfs unmount to unmount a ZFS filesystem versus the standard unix umount command? I always use zfs unmount but some of my colleagues still just use umount. Is there any reason to use one over the other? No, they're identical. If you use 'zfs umount' the code automatically maps it to 'unmount'. It also maps 'recv' to 'receive' and '-?' to call into the usage function. Here's the relevant code from main(): Mark, I think that wasn't the question, rather, what's the difference between 'zfs u[n]mount' and '/usr/bin/umount'? HTH Michael -- michael.schus...@oracle.com http://blogs.sun.com/recursion Recursion, n.: see 'Recursion' ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs unmount versus umount?
Michael Schuster [mailto:michael.schus...@oracle.com] wrote: Mark, I think that wasn't the question, rather, what's the difference between 'zfs u[n]mount' and '/usr/bin/umount'? Yes, that was the question. Sorry I wasn't more clear. Doug Linder -- Learn more about Merchant Link at www.merchantlink.com. THIS MESSAGE IS CONFIDENTIAL. This e-mail message and any attachments are proprietary and confidential information intended only for the use of the recipient(s) named above. If you are not the intended recipient, you may not print, distribute, or copy this message or any attachments. If you have received this communication in error, please notify the sender by return e-mail and delete this message and any attachments from your computer. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs unmount versus umount?
On Thu, 30 Sep 2010, Darren J Moffat wrote: * It can be applied recursively down a ZFS hierarchy True. * It will unshare the filesystems first Actually, because we use the zfs command to do the unmount, we end up doing the unshare on the filesystem first. See the opensolaris code for details: http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/libzfs/common/libzfs_mount.c#zfs_unmount ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Resliver making the system unresponsive
On Sep 30, 2010, at 2:32 AM, Tuomas Leikola wrote: On Thu, Sep 30, 2010 at 1:16 AM, Scott Meilicke scott.meili...@craneaerospace.com wrote: Resliver speed has been beaten to death I know, but is there a way to avoid this? For example, is more enterprisy hardware less susceptible to reslivers? This box is used for development VMs, but there is no way I would consider this for production with this kind of performance hit during a resliver. According to http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6494473 resilver should in later builds have some option to limit rebuild speed in order to allow for more IO during reconstruction, but I havent't found any guides on how to actually make use of this feature. Maybe someone can shed some light on this? Simple. Resilver activity is throttled using a delay method. Nothing to tune here. In general, if resilver or scrub make a system seem unresponsive, there is a root cause that is related to the I/O activity. To diagnose, I usually use iostat -zxCn 10 (or similar) and look for unusual asvc_t from a busy disk. One bad disk can ruin performance for the whole pool. -- richard -- OpenStorage Summit, October 25-27, Palo Alto, CA http://nexenta-summit2010.eventbrite.com ZFS and performance consulting http://www.RichardElling.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Resliver making the system unresponsive
If we've found one bad disk, what are our options? On Thu, Sep 30, 2010 at 10:12 AM, Richard Elling richard.ell...@gmail.comwrote: On Sep 30, 2010, at 2:32 AM, Tuomas Leikola wrote: On Thu, Sep 30, 2010 at 1:16 AM, Scott Meilicke scott.meili...@craneaerospace.com wrote: Resliver speed has been beaten to death I know, but is there a way to avoid this? For example, is more enterprisy hardware less susceptible to reslivers? This box is used for development VMs, but there is no way I would consider this for production with this kind of performance hit during a resliver. According to http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6494473 resilver should in later builds have some option to limit rebuild speed in order to allow for more IO during reconstruction, but I havent't found any guides on how to actually make use of this feature. Maybe someone can shed some light on this? Simple. Resilver activity is throttled using a delay method. Nothing to tune here. In general, if resilver or scrub make a system seem unresponsive, there is a root cause that is related to the I/O activity. To diagnose, I usually use iostat -zxCn 10 (or similar) and look for unusual asvc_t from a busy disk. One bad disk can ruin performance for the whole pool. -- richard -- OpenStorage Summit, October 25-27, Palo Alto, CA http://nexenta-summit2010.eventbrite.com ZFS and performance consulting http://www.RichardElling.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Cannot destroy snapshots: dataset does not exist
Hello, I have a ZFS filesystem (zpool version 26 on Nexenta CP 3.01) which I'd like to rollback but it's having an existential crisis. Here's what I see: r...@bambi:/# zfs rollback bambi/faline/userd...@autod-2010-09-28 cannot rollback to 'bambi/faline/userd...@autod-2010-09-28': more recent snapshots exist use '-r' to force deletion of the following snapshots: bambi/faline/userd...@autoh-2010-09-28t2200 bambi/faline/userd...@autoh-2010-09-28t0300 snip That looks right; ZFS sees that there are other snapshots newer than the rollback snapshot. No problem, I'll just run with -r: r...@bambi:/# zfs rollback -r bambi/faline/userd...@autod-2010-09-28 cannot destroy 'bambi/faline/userd...@autoh-2010-09-28t2200': dataset does not exist cannot destroy 'bambi/faline/userd...@autoh-2010-09-28t0300': dataset does not exist snip Any ideas how to work around this apparent bug? Best, Ian ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Unusual Resilver Result
Thanks Tuomas. I'll run the scrub. It's an aging X4500. -J On Thu, Sep 30, 2010 at 3:25 AM, Tuomas Leikola tuomas.leik...@gmail.comwrote: On Thu, Sep 30, 2010 at 9:08 AM, Jason J. W. Williams jasonjwwilli...@gmail.com wrote: Should I be worried about these checksum errors? Maybe. Your disks, cabling or disk controller is probably having some issue which caused them. or maybe sunspots are to blame. Run a scrub often and monitor if there are more, and if there is a pattern to them. Have backups. Maybe switch hardware one by one to see if that helps. What caused the small resilverings on c8t5d0 and c11t5d0 which were not replaced or otherwise touched? It was the checksum errors. ZFS automatically read the good data on other mirrors, and replaced the broken blocks with correct data. If you run zpool clear and zpool scrub you will notice these checksum errors have vanished. If they were caused by botched writes, no new errors should probably appear. If they are botched reads, you can see some new ones appearing :( So, not critical yet but something to keep an eye on. Tuomas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Cannot destroy snapshots: dataset does not exist
Hi Ian, If this is a release prior to b122, you might be running into CR 6860996. Please see this thread for a possible resolution: http://opensolaris.org/jive/thread.jspa?messageID=493866#493866 Thanks, Cindy On 09/30/10 09:34, Ian Levesque wrote: Hello, I have a ZFS filesystem (zpool version 26 on Nexenta CP 3.01) which I'd like to rollback but it's having an existential crisis. Here's what I see: r...@bambi:/# zfs rollback bambi/faline/userd...@autod-2010-09-28 cannot rollback to 'bambi/faline/userd...@autod-2010-09-28': more recent snapshots exist use '-r' to force deletion of the following snapshots: bambi/faline/userd...@autoh-2010-09-28t2200 bambi/faline/userd...@autoh-2010-09-28t0300 snip That looks right; ZFS sees that there are other snapshots newer than the rollback snapshot. No problem, I'll just run with -r: r...@bambi:/# zfs rollback -r bambi/faline/userd...@autod-2010-09-28 cannot destroy 'bambi/faline/userd...@autoh-2010-09-28t2200': dataset does not exist cannot destroy 'bambi/faline/userd...@autoh-2010-09-28t0300': dataset does not exist snip Any ideas how to work around this apparent bug? Best, Ian ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Cannot destroy snapshots: dataset does not exist
Hi Cindy, Thanks for your email; I noticed that link before sending this to the list. Unfortunately, I'm running b134+ and there aren't any clones reported via zdb. Ian On Sep 30, 2010, at 12:33 PM, Cindy Swearingen wrote: If this is a release prior to b122, you might be running into CR 6860996. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] rpool issue
I tried to do a zfs root via flash install which was not sucessful later i did a normal flash installation on my sparc system , but now zpool import shows following status zpool import pool: rootpool id: 1557419723465062977 state: UNAVAIL status: The pool is formatted using an incompatible version. action: The pool cannot be imported. Access the pool on a system running newer software, or recreate the pool from backup. see: http://www.sun.com/msg/ZFS-8000-A5 config: rootpoolUNAVAIL newer version c0t1d0s2 ONLINE pool: rpool id: 5084939592711816445 state: FAULTED status: The pool metadata is corrupted. action: The pool cannot be imported due to damaged devices or data. The pool may be active on another system, but can be imported using the '-f' flag. see: http://www.sun.com/msg/ZFS-8000-72 config: rpool FAULTED corrupted data c0t0d0s2 ONLINE I tried removing zpool cache and rebooted and still its shows 2 unavailable pools for import ... #ls -l /etc/zfs/ total 0 # I 'm on UFS based slices .. df -h Filesystem size used avail capacity Mounted on /dev/dsk/c0t1d0s0 52G 6.0G45G12%/ /devices 0K 0K 0K 0%/devices ctfs 0K 0K 0K 0%/system/contract proc 0K 0K 0K 0%/proc mnttab 0K 0K 0K 0%/etc/mnttab swap29G 1.2M29G 1%/etc/svc/volatile objfs0K 0K 0K 0%/system/object sharefs 0K 0K 0K 0%/etc/dfs/sharetab fd 0K 0K 0K 0%/dev/fd swap 512M 0K 512M 0%/tmp swap29G24K29G 1%/var/run -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS Raidz2 problem, detached drive
I have an X4500 thumper box with 48x 500gb drives setup in a a pool and split into raidz2 sets of 8 - 10 drives within the single pool. I had a failed disk with i cfgadm unconfigured and replaced no problem, but it wasn't recognised as a Sun drive in Format and unbeknown to me someone else logged in remotely at the time and issued a zpool replace I corrected the system/drive recognition problem, drive seen and partitioned all ok but zpool showed two instances for the same drive, one as failed with corrupt data, the other as online but still in a degraded state as the spare had been utilized. I tried a zpool clear device, zpool scrub, zpool replace all with no joy...then and i kick myself now i thought i 'll detach and reattach the drive Drive detached no problem, no questions asked, failed drive still in zpool status, online one gone, reattach dosn't seem possible. As a temporary solution in case of further failures i've attached the new drive as a hot spare... My question ishow do i reattach the drive to the raidz2 set? Can i use the replace command to replace the currently used spare with the new drive if i first remove it as a hot spare? Or do i have to delete the whole pool and restore 24 TB of data...please no!!! -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] tagged ACL groups: let's just keep digging until we come out the other side
nw == Nicolas Williams nicolas.willi...@oracle.com writes: nw Keep in mind that Windows lacks a mode_t. We need to interop nw with Windows. If a Windows user cannot completely change file nw perms because there's a mode_t completely out of their nw reach... they'll be frustrated. well...AIUI this already works very badly, so keep that in mind, too. In AFS this is handled by most files having 777, and we could do the same if we had an AND-based system. This is both less frustrating and more self-documenting than the current system. In an AND-based system, some unix users will be able to edit the windows permissions with 'chmod A...'. In shops using older unixes where users can only set mode bits, the rule becomes ``enforced permissions are the lesser of what Unix people and Windows people apply.'' This rule is easy to understand, not frustrating, and readily encourages ad-hoc cooperation (``can you please set everything-everyone on your subtree? we'll handle it in unix.'' / ``can you please set 777 on your subtree? or 770 group windows? we want to add windows silly-sid-permissions.''). This is a big step better than existing systems with subtrees where Unix and Windows users are forced to cooperate. It would certainly work much better than the current system, where you look at your permissions and don't have any idea whether you've got more, less, or exactly the same permission as what your software is telling you: the crappy autotranslation teaches users that all bets are off. It would be nice if, under my proposal, we could delete the unix tagspace entirely: chpacl '(unix)' chmod -R A- . but unfortunately, deletion of ACL's is special-cased by Solaris's chmod to ``rewrite ACL's that match the UNIX permissions bits,'' so it would probably have to stay special-cased in a tagspace system. pgpzWtQEMyslr.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Disk keeps resilvering, was: Replacing a disk never completes
On 09/22/10 04:27 PM, Ben Miller wrote: On 09/21/10 09:16 AM, Ben Miller wrote: I had tried a clear a few times with no luck. I just did a detach and that did remove the old disk and has now triggered another resilver which hopefully works. I had tried a remove rather than a detach before, but that doesn't work on raidz2... thanks, Ben I made some progress. That resilver completed with 4 errors. I cleared those and still had the one error metadata:0x0 so I started a scrub. The scrub restarted the resilver on c4t0d0 again though! There currently are no errors anyway, but the resilver will be running for the next day+. Is this another bug or will doing a scrub eventually lead to a scrub of the pool instead of the resilver? Ben Well not much progress. The one permanent error metadata:0x0 came back. And the disk keeps wanting to resilver when trying to do a scrub. Now after the last resilver I have more checksum errors on the pool, but not on any disks: NAME STATE READ WRITE CKSUM pool2 ONLINE 0 037 ... raidz2-1ONLINE 0 074 All other checksum totals are 0. So three problems: 1. How to get the disk to stop resilvering? 2. How do you get checksum errors on the pool, but no disk is identified? If I clear them and let the resilver go again more checksum errors appear. So how to get rid of these errors? 3. How to get rid of the metadata:0x0 error? I'm currently destroying old snapshots (though that bug was fixed quite awhile ago and I'm running b134). I can try unmounting filesystems and remounting next (all are currently mounted). I can also schedule a reboot for next week if anyone things that would help. thanks, Ben ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] dedup status
Hi all I just tested dedup on this test box running OpenIndiana (147) storing bacula backups, and did some more testing on some datasets with ISO images. The results show so far that removing 30GB deduped datasets are done in a matter of minutes, which is not the case with 134 (which may take hours). The tests also show that the write speed to the pool is low, very low, if dedup is enabled. This is a box with a 3GHz core2duo, 8 gigs of RAM, eight 2TB drives and a 80GB x25m for the SLOG (4 gigs) and L2ARC (the rest of it). So far I will conclude that dedup should be useful if storage capacity is crucial, but not if performance is taken into concideration. Mind, this is not a high-end box, but still, I think the numbers show something Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 r...@karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer på norsk. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] tagged ACL groups: let's just keep digging until we come out the other side
On Thu, Sep 30, 2010 at 02:55:26PM -0400, Miles Nordin wrote: nw == Nicolas Williams nicolas.willi...@oracle.com writes: nw Keep in mind that Windows lacks a mode_t. We need to interop nw with Windows. If a Windows user cannot completely change file nw perms because there's a mode_t completely out of their nw reach... they'll be frustrated. well...AIUI this already works very badly, so keep that in mind, too. In AFS this is handled by most files having 777, and we could do the same if we had an AND-based system. This is both less frustrating and more self-documenting than the current system. In an AND-based system, some unix users will be able to edit the windows permissions with 'chmod A...'. In shops using older unixes where users can only set mode bits, the rule becomes ``enforced permissions are the lesser of what Unix people and Windows people apply.'' This rule is easy to understand, not frustrating, and readily encourages ad-hoc cooperation (``can you please set everything-everyone on your subtree? we'll handle it in unix.'' / ``can you please set 777 on your subtree? or 770 group windows? we want to add windows silly-sid-permissions.''). This is a big step better than existing systems with subtrees where Unix and Windows users are forced to cooperate. Consider this chronologically-ordered sequence of events: 1) File is created via Windows, gets SMB/ZFS/NFSv4-style ACL, including inherittable ACEs. A mode computed from this ACL might be 664, say. 2) A Unix user does chmod(644) on that file, and one way or another this effectively reduces permissions otherwise granted by the ACL. 3) Another Windows user now fails to get write perm that they should have, so they complain, and then the owner tries to view/change the ACL from a Windows desktop. Now what? Can the user in (3) fix the permissions from Windows? For that to be possible the mode must implicitly get recomputed when the ACL is modified. What if (2) happens again? But, OK, this is a problem no matter what, whether we do groupmasking, discard, or keep mode separate from the ACL and AND the two. ZFS does, in fact, keep a separate mode, and it does recompute it when ACLs are modified. So this may just be a matter of doing the AND thing and not touching the ACL on chmod. Is that what you have in mind? It would certainly work much better than the current system, where you look at your permissions and don't have any idea whether you've got more, less, or exactly the same permission as what your software is telling you: the crappy autotranslation teaches users that all bets are off. No, currently you look at permissions that they reflect the ACL (with the group bits being the max of all non-owner@ and non-everyone@ ACEs). It would be nice if, under my proposal, we could delete the unix tagspace entirely: chpacl '(unix)' chmod -R A- . Huh? but unfortunately, deletion of ACL's is special-cased by Solaris's chmod to ``rewrite ACL's that match the UNIX permissions bits,'' so it would probably have to stay special-cased in a tagspace system. Nico -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] tagged ACL groups: let's just keep digging until we come out the other side
On Thu, Sep 30, 2010 at 03:28:14PM -0500, Nicolas Williams wrote: Consider this chronologically-ordered sequence of events: 1) File is created via Windows, gets SMB/ZFS/NFSv4-style ACL, including inherittable ACEs. A mode computed from this ACL might be 664, say. 2) A Unix user does chmod(644) on that file, and one way or another this effectively reduces permissions otherwise granted by the ACL. 3) Another Windows user now fails to get write perm that they should have, so they complain, and then the owner tries to view/change the ACL from a Windows desktop. Now what? Can the user in (3) fix the permissions from Windows? For that to be possible the mode must implicitly get recomputed when the ACL is modified. Also, even if in (3) the user can fix the perms from Windows because we'd recompute the mode from the ACL, the user wouldn't be able to see the effective ACL (as reduced by the mode_t that Windows can't see). The only way to address that is... to do groupmasking. And that gets us back to the problems we had with groupmasking. Nico -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs unmount versus umount?
On 30/09/2010 14:49, Linder, Doug wrote: Michael Schuster [mailto:michael.schus...@oracle.com] wrote: Mark, I think that wasn't the question, rather, what's the difference between 'zfs u[n]mount' and '/usr/bin/umount'? Yes, that was the question. Sorry I wasn't more clear. The main differences are: zfs umount will do all these things that /usr/bin/umount won't do: * It can be applied recursively down a ZFS hierarchy * It will unshare the filesystems first Both ultimately just call umount2(2) -- Darren J Moffat ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] tagged ACL groups: let's just keep digging until we come out the other side
Can the user in (3) fix the permissions from Windows? no, not under my proposal. but it sounds like currently people cannot ``fix'' permissions through the quirky autotranslation anyway, certainly not to the point where neither unix nor windows users are confused: windows users are always confused, and unix users don't get to see all the permissions. Now what? set the unix perms to 777 as a sign to the unix people to either (a) leave it alone, or (b) learn to use 'chmod A...'. This will actually work: it's not a hand-waving hypothetical that just doesn't play out. What I provide, which we don't have now, is a way to make: /tub/dataset/a subtree -rwxrwxrwxin old unix [working, changeable permissions] in windows /tub/dataset/b subtree -rw-r--r--in old unix [everything: everyone]in windows, but unix permissions still enforced this means: * unix writers and windows writers can cooperate even within a single dataset * an intuitive warning sign when non-native permissions are in effect, * fewer leaked-data surprises If you accept that the autotranslation between the two permissions regimes is total shit, which it is, then what I offer is the best oyu can hope for. My proposal also generalizes to other permissions autoconversion problems: * Future ACL translation stupidity that will happen as more bizarre ACL mechanisms are invented, or underspecified parts of current spec make different choices in different OS's. - POSIX - NFSv4, Darwin - NFSv4 If Apple provides a Darwin - NFSv4 translation that's silly, a match for Darwin NFS client IP's in the share string could put these clients into a tagged ACL group. - AFP - NFSv4 ACL's can be tagged by protocol for new weird protocols. If [new protocol]'s ACLs are a subset of NFSv4 ACL's, then they can be implemented by the bridge and apply to users who don't go through the bridge. The [new protocol] bridge will have an ACLspace all to itself, within which it can be certain nothing but itself will change ACL's, so it can rely on never having to read NFSv4 ACL's that do not match the subset it would feel inclined to write. Unix users will get an everything:everyone or 777 warning that someone else is managing the ACLspace. Yet, Unix users can descend into its private subtrees and muck around with ACL's, and the Unix changes will still get enforced. It's easy to search for all the changes made by Unix, vs all the changes made by [new protocol] bridge, and see if some are important. It's easy to delete all of them at once if someone shouldn't have been mucking arond from unix, or if the [new protocol] bridge was unleashed on a dataset that wasn't dedicated to it and made a mess. This is a case where the [new protocol] bridge is using the ACL's for two related but slightly-orthogonal purposes: to enforce security, and to store metadata. My proposal separates the two. - SMB - NFSv4, NFSv4 - NFSv4 I get that the NFSv4 ACL's are supposed to match Windows perfectly, but if that turns out to be untrue, Linux and Windows clients could be put in separate ACL groups even though they're both, in theory, using NFSv4 ACL's. * zones running large software packages that have bizarre or misguided ACL behavior ACL's are complicated enough that a lot of programmers will get them wrong. If you have a large, assertion-riddled app that will shit itself if it doesn't see the ACL's it expects, or autoset or autoremove ACL's, or does other stupid things with ACL's, you can put it into a zone and configure an ACL tag on the zone, segregating its ACL-writing from the rest of the system. Yet, its restrictions are still respected. If the app were setting ACL's that don't give enough permission, it wouldn't work. but it may have hardcoded crap that stupidly opens up ACL's, or refuses to work if ACL's aren't as open as it thinks they should be. Now you can fake it out whenever it calls getacl, but set other ACL's kept secret from it and still return permission denied when you like. * (optional) a backup mechanism. If you make the choice ``global zone ignores ACLgroups with 'zoned' bit set'', then you can run backups in the global zone that won't be stopped by ACL's set by the inner zones, however you can still limit your backup process's access by adding zoned=0 ACL's. chpacl '(unix)' chmod -R A- . nw Huh? I think you are confused because you didn't read my proposal because it was too long, or the examples I wrote weren't easy to understand. however if I try to repeat it in small pieces, I think it'll just be even longer and harder to understand than the original. What's more, if you don't agree that the
Re: [zfs-discuss] dedup status
Can you provide some specifics to see how bad the writes are? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] tagged ACL groups: let's just keep digging until we come out the other side
On Thu, Sep 30, 2010 at 08:14:24PM -0400, Miles Nordin wrote: Can the user in (3) fix the permissions from Windows? no, not under my proposal. Then your proposal is a non-starter. Support for multiple remote filesystem access protocols is key for ZFS and Solaris. The impedance mismatches between these various protocols means that we need to make some trade-offs. In this case I think the business (as well as the engineers involved) would assert that being a good SMB server is critical, and that being able to authoritatively edit file permissions via SMB clients is part of what it means to be a good SMB server. Now, you could argue that we should being aclmode back and let the user choose which trade-offs to make. And you might propose new values for aclmode or enhancements to the groupmask setting of aclmode. but it sounds like currently people cannot ``fix'' permissions through the quirky autotranslation anyway, certainly not to the point where neither unix nor windows users are confused: windows users are always confused, and unix users don't get to see all the permissions. Thus the current behavior is the same as the old aclmode=discard setting. Now what? set the unix perms to 777 as a sign to the unix people to either (a) leave it alone, or (b) learn to use 'chmod A...'. This will actually work: it's not a hand-waving hypothetical that just doesn't play out. That's not an option, not for a default behavior anyways. Nico -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Resliver making the system unresponsive
Replace it. Reslivering should not as painful if all your disks are functioning normally. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss