[zfs-discuss] History of EPERM for unlink() of directories on ZFS?
Does someone know the history which led to the EPERM for unlink() of directories on ZFS? Why was this done this way, and not something like allowing the unlink and execute it on the next scrub or remount? Lionel ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] History of EPERM for unlink() of directories on ZFS?
Does someone know the history which led to the EPERM for unlink() of directories on ZFS? Why was this done this way, and not something like allowing the unlink and execute it on the next scrub or remount? It's not about the unlink(), it's about the link() and unlink(). But not allowing link unlink, you force the filesystem to contain only trees and not graphs. It also allows you to create directories were .. points to a directory were the inode cannot be found, simply because it was just removed. The support for link() on directories in ufs has always given issues and would create problems fsck couldn't fix. To be honest, I think we should also remove this from all other filesystems and I think ZFS was created this way because all modern filesystems do it that way. Casper ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] (fwd) Re: ZFS NFS service hanging on Sunday
2012-06-14 19:11, tpc...@mklab.ph.rhul.ac.uk wrote: In message 201206141413.q5eedvzq017...@mklab.ph.rhul.ac.uk, tpc...@mklab.ph.r hul.ac.uk writes: Memory: 2048M phys mem, 32M free mem, 16G total swap, 16G free swap My WAG is that your zpool history is hanging due to lack of RAM. Interesting. In the problem state the system is usually quite responsive, eg. not memory trashing. Under Linux which I'm more familiar with the 'used memory' = 'total memory - 'free memory', refers to physical memory being used for data caching by the kernel which is still available for processes to allocate as needed together with memory allocated to processes, as opposed to only physical memory already allocated and therefore really 'used'. Does this mean something different under Solaris ? Well, it is roughly similar. In Solaris there is a general notion [snipped] Dear Jim, Thanks for the detailed explanation of ZFS memory usage. Special thanks also to John D Groenveld for the initial suggestion of a lack of RAM problem. Since up-ing the RAM from 2GB to 4GB the machine has sailed though the last two Sunday mornings w/o problem. I was interested to subsequently discover the Solaris command 'echo ::memstat | mdb -k' which reveals just how much memory ZFS can use. Best regards Tom. -- Tom Crane, Dept. Physics, Royal Holloway, University of London, Egham Hill, Egham, Surrey, TW20 0EX, England. Email: T.Crane@rhul dot ac dot uk ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] (fwd) Re: ZFS NFS service hanging on Sunday
in solaris zfs cache many things, you should have more ram if you setup 18gb swap , imho, ram should be high than 4gb regards Sent from my iPad On Jun 25, 2012, at 5:58, tpc...@mklab.ph.rhul.ac.uk wrote: 2012-06-14 19:11, tpc...@mklab.ph.rhul.ac.uk wrote: In message 201206141413.q5eedvzq017...@mklab.ph.rhul.ac.uk, tpc...@mklab.ph.r hul.ac.uk writes: Memory: 2048M phys mem, 32M free mem, 16G total swap, 16G free swap My WAG is that your zpool history is hanging due to lack of RAM. Interesting. In the problem state the system is usually quite responsive, eg. not memory trashing. Under Linux which I'm more familiar with the 'used memory' = 'total memory - 'free memory', refers to physical memory being used for data caching by the kernel which is still available for processes to allocate as needed together with memory allocated to processes, as opposed to only physical memory already allocated and therefore really 'used'. Does this mean something different under Solaris ? Well, it is roughly similar. In Solaris there is a general notion [snipped] Dear Jim, Thanks for the detailed explanation of ZFS memory usage. Special thanks also to John D Groenveld for the initial suggestion of a lack of RAM problem. Since up-ing the RAM from 2GB to 4GB the machine has sailed though the last two Sunday mornings w/o problem. I was interested to subsequently discover the Solaris command 'echo ::memstat | mdb -k' which reveals just how much memory ZFS can use. Best regards Tom. -- Tom Crane, Dept. Physics, Royal Holloway, University of London, Egham Hill, Egham, Surrey, TW20 0EX, England. Email: T.Crane@rhul dot ac dot uk ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [developer] History of EPERM for unlink() of directories on ZFS?
I don't know the precise history, but I think its a mistake to permit direct link() or unlink() of directories. I do note that on BSD (MacOS at least) unlink returns EPERM if the executing user is not superuser. I do see that the man page for unlink() says this on illumos: The named file is a directory and {PRIV_SYS_LINKDIR} is not asserted in the effective set of the calling process, or the filesystem implementation does not support unlink() or unlinkat() on directories. I can't imagine why you'd *ever* want to support unlink() of a *directory* -- what's the use case for it anyway (outside of filesystem repair)? Garrett D'Amore garr...@damore.org On Jun 25, 2012, at 2:23 AM, Lionel Cons wrote: Does someone know the history which led to the EPERM for unlink() of directories on ZFS? Why was this done this way, and not something like allowing the unlink and execute it on the next scrub or remount? Lionel --- illumos-developer Archives: https://www.listbox.com/member/archive/182179/=now RSS Feed: https://www.listbox.com/member/archive/rss/182179/21239177-c925e33f Modify Your Subscription: https://www.listbox.com/member/?member_id=21239177id_secret=21239177-4dba8197 Powered by Listbox: http://www.listbox.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [developer] History of EPERM for unlink() of directories on ZFS?
The decision to not support link(2) of directories was very deliberate - it is an abomination that never should have been allowed in the first place. My guess is that the behavior of unlink(2) on directories is a direct side-effect of that (if link isn't supported, then why support unlink?). Also worth noting that ZFS also doesn't let you open(2) directories and read(2) from them, something (I believe) UFS does allow. - Eric On Mon, Jun 25, 2012 at 10:40 AM, Garrett D'Amore garr...@damore.orgwrote: I don't know the precise history, but I think its a mistake to permit direct link() or unlink() of directories. I do note that on BSD (MacOS at least) unlink returns EPERM if the executing user is not superuser. I do see that the man page for unlink() says this on illumos: The named file is a directory and {PRIV_SYS_LINKDIR} is not asserted in the effective set of the calling process, or the filesystem implementation does not support unlink() or unlinkat() on directories. I can't imagine why you'd *ever* want to support unlink() of a *directory* -- what's the use case for it anyway (outside of filesystem repair)? Garrett D'Amore garr...@damore.org On Jun 25, 2012, at 2:23 AM, Lionel Cons wrote: Does someone know the history which led to the EPERM for unlink() of directories on ZFS? Why was this done this way, and not something like allowing the unlink and execute it on the next scrub or remount? Lionel --- illumos-developer Archives: https://www.listbox.com/member/archive/182179/=now RSS Feed: https://www.listbox.com/member/archive/rss/182179/21239177-c925e33f Modify Your Subscription: https://www.listbox.com/member/?; Powered by Listbox: http://www.listbox.com --- illumos-developer Archives: https://www.listbox.com/member/archive/182179/=now RSS Feed: https://www.listbox.com/member/archive/rss/182179/21175057-f8151d0d Modify Your Subscription: https://www.listbox.com/member/?member_id=21175057id_secret=21175057-02786781 Powered by Listbox: http://www.listbox.com -- Eric Schrock Delphix http://blog.delphix.com/eschrock 275 Middlefield Road, Suite 50 Menlo Park, CA 94025 http://www.delphix.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [developer] History of EPERM for unlink() of directories on ZFS?
The decision to not support link(2) of directories was very deliberate - it is an abomination that never should have been allowed in the first place. My guess is that the behavior of unlink(2) on directories is a direct side-effect of that (if link isn't supported, then why support unlink?). Also worth noting that ZFS also doesn't let you open(2) directories and read(2) from them, something (I believe) UFS does allow. In the very beginning, mkdir(1) was a set-uid application; it used mknod to make a directory and then created a link from newdir to newdir/. and from . to newdir/.. Traditionally, we was only allowed for the superuser and when we added privileges a special privileges was added. I think we should remove it for the other filesystems. Casper ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [developer] History of EPERM for unlink() of directories on ZFS?
Eric Schrock eric.schr...@delphix.com wrote: The decision to not support link(2) of directories was very deliberate - it is an abomination that never should have been allowed in the first place. My guess is that the behavior of unlink(2) on directories is a direct side-effect of that (if link isn't supported, then why support unlink?). Also worth noting that ZFS also doesn't let you open(2) directories and read(2) from them, something (I believe) UFS does allow. Link/unlink on directories is not a property of UFS. UFS has been designed without that feature, but it has been added by ATT with SVr4. Jörg -- EMail:jo...@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin j...@cs.tu-berlin.de(uni) joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [developer] History of EPERM for unlink() of directories on ZFS?
On Mon, Jun 25, 2012 at 11:19 AM, casper@oracle.com wrote: In the very beginning, mkdir(1) was a set-uid application; it used mknod to make a directory and then created a link from newdir to newdir/. and from . to newdir/.. Interesting, guess you learn something new every day :-) http://minnie.tuhs.org/cgi-bin/utree.pl?file=V7/usr/src/cmd/mkdir.c Thanks, - Eric -- Eric Schrock Delphix http://blog.delphix.com/eschrock 275 Middlefield Road, Suite 50 Menlo Park, CA 94025 http://www.delphix.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [developer] History of EPERM for unlink() of directories on ZFS?
Eric Schrock eric.schr...@delphix.com wrote: On Mon, Jun 25, 2012 at 11:19 AM, casper@oracle.com wrote: In the very beginning, mkdir(1) was a set-uid application; it used mknod to make a directory and then created a link from newdir to newdir/. and from . to newdir/.. Interesting, guess you learn something new every day :-) http://minnie.tuhs.org/cgi-bin/utree.pl?file=V7/usr/src/cmd/mkdir.c This was a nice way to become superuser those days. Just run a loop to make a directory in /tmp and run another program that tries to remove the directory and replace it by a hadlink to /etc/passwd. Mkdir(1) then did a chown you /etc/passwd... We tried this and it took aprox. 30 minutes to become super user this way. And BSD introduced the syscall mkdir(2) to fix this and this is is why UFS was not designed to support link(2) in directories. BTW: to implement mkdir(2), there was a new struct dirtemplate in the kernel with the following comment: /* * A virgin directory (no blushing please). */ struct dirtemplate mastertemplate = { 0, 12, 1, ., 0, DIRBLKSIZ - 12, 2, .. }; This is the first time where Sun verified not to have humor, as Sun removed that comment... Jörg -- EMail:jo...@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin j...@cs.tu-berlin.de(uni) joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] oddity of slow zfs destroy
I ran into something odd today: zfs destroy -r random/filesystem is mindbogglingly slow. But seems to me, it shouldnt be. It's slow, because the filesystem has two snapshots on it. Presumably, it's busy rolling back the snapshots. but I've already declared by my command line, that I DONT CARE about the contents of the filesystem! Why doesnt zfs simply do: 1. unmount filesystem, if possible (it was possible) (1.5 possibly note intent to delete somewhere in the pool records) 2. zero out/free the in-kernel-memory in one go 3. update the pool, hey I deleted the filesystem, all these blocks are now clear Having this kind of operation take more than even 10 seconds, seems like a huge bug to me. yet it can take many minutes. An order of magnitude off. yuck. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] oddity of slow zfs destroy
On Jun 25, 2012, at 10:55 AM, Philip Brown wrote: I ran into something odd today: zfs destroy -r random/filesystem is mindbogglingly slow. But seems to me, it shouldnt be. It's slow, because the filesystem has two snapshots on it. Presumably, it's busy rolling back the snapshots. but I've already declared by my command line, that I DONT CARE about the contents of the filesystem! Why doesnt zfs simply do: 1. unmount filesystem, if possible (it was possible) (1.5 possibly note intent to delete somewhere in the pool records) 2. zero out/free the in-kernel-memory in one go 3. update the pool, hey I deleted the filesystem, all these blocks are now clear Having this kind of operation take more than even 10 seconds, seems like a huge bug to me. yet it can take many minutes. An order of magnitude off. yuck. Agree. Asynchronous destroy has been integrated into illumos. Look for it soon in the distributions derived from illumos soon. For more information, see Chris Siden and Matt Ahrens discussions on async destroy and ZFS feature flags at the ZSF Meetup in January 2012 here: http://blog.delphix.com/ahl/2012/zfs10-illumos-meetup/ -- richard -- ZFS and performance consulting http://www.RichardElling.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss