[zfs-discuss] History of EPERM for unlink() of directories on ZFS?

2012-06-25 Thread Lionel Cons
Does someone know the history which led to the EPERM for unlink() of
directories on ZFS? Why was this done this way, and not something like
allowing the unlink and execute it on the next scrub or remount?

Lionel
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] History of EPERM for unlink() of directories on ZFS?

2012-06-25 Thread Casper . Dik

Does someone know the history which led to the EPERM for unlink() of
directories on ZFS? Why was this done this way, and not something like
allowing the unlink and execute it on the next scrub or remount?


It's not about the unlink(), it's about the link() and unlink().
But not allowing link  unlink, you force the filesystem to contain only 
trees and not graphs.

It also allows you to create directories were .. points to a directory 
were the inode cannot be found, simply because it was just removed.

The support for link() on directories in ufs has always given issues
and would create problems fsck couldn't fix.

To be honest, I think we should also remove this from all other
filesystems and I think ZFS was created this way because all modern
filesystems do it that way.

Casper

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (fwd) Re: ZFS NFS service hanging on Sunday

2012-06-25 Thread TPCzfs

 2012-06-14 19:11, tpc...@mklab.ph.rhul.ac.uk wrote:
 
  In message 201206141413.q5eedvzq017...@mklab.ph.rhul.ac.uk, 
  tpc...@mklab.ph.r
  hul.ac.uk writes:
  Memory: 2048M phys mem, 32M free mem, 16G total swap, 16G free swap
  My WAG is that your zpool history is hanging due to lack of
  RAM.
 
  Interesting.  In the problem state the system is usually quite responsive, 
  eg. not memory trashing.  Under Linux which I'm more
  familiar with the 'used memory' = 'total memory - 'free memory', refers to 
  physical memory being used for data caching by
  the kernel which is still available for processes to allocate as needed 
  together with memory allocated to processes, as opposed to
  only physical memory already allocated and therefore really 'used'.  Does 
  this mean something different under Solaris ?

 Well, it is roughly similar. In Solaris there is a general notion

[snipped]

Dear Jim,
Thanks for the detailed explanation of ZFS memory usage.  Special 
thanks also to John D Groenveld for the initial suggestion of a lack of RAM
problem.  Since up-ing the RAM from 2GB to 4GB the machine has sailed though 
the last two Sunday mornings w/o problem.  I was interested to
subsequently discover the Solaris command 'echo ::memstat | mdb -k' which 
reveals just how much memory ZFS can use.

Best regards
Tom.

--
Tom Crane, Dept. Physics, Royal Holloway, University of London, Egham Hill,
Egham, Surrey, TW20 0EX, England.
Email:  T.Crane@rhul dot ac dot uk
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (fwd) Re: ZFS NFS service hanging on Sunday

2012-06-25 Thread Hung-Sheng Tsao (LaoTsao) Ph.D
in solaris zfs cache many things, you should have more ram
if you setup 18gb swap , imho, ram should be high than 4gb
regards

Sent from my iPad

On Jun 25, 2012, at 5:58, tpc...@mklab.ph.rhul.ac.uk wrote:

 
 2012-06-14 19:11, tpc...@mklab.ph.rhul.ac.uk wrote:
 
 In message 201206141413.q5eedvzq017...@mklab.ph.rhul.ac.uk, 
 tpc...@mklab.ph.r
 hul.ac.uk writes:
 Memory: 2048M phys mem, 32M free mem, 16G total swap, 16G free swap
 My WAG is that your zpool history is hanging due to lack of
 RAM.
 
 Interesting.  In the problem state the system is usually quite responsive, 
 eg. not memory trashing.  Under Linux which I'm more
 familiar with the 'used memory' = 'total memory - 'free memory', refers to 
 physical memory being used for data caching by
 the kernel which is still available for processes to allocate as needed 
 together with memory allocated to processes, as opposed to
 only physical memory already allocated and therefore really 'used'.  Does 
 this mean something different under Solaris ?
 
 Well, it is roughly similar. In Solaris there is a general notion
 
 [snipped]
 
 Dear Jim,
Thanks for the detailed explanation of ZFS memory usage.  Special 
 thanks also to John D Groenveld for the initial suggestion of a lack of RAM
 problem.  Since up-ing the RAM from 2GB to 4GB the machine has sailed though 
 the last two Sunday mornings w/o problem.  I was interested to
 subsequently discover the Solaris command 'echo ::memstat | mdb -k' which 
 reveals just how much memory ZFS can use.
 
 Best regards
 Tom.
 
 --
 Tom Crane, Dept. Physics, Royal Holloway, University of London, Egham Hill,
 Egham, Surrey, TW20 0EX, England.
 Email:  T.Crane@rhul dot ac dot uk
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [developer] History of EPERM for unlink() of directories on ZFS?

2012-06-25 Thread Garrett D'Amore
I don't know the precise history, but I think its a mistake to permit direct 
link() or unlink() of directories.  I do note that on BSD (MacOS at least) 
unlink returns EPERM if the executing user is not superuser.  I do see that the 
man page for unlink() says this on illumos:

 The  named   file   is   a   directory   and
 {PRIV_SYS_LINKDIR}  is  not  asserted in the
 effective set of the calling process, or the
 filesystem  implementation  does not support
 unlink() or unlinkat() on directories.

I can't imagine why you'd *ever* want to support unlink() of a *directory* -- 
what's the use case for it anyway (outside of filesystem repair)?

Garrett D'Amore
garr...@damore.org



On Jun 25, 2012, at 2:23 AM, Lionel Cons wrote:

 Does someone know the history which led to the EPERM for unlink() of
 directories on ZFS? Why was this done this way, and not something like
 allowing the unlink and execute it on the next scrub or remount?
 
 Lionel
 
 
 ---
 illumos-developer
 Archives: https://www.listbox.com/member/archive/182179/=now
 RSS Feed: https://www.listbox.com/member/archive/rss/182179/21239177-c925e33f
 Modify Your Subscription: 
 https://www.listbox.com/member/?member_id=21239177id_secret=21239177-4dba8197
 Powered by Listbox: http://www.listbox.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [developer] History of EPERM for unlink() of directories on ZFS?

2012-06-25 Thread Eric Schrock
The decision to not support link(2) of directories was very deliberate - it
is an abomination that never should have been allowed in the first place.
My guess is that the behavior of unlink(2) on directories is a direct
side-effect of that (if link isn't supported, then why support unlink?).
Also worth noting that ZFS also doesn't let you open(2) directories and
read(2) from them, something (I believe) UFS does allow.

- Eric

On Mon, Jun 25, 2012 at 10:40 AM, Garrett D'Amore garr...@damore.orgwrote:

 I don't know the precise history, but I think its a mistake to permit
 direct link() or unlink() of directories.  I do note that on BSD (MacOS at
 least) unlink returns EPERM if the executing user is not superuser.  I do
 see that the man page for unlink() says this on illumos:

 The  named   file   is   a   directory   and
 {PRIV_SYS_LINKDIR}  is  not  asserted in the
 effective set of the calling process, or the
 filesystem  implementation  does not support
 unlink() or unlinkat() on directories.

 I can't imagine why you'd *ever* want to support unlink() of a *directory*
 -- what's the use case for it anyway (outside of filesystem repair)?

 Garrett D'Amore
 garr...@damore.org



 On Jun 25, 2012, at 2:23 AM, Lionel Cons wrote:

  Does someone know the history which led to the EPERM for unlink() of
  directories on ZFS? Why was this done this way, and not something like
  allowing the unlink and execute it on the next scrub or remount?
 
  Lionel
 
 
  ---
  illumos-developer
  Archives: https://www.listbox.com/member/archive/182179/=now
  RSS Feed:
 https://www.listbox.com/member/archive/rss/182179/21239177-c925e33f
  Modify Your Subscription: https://www.listbox.com/member/?;
  Powered by Listbox: http://www.listbox.com



 ---
 illumos-developer
 Archives: https://www.listbox.com/member/archive/182179/=now
 RSS Feed:
 https://www.listbox.com/member/archive/rss/182179/21175057-f8151d0d
 Modify Your Subscription:
 https://www.listbox.com/member/?member_id=21175057id_secret=21175057-02786781
 Powered by Listbox: http://www.listbox.com




-- 
Eric Schrock
Delphix
http://blog.delphix.com/eschrock

275 Middlefield Road, Suite 50
Menlo Park, CA 94025
http://www.delphix.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [developer] History of EPERM for unlink() of directories on ZFS?

2012-06-25 Thread Casper . Dik

The decision to not support link(2) of directories was very deliberate - it
is an abomination that never should have been allowed in the first place.
My guess is that the behavior of unlink(2) on directories is a direct
side-effect of that (if link isn't supported, then why support unlink?).
Also worth noting that ZFS also doesn't let you open(2) directories and
read(2) from them, something (I believe) UFS does allow.

In the very beginning, mkdir(1) was a set-uid application; it used
mknod to make a directory and then created a link from
newdir to newdir/.
and from
. to newdir/..

Traditionally, we was only allowed for the superuser and when
we added privileges a special privileges was added.

I think we should remove it for the other filesystems.

Casper
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [developer] History of EPERM for unlink() of directories on ZFS?

2012-06-25 Thread Joerg Schilling
Eric Schrock eric.schr...@delphix.com wrote:

 The decision to not support link(2) of directories was very deliberate - it
 is an abomination that never should have been allowed in the first place.
 My guess is that the behavior of unlink(2) on directories is a direct
 side-effect of that (if link isn't supported, then why support unlink?).
 Also worth noting that ZFS also doesn't let you open(2) directories and
 read(2) from them, something (I believe) UFS does allow.

Link/unlink on directories is not a property of UFS.

UFS has been designed without that feature, but it has been added by ATT with 
SVr4.

Jörg

-- 
 EMail:jo...@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin
   j...@cs.tu-berlin.de(uni)  
   joerg.schill...@fokus.fraunhofer.de (work) Blog: 
http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [developer] History of EPERM for unlink() of directories on ZFS?

2012-06-25 Thread Eric Schrock
On Mon, Jun 25, 2012 at 11:19 AM, casper@oracle.com wrote:


 In the very beginning, mkdir(1) was a set-uid application; it used
 mknod to make a directory and then created a link from
newdir to newdir/.
 and from
. to newdir/..


Interesting, guess you learn something new every day :-)

http://minnie.tuhs.org/cgi-bin/utree.pl?file=V7/usr/src/cmd/mkdir.c

Thanks,

- Eric

-- 
Eric Schrock
Delphix
http://blog.delphix.com/eschrock

275 Middlefield Road, Suite 50
Menlo Park, CA 94025
http://www.delphix.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [developer] History of EPERM for unlink() of directories on ZFS?

2012-06-25 Thread Joerg Schilling
Eric Schrock eric.schr...@delphix.com wrote:

 On Mon, Jun 25, 2012 at 11:19 AM, casper@oracle.com wrote:
 
 
  In the very beginning, mkdir(1) was a set-uid application; it used
  mknod to make a directory and then created a link from
 newdir to newdir/.
  and from
 . to newdir/..
 

 Interesting, guess you learn something new every day :-)

 http://minnie.tuhs.org/cgi-bin/utree.pl?file=V7/usr/src/cmd/mkdir.c

This was a nice way to become superuser those days.

Just run a loop to make a directory in /tmp and run another program that tries 
to remove the directory and replace it by a hadlink to /etc/passwd. Mkdir(1) 
then did a chown you /etc/passwd... We tried this and it took aprox. 
30 minutes to become super user this way.

And BSD introduced the syscall mkdir(2) to fix this and this is is why UFS was 
not designed to support link(2) in directories.

BTW: to implement mkdir(2), there was a new struct dirtemplate in the kernel 
with the following comment:

/* 
 * A virgin directory (no blushing please). 
 */ 
struct dirtemplate mastertemplate = { 
0, 12, 1, ., 
0, DIRBLKSIZ - 12, 2, .. 
}; 


This is the first time where Sun verified not to have humor, as Sun removed 
that comment...

Jörg

-- 
 EMail:jo...@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin
   j...@cs.tu-berlin.de(uni)  
   joerg.schill...@fokus.fraunhofer.de (work) Blog: 
http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] oddity of slow zfs destroy

2012-06-25 Thread Philip Brown

I ran into something odd today:

zfs destroy -r  random/filesystem

is mindbogglingly slow. But seems to me, it shouldnt be.
It's slow, because the filesystem has two snapshots on it. Presumably, it's 
busy rolling back the snapshots.
but I've already declared by my command line, that I DONT CARE about the 
contents of the filesystem!

Why doesnt zfs simply do:

1. unmount filesystem, if possible (it was possible)
(1.5 possibly note intent to delete somewhere in the pool records)
2. zero out/free the in-kernel-memory in one go
3. update the pool, hey I deleted the filesystem, all these blocks are now 
clear



Having this kind of operation take more than even 10 seconds, seems like a 
huge bug to me. yet it can take many minutes. An order of magnitude off. yuck.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] oddity of slow zfs destroy

2012-06-25 Thread Richard Elling
On Jun 25, 2012, at 10:55 AM, Philip Brown wrote:

 I ran into something odd today:
 
 zfs destroy -r  random/filesystem
 
 is mindbogglingly slow. But seems to me, it shouldnt be.
 It's slow, because the filesystem has two snapshots on it. Presumably, it's 
 busy rolling back the snapshots.
 but I've already declared by my command line, that I DONT CARE about the 
 contents of the filesystem!
 Why doesnt zfs simply do:
 
 1. unmount filesystem, if possible (it was possible)
 (1.5 possibly note intent to delete somewhere in the pool records)
 2. zero out/free the in-kernel-memory in one go
 3. update the pool, hey I deleted the filesystem, all these blocks are now 
 clear
 
 
 Having this kind of operation take more than even 10 seconds, seems like a 
 huge bug to me. yet it can take many minutes. An order of magnitude off. yuck.

Agree. Asynchronous destroy has been integrated into illumos. Look for it soon
in the distributions derived from illumos soon. For more information, see Chris
Siden and Matt Ahrens discussions on async destroy and ZFS feature flags at
the ZSF Meetup in January 2012 here:
http://blog.delphix.com/ahl/2012/zfs10-illumos-meetup/

 -- richard

-- 

ZFS and performance consulting
http://www.RichardElling.com
















___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss