Re: Nullfs leaks i-nodes

2013-05-09 Thread Peter Holm
On Wed, May 08, 2013 at 12:13:17PM +0300, Konstantin Belousov wrote:
 On Tue, May 07, 2013 at 08:30:06AM +0200, G??ran L??wkrantz wrote:
  I created a PR, kern/178238, on this but would like to know if anyone has 
  any ideas or patches?
  
  Have updated the system where I see this to FreeBSD 9.1-STABLE #0 r250229 
  and still have the problem.
 
 The patch below should fix the issue for you, at least it did so in my
 limited testing.
 
 What is does:
 1. When inactivating a nullfs vnode, check if the lower vnode is
unlinked, and reclaim upper vnode if so. [This fixes your case].
 
 2. Besides a callback to the upper filesystems for the lower vnode
reclaimation, it also calls the upper fs for lower vnode unlink.
This allows nullfs to purge cached vnodes for the unlinked lower.
[This fixes an opposite case, when the vnode is removed from the
lower mount, but upper aliases prevent the vnode from being
recycled].
 
 3. Fix a wart which existed from the introduction of the nullfs caching,
do not unlock lower vnode in the nullfs_reclaim_lowervp().  It should
be completely innocent, but now it is also formally safe.
 
 4. Fix vnode reference leak in nullfs_reclaim_lowervp().
 
 Please note that the patch is basically not tested, I only verified your
 scenario and a mirror of it as described in item 2.
 
 diff --git a/sys/fs/nullfs/null.h b/sys/fs/nullfs/null.h
 index 4f37020..a624be6 100644
 --- a/sys/fs/nullfs/null.h

The page fault seen in fifo_close() seems unrelated to this patch,
which I will continue testing some more.

The scenario triggering the page fault is the rm:

mdconfig -a -t swap -s 1g -u 5
bsdlabel -w md5 auto
newfs -U md5a
mount /dev/md5a /mnt
mount -t nullfs /mnt /mnt2
mkfifo /mnt2/fifo
rm /mnt/fifo

Not a problem on 8.3-STABLE r247938.

- Peter
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Nullfs leaks i-nodes

2013-05-09 Thread Konstantin Belousov
On Thu, May 09, 2013 at 09:02:56AM +0200, Peter Holm wrote:
 On Wed, May 08, 2013 at 12:13:17PM +0300, Konstantin Belousov wrote:
  On Tue, May 07, 2013 at 08:30:06AM +0200, G??ran L??wkrantz wrote:
   I created a PR, kern/178238, on this but would like to know if anyone has 
   any ideas or patches?
   
   Have updated the system where I see this to FreeBSD 9.1-STABLE #0 r250229 
   and still have the problem.
  
  The patch below should fix the issue for you, at least it did so in my
  limited testing.
  
  What is does:
  1. When inactivating a nullfs vnode, check if the lower vnode is
 unlinked, and reclaim upper vnode if so. [This fixes your case].
  
  2. Besides a callback to the upper filesystems for the lower vnode
 reclaimation, it also calls the upper fs for lower vnode unlink.
 This allows nullfs to purge cached vnodes for the unlinked lower.
 [This fixes an opposite case, when the vnode is removed from the
 lower mount, but upper aliases prevent the vnode from being
 recycled].
  
  3. Fix a wart which existed from the introduction of the nullfs caching,
 do not unlock lower vnode in the nullfs_reclaim_lowervp().  It should
 be completely innocent, but now it is also formally safe.
  
  4. Fix vnode reference leak in nullfs_reclaim_lowervp().
  
  Please note that the patch is basically not tested, I only verified your
  scenario and a mirror of it as described in item 2.
  
  diff --git a/sys/fs/nullfs/null.h b/sys/fs/nullfs/null.h
  index 4f37020..a624be6 100644
  --- a/sys/fs/nullfs/null.h
 
 The page fault seen in fifo_close() seems unrelated to this patch,
 which I will continue testing some more.
 
 The scenario triggering the page fault is the rm:
 
 mdconfig -a -t swap -s 1g -u 5
 bsdlabel -w md5 auto
 newfs -U md5a
 mount /dev/md5a /mnt
 mount -t nullfs /mnt /mnt2
 mkfifo /mnt2/fifo
 rm /mnt/fifo
Yeah, I figured this out from the backtrace.

The panic in kostik563 is related and directly caused by the patch,
since patch erronously performs vgone() on the active (removed) vnode.
As result, a spurious VOP_CLOSE() call is performed on the vnode,
which is more or less innocent for regular files, but fatal for fifos
and probably nfsv4 as well.

Updated patch is below.

 
 Not a problem on 8.3-STABLE r247938.
Yes, new nullfs is only in HEAD and stable/9.

diff --git a/sys/fs/nullfs/null.h b/sys/fs/nullfs/null.h
index 4f37020..0972dfc 100644
--- a/sys/fs/nullfs/null.h
+++ b/sys/fs/nullfs/null.h
@@ -53,8 +53,12 @@ struct null_node {
LIST_ENTRY(null_node)   null_hash;  /* Hash list */
struct vnode*null_lowervp;  /* VREFed once */
struct vnode*null_vnode;/* Back pointer */
+   u_int   null_flags;
 };
 
+#defineNULLV_NOUNLOCK  0x0001
+#defineNULLV_DROP  0x0002
+
 #defineMOUNTTONULLMOUNT(mp) ((struct null_mount *)((mp)-mnt_data))
 #defineVTONULL(vp) ((struct null_node *)(vp)-v_data)
 #defineNULLTOV(xp) ((xp)-null_vnode)
diff --git a/sys/fs/nullfs/null_vfsops.c b/sys/fs/nullfs/null_vfsops.c
index 02932bd..ad02236 100644
--- a/sys/fs/nullfs/null_vfsops.c
+++ b/sys/fs/nullfs/null_vfsops.c
@@ -65,7 +65,6 @@ static vfs_statfs_t   nullfs_statfs;
 static vfs_unmount_t   nullfs_unmount;
 static vfs_vget_t  nullfs_vget;
 static vfs_extattrctl_tnullfs_extattrctl;
-static vfs_reclaim_lowervp_t nullfs_reclaim_lowervp;
 
 /*
  * Mount null layer
@@ -391,8 +390,37 @@ nullfs_reclaim_lowervp(struct mount *mp, struct vnode 
*lowervp)
vp = null_hashget(mp, lowervp);
if (vp == NULL)
return;
+   VTONULL(vp)-null_flags |= NULLV_NOUNLOCK;
vgone(vp);
-   vn_lock(lowervp, LK_EXCLUSIVE | LK_RETRY);
+   vput(vp);
+}
+
+static void
+nullfs_unlink_lowervp(struct mount *mp, struct vnode *lowervp)
+{
+   struct vnode *vp;
+   struct null_node *xp;
+
+   vp = null_hashget(mp, lowervp);
+   if (vp == NULL)
+   return;
+   xp = VTONULL(vp);
+   xp-null_flags |= NULLV_DROP | NULLV_NOUNLOCK;
+   vhold(vp);
+   vunref(vp);
+
+   /*
+* If vunref() dropped the last use reference on the nullfs
+* vnode, it must be reclaimed, and its lock was split from
+* the lower vnode lock.  Need to do extra unlock before
+* allowing the final vdrop() to free the vnode.
+*/
+   if (vp-v_usecount == 0) {
+   KASSERT((vp-v_iflag  VI_DOOMED) != 0,
+   (not reclaimed %p, vp));
+   VOP_UNLOCK(vp, 0);
+   }
+   vdrop(vp);
 }
 
 static struct vfsops null_vfsops = {
@@ -408,6 +436,7 @@ static struct vfsops null_vfsops = {
.vfs_unmount =  nullfs_unmount,
.vfs_vget = nullfs_vget,
.vfs_reclaim_lowervp =  nullfs_reclaim_lowervp,
+   .vfs_unlink_lowervp =   nullfs_unlink_lowervp,
 };
 
 VFS_SET(null_vfsops, nullfs, VFCF_LOOPBACK | VFCF_JAIL);
diff --git 

Re: Nullfs leaks i-nodes

2013-05-09 Thread Goran Lowkrantz
--On Wednesday, 08 May, 2013 12:13 PM +0300 Konstantin Belousov 
kostik...@gmail.com wrote:



On Tue, May 07, 2013 at 08:30:06AM +0200, G??ran L??wkrantz wrote:

I created a PR, kern/178238, on this but would like to know if anyone
has  any ideas or patches?

Have updated the system where I see this to FreeBSD 9.1-STABLE #0
r250229  and still have the problem.


The patch below should fix the issue for you, at least it did so in my
limited testing.

What is does:
1. When inactivating a nullfs vnode, check if the lower vnode is
   unlinked, and reclaim upper vnode if so. [This fixes your case].

2. Besides a callback to the upper filesystems for the lower vnode
   reclaimation, it also calls the upper fs for lower vnode unlink.
   This allows nullfs to purge cached vnodes for the unlinked lower.
   [This fixes an opposite case, when the vnode is removed from the
   lower mount, but upper aliases prevent the vnode from being
   recycled].

3. Fix a wart which existed from the introduction of the nullfs caching,
   do not unlock lower vnode in the nullfs_reclaim_lowervp().  It should
   be completely innocent, but now it is also formally safe.

4. Fix vnode reference leak in nullfs_reclaim_lowervp().

Please note that the patch is basically not tested, I only verified your
scenario and a mirror of it as described in item 2.

diff --git a/sys/fs/nullfs/null.h b/sys/fs/nullfs/null.h
index 4f37020..a624be6 100644
--- a/sys/fs/nullfs/null.h
+++ b/sys/fs/nullfs/null.h
@@ -53,8 +53,11 @@ struct null_node {
LIST_ENTRY(null_node)   null_hash;  /* Hash list */
struct vnode*null_lowervp;  /* VREFed once */
struct vnode*null_vnode;/* Back pointer */
+   u_int   null_flags;
 };

+#defineNULLV_NOUNLOCK  0x0001
+
 #defineMOUNTTONULLMOUNT(mp) ((struct null_mount *)((mp)-mnt_data))
 #defineVTONULL(vp) ((struct null_node *)(vp)-v_data)
 #defineNULLTOV(xp) ((xp)-null_vnode)
diff --git a/sys/fs/nullfs/null_vfsops.c b/sys/fs/nullfs/null_vfsops.c
index 02932bd..544c358 100644
--- a/sys/fs/nullfs/null_vfsops.c
+++ b/sys/fs/nullfs/null_vfsops.c
@@ -65,7 +65,6 @@ static vfs_statfs_t   nullfs_statfs;
 static vfs_unmount_t   nullfs_unmount;
 static vfs_vget_t  nullfs_vget;
 static vfs_extattrctl_tnullfs_extattrctl;
-static vfs_reclaim_lowervp_t nullfs_reclaim_lowervp;

 /*
  * Mount null layer
@@ -391,8 +390,22 @@ nullfs_reclaim_lowervp(struct mount *mp, struct
vnode *lowervp) vp = null_hashget(mp, lowervp);
if (vp == NULL)
return;
+   VTONULL(vp)-null_flags |= NULLV_NOUNLOCK;
vgone(vp);
-   vn_lock(lowervp, LK_EXCLUSIVE | LK_RETRY);
+   vput(vp);
+}
+
+static void
+nullfs_unlink_lowervp(struct mount *mp, struct vnode *lowervp)
+{
+   struct vnode *vp;
+
+   vp = null_hashget(mp, lowervp);
+   if (vp == NULL || vp-v_usecount  1)
+   return;
+   VTONULL(vp)-null_flags |= NULLV_NOUNLOCK;
+   vgone(vp);
+   vput(vp);
 }

 static struct vfsops null_vfsops = {
@@ -408,6 +421,7 @@ static struct vfsops null_vfsops = {
.vfs_unmount =  nullfs_unmount,
.vfs_vget = nullfs_vget,
.vfs_reclaim_lowervp =  nullfs_reclaim_lowervp,
+   .vfs_unlink_lowervp =   nullfs_unlink_lowervp,
 };

 VFS_SET(null_vfsops, nullfs, VFCF_LOOPBACK | VFCF_JAIL);
diff --git a/sys/fs/nullfs/null_vnops.c b/sys/fs/nullfs/null_vnops.c
index f59865f..3c41124 100644
--- a/sys/fs/nullfs/null_vnops.c
+++ b/sys/fs/nullfs/null_vnops.c
@@ -692,18 +692,21 @@ null_unlock(struct vop_unlock_args *ap)
 static int
 null_inactive(struct vop_inactive_args *ap __unused)
 {
-   struct vnode *vp;
+   struct vnode *vp, *lvp;
struct mount *mp;
struct null_mount *xmp;

vp = ap-a_vp;
+   lvp = NULLVPTOLOWERVP(vp);
mp = vp-v_mount;
xmp = MOUNTTONULLMOUNT(mp);
-   if ((xmp-nullm_flags  NULLM_CACHE) == 0) {
+   if ((xmp-nullm_flags  NULLM_CACHE) == 0 ||
+   (lvp-v_vflag  VV_NOSYNC) != 0) {
/*
 * If this is the last reference and caching of the
-* nullfs vnodes is not enabled, then free up the
-* vnode so as not to tie up the lower vnodes.
+* nullfs vnodes is not enabled, or the lower vnode is
+* deleted, then free up the vnode so as not to tie up
+* the lower vnodes.
 */
vp-v_object = NULL;
vrecycle(vp);
@@ -748,7 +751,10 @@ null_reclaim(struct vop_reclaim_args *ap)
 */
if (vp-v_writecount  0)
VOP_ADD_WRITECOUNT(lowervp, -1);
-   vput(lowervp);
+   if ((xp-null_flags  NULLV_NOUNLOCK) != 0)
+   vunref(lowervp);
+   else
+   vput(lowervp);
free(xp, M_NULLFSNODE);

return (0);
diff --git a/sys/kern/vfs_subr.c b/sys/kern/vfs_subr.c
index 

Re: Nullfs leaks i-nodes

2013-05-09 Thread Konstantin Belousov
On Thu, May 09, 2013 at 02:11:44PM +0200, Goran Lowkrantz wrote:
 I assume this is CURRENT? Tried on STABLE but got this:
 cc1: warnings being treated as errors
 /usr/src/sys/kern/vfs_subr.c: In function 'vfs_notify_upper':
 /usr/src/sys/kern/vfs_subr.c:2801: warning: implicit declaration of 
 function 'VFS_PROLOGUE'
 /usr/src/sys/kern/vfs_subr.c:2801: warning: nested extern declaration of 
 'VFS_PROLOGUE' [-Wnested-externs]
 /usr/src/sys/kern/vfs_subr.c:2801: warning: implicit declaration of 
 function 'VFS_EPILOGUE'
 /usr/src/sys/kern/vfs_subr.c:2801: warning: nested extern declaration of 
 'VFS_EPILOGUE' [-Wnested-externs]
 *** [vfs_subr.o] Error code 1

Yes, the patch is for CURRENT. You can simply remove the VFS_PROLOGUE and
VFS_EPILOGUE lines from the patch if applying on stable.  Not sure whether
it needs more trivial adjustments there.


pgpgL_Pws0SGM.pgp
Description: PGP signature


Re: Nullfs leaks i-nodes

2013-05-08 Thread Göran Löwkrantz

--On May 7, 2013 23:41:51 +0300 Mikolaj Golub troc...@freebsd.org wrote:


On Tue, May 07, 2013 at 08:30:06AM +0200, Göran Löwkrantz wrote:

I created a PR, kern/178238, on this but would like to know if anyone
has  any ideas or patches?

Have updated the system where I see this to FreeBSD 9.1-STABLE #0
r250229  and still have the problem.


I am observing an effect that might look like inode leak, which I
think is due free nullfs vnodes caching, recently added by kib
(r240285): free inode number does not increase after unlink; but if I
purge the free vnodes cache (temporary setting vfs.wantfreevnodes to 0
and observing vfs.freevnodes decreasing to 0) the inode number grows
back.

You have only about 1000 inodes available on your underlying fs, while
vfs.wantfreevnodes I think is much higher, resulting in running out of
i-nodes.

If it is really your case you can disable caching, mounting nullfs
with nocache (it looks like caching is not important in your case).

--
Mikolaj Golub


Thanks Mikolaj, mounting the active fs with 'nocache' fixed it, keeping 
ifree steady.


Any idea how to fix this in NanoBSD? The data partition is created with 
only 1024 i-nodes in the script, so any use that includes file deletion on 
this r/w area will be bitten.


As the nocache attribute is not valid for device mounts, I see no way to 
inherit it to the nullfs mounts for this specific partition.


Easiest thing could be to set vfs.wantfreevnodes=0 in the default 
sysctl.conf, maybe? But will this have implications for non-nullfs 
filesystems? Only UFS? Even ZFS?


Thanks again,
Göran




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

RE: Nullfs leaks i-nodes

2013-05-08 Thread Göran Löwkrantz



--On May 8, 2013 8:35:18 +1000 Dewayne Geraghty 
dewayne.gerag...@heuristicsystems.com.au wrote:



-Original Message-
From: owner-freebsd-sta...@freebsd.org
[mailto:owner-freebsd-sta...@freebsd.org] On Behalf Of Mikolaj Golub
Sent: Wednesday, 8 May 2013 6:42 AM
To: Göran Löwkrantz
Cc: Kostik Belousov; freebsd-stable@freebsd.org
Subject: Re: Nullfs leaks i-nodes

On Tue, May 07, 2013 at 08:30:06AM +0200, Göran Löwkrantz wrote:
 I created a PR, kern/178238, on this but would like to know
if anyone has
 any ideas or patches?

 Have updated the system where I see this to FreeBSD
9.1-STABLE #0 r250229
 and still have the problem.

I am observing an effect that might look like inode leak, which I
think is due free nullfs vnodes caching, recently added by kib
(r240285): free inode number does not increase after unlink; but if I
purge the free vnodes cache (temporary setting vfs.wantfreevnodes to 0
and observing vfs.freevnodes decreasing to 0) the inode number grows
back.

You have only about 1000 inodes available on your underlying fs, while
vfs.wantfreevnodes I think is much higher, resulting in running out of
i-nodes.

If it is really your case you can disable caching, mounting nullfs
with nocache (it looks like caching is not important in your case).

--
Mikolaj Golub
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to
freebsd-stable-unsubscr...@freebsd.org



Hi Goran,

After I included Kib's vnode caching patch the performance on my port
builder machine, decreased significantly.  The port builder is one of
many jails and nullfs is used extensively. I was starving the system of
vnodes.  Increasing the kern.maxvnodes, resulted in better performance
than the original system configuration without vnode caching. Thanks Kib
:)

I don't think you'll run out of vnodes as it is self adjusting (that was
my concern too)

I changed kern.maxvnode to approx 3 times what it wanted and tuned for my
needs. Try it and keep an eye on:
sysctl vfs.numvnodes vfs.wantfreevnodes vfs.freevnodes
vm.stats.vm.v_vnodepgsout vm.stats.vm.v_vnodepgsin

Regards, Dewayne


Hi Dewayne,

I got a few of those too but I didn't connect them with the FW problem as 
here there seems to be reclaim pressure.


On the FW I get these numbers:
vfs.numvnodes: 7500
vfs.wantfreevnodes: 27936
vfs.freevnodes: 5663
vm.stats.vm.v_vnodepgsout: 0
vm.stats.vm.v_vnodepgsin: 4399

while on the jail systems I get something like this:
vfs.numvnodes: 51212
vfs.wantfreevnodes: 35668
vfs.freevnodes: 35665
vm.stats.vm.v_vnodepgsout: 5952
vm.stats.vm.v_vnodepgsin: 939563

and as far as I can understand, the fact that vfs.wantfreevnodes and 
vfs.freevnodes are almost the same suggests that we have a reclaim pressure.


So one fix for small NanoBSD systems would be to lower vfs.wantfreevnodes 
and I will test that on a virtual machine and see if I can get better 
reclaim.


MVH
Göran




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: Nullfs leaks i-nodes

2013-05-08 Thread Konstantin Belousov
On Tue, May 07, 2013 at 08:30:06AM +0200, G??ran L??wkrantz wrote:
 I created a PR, kern/178238, on this but would like to know if anyone has 
 any ideas or patches?
 
 Have updated the system where I see this to FreeBSD 9.1-STABLE #0 r250229 
 and still have the problem.

The patch below should fix the issue for you, at least it did so in my
limited testing.

What is does:
1. When inactivating a nullfs vnode, check if the lower vnode is
   unlinked, and reclaim upper vnode if so. [This fixes your case].

2. Besides a callback to the upper filesystems for the lower vnode
   reclaimation, it also calls the upper fs for lower vnode unlink.
   This allows nullfs to purge cached vnodes for the unlinked lower.
   [This fixes an opposite case, when the vnode is removed from the
   lower mount, but upper aliases prevent the vnode from being
   recycled].

3. Fix a wart which existed from the introduction of the nullfs caching,
   do not unlock lower vnode in the nullfs_reclaim_lowervp().  It should
   be completely innocent, but now it is also formally safe.

4. Fix vnode reference leak in nullfs_reclaim_lowervp().

Please note that the patch is basically not tested, I only verified your
scenario and a mirror of it as described in item 2.

diff --git a/sys/fs/nullfs/null.h b/sys/fs/nullfs/null.h
index 4f37020..a624be6 100644
--- a/sys/fs/nullfs/null.h
+++ b/sys/fs/nullfs/null.h
@@ -53,8 +53,11 @@ struct null_node {
LIST_ENTRY(null_node)   null_hash;  /* Hash list */
struct vnode*null_lowervp;  /* VREFed once */
struct vnode*null_vnode;/* Back pointer */
+   u_int   null_flags;
 };
 
+#defineNULLV_NOUNLOCK  0x0001
+
 #defineMOUNTTONULLMOUNT(mp) ((struct null_mount *)((mp)-mnt_data))
 #defineVTONULL(vp) ((struct null_node *)(vp)-v_data)
 #defineNULLTOV(xp) ((xp)-null_vnode)
diff --git a/sys/fs/nullfs/null_vfsops.c b/sys/fs/nullfs/null_vfsops.c
index 02932bd..544c358 100644
--- a/sys/fs/nullfs/null_vfsops.c
+++ b/sys/fs/nullfs/null_vfsops.c
@@ -65,7 +65,6 @@ static vfs_statfs_t   nullfs_statfs;
 static vfs_unmount_t   nullfs_unmount;
 static vfs_vget_t  nullfs_vget;
 static vfs_extattrctl_tnullfs_extattrctl;
-static vfs_reclaim_lowervp_t nullfs_reclaim_lowervp;
 
 /*
  * Mount null layer
@@ -391,8 +390,22 @@ nullfs_reclaim_lowervp(struct mount *mp, struct vnode 
*lowervp)
vp = null_hashget(mp, lowervp);
if (vp == NULL)
return;
+   VTONULL(vp)-null_flags |= NULLV_NOUNLOCK;
vgone(vp);
-   vn_lock(lowervp, LK_EXCLUSIVE | LK_RETRY);
+   vput(vp);
+}
+
+static void
+nullfs_unlink_lowervp(struct mount *mp, struct vnode *lowervp)
+{
+   struct vnode *vp;
+
+   vp = null_hashget(mp, lowervp);
+   if (vp == NULL || vp-v_usecount  1)
+   return;
+   VTONULL(vp)-null_flags |= NULLV_NOUNLOCK;
+   vgone(vp);
+   vput(vp);
 }
 
 static struct vfsops null_vfsops = {
@@ -408,6 +421,7 @@ static struct vfsops null_vfsops = {
.vfs_unmount =  nullfs_unmount,
.vfs_vget = nullfs_vget,
.vfs_reclaim_lowervp =  nullfs_reclaim_lowervp,
+   .vfs_unlink_lowervp =   nullfs_unlink_lowervp,
 };
 
 VFS_SET(null_vfsops, nullfs, VFCF_LOOPBACK | VFCF_JAIL);
diff --git a/sys/fs/nullfs/null_vnops.c b/sys/fs/nullfs/null_vnops.c
index f59865f..3c41124 100644
--- a/sys/fs/nullfs/null_vnops.c
+++ b/sys/fs/nullfs/null_vnops.c
@@ -692,18 +692,21 @@ null_unlock(struct vop_unlock_args *ap)
 static int
 null_inactive(struct vop_inactive_args *ap __unused)
 {
-   struct vnode *vp;
+   struct vnode *vp, *lvp;
struct mount *mp;
struct null_mount *xmp;
 
vp = ap-a_vp;
+   lvp = NULLVPTOLOWERVP(vp);
mp = vp-v_mount;
xmp = MOUNTTONULLMOUNT(mp);
-   if ((xmp-nullm_flags  NULLM_CACHE) == 0) {
+   if ((xmp-nullm_flags  NULLM_CACHE) == 0 ||
+   (lvp-v_vflag  VV_NOSYNC) != 0) {
/*
 * If this is the last reference and caching of the
-* nullfs vnodes is not enabled, then free up the
-* vnode so as not to tie up the lower vnodes.
+* nullfs vnodes is not enabled, or the lower vnode is
+* deleted, then free up the vnode so as not to tie up
+* the lower vnodes.
 */
vp-v_object = NULL;
vrecycle(vp);
@@ -748,7 +751,10 @@ null_reclaim(struct vop_reclaim_args *ap)
 */
if (vp-v_writecount  0)
VOP_ADD_WRITECOUNT(lowervp, -1);
-   vput(lowervp);
+   if ((xp-null_flags  NULLV_NOUNLOCK) != 0)
+   vunref(lowervp);
+   else
+   vput(lowervp);
free(xp, M_NULLFSNODE);
 
return (0);
diff --git a/sys/kern/vfs_subr.c b/sys/kern/vfs_subr.c
index de118f7..988a114 100644
--- a/sys/kern/vfs_subr.c
+++ b/sys/kern/vfs_subr.c
@@ -2700,19 

Re: Nullfs leaks i-nodes

2013-05-08 Thread Peter Holm
On Wed, May 08, 2013 at 12:13:17PM +0300, Konstantin Belousov wrote:
 On Tue, May 07, 2013 at 08:30:06AM +0200, G??ran L??wkrantz wrote:
  I created a PR, kern/178238, on this but would like to know if anyone has 
  any ideas or patches?
  
  Have updated the system where I see this to FreeBSD 9.1-STABLE #0 r250229 
  and still have the problem.
 
 The patch below should fix the issue for you, at least it did so in my
 limited testing.
 
 What is does:
 1. When inactivating a nullfs vnode, check if the lower vnode is
unlinked, and reclaim upper vnode if so. [This fixes your case].
 
 2. Besides a callback to the upper filesystems for the lower vnode
reclaimation, it also calls the upper fs for lower vnode unlink.
This allows nullfs to purge cached vnodes for the unlinked lower.
[This fixes an opposite case, when the vnode is removed from the
lower mount, but upper aliases prevent the vnode from being
recycled].
 
 3. Fix a wart which existed from the introduction of the nullfs caching,
do not unlock lower vnode in the nullfs_reclaim_lowervp().  It should
be completely innocent, but now it is also formally safe.
 
 4. Fix vnode reference leak in nullfs_reclaim_lowervp().
 
 Please note that the patch is basically not tested, I only verified your
 scenario and a mirror of it as described in item 2.
 
 diff --git a/sys/fs/nullfs/null.h b/sys/fs/nullfs/null.h
 index 4f37020..a624be6 100644

I got this page fault after interrupting a nullfs test that had been
running for three hours:

http://people.freebsd.org/~pho/stress/log/kostik562.txt

- Peter
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Nullfs leaks i-nodes

2013-05-08 Thread Peter Holm
On Wed, May 08, 2013 at 07:58:56PM +0200, Peter Holm wrote:
 On Wed, May 08, 2013 at 12:13:17PM +0300, Konstantin Belousov wrote:
  On Tue, May 07, 2013 at 08:30:06AM +0200, G??ran L??wkrantz wrote:
   I created a PR, kern/178238, on this but would like to know if anyone has 
   any ideas or patches?
   
   Have updated the system where I see this to FreeBSD 9.1-STABLE #0 r250229 
   and still have the problem.
  
  The patch below should fix the issue for you, at least it did so in my
  limited testing.
  
  What is does:
  1. When inactivating a nullfs vnode, check if the lower vnode is
 unlinked, and reclaim upper vnode if so. [This fixes your case].
  
  2. Besides a callback to the upper filesystems for the lower vnode
 reclaimation, it also calls the upper fs for lower vnode unlink.
 This allows nullfs to purge cached vnodes for the unlinked lower.
 [This fixes an opposite case, when the vnode is removed from the
 lower mount, but upper aliases prevent the vnode from being
 recycled].
  
  3. Fix a wart which existed from the introduction of the nullfs caching,
 do not unlock lower vnode in the nullfs_reclaim_lowervp().  It should
 be completely innocent, but now it is also formally safe.
  
  4. Fix vnode reference leak in nullfs_reclaim_lowervp().
  
  Please note that the patch is basically not tested, I only verified your
  scenario and a mirror of it as described in item 2.
  
  diff --git a/sys/fs/nullfs/null.h b/sys/fs/nullfs/null.h
  index 4f37020..a624be6 100644
 
 I got this page fault after interrupting a nullfs test that had been
 running for three hours:
 
 http://people.freebsd.org/~pho/stress/log/kostik562.txt
 

Seems to be easily reproduced, so I compiled null_vnops.c and
fifo_vnops.c without -O in order to get some more info:

http://people.freebsd.org/~pho/stress/log/kostik563.txt

- Peter
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Nullfs leaks i-nodes

2013-05-07 Thread Mikolaj Golub
On Tue, May 07, 2013 at 08:30:06AM +0200, Göran Löwkrantz wrote:
 I created a PR, kern/178238, on this but would like to know if anyone has 
 any ideas or patches?
 
 Have updated the system where I see this to FreeBSD 9.1-STABLE #0 r250229 
 and still have the problem.

I am observing an effect that might look like inode leak, which I
think is due free nullfs vnodes caching, recently added by kib
(r240285): free inode number does not increase after unlink; but if I
purge the free vnodes cache (temporary setting vfs.wantfreevnodes to 0
and observing vfs.freevnodes decreasing to 0) the inode number grows
back.

You have only about 1000 inodes available on your underlying fs, while
vfs.wantfreevnodes I think is much higher, resulting in running out of
i-nodes.

If it is really your case you can disable caching, mounting nullfs
with nocache (it looks like caching is not important in your case).

-- 
Mikolaj Golub
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

RE: Nullfs leaks i-nodes

2013-05-07 Thread Dewayne Geraghty
 -Original Message-
 From: owner-freebsd-sta...@freebsd.org 
 [mailto:owner-freebsd-sta...@freebsd.org] On Behalf Of Mikolaj Golub
 Sent: Wednesday, 8 May 2013 6:42 AM
 To: Göran Löwkrantz
 Cc: Kostik Belousov; freebsd-stable@freebsd.org
 Subject: Re: Nullfs leaks i-nodes
 
 On Tue, May 07, 2013 at 08:30:06AM +0200, Göran Löwkrantz wrote:
  I created a PR, kern/178238, on this but would like to know 
 if anyone has 
  any ideas or patches?
  
  Have updated the system where I see this to FreeBSD 
 9.1-STABLE #0 r250229 
  and still have the problem.
 
 I am observing an effect that might look like inode leak, which I
 think is due free nullfs vnodes caching, recently added by kib
 (r240285): free inode number does not increase after unlink; but if I
 purge the free vnodes cache (temporary setting vfs.wantfreevnodes to 0
 and observing vfs.freevnodes decreasing to 0) the inode number grows
 back.
 
 You have only about 1000 inodes available on your underlying fs, while
 vfs.wantfreevnodes I think is much higher, resulting in running out of
 i-nodes.
 
 If it is really your case you can disable caching, mounting nullfs
 with nocache (it looks like caching is not important in your case).
 
 -- 
 Mikolaj Golub
 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to 
 freebsd-stable-unsubscr...@freebsd.org
 

Hi Goran,

After I included Kib's vnode caching patch the performance on my port builder 
machine, decreased significantly.  The port
builder is one of many jails and nullfs is used extensively. I was starving 
the system of vnodes.  Increasing the kern.maxvnodes,
resulted in better performance than the original system configuration without 
vnode caching. Thanks Kib :)

I don't think you'll run out of vnodes as it is self adjusting (that was my 
concern too)

I changed kern.maxvnode to approx 3 times what it wanted and tuned for my needs.
Try it and keep an eye on:
sysctl vfs.numvnodes vfs.wantfreevnodes vfs.freevnodes 
vm.stats.vm.v_vnodepgsout vm.stats.vm.v_vnodepgsin

Regards, Dewayne

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Nullfs leaks i-nodes

2013-05-07 Thread Jeremy Chadwick
On Wed, May 08, 2013 at 08:35:18AM +1000, Dewayne Geraghty wrote:
  -Original Message-
  From: owner-freebsd-sta...@freebsd.org 
  [mailto:owner-freebsd-sta...@freebsd.org] On Behalf Of Mikolaj Golub
  Sent: Wednesday, 8 May 2013 6:42 AM
  To: G?ran L?wkrantz
  Cc: Kostik Belousov; freebsd-stable@freebsd.org
  Subject: Re: Nullfs leaks i-nodes
  
  On Tue, May 07, 2013 at 08:30:06AM +0200, G?ran L?wkrantz wrote:
   I created a PR, kern/178238, on this but would like to know 
  if anyone has 
   any ideas or patches?
   
   Have updated the system where I see this to FreeBSD 
  9.1-STABLE #0 r250229 
   and still have the problem.
  
  I am observing an effect that might look like inode leak, which I
  think is due free nullfs vnodes caching, recently added by kib
  (r240285): free inode number does not increase after unlink; but if I
  purge the free vnodes cache (temporary setting vfs.wantfreevnodes to 0
  and observing vfs.freevnodes decreasing to 0) the inode number grows
  back.
  
  You have only about 1000 inodes available on your underlying fs, while
  vfs.wantfreevnodes I think is much higher, resulting in running out of
  i-nodes.
  
  If it is really your case you can disable caching, mounting nullfs
  with nocache (it looks like caching is not important in your case).
  
  -- 
  Mikolaj Golub
  ___
  freebsd-stable@freebsd.org mailing list
  http://lists.freebsd.org/mailman/listinfo/freebsd-stable
  To unsubscribe, send any mail to 
  freebsd-stable-unsubscr...@freebsd.org
  
 
 Hi Goran,
 
 After I included Kib's vnode caching patch the performance on my port 
 builder machine, decreased significantly.  The port
 builder is one of many jails and nullfs is used extensively. I was starving 
 the system of vnodes.  Increasing the kern.maxvnodes,
 resulted in better performance than the original system configuration without 
 vnode caching. Thanks Kib :)
 
 I don't think you'll run out of vnodes as it is self adjusting (that was my 
 concern too)
 
 I changed kern.maxvnode to approx 3 times what it wanted and tuned for my 
 needs.
 Try it and keep an eye on:
 sysctl vfs.numvnodes vfs.wantfreevnodes vfs.freevnodes 
 vm.stats.vm.v_vnodepgsout vm.stats.vm.v_vnodepgsin

Telling people keep an eye on these sysctls is not exactly helpful
when there isn't an understanding of what they represent -- it's akin to
people using munin or SNMP polling software to monitoring some MIBs
without actually knowing, truly, deep down inside, what it is they're
looking at.  (I cannot tell you how often this happens.  In fact, most
systems monitoring softwares/graphs/other crap I see these days tends
to suffer from exactly this)

The only thing I'm aware of is what's in vnode(9) and what I could find
here:

http://www.youtube.com/watch?v=SpS7Ajnx9Q8
http://bsd-id.blogspot.com/2007/11/vnode.html

All said -- has anyone actually seen vfs.freevnodes hit 0?  On some of
my systems I've seen it reach small numbers (in the 3-digit range),
but would later increase (to mid-4-digit range), even after lots of
(new/unique, i.e. not previously cached) file I/O.  So the
auto-adjusting nature of this makes it very hard for one to say keep
an eye on these sysctls when the administrator does not know when
he/she should become concerned/considering increasing kern.maxvnodes.

Next, maxvnodes - numvnodes != freevnodes, which doesn't make sense to
me:

$ sysctl kern.maxvnodes vfs.freevnodes vfs.wantfreevnodes vfs.numvnodes
kern.maxvnodes: 393216
vfs.freevnodes: 51543
vfs.wantfreevnodes: 51545
vfs.numvnodes: 244625
$ expr 393216 - 244625
148591

And finally, the lack of sysctl description for vfs.wantfreevnodes is
quite bothersome:

$ sysctl -d kern.maxvnodes vfs.freevnodes vfs.wantfreevnodes
vfs.numvnodes
kern.maxvnodes: Maximum number of vnodes
vfs.freevnodes: Number of vnodes in the free list
vfs.wantfreevnodes:
vfs.numvnodes: Number of vnodes in existence

-- 
| Jeremy Chadwick   j...@koitsu.org |
| UNIX Systems Administratorhttp://jdc.koitsu.org/ |
| Mountain View, CA, US|
| Making life hard for others since 1977. PGP 4BD6C0CB |
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

RE: Nullfs leaks i-nodes

2013-05-07 Thread Dewayne Geraghty
Jeremy, 

Thank-you for your advice.  I take your point that the next time that I have an 
idea that might help, that is, suggesting that
maxvnodes should be increased and that vnode depletion vs starvation shouldn't 
be a concern;  I'll reflect on your reply for my
failure to review the vnode source and to fully understand the ramifications of 
examing the sysctl's.

Regards, Dewayne.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org