Re: PUFFS and existing file that get ENOENT
YAMAMOTO Takashi y...@mwd.biglobe.ne.jp wrote: it should retry from puffs_cookie2pnode in that case. Here is a patch that works around the problem (I initially had printf to check it did go through the ENOENT case and it does). I am about to commit that and pullup to netbsd-5, except if there are comments. Index: sys/fs/puffs/puffs_node.c === RCS file: /cvsroot/src/sys/fs/puffs/puffs_node.c,v retrieving revision 1.13.10.3 diff -U 4 -r1.13.10.3 puffs_node.c --- sys/fs/puffs/puffs_node.c 2 Nov 2011 20:11:12 - 1.13.10.3 +++ sys/fs/puffs/puffs_node.c 18 Jan 2012 03:03:25 - @@ -320,10 +320,18 @@ vp = pmp-pmp_root; if (vp) { mutex_enter(vp-v_interlock); mutex_exit(pmp-pmp_lock); - if (vget(vp, LK_INTERLOCK) == 0) + switch (vget(vp, LK_INTERLOCK)) { + case ENOENT: + goto retry; + break; + case 0: return 0; + break; + default: + break; + } } else mutex_exit(pmp-pmp_lock); /* @@ -386,8 +394,9 @@ *vpp = pmp-pmp_root; return 0; } +retry_vget: mutex_enter(pmp-pmp_lock); pnode = puffs_cookie2pnode(pmp, ck); if (pnode == NULL) { if (willcreate) { @@ -405,10 +414,18 @@ vgetflags = LK_INTERLOCK; if (lock) vgetflags |= LK_EXCLUSIVE | LK_RETRY; - if ((rv = vget(vp, vgetflags))) + switch (rv = vget(vp, vgetflags)) { + case ENOENT: + goto retry_vget; + break; + case 0: + break; + default: return rv; + break; + } *vpp = vp; return 0; } -- Emmanuel Dreyfus http://hcpnet.free.fr/pubz m...@netbsd.org
Re: PUFFS and existing file that get ENOENT
hi, YAMAMOTO Takashi y...@mwd.biglobe.ne.jp wrote: it should retry from puffs_cookie2pnode in that case. Here is a patch that works around the problem (I initially had printf to check it did go through the ENOENT case and it does). I am about to commit that and pullup to netbsd-5, except if there are comments. + case ENOENT: + goto retry; + break; + case 0: return 0; + break; + case ENOENT: + goto retry_vget; + break; please clean up these unreachable statements. otherwise looks ok to me. thanks for fixing. YAMAMOTO Takashi
Re: PUFFS and existing file that get ENOENT
On Mon, Jan 16, 2012 at 06:25:39AM +, YAMAMOTO Takashi wrote: it should retry from puffs_cookie2pnode in that case. I also need to build a test case that reliabiliy reproduce the bug. For now I run our build.sh -Uo release and come back the next day, this is not very convenient. As I understand, I need to lookup a node I arealdy node but is beeing recycled. When does the kernel decide to recycle a vnode? -- Emmanuel Dreyfus m...@netbsd.org
Re: PUFFS and existing file that get ENOENT
On Mon, Jan 16, 2012 at 02:02:41PM +0100, Adam Hamsik wrote: Just try to lower that number to some smaller one ? sysctl(7) says: kern.maxvnodes (KERN_MAXVNODES) The maximum number of vnodes available on the system. This can only be raised. But it seems I can lower it from 26214 to 200 without a hitch. I have no idea how mch room it has, however. We cannot get the number of used vnode from userland, can we? -- Emmanuel Dreyfus m...@netbsd.org
Re: PUFFS and existing file that get ENOENT
On Mon 16 Jan 2012 at 13:17:17 +, Emmanuel Dreyfus wrote: But it seems I can lower it from 26214 to 200 without a hitch. I have no idea how mch room it has, however. We cannot get the number of used vnode from userland, can we? pstat -v gives the number of active vnodes; that may be useful. -Olaf. -- ___ Olaf 'Rhialto' Seibert -- There's no point being grown-up if you \X/ rhialto/at/xs4all.nl-- can't be childish sometimes. -The 4th Doctor
Re: PUFFS and existing file that get ENOENT
hello. pstat -v should give you what you want to know. -thanks -Brian On Jan 16, 1:17pm, Emmanuel Dreyfus wrote: } Subject: Re: PUFFS and existing file that get ENOENT } On Mon, Jan 16, 2012 at 02:02:41PM +0100, Adam Hamsik wrote: } Just try to lower that number to some smaller one ? } } sysctl(7) says: } kern.maxvnodes (KERN_MAXVNODES) } The maximum number of vnodes available on the system. This can } only be raised. } } But it seems I can lower it from 26214 to 200 without a hitch. I have } no idea how mch room it has, however. We cannot get the number of used } vnode from userland, can we? } } } -- } Emmanuel Dreyfus } m...@netbsd.org -- End of excerpt from Emmanuel Dreyfus
Re: PUFFS and existing file that get ENOENT
hi, On Mon, Jan 16, 2012 at 10:56:33AM +, YAMAMOTO Takashi wrote: you can increase the chance by running while :;do sysctl -w kern.maxvnodes=0; done It will always fail: bacasable# sysctl -w kern.maxvnodes=0 sysctl: kern.maxvnodes: sysctl() failed with Device busy it tries to reclaim vnodes before failing. YAMAMOTO Takashi -- Emmanuel Dreyfus m...@netbsd.org
Re: PUFFS and existing file that get ENOENT
On Mon, 16 Jan 2012 10:56:33 + (UTC) y...@mwd.biglobe.ne.jp (YAMAMOTO Takashi) wrote: when the kernel wants to cache other files. ie. whenever the kernel decides to reclaim it. :-) you can increase the chance by running while :;do sysctl -w kern.maxvnodes=0; done or something like that. Wouldn't the performance also drop significantly with a permanently low maxvnodes, though? Thanks, -- Matt
Re: PUFFS and existing file that get ENOENT
hi, On Mon, 16 Jan 2012 10:56:33 + (UTC) y...@mwd.biglobe.ne.jp (YAMAMOTO Takashi) wrote: when the kernel wants to cache other files. ie. whenever the kernel decides to reclaim it. :-) you can increase the chance by running while :;do sysctl -w kern.maxvnodes=0; done or something like that. Wouldn't the performance also drop significantly with a permanently low maxvnodes, though? it does never succeed. anyway the performance is not a priority when trying to reproduce a bug. YAMAMOTO Takashi Thanks, -- Matt
Re: PUFFS and existing file that get ENOENT
hi, Emmanuel Dreyfus m...@netbsd.org wrote: Hence I come to the conclusion that it may come from sys/kern/vfs_lookup.c, but it is very unlikely that there is a bug there that went unnoticed for other filesystems. Further investigation shows that this ENOENT is returned by vget() call in puffs_cookie2vnode(). That suggests some kind of race condition, but that is not obvious. It means a vnode has been created on a lookup, then it gets recycled while looking up one of its child. it should retry from puffs_cookie2pnode in that case. YAMAMOTO Takashi -- Emmanuel Dreyfus http://hcpnet.free.fr/pubz m...@netbsd.org
PUFFS and existing file that get ENOENT
Hello I am tracking a rare bug with perfused, where an existing file is reported as unexistent. It happens after a few hours of usage, and it happens only once: if I retry accessing the file, I am successful. Adding traces in perfused while performing ktrace on the calling process, I am now confident that perfused is not the component that raises the ENOENT. Here is what I get in the ktrace: 14085 1 cc1 1326421055.20998 CALL close(4) 14085 1 cc1 1326421055.214918302 RET close 0 14085 1 cc1 1326421055.215025195 CALL open(0xbb960a90,4,0x1b6) 14085 1 cc1 1326421055.215031533 NAMI /gfs/manu/netbsd/usr/src/sys/sys/featuretest.h 14085 1 cc1 1326421055.216282844 RET open -1 errno 2 No such file or directory In the PUFFS trace I collect in perfused, this open() only causes LOOKUPs up to /manu/netbsd/usr, all successful, then nothing. 1326421055.215229913 LOOKUP / cn = manu error = 0 1326421055.215396312 LOOKUP /manu cn = netbsd error = 0 1326421055.215749931 LOOKUP /manu/netbsd cn = usr error = 0 That means the ENOENT is decided by the kernel on its own, perfused does not produce it. The question is where can this come from? There are two ENOENT occurences in sys/fs/puffs. One can happen at mount time, and I ruled out the other in puffsop_flush() by adding a printf() that never show up when the bug strikes. There are also ENOENT in sys/dev/putter/putter.c, but they all have a printf() that I would have seen, therefore it cannot come from there. Hence I come to the conclusion that it may come from sys/kern/vfs_lookup.c, but it is very unlikely that there is a bug there that went unnoticed for other filesystems. Anyone would have an idea of what can possibly be going on? -- Emmanuel Dreyfus http://hcpnet.free.fr/pubz m...@netbsd.org