Re: panic: vfs_busy: unexpected lock failure
Matthew Dillon wrote: :On Tue, Mar 16, 1999 at 12:52:32PM -0800, Matthew Dillon wrote: : A.. And if you make those AMD mounts normal nfs mounts it doesn't : fry? If so, then we have a bug in AMD somewhere. : :I tried the cp several times again on a regular NFS mount, to make :sure, and no, it doesn't seem to panic. So yes, that seems to be :AMD-related. Can't it be in the vfs layer though? :-- :Pierre Beyssac p...@enst.fr It's probably AMD. I'm not really up on how AMD works... hasn't someone done some work on it recently to fix other breakages? Maybe they could look at this panic. AMD is easy to upset, and that's bad because it's holding a mountpoint in / (ie: /host) which often gets hit by every single getcwd() call when it gets a lstat(/host...) or whatever. I think this is the single largest source of load on the amd process. The other problem is that amd is an rpc client, it depends on the libc rpc code for robustness, and that's not the first word that springs to mind when I think of it... When amd hangs on a dns lookup, there are all sorts of VFS locking cascades and NFS wedges while the kernel is retrying all those retransmitted packets to amd's pseudo-nfs server port. It's been found to be the primary cause of the 'nfsrcv' hangs - processes wedged in getcwd() style situations trying to stat /host. IMHO, /host needs to move down a level to get it out of the way of getcwd(). NFS mounts should probably move away from / as well, as they cause traffic on each getcwd(). I think the default settings should look something like this.. /net- amd and nfs related stuff /net/sysname/mount1 - nfs mount created by amd /net/sysname/mount2 - nfs mount created by amd /net/host - /host lives here instead. and a symlink: /host - /net/host I think that'll stop amd from being hammered by all those lstat()'s in getcwd and friends in the root directory. And instead of mounting NFS things as: /a, mount them as /net/a instead and use a symlink. This isn't a fix, it's just trying to move a particularly weak link out of the direct line of fire. A real solution would be a proper userfs interface that could cope with kernel-user_process protocol timeouts, process deaths, etc. Of course, then there's always an in-kernel autofs etc. Cheers, -Peter To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message
Re: panic: vfs_busy: unexpected lock failure
On Thu, 18 Mar 1999 22:49:10 +0800, Peter Wemm pe...@netplex.com.au said: AMD is easy to upset, and that's bad because it's holding a mountpoint in / (ie: /host) which often gets hit by every single getcwd() call when it gets a lstat(/host...) or whatever. I think this is the single largest source of load on the amd process. IMHO, /host needs to move down a level to get it out of the way of getcwd(). NFS mounts should probably move away from / as well, as they cause traffic on each getcwd(). `/host' is non-standard. The Standard Configuration is `/net' is the directory simulated by amd and `/a/${hostname}/root' is where amd mounts the directory tree. This is done specifically to avoid getcwd wedgitude. The example we ship would sorely puzzle anyone who is experienced running a Standard Configuration amd. My machine has, throughout its entire history, had `/home' simulated by amd. I have literally *never* had amd hose my configuration (and I would know it fast since both mail and Web service would break). -GAWollman -- Garrett A. Wollman | O Siem / We are all family / O Siem / We're all the same woll...@lcs.mit.edu | O Siem / The fires of freedom Opinions not those of| Dance in the burning flame MIT, LCS, CRS, or NSA| - Susan Aglukark and Chad Irschick To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message
Re: panic: vfs_busy: unexpected lock failure
:On Tue, Mar 16, 1999 at 12:52:32PM -0800, Matthew Dillon wrote: : A.. And if you make those AMD mounts normal nfs mounts it doesn't : fry? If so, then we have a bug in AMD somewhere. : :I tried the cp several times again on a regular NFS mount, to make :sure, and no, it doesn't seem to panic. So yes, that seems to be :AMD-related. Can't it be in the vfs layer though? :-- :Pierre Beyssac p...@enst.fr It's probably AMD. I'm not really up on how AMD works... hasn't someone done some work on it recently to fix other breakages? Maybe they could look at this panic. -Matt Matthew Dillon dil...@backplane.com To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message
Re: panic: vfs_busy: unexpected lock failure
On Mon, Mar 15, 1999 at 01:24:46PM -0800, Matthew Dillon wrote: Compile up a kernel with 'options DDB' and get a backtrace when it panics next ( 'trace' command from DDB prompt ). Ok, here goes. The kernel is compiled without -g for the moment, but I've provided the function offsets if that may help. vfs_busy() at vfs_busy+0x6d lookup()+0x3b9 namei() +0x180 stat() +0x44 syscall() +0x187 I also get what seems to be spurious EPROTONOSUPPORT errors that show up in cp while copying files... -- Pierre Beyssac p...@enst.fr To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message
Re: panic: vfs_busy: unexpected lock failure
:On Mon, Mar 15, 1999 at 01:24:46PM -0800, Matthew Dillon wrote: : Compile up a kernel with 'options DDB' and get a backtrace when : it panics next ( 'trace' command from DDB prompt ). : :Ok, here goes. The kernel is compiled without -g for the moment, :but I've provided the function offsets if that may help. : :vfs_busy() at vfs_busy+0x6d :lookup() +0x3b9 :namei()+0x180 :stat() +0x44 :syscall() +0x187 : :I also get what seems to be spurious EPROTONOSUPPORT errors that :show up in cp while copying files... :-- :Pierre Beyssac p...@enst.fr The code in lookup() that calls vfs_busy() is: while (dp-v_type == VDIR (mp = dp-v_mountedhere) (cnp-cn_flags NOCROSSMOUNT) == 0) { if (vfs_busy(mp, 0, 0, p)) continue; error = VFS_ROOT(mp, tdp); vfs_unbusy(mp, p); if (error) goto bad2; vput(dp); ndp-ni_vp = dp = tdp; } You shouldn't be crossing a mount point. Are you by chance doing a recursive copy onto itself? e.g. cp -rp src destwhere dest is mounted under src somewhere ? Of course, it is still a serious kernel bug. I would like to try to reproduce it in order to track it down. How are things mounted on your system ( df ) and what are the *exact* arguments you are using with cp? -Matt Matthew Dillon dil...@backplane.com To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message
Re: panic: vfs_busy: unexpected lock failure
On Tue, Mar 16, 1999 at 11:11:44AM -0800, Matthew Dillon wrote: (cnp-cn_flags NOCROSSMOUNT) == 0) { if (vfs_busy(mp, 0, 0, p)) continue; ... You shouldn't be crossing a mount point. Are you by chance doing a recursive copy onto itself? e.g. cp -rp src dest where dest is mounted under src somewhere ? No. At first it was from a NFS-mounted volume to another NFS-mounted volume. I then found that it panic'ed the same when I copied from a local FFS volume to the same NFS volume. The NFS volumes are automounted by amd under /a. That may well have something to do with the panic: that's a recent change in my configuration; I previously used NFS mounts in /etc/fstab which didn't cause me any trouble. Of course, it is still a serious kernel bug. I would like to try to reproduce it in order to track it down. How are things mounted on your system ( df ) and what are the *exact* arguments you are using with cp? Here's the df (I removed some of the amd dummy mount points). $ df Filesystem 1K-blocks UsedAvail Capacity Mounted on /dev/wd0s1a 49583345951102276%/ /dev/wd1s1e 5975845 3556146 194163265%/home /dev/wd0s1f148823 1290 135628 1%/tmp /dev/wd0s1g 5380597 1615221 333492933%/usr /dev/wd0s1e39689538127 32701710%/var procfs 440 100%/proc [ ten pid...@bofh:/xyz lines removed ] pid...@bofh:/cal000 100%/cal huuh:/home/huuh 1217519 1064153 14119188%/a/huuh/home/huuh The failing cp is: $ cp -rp /home/beyssac/src/sendmail-8.9.3/cf/ /home/beyssac/nfs/junk/ In the above, /home/beyssac/nfs is a symbolic link to /cal/huuh/cal/beyssac which is automounted by amd (last line in the above df). -- Pierre Beyssac p...@enst.fr To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message
Re: panic: vfs_busy: unexpected lock failure
: :On Tue, Mar 16, 1999 at 11:11:44AM -0800, Matthew Dillon wrote: :(cnp-cn_flags NOCROSSMOUNT) == 0) { : if (vfs_busy(mp, 0, 0, p)) : continue; :... : You shouldn't be crossing a mount point. Are you by chance doing a : recursive copy onto itself? : e.g. cp -rp src dest where dest is mounted under src somewhere ? : :No. At first it was from a NFS-mounted volume to another NFS-mounted :volume. I then found that it panic'ed the same when I copied from :a local FFS volume to the same NFS volume. : :The NFS volumes are automounted by amd under /a. That may well have :something to do with the panic: that's a recent change in my :configuration; I previously used NFS mounts in /etc/fstab which :didn't cause me any trouble. : : Of course, it is still a serious kernel bug. I would like to try : to reproduce it in order to track it down. How are things mounted on : your system ( df ) and what are the *exact* arguments you are using with : cp? : :Here's the df (I removed some of the amd dummy mount points). : :$ df :Filesystem 1K-blocks UsedAvail Capacity Mounted on :/dev/wd0s1a 49583345951102276%/ :/dev/wd1s1e 5975845 3556146 194163265%/home :/dev/wd0s1f148823 1290 135628 1%/tmp :/dev/wd0s1g 5380597 1615221 333492933%/usr :/dev/wd0s1e39689538127 32701710%/var :procfs 440 100%/proc :[ ten pid...@bofh:/xyz lines removed ] :pid...@bofh:/cal000 100%/cal :huuh:/home/huuh 1217519 1064153 14119188%/a/huuh/home/huuh : :The failing cp is: : :$ cp -rp /home/beyssac/src/sendmail-8.9.3/cf/ /home/beyssac/nfs/junk/ : :In the above, /home/beyssac/nfs is a symbolic link to :/cal/huuh/cal/beyssac which is automounted by amd (last line in :the above df). :-- :Pierre Beyssac p...@enst.fr A.. And if you make those AMD mounts normal nfs mounts it doesn't fry? If so, then we have a bug in AMD somewhere. -Matt Matthew Dillon dil...@backplane.com To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message
Re: panic: vfs_busy: unexpected lock failure
On Tue, Mar 16, 1999 at 12:52:32PM -0800, Matthew Dillon wrote: A.. And if you make those AMD mounts normal nfs mounts it doesn't fry? If so, then we have a bug in AMD somewhere. I tried the cp several times again on a regular NFS mount, to make sure, and no, it doesn't seem to panic. So yes, that seems to be AMD-related. Can't it be in the vfs layer though? -- Pierre Beyssac p...@enst.fr To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message
panic: vfs_busy: unexpected lock failure
Hello, My FreeBSD box keeps panicing when I'm trying to do a simple cp -rp from a local disk to a NFS-mounted disk. The NFS server is a Solaris 2.5 box; the NFS partition is mounted through amd. The files I try to copy are just sendmail's cf directory (lots of small files) and the panic happens every time I try (with cp -rp; not with piped tars). The kernel is today's, with NFS compiled-in (it's not a module). I'm having the following message: panic: vfs_busy: unexpected lock failure -- Pierre Beyssac p...@enst.fr To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message
Re: panic: vfs_busy: unexpected lock failure
:Hello, : :My FreeBSD box keeps panicing when I'm trying to do a simple cp :-rp from a local disk to a NFS-mounted disk. The NFS server is a :Solaris 2.5 box; the NFS partition is mounted through amd. : :The files I try to copy are just sendmail's cf directory (lots of :small files) and the panic happens every time I try (with cp -rp; :not with piped tars). : :The kernel is today's, with NFS compiled-in (it's not a module). : :I'm having the following message: : panic: vfs_busy: unexpected lock failure :-- :Pierre Beyssac p...@enst.fr Compile up a kernel with 'options DDB' and get a backtrace when it panics next ( 'trace' command from DDB prompt ). -Matt Matthew Dillon dil...@backplane.com To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-current in the body of the message