Re: more amd hangs: problem really in syslog?
Mike Smith wrote: On Tue, 13 Jul 1999, Mike Smith wrote: 'siobi' is someone trying to open the serial console, for whatever reason. Without knowing who it was that was stuck there, it's hard to guess what is going on. D'uh, sorry. Long day. It was amd that was hung in the siobi state. No way to clear it without rebooting the box. Dang. Now I need that stack dump from amd that you posted and I deleted. OK, sent under different cover. Specifically, it'd be handy to know why amd felt it was necessary to open the console. Yeah, I'm kind of curious myself. BTW, I was going to work on this some more today, but the boss thought that putting the box into production was more important. The good news is that under real world load my freebsd box had 20-40% free cpu and a load average of 1.5. With load as equal as the switch could make it, the linux box had no free cpu and a load average of 8. :) I also (finally) got the approval to install freebsd on the fourth box (there are already two linux machines up) so A) I'm making progress in the office, and B) I should have a chance to pound on the syslog stuff tomorrow. Happy, Doug To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: more amd hangs: problem really in syslog?
On Tue, 13 Jul 1999, Mike Smith wrote: 'siobi' is someone trying to open the serial console, for whatever reason. Without knowing who it was that was stuck there, it's hard to guess what is going on. D'uh, sorry. Long day. It was amd that was hung in the siobi state. No way to clear it without rebooting the box. Dang. Now I need that stack dump from amd that you posted and I deleted. Specifically, it'd be handy to know why amd felt it was necessary to open the console. -- \\ The mind's the standard \\ Mike Smith \\ of the man. \\ msm...@freebsd.org \\-- Joseph Merrick \\ msm...@cdrom.com To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: more amd hangs: problem really in syslog?
Mike Smith wrote: On Tue, 13 Jul 1999, Mike Smith wrote: 'siobi' is someone trying to open the serial console, for whatever reason. Without knowing who it was that was stuck there, it's hard to guess what is going on. D'uh, sorry. Long day. It was amd that was hung in the siobi state. No way to clear it without rebooting the box. Dang. Now I need that stack dump from amd that you posted and I deleted. OK, sent under different cover. Specifically, it'd be handy to know why amd felt it was necessary to open the console. Yeah, I'm kind of curious myself. BTW, I was going to work on this some more today, but the boss thought that putting the box into production was more important. The good news is that under real world load my freebsd box had 20-40% free cpu and a load average of 1.5. With load as equal as the switch could make it, the linux box had no free cpu and a load average of 8. :) I also (finally) got the approval to install freebsd on the fourth box (there are already two linux machines up) so A) I'm making progress in the office, and B) I should have a chance to pound on the syslog stuff tomorrow. Happy, Doug To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: more amd hangs: problem really in syslog?
On Wed, Jul 14, 1999 at 10:56:05PM -0700, Mike Smith wrote: On Tue, 13 Jul 1999, Mike Smith wrote: 'siobi' is someone trying to open the serial console, for whatever reason. Without knowing who it was that was stuck there, it's hard to guess what is going on. D'uh, sorry. Long day. It was amd that was hung in the siobi state. No way to clear it without rebooting the box. Dang. Now I need that stack dump from amd that you posted and I deleted. Specifically, it'd be handy to know why amd felt it was necessary to open the console. http://www.egroups.com/group/freebsd-hackers/40590.html Greg -- Gregory S. Sutter My reality check just bounced. mailto:gsut...@pobox.com http://www.pobox.com/~gsutter/ PGP DSS public key 0x40AE3052 To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: more amd hangs: problem really in syslog?
On Tue, 13 Jul 1999, Mike Smith wrote: 'siobi' is someone trying to open the serial console, for whatever reason. Without knowing who it was that was stuck there, it's hard to guess what is going on. D'uh, sorry. Long day. It was amd that was hung in the siobi state. No way to clear it without rebooting the box. Dang. Now I need that stack dump from amd that you posted and I deleted. Specifically, it'd be handy to know why amd felt it was necessary to open the console. -- \\ The mind's the standard \\ Mike Smith \\ of the man. \\ [EMAIL PROTECTED] \\-- Joseph Merrick \\ [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
RE: more amd hangs: in _start()
On Tue, 13 Jul 1999, Ladavac Marino wrote: I don't know if your diagnosis was in jest, Yes it was, but thank you for asking. :) I should have known better than to attempt subtle humor at the end of a long, tiring day. Doug -- On account of being a democracy and run by the people, we are the only nation in the world that has to keep a government four years, no matter what it does. -- Will Rogers To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: more amd hangs: problem really in syslog?
: : So I started thinking that maybe the problem was actually in :syslog (or amd's interface to it). So I disabled the following two options :in my amd.conf file: : :log_file = syslog:local7 :log_options =all : : And lo and behold, it worked like a charm. I was able to run my :conf-building script for my web server 20 times in a row with no ill :effects. Previously the best I could do was 3 times before it hung. : : After confirming that it worked with no logging, I tried enabling :logging to a regular file, and that also worked like a charm. After :turning syslog style logging back on, it locked up cold, with a very :similar traceback. : : If anyone wants to work on this, let me know. : :Doug Are you syslogging to the console by any chance? Try messing around with /etc/syslog.conf and see if just plain file logging prevents the lockup -- you could even try turning off all logging (but leaving syslog running, i.e. turning it into a sink-null) to see if that has an effect. -Matt Matthew Dillon [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: more amd hangs: problem really in syslog?
After pounding on this some more with today's -current (prior to the MNT_ASYNC flag change) I got a lot more lockups that looked like this: On Mon, 12 Jul 1999, Doug wrote: Ok, got another hang in "siobi" state (this time after it successfully completed the script). Here is the trace: 'siobi' is in sioopen() in the sio driver. The callout device is already open, but the caller is trying to open it in blocking mode. It'd be useful to know what is hanging in 'siobi' here, since trying to re-open the console is a bit of a suspicious action. -- \\ The mind's the standard \\ Mike Smith \\ of the man. \\ [EMAIL PROTECTED] \\-- Joseph Merrick \\ [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: more amd hangs: problem really in syslog?
On Tue, 13 Jul 1999, Matthew Dillon wrote: : : So I started thinking that maybe the problem was actually in :syslog (or amd's interface to it). So I disabled the following two options :in my amd.conf file: : :log_file = syslog:local7 :log_options =all : : And lo and behold, it worked like a charm. I was able to run my :conf-building script for my web server 20 times in a row with no ill :effects. Previously the best I could do was 3 times before it hung. : : After confirming that it worked with no logging, I tried enabling :logging to a regular file, and that also worked like a charm. After :turning syslog style logging back on, it locked up cold, with a very :similar traceback. : : If anyone wants to work on this, let me know. : :Doug Are you syslogging to the console by any chance? Here is syslog.conf: *.err;kern.debug;auth.notice;mail.crit /dev/console *.notice;kern.debug;lpr.info;mail.crit;news.err /var/log/messages mail.info /var/log/maillog lpr.info/var/log/lpd-errs cron.* /var/cron/log *.err root *.notice;news.err root *.alert root *.emerg * local7.*/var/log/amd.log Basically, it's what comes with the system plus that line for local7. I am using a serial console setup for this box, but as far as I could see from the logs amd did generate there were no events at *.err priority, or to the kern facility, so nothing should have been printed to the serial console. Also, just in case it matters I start syslogd with -svv flags in rc.conf. Try messing around with /etc/syslog.conf and see if just plain file logging prevents the lockup -- you could even try turning off all logging (but leaving syslog running, i.e. turning it into a sink-null) to see if that has an effect. I have to admit that you lost me here. Normal syslog stuff is working just fine (where "normal" is freebsd system stuff), it's amd that locks up. It's been kind of a hectic day here, in addition to this problem so I might just be a little dense. Can you explain in more detail what you'd like me to try? Thanks, Doug -- On account of being a democracy and run by the people, we are the only nation in the world that has to keep a government four years, no matter what it does. -- Will Rogers To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: more amd hangs: problem really in syslog?
On Tue, 13 Jul 1999, Mike Smith wrote: After pounding on this some more with today's -current (prior to the MNT_ASYNC flag change) I got a lot more lockups that looked like this: On Mon, 12 Jul 1999, Doug wrote: Ok, got another hang in "siobi" state (this time after it successfully completed the script). Here is the trace: 'siobi' is in sioopen() in the sio driver. The callout device is already open, but the caller is trying to open it in blocking mode. It'd be useful to know what is hanging in 'siobi' here, since trying to re-open the console is a bit of a suspicious action. I'm using a serial console, but I directed local7 to a file in syslog.conf. But from what you're saying it sounds like the serial console is a suspect? Doug -- On account of being a democracy and run by the people, we are the only nation in the world that has to keep a government four years, no matter what it does. -- Will Rogers To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
RE: more amd hangs: in _start()
On Tue, 13 Jul 1999, Ladavac Marino wrote: I don't know if your diagnosis was in jest, Yes it was, but thank you for asking. :) I should have known better than to attempt subtle humor at the end of a long, tiring day. Doug -- On account of being a democracy and run by the people, we are the only nation in the world that has to keep a government four years, no matter what it does. -- Will Rogers To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: more amd hangs: problem really in syslog?
After pounding on this some more with today's -current (prior to the MNT_ASYNC flag change) I got a lot more lockups that looked like this: On Mon, 12 Jul 1999, Doug wrote: Ok, got another hang in siobi state (this time after it successfully completed the script). Here is the trace: (gdb) file /usr/sbin/amd Reading symbols from /usr/sbin/amd...done. (gdb) attach 155 Attaching to program: /usr/sbin/amd, process 155 0x8063dc4 in open () (gdb) where #0 0x8063dc4 in open () #1 0x806b5c3 in vsyslog (pri=6, fmt=0x809279a %s, ap=0xbfbfb240 X) at /usr/src/lib/libc/../libc/gen/syslog.c:262 #2 0x806b2c2 in syslog (pri=6, fmt=0x809279a %s) at /usr/src/lib/libc/../libc/gen/syslog.c:130 #3 0x805a3d8 in real_plog (lvl=6, fmt=0x8091ea0 recompute_portmap: NFS version %d, vargs=0xbfbfba7c \002) at /usr/src/usr.sbin/amd/libamu/../../../contrib/amd/libamu/xutil.c:443 #4 0x805a2be in plog (lvl=16, fmt=0x8091ea0 recompute_portmap: NFS version %d) at /usr/src/usr.sbin/amd/libamu/../../../contrib/amd/libamu/xutil.c:383 So I started thinking that maybe the problem was actually in syslog (or amd's interface to it). So I disabled the following two options in my amd.conf file: log_file = syslog:local7 log_options =all And lo and behold, it worked like a charm. I was able to run my conf-building script for my web server 20 times in a row with no ill effects. Previously the best I could do was 3 times before it hung. After confirming that it worked with no logging, I tried enabling logging to a regular file, and that also worked like a charm. After turning syslog style logging back on, it locked up cold, with a very similar traceback. If anyone wants to work on this, let me know. Doug -- On account of being a democracy and run by the people, we are the only nation in the world that has to keep a government four years, no matter what it does. -- Will Rogers To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: more amd hangs: problem really in syslog?
On Tue, 13 Jul 1999 13:20:55 MST, Doug wrote: After confirming that it worked with no logging, I tried enabling logging to a regular file, and that also worked like a charm. After turning syslog style logging back on, it locked up cold, with a very similar traceback. Sheesh, Mark Murray wasn't kidding when he told me that AMD tickles bugs. Of course, I thought he meant that it tickles bugs in our NFS code. :-) This discovery sounds like a real step forward. Ciao, Sheldon. To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: more amd hangs: problem really in syslog?
: : So I started thinking that maybe the problem was actually in :syslog (or amd's interface to it). So I disabled the following two options :in my amd.conf file: : :log_file = syslog:local7 :log_options =all : : And lo and behold, it worked like a charm. I was able to run my :conf-building script for my web server 20 times in a row with no ill :effects. Previously the best I could do was 3 times before it hung. : : After confirming that it worked with no logging, I tried enabling :logging to a regular file, and that also worked like a charm. After :turning syslog style logging back on, it locked up cold, with a very :similar traceback. : : If anyone wants to work on this, let me know. : :Doug Are you syslogging to the console by any chance? Try messing around with /etc/syslog.conf and see if just plain file logging prevents the lockup -- you could even try turning off all logging (but leaving syslog running, i.e. turning it into a sink-null) to see if that has an effect. -Matt Matthew Dillon dil...@backplane.com To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: more amd hangs: problem really in syslog?
After pounding on this some more with today's -current (prior to the MNT_ASYNC flag change) I got a lot more lockups that looked like this: On Mon, 12 Jul 1999, Doug wrote: Ok, got another hang in siobi state (this time after it successfully completed the script). Here is the trace: 'siobi' is in sioopen() in the sio driver. The callout device is already open, but the caller is trying to open it in blocking mode. It'd be useful to know what is hanging in 'siobi' here, since trying to re-open the console is a bit of a suspicious action. -- \\ The mind's the standard \\ Mike Smith \\ of the man. \\ msm...@freebsd.org \\-- Joseph Merrick \\ msm...@cdrom.com To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: more amd hangs: problem really in syslog?
On Tue, 13 Jul 1999, Matthew Dillon wrote: : : So I started thinking that maybe the problem was actually in :syslog (or amd's interface to it). So I disabled the following two options :in my amd.conf file: : :log_file = syslog:local7 :log_options =all : : And lo and behold, it worked like a charm. I was able to run my :conf-building script for my web server 20 times in a row with no ill :effects. Previously the best I could do was 3 times before it hung. : : After confirming that it worked with no logging, I tried enabling :logging to a regular file, and that also worked like a charm. After :turning syslog style logging back on, it locked up cold, with a very :similar traceback. : : If anyone wants to work on this, let me know. : :Doug Are you syslogging to the console by any chance? Here is syslog.conf: *.err;kern.debug;auth.notice;mail.crit /dev/console *.notice;kern.debug;lpr.info;mail.crit;news.err /var/log/messages mail.info /var/log/maillog lpr.info/var/log/lpd-errs cron.* /var/cron/log *.err root *.notice;news.err root *.alert root *.emerg * local7.*/var/log/amd.log Basically, it's what comes with the system plus that line for local7. I am using a serial console setup for this box, but as far as I could see from the logs amd did generate there were no events at *.err priority, or to the kern facility, so nothing should have been printed to the serial console. Also, just in case it matters I start syslogd with -svv flags in rc.conf. Try messing around with /etc/syslog.conf and see if just plain file logging prevents the lockup -- you could even try turning off all logging (but leaving syslog running, i.e. turning it into a sink-null) to see if that has an effect. I have to admit that you lost me here. Normal syslog stuff is working just fine (where normal is freebsd system stuff), it's amd that locks up. It's been kind of a hectic day here, in addition to this problem so I might just be a little dense. Can you explain in more detail what you'd like me to try? Thanks, Doug -- On account of being a democracy and run by the people, we are the only nation in the world that has to keep a government four years, no matter what it does. -- Will Rogers To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: more amd hangs: problem really in syslog?
On Tue, 13 Jul 1999, Mike Smith wrote: After pounding on this some more with today's -current (prior to the MNT_ASYNC flag change) I got a lot more lockups that looked like this: On Mon, 12 Jul 1999, Doug wrote: Ok, got another hang in siobi state (this time after it successfully completed the script). Here is the trace: 'siobi' is in sioopen() in the sio driver. The callout device is already open, but the caller is trying to open it in blocking mode. It'd be useful to know what is hanging in 'siobi' here, since trying to re-open the console is a bit of a suspicious action. I'm using a serial console, but I directed local7 to a file in syslog.conf. But from what you're saying it sounds like the serial console is a suspect? Doug -- On account of being a democracy and run by the people, we are the only nation in the world that has to keep a government four years, no matter what it does. -- Will Rogers To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: more amd hangs: problem really in syslog?
On Tue, 13 Jul 1999, Mike Smith wrote: After pounding on this some more with today's -current (prior to the MNT_ASYNC flag change) I got a lot more lockups that looked like this: On Mon, 12 Jul 1999, Doug wrote: Ok, got another hang in siobi state (this time after it successfully completed the script). Here is the trace: 'siobi' is in sioopen() in the sio driver. The callout device is already open, but the caller is trying to open it in blocking mode. It'd be useful to know what is hanging in 'siobi' here, since trying to re-open the console is a bit of a suspicious action. I'm using a serial console, but I directed local7 to a file in syslog.conf. But from what you're saying it sounds like the serial console is a suspect? 'siobi' is someone trying to open the serial console, for whatever reason. Without knowing who it was that was stuck there, it's hard to guess what is going on. -- \\ The mind's the standard \\ Mike Smith \\ of the man. \\ msm...@freebsd.org \\-- Joseph Merrick \\ msm...@cdrom.com To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: more amd hangs: problem really in syslog?
:*.err;kern.debug;auth.notice;mail.crit /dev/console :*.notice;kern.debug;lpr.info;mail.crit;news.err /var/log/messages :mail.info /var/log/maillog :lpr.info/var/log/lpd-errs :cron.* /var/cron/log :*.err root :*.notice;news.err root :*.alert root :*.emerg * : :local7.*/var/log/amd.log : : Basically, it's what comes with the system plus that line for :local7. I am using a serial console setup for this box, but as far as I :could see from the logs amd did generate there were no events at *.err :priority, or to the kern facility, so nothing should have been printed to :the serial console. Also, just in case it matters I start syslogd with :-svv flags in rc.conf. : : : Try messing around : with /etc/syslog.conf and see if just plain file logging prevents the : lockup -- you could even try turning off all logging (but leaving syslog : running, i.e. turning it into a sink-null) to see if that has an effect. : : I have to admit that you lost me here. Normal syslog stuff is :working just fine (where normal is freebsd system stuff), it's amd that :locks up. It's been kind of a hectic day here, in addition to this problem :so I might just be a little dense. Can you explain in more detail what :you'd like me to try? : :Thanks, : :Doug Comment the whole thing out, kill -HUP the syslogd (or kill and restart it), and see if amd still locks up. If it does not, add the file lines (/var/*) back in, restart, and see if amd locks up. If it does not, add the /dev/console line back in, restart, and see if amd locks up. If it does not, add the root and * entries back in and see if amd locks up. And so forth. We may find that there is something inherently broken with syslogd that is causing the lockup even with all entries commented out, or we may well find that it is a certain line, such as the /dev/console, root, or * line for emergency messages that is causing amd to lockup. -Matt Matthew Dillon dil...@backplane.com To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: more amd hangs: problem really in syslog?
On Tue, 13 Jul 1999, Mike Smith wrote: 'siobi' is someone trying to open the serial console, for whatever reason. Without knowing who it was that was stuck there, it's hard to guess what is going on. D'uh, sorry. Long day. It was amd that was hung in the siobi state. No way to clear it without rebooting the box. Doug -- On account of being a democracy and run by the people, we are the only nation in the world that has to keep a government four years, no matter what it does. -- Will Rogers To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: more amd hangs: problem really in syslog?
On Tue, 13 Jul 1999, Matthew Dillon wrote: Comment the whole thing out, kill -HUP the syslogd (or kill and restart it), and see if amd still locks up. Ok, now I think I get it. You want me to enable syslog'ing in amd, then do what you're talking about here. I will try this first thing tomorrow morning. We're about to put the box into production to make sure that the bug is really licked, then I'm about to go home. :) We have multiple machines in this configuration, so taking this one down tomorrow should be no problem. If it does not, add the file lines (/var/*) back in, restart, and see if amd locks up. If it does not, add the /dev/console line back in, restart, and see if amd locks up. If it does not, add the root and * entries back in and see if amd locks up. And so forth. Gotcha. We may find that there is something inherently broken with syslogd that is causing the lockup even with all entries commented out, or we may well find that it is a certain line, such as the /dev/console, root, or * line for emergency messages that is causing amd to lockup. I think that Mike Smith is on the right track in suspecting that the serial console is involved (due to the siobi state that amd was hanging in). However, which line of the syslog.conf that was causing that is a darn good question, given that none of them *should* have been involved. Thanks for all the great suggestions, Doug -- On account of being a democracy and run by the people, we are the only nation in the world that has to keep a government four years, no matter what it does. -- Will Rogers To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: more amd hangs
On Fri, 9 Jul 1999, Doug wrote: In my continuing efforts to get this freebsd box into shape for web hosting at my company (where it relies exclusively on NFS for retrieving customer data) I've been making progress thanks to some recent commits by Peter. Now I can run the heavy duty NFS access script and it completes its mission about 2 out of 3 times. Also, now when it fails it doesn't lock the whole box, just amd. Still not where I want it to be, but it is definitely big progress. :) What happens when it hangs is that amd becomes totally wedged. I cannot do 'df' or 'ls /' at all (the amd mountpoints are on /), and killing the amd process is no help; I have to reboot the box. Ktrace'ing amd at this point gets me nothing. The ktrace process just dies and leaves a 0 byte ktrace.out file. (BTW, I am also still having problems with ktrace exiting while the process is still running when I actually get it to attach, if anyone cares.) Still working on this problem. Thanks to some suggestions I got off the list, I have compiled libc and amd with debugging symbols. I wedged the box the same way I have previously, by running a perl script that automounts a directory, reads 250 files in that directory, automounts another one, etc. for a total of 68 directories. Using today's current this time amd wedged in the following state according to top: 155 root3 0 736K 520K siobi 1 0:21 0.00% 0.00% amd Here is the trace after killing it: Core was generated by `amd'. Program terminated with signal 3, Quit. #0 0x8063dc4 in open () (gdb) where #0 0x8063dc4 in open () #1 0x806b5c3 in vsyslog (pri=6, fmt=0x809279a "%s", ap=0xbfbfd2c0 "") at /usr/src/lib/libc/../libc/gen/syslog.c:262 #2 0x806b2c2 in syslog (pri=6, fmt=0x809279a "%s") at /usr/src/lib/libc/../libc/gen/syslog.c:130 #3 0x805a3d8 in real_plog (lvl=6, fmt=0x8091440 "prime_nfs_fhandle_cache: NFS version %d", vargs=0xbfbfdafc "\002") at /usr/src/usr.sbin/amd/libamu/../../../contrib/amd/libamu/xutil.c:443 #4 0x805a2be in plog (lvl=16, fmt=0x8091440 "prime_nfs_fhandle_cache: NFS version %d") at /usr/src/usr.sbin/amd/libamu/../../../contrib/amd/libamu/xutil.c:383 #5 0x805323a in prime_nfs_fhandle_cache (path=0x80cb287 "/Space/209.132.66", fs=0x80b5640, fhbuf=0xbfbfdb34, wchan=0x80b57c0) at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/ops_nfs.c:262 #6 0x805363f in nfs_init (mf=0x80b57c0) at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/ops_nfs.c:542 #7 0x804a7f9 in amfs_auto_bgmount (cp=0x80c8600, mpe=0) at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/amfs_auto.c:676 #8 0x804a4bc in amfs_auto_retry (rc=0, term=0, closure=0x80c8600) at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/amfs_auto.c:402 #9 0x8055212 in do_task_notify () at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/sched.c:239 #10 0x804df6d in softclock () at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/clock.c:212 #11 0x8052583 in run_rpc () at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/nfs_start.c:253 #12 0x80527e6 in mount_automounter (ppid=154) at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/nfs_start.c:467 #13 0x804a109 in main (argc=4, argv=0xbfbfddec) at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/amd.c:544 #14 0x80480e9 in _start () I'm going to work on attaching it with gdb while it's locked next. Also based on advice I've made some changes to my configuration files, although it didn't help the 'ls /' or 'df' problems. Thanks for any help, Doug -- On account of being a democracy and run by the people, we are the only nation in the world that has to keep a government four years, no matter what it does. -- Will Rogers [ global ] # Only search for maps of this type map_type = file # Search this path for maps search_path =/etc # Use this directory for amd's private mount points auto_dir = /usr/amd/realmounts # Log all activity to syslog (daemon) log_file = syslog:local7 log_options =all # Check /etc/hosts for hostnames normalize_hostnames = yes # Lock the amd process into memory, improves perf. plock = yes # Use the special /default entry in maps selectors_on_default = yes # DEFINE AN AMD MOUNT POINT [ /usr/amd/Interfaces ] map_name = amd.Interfaces [ /usr/amd/Hold ] map_name = amd.Hold 32# more /etc/amd.Interfaces /defaults type:=nfs;opts:=rw,vers=2,intr,proto=udp,noconn 209.132.4 netapp01:/vol/Space/209.132.4 * rhost:=IP${key};rfs:=/Space/${key} To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: more amd hangs
Ok, it's now wedged in a different state (using the same perl script to wedge it). According to top: 317 root2 0 648K 456K STOP 0 0:00 0.00% 0.00% amd I also managed to attach to the running process this time: (gdb) file /usr/sbin/amd Reading symbols from /usr/sbin/amd...done. (gdb) attach 317 Attaching to program: /usr/sbin/amd, process 317 0x8063c34 in select () (gdb) where #0 0x8063c34 in select () #1 0x80523b6 in do_select (smask=0, fds=1024, fdp=0xbfbfd990, tvp=0xbfbfd984) at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/nfs_start.c:146 #2 0x80525fd in run_rpc () at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/nfs_start.c:274 #3 0x80527e6 in mount_automounter (ppid=316) at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/nfs_start.c:467 #4 0x804a109 in main (argc=1, argv=0xbfbfdb84) at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/amd.c:544 #5 0x80480e9 in _start () I am noticing that the last function, _start() is the same as in the last traceback. Anyone with suggestions, I'm open to them. :) I tried doing 'continue' with gdb and it wedged gdb and amd, so I just rebooted. HTH, Doug To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: more amd hangs
On 10 Jul 1999 12:56:41 -0400, R. Matthew Emerson wrote: I thought that it was almost never proper to soft-mount rw filesytems. Am I mistaken about this? I must admit, it sounds like sensible advice. The only NFS exports which I have to rely on are read-only mounts. The only time I soft-mounted a read-write export was when I was mucking around with buildworld over NFS, and it didn't cause me problems then. Ciao, Sheldon. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: more amd hangs: in _start()
Ok, got another hang in "siobi" state (this time after it successfully completed the script). Here is the trace: (gdb) file /usr/sbin/amd Reading symbols from /usr/sbin/amd...done. (gdb) attach 155 Attaching to program: /usr/sbin/amd, process 155 0x8063dc4 in open () (gdb) where #0 0x8063dc4 in open () #1 0x806b5c3 in vsyslog (pri=6, fmt=0x809279a "%s", ap=0xbfbfb240 "X") at /usr/src/lib/libc/../libc/gen/syslog.c:262 #2 0x806b2c2 in syslog (pri=6, fmt=0x809279a "%s") at /usr/src/lib/libc/../libc/gen/syslog.c:130 #3 0x805a3d8 in real_plog (lvl=6, fmt=0x8091ea0 "recompute_portmap: NFS version %d", vargs=0xbfbfba7c "\002") at /usr/src/usr.sbin/amd/libamu/../../../contrib/amd/libamu/xutil.c:443 #4 0x805a2be in plog (lvl=16, fmt=0x8091ea0 "recompute_portmap: NFS version %d") at /usr/src/usr.sbin/amd/libamu/../../../contrib/amd/libamu/xutil.c:383 #5 0x80556f8 in recompute_portmap (fs=0x80c9f80) at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/srvr_nfs.c:285 #6 0x80559ff in nfs_srvr_port (fs=0x80c9f80, port=0xbfbfbabe, wchan=0x0) at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/srvr_nfs.c:564 #7 0x80534cd in call_mountd (fp=0xbfbfdb24, proc=3, f=0, wchan=0x0) at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/ops_nfs.c:438 #8 0x8053a3d in nfs_umounted (mp=0x80cad00) at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/ops_nfs.c:796 #9 0x804dd4f in am_unmounted (mp=0x80cad00) at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/autil.c:366 #10 0x8050b22 in free_map_if_success (rc=0, term=0, closure=0x80cad00) at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/map.c:924 #11 0x8055212 in do_task_notify () at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/sched.c:239 #12 0x804df6d in softclock () at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/clock.c:212 #13 0x8052583 in run_rpc () at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/nfs_start.c:253 #14 0x80527e6 in mount_automounter (ppid=154) at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/nfs_start.c:467 #15 0x804a109 in main (argc=4, argv=0xbfbfddec) at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/amd.c:544 #16 0x80480e9 in _start () I'm going to have a go at the code now that I can be fairly certain that _start() is the culprit. (Please everyone, stop laughing, thanks. :) Comments or suggestions welcome. Doug To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: more amd hangs
On Fri, 9 Jul 1999, Doug wrote: In my continuing efforts to get this freebsd box into shape for web hosting at my company (where it relies exclusively on NFS for retrieving customer data) I've been making progress thanks to some recent commits by Peter. Now I can run the heavy duty NFS access script and it completes its mission about 2 out of 3 times. Also, now when it fails it doesn't lock the whole box, just amd. Still not where I want it to be, but it is definitely big progress. :) What happens when it hangs is that amd becomes totally wedged. I cannot do 'df' or 'ls /' at all (the amd mountpoints are on /), and killing the amd process is no help; I have to reboot the box. Ktrace'ing amd at this point gets me nothing. The ktrace process just dies and leaves a 0 byte ktrace.out file. (BTW, I am also still having problems with ktrace exiting while the process is still running when I actually get it to attach, if anyone cares.) Still working on this problem. Thanks to some suggestions I got off the list, I have compiled libc and amd with debugging symbols. I wedged the box the same way I have previously, by running a perl script that automounts a directory, reads 250 files in that directory, automounts another one, etc. for a total of 68 directories. Using today's current this time amd wedged in the following state according to top: 155 root3 0 736K 520K siobi 1 0:21 0.00% 0.00% amd Here is the trace after killing it: Core was generated by `amd'. Program terminated with signal 3, Quit. #0 0x8063dc4 in open () (gdb) where #0 0x8063dc4 in open () #1 0x806b5c3 in vsyslog (pri=6, fmt=0x809279a %s, ap=0xbfbfd2c0 ) at /usr/src/lib/libc/../libc/gen/syslog.c:262 #2 0x806b2c2 in syslog (pri=6, fmt=0x809279a %s) at /usr/src/lib/libc/../libc/gen/syslog.c:130 #3 0x805a3d8 in real_plog (lvl=6, fmt=0x8091440 prime_nfs_fhandle_cache: NFS version %d, vargs=0xbfbfdafc \002) at /usr/src/usr.sbin/amd/libamu/../../../contrib/amd/libamu/xutil.c:443 #4 0x805a2be in plog (lvl=16, fmt=0x8091440 prime_nfs_fhandle_cache: NFS version %d) at /usr/src/usr.sbin/amd/libamu/../../../contrib/amd/libamu/xutil.c:383 #5 0x805323a in prime_nfs_fhandle_cache (path=0x80cb287 /Space/209.132.66, fs=0x80b5640, fhbuf=0xbfbfdb34, wchan=0x80b57c0) at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/ops_nfs.c:262 #6 0x805363f in nfs_init (mf=0x80b57c0) at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/ops_nfs.c:542 #7 0x804a7f9 in amfs_auto_bgmount (cp=0x80c8600, mpe=0) at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/amfs_auto.c:676 #8 0x804a4bc in amfs_auto_retry (rc=0, term=0, closure=0x80c8600) at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/amfs_auto.c:402 #9 0x8055212 in do_task_notify () at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/sched.c:239 #10 0x804df6d in softclock () at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/clock.c:212 #11 0x8052583 in run_rpc () at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/nfs_start.c:253 #12 0x80527e6 in mount_automounter (ppid=154) at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/nfs_start.c:467 #13 0x804a109 in main (argc=4, argv=0xbfbfddec) at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/amd.c:544 #14 0x80480e9 in _start () I'm going to work on attaching it with gdb while it's locked next. Also based on advice I've made some changes to my configuration files, although it didn't help the 'ls /' or 'df' problems. Thanks for any help, Doug -- On account of being a democracy and run by the people, we are the only nation in the world that has to keep a government four years, no matter what it does. -- Will Rogers [ global ] # Only search for maps of this type map_type = file # Search this path for maps search_path =/etc # Use this directory for amd's private mount points auto_dir = /usr/amd/realmounts # Log all activity to syslog (daemon) log_file = syslog:local7 log_options =all # Check /etc/hosts for hostnames normalize_hostnames = yes # Lock the amd process into memory, improves perf. plock = yes # Use the special /default entry in maps selectors_on_default = yes # DEFINE AN AMD MOUNT POINT [ /usr/amd/Interfaces ] map_name = amd.Interfaces [ /usr/amd/Hold ] map_name = amd.Hold 32# more /etc/amd.Interfaces /defaults type:=nfs;opts:=rw,vers=2,intr,proto=udp,noconn 209.132.4 netapp01:/vol/Space/209.132.4 * rhost:=IP${key};rfs:=/Space/${key} To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: more amd hangs
Ok, it's now wedged in a different state (using the same perl script to wedge it). According to top: 317 root2 0 648K 456K STOP 0 0:00 0.00% 0.00% amd I also managed to attach to the running process this time: (gdb) file /usr/sbin/amd Reading symbols from /usr/sbin/amd...done. (gdb) attach 317 Attaching to program: /usr/sbin/amd, process 317 0x8063c34 in select () (gdb) where #0 0x8063c34 in select () #1 0x80523b6 in do_select (smask=0, fds=1024, fdp=0xbfbfd990, tvp=0xbfbfd984) at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/nfs_start.c:146 #2 0x80525fd in run_rpc () at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/nfs_start.c:274 #3 0x80527e6 in mount_automounter (ppid=316) at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/nfs_start.c:467 #4 0x804a109 in main (argc=1, argv=0xbfbfdb84) at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/amd.c:544 #5 0x80480e9 in _start () I am noticing that the last function, _start() is the same as in the last traceback. Anyone with suggestions, I'm open to them. :) I tried doing 'continue' with gdb and it wedged gdb and amd, so I just rebooted. HTH, Doug To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: more amd hangs
On 10 Jul 1999 12:56:41 -0400, R. Matthew Emerson wrote: I thought that it was almost never proper to soft-mount rw filesytems. Am I mistaken about this? I must admit, it sounds like sensible advice. The only NFS exports which I have to rely on are read-only mounts. The only time I soft-mounted a read-write export was when I was mucking around with buildworld over NFS, and it didn't cause me problems then. Ciao, Sheldon. To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: more amd hangs: in _start()
Ok, got another hang in siobi state (this time after it successfully completed the script). Here is the trace: (gdb) file /usr/sbin/amd Reading symbols from /usr/sbin/amd...done. (gdb) attach 155 Attaching to program: /usr/sbin/amd, process 155 0x8063dc4 in open () (gdb) where #0 0x8063dc4 in open () #1 0x806b5c3 in vsyslog (pri=6, fmt=0x809279a %s, ap=0xbfbfb240 X) at /usr/src/lib/libc/../libc/gen/syslog.c:262 #2 0x806b2c2 in syslog (pri=6, fmt=0x809279a %s) at /usr/src/lib/libc/../libc/gen/syslog.c:130 #3 0x805a3d8 in real_plog (lvl=6, fmt=0x8091ea0 recompute_portmap: NFS version %d, vargs=0xbfbfba7c \002) at /usr/src/usr.sbin/amd/libamu/../../../contrib/amd/libamu/xutil.c:443 #4 0x805a2be in plog (lvl=16, fmt=0x8091ea0 recompute_portmap: NFS version %d) at /usr/src/usr.sbin/amd/libamu/../../../contrib/amd/libamu/xutil.c:383 #5 0x80556f8 in recompute_portmap (fs=0x80c9f80) at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/srvr_nfs.c:285 #6 0x80559ff in nfs_srvr_port (fs=0x80c9f80, port=0xbfbfbabe, wchan=0x0) at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/srvr_nfs.c:564 #7 0x80534cd in call_mountd (fp=0xbfbfdb24, proc=3, f=0, wchan=0x0) at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/ops_nfs.c:438 #8 0x8053a3d in nfs_umounted (mp=0x80cad00) at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/ops_nfs.c:796 #9 0x804dd4f in am_unmounted (mp=0x80cad00) at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/autil.c:366 #10 0x8050b22 in free_map_if_success (rc=0, term=0, closure=0x80cad00) at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/map.c:924 #11 0x8055212 in do_task_notify () at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/sched.c:239 #12 0x804df6d in softclock () at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/clock.c:212 #13 0x8052583 in run_rpc () at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/nfs_start.c:253 #14 0x80527e6 in mount_automounter (ppid=154) at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/nfs_start.c:467 #15 0x804a109 in main (argc=4, argv=0xbfbfddec) at /usr/src/usr.sbin/amd/amd/../../../contrib/amd/amd/amd.c:544 #16 0x80480e9 in _start () I'm going to have a go at the code now that I can be fairly certain that _start() is the culprit. (Please everyone, stop laughing, thanks. :) Comments or suggestions welcome. Doug To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: more amd hangs
On Fri, 09 Jul 1999 12:00:52 MST, Doug wrote: The amd conf files are below, any insights or suggestions welcome. I can't remember whether it was you or someone else to whom I offered this advice not so recently, so forgive me if I've suggested this to you before. I've found that AMD exacerbates NFS-related problems. Since I moved away from AMD toward using proper NFS mounts (soft, interruptible, bg), the hassles I was having with NFS have gone away completely. Ciao, Sheldon. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: more amd hangs
Sheldon Hearn [EMAIL PROTECTED] writes: I've found that AMD exacerbates NFS-related problems. Since I moved away from AMD toward using proper NFS mounts (soft, interruptible, bg), the hassles I was having with NFS have gone away completely. I thought that it was almost never proper to soft-mount rw filesytems. Am I mistaken about this? -matt To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: more amd hangs
On Fri, 09 Jul 1999 12:00:52 MST, Doug wrote: The amd conf files are below, any insights or suggestions welcome. I can't remember whether it was you or someone else to whom I offered this advice not so recently, so forgive me if I've suggested this to you before. I've found that AMD exacerbates NFS-related problems. Since I moved away from AMD toward using proper NFS mounts (soft, interruptible, bg), the hassles I was having with NFS have gone away completely. Ciao, Sheldon. To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: more amd hangs
Sheldon Hearn sheld...@uunet.co.za writes: I've found that AMD exacerbates NFS-related problems. Since I moved away from AMD toward using proper NFS mounts (soft, interruptible, bg), the hassles I was having with NFS have gone away completely. I thought that it was almost never proper to soft-mount rw filesytems. Am I mistaken about this? -matt To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message