Re: Bash lockups
Giorgos Keramidas writes: > On Fri, 21 May 2010 09:30:05 -0700, Carl Johnson wrote: >> Giorgos Keramidas writes: >>> Does this lock-up happen if you leave the shell 'idle' for too long >>> over an ssh session? There may be problems with stateful connection >>> tracking between your terminal and the remote shell :-/ >> >> No, I don't think that could be the problem. I am just using ssh >> between local machines and there is no firewall between them. It also >> often seems to happen to a shell as I switch away from it to another >> one. One suspicion is that something is sending a signal to the shell >> as it switches, and bash sometimes doesn't handle that signal >> properly. >> >> I also should have mentioned that I have been running bash as my >> default shell for years under Linux and have never seen this problem >> there. >> >> Thanks for the suggestion. > > That's ok. If you can attach to the bash process with ktrace please try > to grab a ktrace file from a deadlocked shell. We may be able to see > why it gets deadlocked by running kdump(8) on the shell trace file. > > You can run a second shell under ktrace (and hope that the parent > doesn't deadlock before the traced child shell), by running: > > bash$ ktrace -f bash.trace bash --login > > When you exit from the child shell you can dump ktrace(8) events from > the bash.trace file with: > > bash$ kdump -f bash.trace > logfile 2>&1 > > Looking near the last records dumped in 'logfile' should be quite > informative if the process is dead-locked or spinning around the same > code over and over again. I finally got one after starting ktrace a few days ago. It is informative, but it raises as many questions as it answers. It basically just wrote out the prompt, *started* to setup for reading the input and just stopped. I ran gdb on it and it is stuck looping somewhere in getenv. I don't have the system compiled with debugging, so I have limited information on what it is doing there. I checked multiple times, and I also saw getenv running routines such as memset, strlen, mbrtowc, and wcsnrtombs. The following is the tail end of the 'kdump -Ef' output: 67263 bash 61412.013860 GIO fd 2 wrote 28 bytes 0x 0d0f 1b5b 316d 5b63 6172 6c6a 4063 6a62 7364 3874 207e 5d24 1b5b |...[1m[ca...@cjbsd8t ~]$.[| 0x001a 6d20 |m | 67263 bash 61412.013867 RET write 28/0x1c 67263 bash 61412.013874 CALL sigprocmask(SIG_SETMASK,0x80e133c,0) 67263 bash 61412.013880 RET sigprocmask 0 and the following is the similar section of a normal prompt: 67263 bash 61403.461469 GIO fd 2 wrote 27 bytes 0x 0f1b 5b31 6d5b 6361 726c 6a40 636a 6273 6438 7420 7e5d 241b 5b6d |..[1m[ca...@cjbsd8t ~]$.[m| 0x001a 20 | | 67263 bash 61403.461476 RET write 27/0x1b 67263 bash 61403.461483 CALL sigprocmask(SIG_SETMASK,0x80e133c,0) 67263 bash 61403.461489 RET sigprocmask 0 67263 bash 61403.461497 CALL sigprocmask(SIG_BLOCK,0,0x80e1e3c) 67263 bash 61403.461504 RET sigprocmask 0 67263 bash 61403.461513 CALL read(0,0xbfbfd95f,0x1) I just realized there is an extra CR at the beginning of that prompt (28 bytes instead of 27) that I don't see elsewhere, but nothing else before that looks different. This one is an i368 8.0 release, but I also have another hung shell in a amd64 7.3 release system in VirtualBox. I just checked my other ktrace logs and I found one other place where that extra CR occurs, but there is no lockup there and that was my other system. The following is a section of a backtrace from gdb: #0 0x28308540 in mbrtowc () from /lib/libc.so.7 #1 0x080c7ce6 in getenv () #2 0x080c1335 in getenv () #3 0x080ae1d4 in getenv () #4 0x080ac4b0 in getenv () #5 0x080ac815 in getenv () #6 0x080c3955 in getenv () #7 0x080c3ac9 in getenv () #8 0x080ac4b0 in getenv () #9 0x080ac815 in getenv () #10 0x080acb6c in getenv () #11 0x080acf55 in getenv () #12 0x08054611 in ?? () #13 0x284a9a80 in ?? () ... #67 0x2832cbfd in time () from /lib/libc.so.7 The first few entries change when I let it run for a while, but the last 8-9 getenv addresses and everything before them remain the same. There are a total of about 65 backtrace entries this time, some of which are 0x addresses which seem suspicious. The backtrace from the other hung shell is also in getenv, but I didn't have ktrace running on that one. I am at the limit of my experience, so does anybody else have any ideas about what could cause this, or how I could trace it further? I am keeping the processes attached to gdb, so I can do further checking on them if anyone has any other ideas. Thanks in advance for any help, and thanks for the help that allowed me to get this far. -- Carl Johnsonca...@peak.org _
Re: Bash lockups
Giorgos Keramidas writes: > On Fri, 21 May 2010 09:30:05 -0700, Carl Johnson wrote: >> Giorgos Keramidas writes: >>> Does this lock-up happen if you leave the shell 'idle' for too long >>> over an ssh session? There may be problems with stateful connection >>> tracking between your terminal and the remote shell :-/ >> >> No, I don't think that could be the problem. I am just using ssh >> between local machines and there is no firewall between them. It also >> often seems to happen to a shell as I switch away from it to another >> one. One suspicion is that something is sending a signal to the shell >> as it switches, and bash sometimes doesn't handle that signal >> properly. >> >> I also should have mentioned that I have been running bash as my >> default shell for years under Linux and have never seen this problem >> there. >> >> Thanks for the suggestion. > > That's ok. If you can attach to the bash process with ktrace please try > to grab a ktrace file from a deadlocked shell. We may be able to see > why it gets deadlocked by running kdump(8) on the shell trace file. > > You can run a second shell under ktrace (and hope that the parent > doesn't deadlock before the traced child shell), by running: > > bash$ ktrace -f bash.trace bash --login > > When you exit from the child shell you can dump ktrace(8) events from > the bash.trace file with: > > bash$ kdump -f bash.trace > logfile 2>&1 > > Looking near the last records dumped in 'logfile' should be quite > informative if the process is dead-locked or spinning around the same > code over and over again. Thanks for the detailed information. I have been mostly a linux user, so this is new for me. It hasn't been happening very often lately, so it might be a while now. I will definitely try to keep any hung processes around to try your suggestions. -- Carl Johnsonca...@peak.org ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"
Re: Bash lockups
On Fri, 21 May 2010 09:30:05 -0700, Carl Johnson wrote: > Giorgos Keramidas writes: >> Does this lock-up happen if you leave the shell 'idle' for too long >> over an ssh session? There may be problems with stateful connection >> tracking between your terminal and the remote shell :-/ > > No, I don't think that could be the problem. I am just using ssh > between local machines and there is no firewall between them. It also > often seems to happen to a shell as I switch away from it to another > one. One suspicion is that something is sending a signal to the shell > as it switches, and bash sometimes doesn't handle that signal > properly. > > I also should have mentioned that I have been running bash as my > default shell for years under Linux and have never seen this problem > there. > > Thanks for the suggestion. That's ok. If you can attach to the bash process with ktrace please try to grab a ktrace file from a deadlocked shell. We may be able to see why it gets deadlocked by running kdump(8) on the shell trace file. You can run a second shell under ktrace (and hope that the parent doesn't deadlock before the traced child shell), by running: bash$ ktrace -f bash.trace bash --login When you exit from the child shell you can dump ktrace(8) events from the bash.trace file with: bash$ kdump -f bash.trace > logfile 2>&1 Looking near the last records dumped in 'logfile' should be quite informative if the process is dead-locked or spinning around the same code over and over again. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"
Re: Bash lockups
Giorgos Keramidas writes: > On Wed, 19 May 2010 16:14:52 -0700, Carl Johnson wrote: >> I have been experimenting with FreeBSD for a while, and I consistently >> get bash lockups at irregular intervals when it is otherwise idle. By >> lockup, I mean that it stops responding to the keyboard and uses 100% >> CPU. It will sometimes go for days with no problems, but I had two >> yesterday, and other today. They have occurred on test systems >> running in VirtualBox and on a real computer, both i386 and amd64 >> images, and a mixture of 7.1, 7.3 and 8.0. They usually seem to >> happen when I am switching tabs in konsole or switching shells in >> screen, but other times I think they happen when I am not even using >> the system. The only thing I have found I can do is to do a kill -9 >> and start a new shell. > > Does this lock-up happen if you leave the shell 'idle' for too long over > an ssh session? There may be problems with stateful connection tracking > between your terminal and the remote shell :-/ No, I don't think that could be the problem. I am just using ssh between local machines and there is no firewall between them. It also often seems to happen to a shell as I switch away from it to another one. One suspicion is that something is sending a signal to the shell as it switches, and bash sometimes doesn't handle that signal properly. I also should have mentioned that I have been running bash as my default shell for years under Linux and have never seen this problem there. Thanks for the suggestion. -- Carl Johnsonca...@peak.org ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"
Re: Bash lockups
On Wed, 19 May 2010 16:14:52 -0700, Carl Johnson wrote: > I have been experimenting with FreeBSD for a while, and I consistently > get bash lockups at irregular intervals when it is otherwise idle. By > lockup, I mean that it stops responding to the keyboard and uses 100% > CPU. It will sometimes go for days with no problems, but I had two > yesterday, and other today. They have occurred on test systems > running in VirtualBox and on a real computer, both i386 and amd64 > images, and a mixture of 7.1, 7.3 and 8.0. They usually seem to > happen when I am switching tabs in konsole or switching shells in > screen, but other times I think they happen when I am not even using > the system. The only thing I have found I can do is to do a kill -9 > and start a new shell. Does this lock-up happen if you leave the shell 'idle' for too long over an ssh session? There may be problems with stateful connection tracking between your terminal and the remote shell :-/ ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"
Re: Bash lockups
vogelke+u...@pobox.com (Karl Vogel) writes: >>> On Wed, 19 May 2010 16:14:52 -0700, >>> Carl Johnson said: > > C> I have been experimenting with FreeBSD for a while, and I consistently > C> get bash lockups at irregular intervals when it is otherwise idle. > C> Does anybody have any suggestings on how I could try to trace this? > >1. Get a process-table list every minute or so via cron. It might show >something else running or trying to run when you have your lockups. >Try "ps -axw -o user,pid,ppid,pgid,tt,start,time,command". > >2. Get the PID of the bash session, and run something like this as root: > >pid=12345 >k=1 >while true; do >truss -p $pid 2>&1 | head -1000 > /dir-with-lots-of-space/$k >k=`expr $k + 1` >done > >This should break the truss output into 1000-line chunks and let you >clean out the directory before it chews up all your space. Hopefully >one of the truss files will show something useful after a lockup. Thanks for the ideas. I keep several windows with shells open so I don't want to trace all of them yet. I don't even know what the shells are doing when they lock up, so for now I'll just wait until one locks up and then try truss to see what it is actually doing. This happens only occasionally, so I will probably have to wait a while. I don't know this is actually just a bash problem since I have never had it happen running on Linux in at least 10 years. -- Carl Johnsonca...@peak.org ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"
Re: Bash lockups
>> On Wed, 19 May 2010 16:14:52 -0700, >> Carl Johnson said: C> I have been experimenting with FreeBSD for a while, and I consistently C> get bash lockups at irregular intervals when it is otherwise idle. C> Does anybody have any suggestings on how I could try to trace this? 1. Get a process-table list every minute or so via cron. It might show something else running or trying to run when you have your lockups. Try "ps -axw -o user,pid,ppid,pgid,tt,start,time,command". 2. Get the PID of the bash session, and run something like this as root: pid=12345 k=1 while true; do truss -p $pid 2>&1 | head -1000 > /dir-with-lots-of-space/$k k=`expr $k + 1` done This should break the truss output into 1000-line chunks and let you clean out the directory before it chews up all your space. Hopefully one of the truss files will show something useful after a lockup. -- Karl Vogel I don't speak for the USAF or my company REMOTE CONTROL - female, because it gives a man pleasure, he'd be lost without it, and while he doesn't always know the right buttons to push, he keeps trying. --from the "What gender are they?" list ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"
Bash lockups
I have been experimenting with FreeBSD for a while, and I consistently get bash lockups at irregular intervals when it is otherwise idle. By lockup, I mean that it stops responding to the keyboard and uses 100% CPU. It will sometimes go for days with no problems, but I had two yesterday, and other today. They have occurred on test systems running in VirtualBox and on a real computer, both i386 and amd64 images, and a mixture of 7.1, 7.3 and 8.0. They usually seem to happen when I am switching tabs in konsole or switching shells in screen, but other times I think they happen when I am not even using the system. The only thing I have found I can do is to do a kill -9 and start a new shell. Does anybody have any suggestings on how I could try to trace this? I haven't been able to find any bug reports, but I don't know enough to know how to search the FreeBSD problem reports very well. Thanks for any help. I already subscribe to this list, so there is no need to cc me. -- Carl Johnsonca...@peak.org ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"