Re: kern/104406: [ufs] Processes get stuck in ufs state under persistent CPU load
Kris Kennaway schrieb: Rainer Hurling wrote: Thanks for your answer. Kris Kennaway schrieb: Rainer Hurling wrote: Looking into PR kern/104406 it seems, that this describes exactly what I am experiencing on three of my systems over the last weeks. They are running FreeBSD 8.0-CURRENT (known as 7.0-CURRENT not long ago ;-) ). Actually it sounds nothing like it at all ;) On these machines I often observe hangings, sometimes only a few seconds, on other times 20-30 seconds before input/output is back. This seems to happen when more extensive disk usage is needed (portupgrade, buildworld, browsing complicated websites etc.). During the hang even xterm is not responding any more, other (diskless) applications like xclock keep to continue. I have no panics, only UFS (and MSDOSFS) are mounted, no NTFS. About two months ago none of my systems showed these hangings. Is your system swapping? This is the usual cause of pauses during high application (actually memory) load. Kris No, I am working with 2GB RAM, without swapping at all. In the meantime I tested the above described behaviour a little more. The hangings even appeared without using Xorg, only working on consoles under heavy disk usage (portupgrade etc.). OK, configure the system with the debugger and when it is hung, break to DDB and obtain the data requested in the developers handbook to try and investigate what is going on. You may want to do this a few times to make sure you capture a representative sample. Kris I hope to find some time on tomorrow for my first session in kernel debugging ;-) Am I right with chapter 'on-line kernel debugging using ddb'? What kind of information is most usefull? Rainer ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kern/104406: [ufs] Processes get stuck in ufs state under persistent CPU load
Thanks for your answer. Kris Kennaway schrieb: Rainer Hurling wrote: Looking into PR kern/104406 it seems, that this describes exactly what I am experiencing on three of my systems over the last weeks. They are running FreeBSD 8.0-CURRENT (known as 7.0-CURRENT not long ago ;-) ). Actually it sounds nothing like it at all ;) On these machines I often observe hangings, sometimes only a few seconds, on other times 20-30 seconds before input/output is back. This seems to happen when more extensive disk usage is needed (portupgrade, buildworld, browsing complicated websites etc.). During the hang even xterm is not responding any more, other (diskless) applications like xclock keep to continue. I have no panics, only UFS (and MSDOSFS) are mounted, no NTFS. About two months ago none of my systems showed these hangings. Is your system swapping? This is the usual cause of pauses during high application (actually memory) load. Kris No, I am working with 2GB RAM, without swapping at all. In the meantime I tested the above described behaviour a little more. The hangings even appeared without using Xorg, only working on consoles under heavy disk usage (portupgrade etc.). Rainer ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kern/104406: [ufs] Processes get stuck in ufs state under persistent CPU load
It looks like that Marek Blaszkowski in his new thread on freebsd-amd64@ http://www.nabble.com/forum/ViewPost.jtp?post=13513077 is describing the same system hangings. He founds some strange behaviour with 'sync' of harddiscs. Perhaps this is a step towards the cause of hangings? Regards, Rainer Rainer Hurling schrieb: Thanks for your answer. Kris Kennaway schrieb: Rainer Hurling wrote: Looking into PR kern/104406 it seems, that this describes exactly what I am experiencing on three of my systems over the last weeks. They are running FreeBSD 8.0-CURRENT (known as 7.0-CURRENT not long ago ;-) ). Actually it sounds nothing like it at all ;) On these machines I often observe hangings, sometimes only a few seconds, on other times 20-30 seconds before input/output is back. This seems to happen when more extensive disk usage is needed (portupgrade, buildworld, browsing complicated websites etc.). During the hang even xterm is not responding any more, other (diskless) applications like xclock keep to continue. I have no panics, only UFS (and MSDOSFS) are mounted, no NTFS. About two months ago none of my systems showed these hangings. Is your system swapping? This is the usual cause of pauses during high application (actually memory) load. Kris No, I am working with 2GB RAM, without swapping at all. In the meantime I tested the above described behaviour a little more. The hangings even appeared without using Xorg, only working on consoles under heavy disk usage (portupgrade etc.). Rainer ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kern/104406: [ufs] Processes get stuck in ufs state under persistent CPU load
Rainer Hurling wrote: Thanks for your answer. Kris Kennaway schrieb: Rainer Hurling wrote: Looking into PR kern/104406 it seems, that this describes exactly what I am experiencing on three of my systems over the last weeks. They are running FreeBSD 8.0-CURRENT (known as 7.0-CURRENT not long ago ;-) ). Actually it sounds nothing like it at all ;) On these machines I often observe hangings, sometimes only a few seconds, on other times 20-30 seconds before input/output is back. This seems to happen when more extensive disk usage is needed (portupgrade, buildworld, browsing complicated websites etc.). During the hang even xterm is not responding any more, other (diskless) applications like xclock keep to continue. I have no panics, only UFS (and MSDOSFS) are mounted, no NTFS. About two months ago none of my systems showed these hangings. Is your system swapping? This is the usual cause of pauses during high application (actually memory) load. Kris No, I am working with 2GB RAM, without swapping at all. In the meantime I tested the above described behaviour a little more. The hangings even appeared without using Xorg, only working on consoles under heavy disk usage (portupgrade etc.). OK, configure the system with the debugger and when it is hung, break to DDB and obtain the data requested in the developers handbook to try and investigate what is going on. You may want to do this a few times to make sure you capture a representative sample. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kern/104406: [ufs] Processes get stuck in ufs state underpersistent CPU load
Rainer Hurling wrote: Looking into PR kern/104406 it seems, that this describes exactly what I am experiencing on three of my systems over the last weeks. They are running FreeBSD 8.0-CURRENT (known as 7.0-CURRENT not long ago ;-) ). Actually it sounds nothing like it at all ;) On these machines I often observe hangings, sometimes only a few seconds, on other times 20-30 seconds before input/output is back. This seems to happen when more extensive disk usage is needed (portupgrade, buildworld, browsing complicated websites etc.). During the hang even xterm is not responding any more, other (diskless) applications like xclock keep to continue. I have no panics, only UFS (and MSDOSFS) are mounted, no NTFS. About two months ago none of my systems showed these hangings. Is your system swapping? This is the usual cause of pauses during high application (actually memory) load. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kern/104406: [ufs] Processes get stuck in ufs state under persistent CPU load
Eugene Grosbein wrote: On Fri, Oct 19, 2007 at 03:05:01PM -0700, Alfred Perlstein wrote: Can anyone take a look on PR kern/104406 ? I got repeatable hang situation, but I can't obtain a kernel dump to get result of all show commands from here: http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html After my break to debugger using Ctrl+Alt+Esc sequence and entering a panic command kernel does not wrote a kernel dump but seems to hang. Can anyone describe how to obtain a kernel dump in this situation, or at least say - which output of show commands need in first place to debug this ? Output of all suggested commands is huge and I afraid of making mistake when carrying this output from screen to list of paper and back :-) This very easy to reproduce [ufs] uninterruptable deadlock for both of RELENG_6 and RELENG_7. Look at this PR: http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/107439 No, ufs and ntfs are different things. The PR is closed but the problem is still here with 7.0-PRERELEASE and, perhaps, CURRENT. It is closed because you could not be contacted by email for feedback. If you are still interested in this PR then you need to rectify that problem and then follow up with remko. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kern/104406: [ufs] Processes get stuck in ufs state under persistent CPU load
On Fri, Oct 19, 2007 at 03:05:01PM -0700, Alfred Perlstein wrote: Can anyone take a look on PR kern/104406 ? I got repeatable hang situation, but I can't obtain a kernel dump to get result of all show commands from here: http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html After my break to debugger using Ctrl+Alt+Esc sequence and entering a panic command kernel does not wrote a kernel dump but seems to hang. Can anyone describe how to obtain a kernel dump in this situation, or at least say - which output of show commands need in first place to debug this ? Output of all suggested commands is huge and I afraid of making mistake when carrying this output from screen to list of paper and back :-) This very easy to reproduce [ufs] uninterruptable deadlock for both of RELENG_6 and RELENG_7. Look at this PR: http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/107439 The PR is closed but the problem is still here with 7.0-PRERELEASE and, perhaps, CURRENT. Eugene ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kern/104406: [ufs] Processes get stuck in ufs state underpersistent CPU load
Can anyone take a look on PR kern/104406 ? I got repeatable hang situation, but I can't obtain a kernel dump to get result of all show commands from here: http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html After my break to debugger using Ctrl+Alt+Esc sequence and entering a panic command kernel does not wrote a kernel dump but seems to hang. Can anyone describe how to obtain a kernel dump in this situation, or at least say - which output of show commands need in first place to debug this ? Output of all suggested commands is huge and I afraid of making mistake when carrying this output from screen to list of paper and back :-) This very easy to reproduce [ufs] uninterruptable deadlock for both of RELENG_6 and RELENG_7. Look at this PR: http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/107439 The PR is closed but the problem is still here with 7.0-PRERELEASE and, perhaps, CURRENT. This is probably another bug because: 1. I built kernel with INVARIANTS as described in on Debugging Deadlocks page of FreeBSD Developers' Handbook and got no panic, but only deadlock; 2. I have no NTFS filesystem at all and just do a copy of file(s) from FTP to local UFS using mc. In this PR panic occured when NTFS mounted r/w (and NOT occured when the same NTFS mounted r/o). -- Oleg Derevenetz [EMAIL PROTECTED] OOD3-RIPE Phone: +7 4732 539880 Fax: +7 4732 531415 http://www.vsi.ru CenterTelecom Voronezh ISPhttp://isp.vsi.ru ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kern/104406: [ufs] Processes get stuck in ufs state underpersistent CPU load
On Sat, Oct 20, 2007 at 12:44:46PM +0400, Oleg Derevenetz wrote: This is probably another bug because: [skip] Then there should be another one distinct bug as God likes the Trinity. Eugene ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kern/104406: [ufs] Processes get stuck in ufs state underpersistent CPU load
Looking into PR kern/104406 it seems, that this describes exactly what I am experiencing on three of my systems over the last weeks. They are running FreeBSD 8.0-CURRENT (known as 7.0-CURRENT not long ago ;-) ). On these machines I often observe hangings, sometimes only a few seconds, on other times 20-30 seconds before input/output is back. This seems to happen when more extensive disk usage is needed (portupgrade, buildworld, browsing complicated websites etc.). During the hang even xterm is not responding any more, other (diskless) applications like xclock keep to continue. I have no panics, only UFS (and MSDOSFS) are mounted, no NTFS. About two months ago none of my systems showed these hangings. I know that this 'hanging' behaviour has been described several times in the near past on STABLE and CURRENT lists. But mostly the context was different. In discussions beared on these hangings it seems people are looking for misbehaviour of the scheduler (namely ULE), linux emulation, java runtime environment or firefox. At my point of view it has more likely to do with UFS-locking under high cpu load or something around it. I have barely skills with programming and debuging, but if there are any activities on this topic in the background, what can we do to help? Sincerely, Rainer Hurling Oleg Derevenetz schrieb: Can anyone take a look on PR kern/104406 ? I got repeatable hang situation, but I can't obtain a kernel dump to get result of all show commands from here: http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html After my break to debugger using Ctrl+Alt+Esc sequence and entering a panic command kernel does not wrote a kernel dump but seems to hang. Can anyone describe how to obtain a kernel dump in this situation, or at least say - which output of show commands need in first place to debug this ? Output of all suggested commands is huge and I afraid of making mistake when carrying this output from screen to list of paper and back :-) This very easy to reproduce [ufs] uninterruptable deadlock for both of RELENG_6 and RELENG_7. Look at this PR: http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/107439 The PR is closed but the problem is still here with 7.0-PRERELEASE and, perhaps, CURRENT. This is probably another bug because: 1. I built kernel with INVARIANTS as described in on Debugging Deadlocks page of FreeBSD Developers' Handbook and got no panic, but only deadlock; 2. I have no NTFS filesystem at all and just do a copy of file(s) from FTP to local UFS using mc. In this PR panic occured when NTFS mounted r/w (and NOT occured when the same NTFS mounted r/o). -- Oleg Derevenetz [EMAIL PROTECTED] OOD3-RIPE Phone: +7 4732 539880 Fax: +7 4732 531415 http://www.vsi.ru CenterTelecom Voronezh ISPhttp://isp.vsi.ru ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kern/104406: [ufs] Processes get stuck in ufs state underpersistent CPU load
Can anyone take a look on PR kern/104406 ? I got repeatable hang situation, but I can't obtain a kernel dump to get result of all show commands from here: http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html After my break to debugger using Ctrl+Alt+Esc sequence and entering a panic command kernel does not wrote a kernel dump but seems to hang. Can anyone describe how to obtain a kernel dump in this situation, or at least say - which output of show commands need in first place to debug this ? Output of all suggested commands is huge and I afraid of making mistake when carrying this output from screen to list of paper and back :-) Oleg, one thing you can do to make this less painful is to run your machine's console over serial port. First get a crossover serial cable, make sure it works from one box to another, it should be easy to run tip com1 on both boxes to ensure that it works. Then you just need to add console=comconsole to /boot/loader.conf and your box's console should come over serial. Then on the machine watching the console, you can just do this: % script Script started, output file is typescript % tip com1 ...do ddb stuff now... ...stop tip % exit now you should have everything logged into a file called typescript should save you a big headache. Thanks, I'll try it in the monday morning. As far as getting a dump from ddb, try this: ddb call doadump I'm completely at a loss why this isn't a base ddb command dump but whatever... :) Unfortunately, this doesn't work too. I called duty personnel in this datacenter and asked them to do this, and person on duty tells me that after he enters this command something like that arrives on monitor: db call doadump Dumping 3072 MB Dump aborted error I/O Dump failed. (Error 5) -- Oleg Derevenetz [EMAIL PROTECTED] OOD3-RIPE Phone: +7 4732 539880 Fax: +7 4732 531415 http://www.vsi.ru CenterTelecom Voronezh ISPhttp://isp.vsi.ru ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kern/104406: [ufs] Processes get stuck in ufs state underpersistent CPU load
* Oleg Derevenetz [EMAIL PROTECTED] [071020 09:58] wrote: Can anyone take a look on PR kern/104406 ? I got repeatable hang situation, but I can't obtain a kernel dump to get result of all show commands from here: http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html After my break to debugger using Ctrl+Alt+Esc sequence and entering a panic command kernel does not wrote a kernel dump but seems to hang. Can anyone describe how to obtain a kernel dump in this situation, or at least say - which output of show commands need in first place to debug this ? Output of all suggested commands is huge and I afraid of making mistake when carrying this output from screen to list of paper and back :-) Oleg, one thing you can do to make this less painful is to run your machine's console over serial port. First get a crossover serial cable, make sure it works from one box to another, it should be easy to run tip com1 on both boxes to ensure that it works. Then you just need to add console=comconsole to /boot/loader.conf and your box's console should come over serial. Then on the machine watching the console, you can just do this: % script Script started, output file is typescript % tip com1 ...do ddb stuff now... ...stop tip % exit now you should have everything logged into a file called typescript should save you a big headache. Thanks, I'll try it in the monday morning. As far as getting a dump from ddb, try this: ddb call doadump I'm completely at a loss why this isn't a base ddb command dump but whatever... :) Unfortunately, this doesn't work too. I called duty personnel in this datacenter and asked them to do this, and person on duty tells me that after he enters this command something like that arrives on monitor: db call doadump Dumping 3072 MB Dump aborted error I/O Dump failed. (Error 5) Hmnmm, that seems like you might be having a hardware problem, what disk device do you have? Have you also enabled kernel dumps via /etc/rc.conf:dumpdev= ? -- - Alfred Perlstein ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: kern/104406: [ufs] Processes get stuck in ufs state under persistent CPU load
* Oleg Derevenetz [EMAIL PROTECTED] [071019 08:17] wrote: Hi all, Can anyone take a look on PR kern/104406 ? I got repeatable hang situation, but I can't obtain a kernel dump to get result of all show commands from here: http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html After my break to debugger using Ctrl+Alt+Esc sequence and entering a panic command kernel does not wrote a kernel dump but seems to hang. Can anyone describe how to obtain a kernel dump in this situation, or at least say - which output of show commands need in first place to debug this ? Output of all suggested commands is huge and I afraid of making mistake when carrying this output from screen to list of paper and back :-) Oleg, one thing you can do to make this less painful is to run your machine's console over serial port. First get a crossover serial cable, make sure it works from one box to another, it should be easy to run tip com1 on both boxes to ensure that it works. Then you just need to add console=comconsole to /boot/loader.conf and your box's console should come over serial. Then on the machine watching the console, you can just do this: % script Script started, output file is typescript % tip com1 ...do ddb stuff now... ...stop tip % exit now you should have everything logged into a file called typescript should save you a big headache. As far as getting a dump from ddb, try this: ddb call doadump I'm completely at a loss why this isn't a base ddb command dump but whatever... :) -Alfred ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
kern/104406: [ufs] Processes get stuck in ufs state under persistent CPU load
Hi all, Can anyone take a look on PR kern/104406 ? I got repeatable hang situation, but I can't obtain a kernel dump to get result of all show commands from here: http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html After my break to debugger using Ctrl+Alt+Esc sequence and entering a panic command kernel does not wrote a kernel dump but seems to hang. Can anyone describe how to obtain a kernel dump in this situation, or at least say - which output of show commands need in first place to debug this ? Output of all suggested commands is huge and I afraid of making mistake when carrying this output from screen to list of paper and back :-) -- Oleg Derevenetz [EMAIL PROTECTED] OOD3-RIPE Phone: +7 4732 539880 Fax: +7 4732 531415 http://www.vsi.ru CenterTelecom Voronezh ISPhttp://isp.vsi.ru ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Processes get stuck in ufs state
Цитирую Oleg Derevenetz [EMAIL PROTECTED]: On Wed, Mar 07, 2007 at 05:22:38AM +0300, Oleg Derevenetz wrote: Sometimes (once a week approximately) I have a problem with the same symptoms described here on SMP FreeBSD 6.2-STABLE with dual AMD Opteron(tm) Processor 850: http://www.freebsd.org/cgi/query-pr.cgi?pr=104406cat= Sometimes (apparently when CPU load suddenly goes up) all processes that interacts with disk gets stuck in ufs state, but in my case SIGSTOP/SIGCONT seemingly does not help. See developer handbook, Deadlock Debugging chapter for instruction what information shall be gathered to debug the problem. OK, I built kernel with debug options and will wait for stuck. By the way, when debug options turned on, I see this message on every boot when nullfs mounting in progress: acquiring duplicate lock of same type: vnode interlock 1st vnode interlock @ /usr/src/sys/kern/vfs_vnops.c:806 2nd vnode interlock @ /usr/src/sys/kern/vfs_subr.c:2040 KDB: stack backtrace: kdb_backtrace(3,cfc60300,c05926d0,c05926d0,c05542c4,...) at kdb_backtrace+0x29 witness_checkorder(cfd5c4dc,9,c051cf1e,7f8) at witness_checkorder+0x578 _mtx_lock_flags(cfd5c4dc,0,c051cf1e,7f8,cfb28b90,...) at _mtx_lock_flags+0x78 vrefcnt(cfd5c414) at vrefcnt+0x20 null_checkvp(cff5eae0,c050c4a6,215) at null_checkvp+0x56 null_lock(f02f1a68) at null_lock+0x66 VOP_LOCK_APV(c054d540,f02f1a68) at VOP_LOCK_APV+0x87 vn_lock(cff5eae0,1002,cfc60300,cff5eae0,cff5ed04,...) at vn_lock+0xac nullfs_root(cff76b90,2,f02f1ae0,cfc60300,0,8,0,c05cfca0,0,c051c79c,407) at nullfs_root+0x26 vfs_domount(cfc60300,cfe3d340,cfe3d130,d,cfe3d3f0,c05817e0,0,c051c79c,2bf) at vfs_domount+0x975 vfs_donmount(cfc60300,d,cfe73080,cfe73080,0,...) at vfs_donmount+0x3f9 nmount(cfc60300,f02f1d04) at nmount+0x8b syscall(3b,3b,3b,bf7fe5f5,bf7feea0,...) at syscall+0x25b Xint0x80_syscall() at Xint0x80_syscall+0x1f --- syscall (378, FreeBSD ELF32, nmount), eip = 0x280bc0e7, esp = 0xbf7fe5bc, ebp = 0xbf7fee38 --- This host have nullfs filesystems. Is this can be related to deadlock ? FYI: after replacing nullfs filesystems with unionfs (using new unionfs implementation): http://people.freebsd.org/~daichi/unionfs/ all deadlocks are gone. It seems to be a problem in current nullfs implementation, but I can't debug it properly because deadlock cases are relatively rare and machine that uses nullfs is heavily loaded so WITNESS and DEBUG options leads to unacceptable performance penalty. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Processes get stuck in ufs state
On Wed, Mar 07, 2007 at 05:22:38AM +0300, Oleg Derevenetz wrote: Sometimes (once a week approximately) I have a problem with the same symptoms described here on SMP FreeBSD 6.2-STABLE with dual AMD Opteron(tm) Processor 850: http://www.freebsd.org/cgi/query-pr.cgi?pr=104406cat= Sometimes (apparently when CPU load suddenly goes up) all processes that interacts with disk gets stuck in ufs state, but in my case SIGSTOP/SIGCONT seemingly does not help. See developer handbook, Deadlock Debugging chapter for instruction what information shall be gathered to debug the problem. OK, I built kernel with debug options and will wait for stuck. By the way, when debug options turned on, I see this message on every boot when nullfs mounting in progress: acquiring duplicate lock of same type: vnode interlock 1st vnode interlock @ /usr/src/sys/kern/vfs_vnops.c:806 2nd vnode interlock @ /usr/src/sys/kern/vfs_subr.c:2040 KDB: stack backtrace: kdb_backtrace(3,cfc60300,c05926d0,c05926d0,c05542c4,...) at kdb_backtrace+0x29 witness_checkorder(cfd5c4dc,9,c051cf1e,7f8) at witness_checkorder+0x578 _mtx_lock_flags(cfd5c4dc,0,c051cf1e,7f8,cfb28b90,...) at _mtx_lock_flags+0x78 vrefcnt(cfd5c414) at vrefcnt+0x20 null_checkvp(cff5eae0,c050c4a6,215) at null_checkvp+0x56 null_lock(f02f1a68) at null_lock+0x66 VOP_LOCK_APV(c054d540,f02f1a68) at VOP_LOCK_APV+0x87 vn_lock(cff5eae0,1002,cfc60300,cff5eae0,cff5ed04,...) at vn_lock+0xac nullfs_root(cff76b90,2,f02f1ae0,cfc60300,0,8,0,c05cfca0,0,c051c79c,407) at nullfs_root+0x26 vfs_domount(cfc60300,cfe3d340,cfe3d130,d,cfe3d3f0,c05817e0,0,c051c79c,2bf) at vfs_domount+0x975 vfs_donmount(cfc60300,d,cfe73080,cfe73080,0,...) at vfs_donmount+0x3f9 nmount(cfc60300,f02f1d04) at nmount+0x8b syscall(3b,3b,3b,bf7fe5f5,bf7feea0,...) at syscall+0x25b Xint0x80_syscall() at Xint0x80_syscall+0x1f --- syscall (378, FreeBSD ELF32, nmount), eip = 0x280bc0e7, esp = 0xbf7fe5bc, ebp = 0xbf7fee38 --- This host have nullfs filesystems. Is this can be related to deadlock ? -- Oleg Derevenetz [EMAIL PROTECTED] OOD3-RIPE Phone: +7 4732 539880 Fax: +7 4732 531415 http://www.vsi.ru CenterTelecom Voronezh ISPhttp://isp.vsi.ru ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Processes get stuck in ufs state
On Fri, Mar 09, 2007 at 06:08:25PM +0300, Oleg Derevenetz wrote: On Wed, Mar 07, 2007 at 05:22:38AM +0300, Oleg Derevenetz wrote: Sometimes (once a week approximately) I have a problem with the same symptoms described here on SMP FreeBSD 6.2-STABLE with dual AMD Opteron(tm) Processor 850: http://www.freebsd.org/cgi/query-pr.cgi?pr=104406cat= Sometimes (apparently when CPU load suddenly goes up) all processes that interacts with disk gets stuck in ufs state, but in my case SIGSTOP/SIGCONT seemingly does not help. See developer handbook, Deadlock Debugging chapter for instruction what information shall be gathered to debug the problem. OK, I built kernel with debug options and will wait for stuck. By the way, when debug options turned on, I see this message on every boot when nullfs mounting in progress: acquiring duplicate lock of same type: vnode interlock 1st vnode interlock @ /usr/src/sys/kern/vfs_vnops.c:806 2nd vnode interlock @ /usr/src/sys/kern/vfs_subr.c:2040 KDB: stack backtrace: kdb_backtrace(3,cfc60300,c05926d0,c05926d0,c05542c4,...) at kdb_backtrace+0x29 witness_checkorder(cfd5c4dc,9,c051cf1e,7f8) at witness_checkorder+0x578 _mtx_lock_flags(cfd5c4dc,0,c051cf1e,7f8,cfb28b90,...) at _mtx_lock_flags+0x78 vrefcnt(cfd5c414) at vrefcnt+0x20 null_checkvp(cff5eae0,c050c4a6,215) at null_checkvp+0x56 null_lock(f02f1a68) at null_lock+0x66 VOP_LOCK_APV(c054d540,f02f1a68) at VOP_LOCK_APV+0x87 vn_lock(cff5eae0,1002,cfc60300,cff5eae0,cff5ed04,...) at vn_lock+0xac nullfs_root(cff76b90,2,f02f1ae0,cfc60300,0,8,0,c05cfca0,0,c051c79c,407) at nullfs_root+0x26 vfs_domount(cfc60300,cfe3d340,cfe3d130,d,cfe3d3f0,c05817e0,0,c051c79c,2bf) at vfs_domount+0x975 vfs_donmount(cfc60300,d,cfe73080,cfe73080,0,...) at vfs_donmount+0x3f9 nmount(cfc60300,f02f1d04) at nmount+0x8b syscall(3b,3b,3b,bf7fe5f5,bf7feea0,...) at syscall+0x25b Xint0x80_syscall() at Xint0x80_syscall+0x1f --- syscall (378, FreeBSD ELF32, nmount), eip = 0x280bc0e7, esp = 0xbf7fe5bc, ebp = 0xbf7fee38 --- This host have nullfs filesystems. Is this can be related to deadlock ? This is harmless, just ignore it. pgp3azpHgEcQb.pgp Description: PGP signature
Processes get stuck in ufs state
Hi ! Sometimes (once a week approximately) I have a problem with the same symptoms described here on SMP FreeBSD 6.2-STABLE with dual AMD Opteron(tm) Processor 850: http://www.freebsd.org/cgi/query-pr.cgi?pr=104406cat= Sometimes (apparently when CPU load suddenly goes up) all processes that interacts with disk gets stuck in ufs state, but in my case SIGSTOP/SIGCONT seemingly does not help. uname -a output: FreeBSD serv2.vsi.ru 6.2-STABLE FreeBSD 6.2-STABLE #2: Sat Mar 3 01:59:08 MSK 2007 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/serv2 i386 dmesg.boot: Copyright (c) 1992-2007 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 6.2-STABLE #2: Sat Mar 3 01:59:08 MSK 2007 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/serv2 Timecounter i8254 frequency 1193182 Hz quality 0 CPU: AMD Opteron(tm) Processor 850 (2389.26-MHz 686-class CPU) Origin = AuthenticAMD Id = 0x20f51 Stepping = 1 Features=0x78bfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2 Features2=0x1SSE3 AMD Features=0xe2500800SYSCALL,NX,MMX+,FFXSR,LM,3DNow+,3DNow AMD Features2=0x1LAHF real memory = 8589934592 (8192 MB) avail memory = 8350457856 (7963 MB) ACPI APIC Table: PTLTD APIC FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 MADT: Forcing active-low polarity and level trigger for SCI ioapic0 Version 1.1 irqs 0-23 on motherboard ioapic1 Version 1.1 irqs 24-27 on motherboard ioapic2 Version 1.1 irqs 28-31 on motherboard ioapic3 Version 1.1 irqs 32-35 on motherboard ioapic4 Version 1.1 irqs 36-39 on motherboard ioapic5 Version 1.1 irqs 40-43 on motherboard ioapic6 Version 1.1 irqs 44-47 on motherboard kbd1 at kbdmux0 acpi0: PTLTDXSDT on motherboard acpi0: Power Button (fixed) unknown: I/O range not supported unknown: I/O range not supported Timecounter ACPI-fast frequency 3579545 Hz quality 1000 acpi_timer0: 24-bit timer at 3.579545MHz port 0xf008-0xf00b on acpi0 cpu0: ACPI CPU on acpi0 cpu1: ACPI CPU on acpi0 acpi_button0: Power Button on acpi0 pcib0: ACPI Host-PCI bridge port 0xcf8-0xcff,0xf000-0xf07f,0xf080-0xf0ff iomem 0xd8000-0xdbfff on acpi0 pci0: ACPI PCI bus on pcib0 pcib1: ACPI PCI-PCI bridge at device 6.0 on pci0 pci1: ACPI PCI bus on pcib1 ohci0: OHCI (generic) USB controller mem 0xfc90-0xfc900fff irq 19 at device 0.0 on pci1 ohci0: [GIANT-LOCKED] usb0: OHCI version 1.0, legacy support usb0: SMM does not respond, resetting usb0: OHCI (generic) USB controller on ohci0 usb0: USB revision 1.0 uhub0: AMD OHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 3 ports with 3 removable, self powered ohci1: OHCI (generic) USB controller mem 0xfc901000-0xfc901fff irq 19 at device 0.1 on pci1 ohci1: [GIANT-LOCKED] usb1: OHCI version 1.0, legacy support usb1: SMM does not respond, resetting usb1: OHCI (generic) USB controller on ohci1 usb1: USB revision 1.0 uhub1: AMD OHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub1: 3 ports with 3 removable, self powered pci1: display, VGA at device 5.0 (no driver attached) isab0: PCI-ISA bridge at device 7.0 on pci0 isa0: ISA bus on isab0 atapci0: AMD 8111 UDMA133 controller port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x1000-0x100f at device 7.1 on pci0 ata0: ATA channel 0 on atapci0 ata1: ATA channel 1 on atapci0 pci0: bridge at device 7.3 (no driver attached) pcib2: ACPI PCI-PCI bridge at device 10.0 on pci0 pci2: ACPI PCI bus on pcib2 bge0: Broadcom BCM5704 A3, ASIC rev. 0x2003 mem 0xfe01-0xfe01,0xfe00-0xfe00 irq 25 at device 2.0 on pci2 miibus0: MII bus on bge0 brgphy0: BCM5704 10/100/1000baseTX PHY on miibus0 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto bge0: Ethernet address: 00:09:3d:13:fd:00 bge1: Broadcom BCM5704 A3, ASIC rev. 0x2003 mem 0xfe03-0xfe03,0xfe02-0xfe02 irq 26 at device 2.1 on pci2 miibus1: MII bus on bge1 brgphy1: BCM5704 10/100/1000baseTX PHY on miibus1 brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto bge1: Ethernet address: 00:09:3d:13:fd:01 mpt0: LSILogic 1030 Ultra4 Adapter port 0x2000-0x20ff mem 0xfe05-0xfe05,0xfe04-0xfe04 irq 27 at device 4.0 on pci2 mpt0: [GIANT-LOCKED] mpt0: MPI Version=1.2.15.0 mpt0: Capabilities: ( RAID-1E RAID-1 SAFTE ) mpt0: 0 Active Volumes (1 Max) mpt0: 0 Hidden Drive Members (6 Max) pci0: base peripheral, interrupt controller at device 10.1 (no driver attached) pcib3: ACPI PCI-PCI bridge at device 11.0 on pci0 pci3: ACPI PCI bus on pcib3 pci0: base peripheral, interrupt controller at device 11.1 (no driver attached) pcib4: ACPI Host-PCI bridge iomem 0xfe301000-0xfe301fff,0xfe303000-0xfe303fff,0xfe305000-0xfe305fff,0xfe307000-0xfe307fff on acpi0 pci32: ACPI PCI bus on pcib4 pcib5: ACPI PCI-PCI bridge mem
Processes get stuck in ufs state
Hi ! Sometimes (once a week approximately) I have a problem with the same symptoms described here on SMP FreeBSD 6.2-STABLE with dual AMD Opteron(tm) Processor 850: http://www.freebsd.org/cgi/query-pr.cgi?pr=104406cat= Sometimes (apparently when CPU load suddenly goes up) all processes that interacts with disk gets stuck in ufs state, but in my case SIGSTOP/SIGCONT seemingly does not help. uname -a output: FreeBSD serv2.vsi.ru 6.2-STABLE FreeBSD 6.2-STABLE #2: Sat Mar 3 01:59:08 MSK 2007 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/serv2 i386 dmesg.boot: Copyright (c) 1992-2007 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 6.2-STABLE #2: Sat Mar 3 01:59:08 MSK 2007 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/serv2 Timecounter i8254 frequency 1193182 Hz quality 0 CPU: AMD Opteron(tm) Processor 850 (2389.26-MHz 686-class CPU) Origin = AuthenticAMD Id = 0x20f51 Stepping = 1 Features=0x78bfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2 Features2=0x1SSE3 AMD Features=0xe2500800SYSCALL,NX,MMX+,FFXSR,LM,3DNow+,3DNow AMD Features2=0x1LAHF real memory = 8589934592 (8192 MB) avail memory = 8350457856 (7963 MB) ACPI APIC Table: PTLTD APIC FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 MADT: Forcing active-low polarity and level trigger for SCI ioapic0 Version 1.1 irqs 0-23 on motherboard ioapic1 Version 1.1 irqs 24-27 on motherboard ioapic2 Version 1.1 irqs 28-31 on motherboard ioapic3 Version 1.1 irqs 32-35 on motherboard ioapic4 Version 1.1 irqs 36-39 on motherboard ioapic5 Version 1.1 irqs 40-43 on motherboard ioapic6 Version 1.1 irqs 44-47 on motherboard kbd1 at kbdmux0 acpi0: PTLTDXSDT on motherboard acpi0: Power Button (fixed) unknown: I/O range not supported unknown: I/O range not supported Timecounter ACPI-fast frequency 3579545 Hz quality 1000 acpi_timer0: 24-bit timer at 3.579545MHz port 0xf008-0xf00b on acpi0 cpu0: ACPI CPU on acpi0 cpu1: ACPI CPU on acpi0 acpi_button0: Power Button on acpi0 pcib0: ACPI Host-PCI bridge port 0xcf8-0xcff,0xf000-0xf07f,0xf080-0xf0ff iomem 0xd8000-0xdbfff on acpi0 pci0: ACPI PCI bus on pcib0 pcib1: ACPI PCI-PCI bridge at device 6.0 on pci0 pci1: ACPI PCI bus on pcib1 ohci0: OHCI (generic) USB controller mem 0xfc90-0xfc900fff irq 19 at device 0.0 on pci1 ohci0: [GIANT-LOCKED] usb0: OHCI version 1.0, legacy support usb0: SMM does not respond, resetting usb0: OHCI (generic) USB controller on ohci0 usb0: USB revision 1.0 uhub0: AMD OHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 3 ports with 3 removable, self powered ohci1: OHCI (generic) USB controller mem 0xfc901000-0xfc901fff irq 19 at device 0.1 on pci1 ohci1: [GIANT-LOCKED] usb1: OHCI version 1.0, legacy support usb1: SMM does not respond, resetting usb1: OHCI (generic) USB controller on ohci1 usb1: USB revision 1.0 uhub1: AMD OHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub1: 3 ports with 3 removable, self powered pci1: display, VGA at device 5.0 (no driver attached) isab0: PCI-ISA bridge at device 7.0 on pci0 isa0: ISA bus on isab0 atapci0: AMD 8111 UDMA133 controller port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x1000-0x100f at device 7.1 on pci0 ata0: ATA channel 0 on atapci0 ata1: ATA channel 1 on atapci0 pci0: bridge at device 7.3 (no driver attached) pcib2: ACPI PCI-PCI bridge at device 10.0 on pci0 pci2: ACPI PCI bus on pcib2 bge0: Broadcom BCM5704 A3, ASIC rev. 0x2003 mem 0xfe01-0xfe01,0xfe00-0xfe00 irq 25 at device 2.0 on pci2 miibus0: MII bus on bge0 brgphy0: BCM5704 10/100/1000baseTX PHY on miibus0 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto bge0: Ethernet address: 00:09:3d:13:fd:00 bge1: Broadcom BCM5704 A3, ASIC rev. 0x2003 mem 0xfe03-0xfe03,0xfe02-0xfe02 irq 26 at device 2.1 on pci2 miibus1: MII bus on bge1 brgphy1: BCM5704 10/100/1000baseTX PHY on miibus1 brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto bge1: Ethernet address: 00:09:3d:13:fd:01 mpt0: LSILogic 1030 Ultra4 Adapter port 0x2000-0x20ff mem 0xfe05-0xfe05,0xfe04-0xfe04 irq 27 at device 4.0 on pci2 mpt0: [GIANT-LOCKED] mpt0: MPI Version=1.2.15.0 mpt0: Capabilities: ( RAID-1E RAID-1 SAFTE ) mpt0: 0 Active Volumes (1 Max) mpt0: 0 Hidden Drive Members (6 Max) pci0: base peripheral, interrupt controller at device 10.1 (no driver attached) pcib3: ACPI PCI-PCI bridge at device 11.0 on pci0 pci3: ACPI PCI bus on pcib3 pci0: base peripheral, interrupt controller at device 11.1 (no driver attached) pcib4: ACPI Host-PCI bridge iomem 0xfe301000-0xfe301fff,0xfe303000-0xfe303fff,0xfe305000-0xfe305fff,0xfe307000-0xfe307fff on acpi0 pci32: ACPI PCI bus on pcib4 pcib5: ACPI PCI-PCI bridge mem
Re: Processes get stuck in ufs state
On Wed, Mar 07, 2007 at 05:22:38AM +0300, Oleg Derevenetz wrote: Hi ! Sometimes (once a week approximately) I have a problem with the same symptoms described here on SMP FreeBSD 6.2-STABLE with dual AMD Opteron(tm) Processor 850: http://www.freebsd.org/cgi/query-pr.cgi?pr=104406cat= Sometimes (apparently when CPU load suddenly goes up) all processes that interacts with disk gets stuck in ufs state, but in my case SIGSTOP/SIGCONT seemingly does not help. See developer handbook, Deadlock Debugging chapter for instruction what information shall be gathered to debug the problem. pgp97NW2c4Doa.pgp Description: PGP signature