Re: kern/104406: [ufs] Processes get stuck in ufs state under persistent CPU load

2007-11-01 Thread Rainer Hurling

Kris Kennaway schrieb:

Rainer Hurling wrote:

Thanks for your answer.

Kris Kennaway schrieb:

Rainer Hurling wrote:
Looking into PR kern/104406 it seems, that this describes exactly 
what I am experiencing on three of my systems over the last weeks. 
They are running FreeBSD 8.0-CURRENT (known as 7.0-CURRENT not long 
ago ;-) ).


Actually it sounds nothing like it at all ;)

On these machines I often observe hangings, sometimes only a few 
seconds, on other times 20-30 seconds before input/output is back. 
This seems to happen when more extensive disk usage is needed 
(portupgrade, buildworld, browsing complicated websites etc.). 
During the hang even xterm is not responding any more, other 
(diskless) applications like xclock keep to continue. I have no 
panics, only UFS (and MSDOSFS) are mounted, no NTFS. About two 
months ago none of my systems showed these hangings.


Is your system swapping?  This is the usual cause of pauses during 
high application (actually memory) load.


Kris


No, I am working with 2GB RAM, without swapping at all.

In the meantime I tested the above described behaviour a little more. 
The hangings even appeared without using Xorg, only working on 
consoles under heavy disk usage (portupgrade etc.).


OK, configure the system with the debugger and when it is hung, break 
to DDB and obtain the data requested in the developers handbook to try 
and investigate what is going on.  You may want to do this a few times 
to make sure you capture a representative sample.


Kris


I hope to find some time on tomorrow for my first session in kernel 
debugging ;-)


Am I right with chapter 'on-line kernel debugging using ddb'?
What kind of information is most usefull?

Rainer

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kern/104406: [ufs] Processes get stuck in ufs state under persistent CPU load

2007-10-31 Thread Rainer Hurling

Thanks for your answer.

Kris Kennaway schrieb:

Rainer Hurling wrote:
Looking into PR kern/104406 it seems, that this describes exactly what 
I am experiencing on three of my systems over the last weeks. They are 
running FreeBSD 8.0-CURRENT (known as 7.0-CURRENT not long ago ;-) ).


Actually it sounds nothing like it at all ;)

On these machines I often observe hangings, sometimes only a few 
seconds, on other times 20-30 seconds before input/output is back. 
This seems to happen when more extensive disk usage is needed 
(portupgrade, buildworld, browsing complicated websites etc.). During 
the hang even xterm is not responding any more, other (diskless) 
applications like xclock keep to continue. I have no panics, only UFS 
(and MSDOSFS) are mounted, no NTFS. About two months ago none of my 
systems showed these hangings.


Is your system swapping?  This is the usual cause of pauses during high 
application (actually memory) load.


Kris


No, I am working with 2GB RAM, without swapping at all.

In the meantime I tested the above described behaviour a little more. 
The hangings even appeared without using Xorg, only working on consoles 
under heavy disk usage (portupgrade etc.).


Rainer
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kern/104406: [ufs] Processes get stuck in ufs state under persistent CPU load

2007-10-31 Thread Rainer Hurling

It looks like that Marek Blaszkowski in his new thread on freebsd-amd64@

http://www.nabble.com/forum/ViewPost.jtp?post=13513077

is describing the same system hangings. He founds some strange behaviour 
with 'sync' of harddiscs. Perhaps this is a step towards the cause of 
hangings?


Regards,
Rainer


Rainer Hurling schrieb:

Thanks for your answer.

Kris Kennaway schrieb:

Rainer Hurling wrote:
Looking into PR kern/104406 it seems, that this describes exactly 
what I am experiencing on three of my systems over the last weeks. 
They are running FreeBSD 8.0-CURRENT (known as 7.0-CURRENT not long 
ago ;-) ).


Actually it sounds nothing like it at all ;)

On these machines I often observe hangings, sometimes only a few 
seconds, on other times 20-30 seconds before input/output is back. 
This seems to happen when more extensive disk usage is needed 
(portupgrade, buildworld, browsing complicated websites etc.). During 
the hang even xterm is not responding any more, other (diskless) 
applications like xclock keep to continue. I have no panics, only UFS 
(and MSDOSFS) are mounted, no NTFS. About two months ago none of my 
systems showed these hangings.


Is your system swapping?  This is the usual cause of pauses during 
high application (actually memory) load.


Kris


No, I am working with 2GB RAM, without swapping at all.

In the meantime I tested the above described behaviour a little more. 
The hangings even appeared without using Xorg, only working on consoles 
under heavy disk usage (portupgrade etc.).


Rainer


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kern/104406: [ufs] Processes get stuck in ufs state under persistent CPU load

2007-10-31 Thread Kris Kennaway

Rainer Hurling wrote:

Thanks for your answer.

Kris Kennaway schrieb:

Rainer Hurling wrote:
Looking into PR kern/104406 it seems, that this describes exactly 
what I am experiencing on three of my systems over the last weeks. 
They are running FreeBSD 8.0-CURRENT (known as 7.0-CURRENT not long 
ago ;-) ).


Actually it sounds nothing like it at all ;)

On these machines I often observe hangings, sometimes only a few 
seconds, on other times 20-30 seconds before input/output is back. 
This seems to happen when more extensive disk usage is needed 
(portupgrade, buildworld, browsing complicated websites etc.). During 
the hang even xterm is not responding any more, other (diskless) 
applications like xclock keep to continue. I have no panics, only UFS 
(and MSDOSFS) are mounted, no NTFS. About two months ago none of my 
systems showed these hangings.


Is your system swapping?  This is the usual cause of pauses during 
high application (actually memory) load.


Kris


No, I am working with 2GB RAM, without swapping at all.

In the meantime I tested the above described behaviour a little more. 
The hangings even appeared without using Xorg, only working on consoles 
under heavy disk usage (portupgrade etc.).


OK, configure the system with the debugger and when it is hung, break 
to DDB and obtain the data requested in the developers handbook to try 
and investigate what is going on.  You may want to do this a few times 
to make sure you capture a representative sample.


Kris

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kern/104406: [ufs] Processes get stuck in ufs state underpersistent CPU load

2007-10-30 Thread Kris Kennaway

Rainer Hurling wrote:
Looking into PR kern/104406 it seems, that this describes exactly what I 
am experiencing on three of my systems over the last weeks. They are 
running FreeBSD 8.0-CURRENT (known as 7.0-CURRENT not long ago ;-) ).


Actually it sounds nothing like it at all ;)

On these machines I often observe hangings, sometimes only a few 
seconds, on other times 20-30 seconds before input/output is back. This 
seems to happen when more extensive disk usage is needed (portupgrade, 
buildworld, browsing complicated websites etc.). During the hang even 
xterm is not responding any more, other (diskless) applications like 
xclock keep to continue. I have no panics, only UFS (and MSDOSFS) are 
mounted, no NTFS. About two months ago none of my systems showed these 
hangings.


Is your system swapping?  This is the usual cause of pauses during high 
application (actually memory) load.


Kris
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kern/104406: [ufs] Processes get stuck in ufs state under persistent CPU load

2007-10-30 Thread Kris Kennaway

Eugene Grosbein wrote:

On Fri, Oct 19, 2007 at 03:05:01PM -0700, Alfred Perlstein wrote:

Can anyone take a look on PR kern/104406 ? I got repeatable hang situation, 
but I can't obtain a kernel dump to get result of all show commands from 
here:


http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html

After my break to debugger using Ctrl+Alt+Esc sequence and entering a 
panic command kernel does not wrote a kernel dump but seems to hang. Can 
anyone describe how to obtain a kernel dump in this situation, or at least 
say - which output of show commands need in first place to debug this ? 
Output of all suggested commands is huge and I afraid of making mistake 
when carrying this output from screen to list of paper and back :-)


This very easy to reproduce [ufs] uninterruptable deadlock
for both of RELENG_6 and RELENG_7. Look at this PR:
http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/107439


No, ufs and ntfs are different things.


The PR is closed but the problem is still here with 7.0-PRERELEASE
and, perhaps, CURRENT.


It is closed because you could not be contacted by email for feedback. 
If you are still interested in this PR then you need to rectify that 
problem and then follow up with remko.


Kris

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kern/104406: [ufs] Processes get stuck in ufs state under persistent CPU load

2007-10-20 Thread Eugene Grosbein
On Fri, Oct 19, 2007 at 03:05:01PM -0700, Alfred Perlstein wrote:

  Can anyone take a look on PR kern/104406 ? I got repeatable hang situation, 
  but I can't obtain a kernel dump to get result of all show commands from 
  here:
  
  http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html
  
  After my break to debugger using Ctrl+Alt+Esc sequence and entering a 
  panic command kernel does not wrote a kernel dump but seems to hang. Can 
  anyone describe how to obtain a kernel dump in this situation, or at least 
  say - which output of show commands need in first place to debug this ? 
  Output of all suggested commands is huge and I afraid of making mistake 
  when carrying this output from screen to list of paper and back :-)

This very easy to reproduce [ufs] uninterruptable deadlock
for both of RELENG_6 and RELENG_7. Look at this PR:
http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/107439

The PR is closed but the problem is still here with 7.0-PRERELEASE
and, perhaps, CURRENT.

Eugene
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kern/104406: [ufs] Processes get stuck in ufs state underpersistent CPU load

2007-10-20 Thread Oleg Derevenetz
   Can anyone take a look on PR kern/104406 ? I got repeatable hang
situation,
   but I can't obtain a kernel dump to get result of all show commands
from
   here:
  
  
http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html
  
   After my break to debugger using Ctrl+Alt+Esc sequence and entering a
   panic command kernel does not wrote a kernel dump but seems to hang.
Can
   anyone describe how to obtain a kernel dump in this situation, or at
least
   say - which output of show commands need in first place to debug this
?
   Output of all suggested commands is huge and I afraid of making
mistake
   when carrying this output from screen to list of paper and back :-)

 This very easy to reproduce [ufs] uninterruptable deadlock
 for both of RELENG_6 and RELENG_7. Look at this PR:
 http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/107439

 The PR is closed but the problem is still here with 7.0-PRERELEASE
 and, perhaps, CURRENT.

This is probably another bug because:

1. I built kernel with INVARIANTS as described in on Debugging Deadlocks
page of FreeBSD Developers' Handbook and got no panic, but only deadlock;
2. I have no NTFS filesystem at all and just do a copy of file(s) from FTP
to local UFS using mc. In this PR panic occured when NTFS mounted r/w (and
NOT occured when the same NTFS mounted r/o).

--
Oleg Derevenetz [EMAIL PROTECTED] OOD3-RIPE
Phone: +7 4732 539880
Fax:   +7 4732 531415 http://www.vsi.ru
CenterTelecom Voronezh ISPhttp://isp.vsi.ru

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kern/104406: [ufs] Processes get stuck in ufs state underpersistent CPU load

2007-10-20 Thread Eugene Grosbein
On Sat, Oct 20, 2007 at 12:44:46PM +0400, Oleg Derevenetz wrote:

 This is probably another bug because:

[skip]

Then there should be another one distinct bug as God likes the Trinity.

Eugene
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kern/104406: [ufs] Processes get stuck in ufs state underpersistent CPU load

2007-10-20 Thread Rainer Hurling
Looking into PR kern/104406 it seems, that this describes exactly what I 
am experiencing on three of my systems over the last weeks. They are 
running FreeBSD 8.0-CURRENT (known as 7.0-CURRENT not long ago ;-) ).


On these machines I often observe hangings, sometimes only a few 
seconds, on other times 20-30 seconds before input/output is back. This 
seems to happen when more extensive disk usage is needed (portupgrade, 
buildworld, browsing complicated websites etc.). During the hang even 
xterm is not responding any more, other (diskless) applications like 
xclock keep to continue. I have no panics, only UFS (and MSDOSFS) are 
mounted, no NTFS. About two months ago none of my systems showed these 
hangings.


I know that this 'hanging' behaviour has been described several times in 
the near past on STABLE and CURRENT lists. But mostly the context was 
different. In discussions beared on these hangings it seems people are 
looking for misbehaviour of the scheduler (namely ULE), linux emulation, 
java runtime environment or firefox. At my point of view it has more 
likely to do with UFS-locking under high cpu load or something around it.


I have barely skills with programming and debuging, but if there are any 
activities on this topic in the background, what can we do to help?


Sincerely,
Rainer Hurling



Oleg Derevenetz schrieb:

Can anyone take a look on PR kern/104406 ? I got repeatable hang

situation,

but I can't obtain a kernel dump to get result of all show commands

from

here:



http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html

After my break to debugger using Ctrl+Alt+Esc sequence and entering a
panic command kernel does not wrote a kernel dump but seems to hang.

Can

anyone describe how to obtain a kernel dump in this situation, or at

least

say - which output of show commands need in first place to debug this

?

Output of all suggested commands is huge and I afraid of making

mistake

when carrying this output from screen to list of paper and back :-)

This very easy to reproduce [ufs] uninterruptable deadlock
for both of RELENG_6 and RELENG_7. Look at this PR:
http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/107439

The PR is closed but the problem is still here with 7.0-PRERELEASE
and, perhaps, CURRENT.


This is probably another bug because:

1. I built kernel with INVARIANTS as described in on Debugging Deadlocks
page of FreeBSD Developers' Handbook and got no panic, but only deadlock;
2. I have no NTFS filesystem at all and just do a copy of file(s) from FTP
to local UFS using mc. In this PR panic occured when NTFS mounted r/w (and
NOT occured when the same NTFS mounted r/o).

--
Oleg Derevenetz [EMAIL PROTECTED] OOD3-RIPE
Phone: +7 4732 539880
Fax:   +7 4732 531415 http://www.vsi.ru
CenterTelecom Voronezh ISPhttp://isp.vsi.ru


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kern/104406: [ufs] Processes get stuck in ufs state underpersistent CPU load

2007-10-20 Thread Oleg Derevenetz
  Can anyone take a look on PR kern/104406 ? I got repeatable hang
situation,
  but I can't obtain a kernel dump to get result of all show commands from
  here:
 
 
http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html
 
  After my break to debugger using Ctrl+Alt+Esc sequence and entering a
  panic command kernel does not wrote a kernel dump but seems to hang.
Can
  anyone describe how to obtain a kernel dump in this situation, or at
least
  say - which output of show commands need in first place to debug this ?
  Output of all suggested commands is huge and I afraid of making mistake
  when carrying this output from screen to list of paper and back :-)

 Oleg, one thing you can do to make this less painful is to
 run your machine's console over serial port.

 First get a crossover serial cable, make sure it works from one
 box to another, it should be easy to run tip com1 on both
 boxes to ensure that it works.

 Then you just need to add console=comconsole to /boot/loader.conf
 and your box's console should come over serial.

 Then on the machine watching the console, you can just do this:

 % script
 Script started, output file is typescript
 % tip com1
 ...do ddb stuff now...
 ...stop tip
 % exit

 now you should have everything logged into a file called typescript
 should save you a big headache.

Thanks, I'll try it in the monday morning.

 As far as getting a dump from ddb, try this:

 ddb call doadump

 I'm completely at a loss why this isn't a base ddb command dump
 but whatever... :)

Unfortunately, this doesn't work too. I called duty personnel in this
datacenter and asked them to do this, and person on duty tells me that after
he enters this command something like that arrives on monitor:

db call doadump
Dumping 3072 MB

Dump aborted error I/O
Dump failed. (Error 5)

--
Oleg Derevenetz [EMAIL PROTECTED] OOD3-RIPE
Phone: +7 4732 539880
Fax:   +7 4732 531415 http://www.vsi.ru
CenterTelecom Voronezh ISPhttp://isp.vsi.ru

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kern/104406: [ufs] Processes get stuck in ufs state underpersistent CPU load

2007-10-20 Thread Alfred Perlstein
* Oleg Derevenetz [EMAIL PROTECTED] [071020 09:58] wrote:
   Can anyone take a look on PR kern/104406 ? I got repeatable hang
 situation,
   but I can't obtain a kernel dump to get result of all show commands from
   here:
  
  
 http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html
  
   After my break to debugger using Ctrl+Alt+Esc sequence and entering a
   panic command kernel does not wrote a kernel dump but seems to hang.
 Can
   anyone describe how to obtain a kernel dump in this situation, or at
 least
   say - which output of show commands need in first place to debug this ?
   Output of all suggested commands is huge and I afraid of making mistake
   when carrying this output from screen to list of paper and back :-)
 
  Oleg, one thing you can do to make this less painful is to
  run your machine's console over serial port.
 
  First get a crossover serial cable, make sure it works from one
  box to another, it should be easy to run tip com1 on both
  boxes to ensure that it works.
 
  Then you just need to add console=comconsole to /boot/loader.conf
  and your box's console should come over serial.
 
  Then on the machine watching the console, you can just do this:
 
  % script
  Script started, output file is typescript
  % tip com1
  ...do ddb stuff now...
  ...stop tip
  % exit
 
  now you should have everything logged into a file called typescript
  should save you a big headache.
 
 Thanks, I'll try it in the monday morning.
 
  As far as getting a dump from ddb, try this:
 
  ddb call doadump
 
  I'm completely at a loss why this isn't a base ddb command dump
  but whatever... :)
 
 Unfortunately, this doesn't work too. I called duty personnel in this
 datacenter and asked them to do this, and person on duty tells me that after
 he enters this command something like that arrives on monitor:
 
 db call doadump
 Dumping 3072 MB
 
 Dump aborted error I/O
 Dump failed. (Error 5)

Hmnmm, that seems like you might be having a hardware problem,
what disk device do you have?  

Have you also enabled kernel dumps via /etc/rc.conf:dumpdev= 
?

-- 
- Alfred Perlstein
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kern/104406: [ufs] Processes get stuck in ufs state under persistent CPU load

2007-10-19 Thread Alfred Perlstein
* Oleg Derevenetz [EMAIL PROTECTED] [071019 08:17] wrote:
 Hi all,
 
 Can anyone take a look on PR kern/104406 ? I got repeatable hang situation, 
 but I can't obtain a kernel dump to get result of all show commands from 
 here:
 
 http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html
 
 After my break to debugger using Ctrl+Alt+Esc sequence and entering a 
 panic command kernel does not wrote a kernel dump but seems to hang. Can 
 anyone describe how to obtain a kernel dump in this situation, or at least 
 say - which output of show commands need in first place to debug this ? 
 Output of all suggested commands is huge and I afraid of making mistake 
 when carrying this output from screen to list of paper and back :-)

Oleg, one thing you can do to make this less painful is to
run your machine's console over serial port.

First get a crossover serial cable, make sure it works from one
box to another, it should be easy to run tip com1 on both
boxes to ensure that it works.

Then you just need to add console=comconsole to /boot/loader.conf
and your box's console should come over serial.

Then on the machine watching the console, you can just do this:

% script
Script started, output file is typescript
% tip com1
...do ddb stuff now...
...stop tip
% exit

now you should have everything logged into a file called typescript
should save you a big headache.

As far as getting a dump from ddb, try this:

ddb call doadump

I'm completely at a loss why this isn't a base ddb command dump
but whatever... :)

-Alfred
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


kern/104406: [ufs] Processes get stuck in ufs state under persistent CPU load

2007-10-19 Thread Oleg Derevenetz

Hi all,

Can anyone take a look on PR kern/104406 ? I got repeatable hang situation, but I can't obtain a kernel dump to get result of all 
show commands from here:


http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html

After my break to debugger using Ctrl+Alt+Esc sequence and entering a panic command kernel does not wrote a kernel dump but seems 
to hang. Can anyone describe how to obtain a kernel dump in this situation, or at least say - which output of show commands need in 
first place to debug this ? Output of all suggested commands is huge and I afraid of making mistake when carrying this output from 
screen to list of paper and back :-)


--
Oleg Derevenetz [EMAIL PROTECTED] OOD3-RIPE
Phone: +7 4732 539880
Fax:   +7 4732 531415 http://www.vsi.ru
CenterTelecom Voronezh ISPhttp://isp.vsi.ru

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Processes get stuck in ufs state

2007-03-25 Thread Oleg Derevenetz
Цитирую Oleg Derevenetz [EMAIL PROTECTED]:

 On Wed, Mar 07, 2007 at 05:22:38AM +0300, Oleg Derevenetz wrote:
 
  Sometimes (once a week approximately) I have a problem with the same
  symptoms described here on SMP FreeBSD 6.2-STABLE with dual AMD
 Opteron(tm)
  Processor 850:
 
  http://www.freebsd.org/cgi/query-pr.cgi?pr=104406cat=
 
  Sometimes (apparently when CPU load suddenly goes up) all processes
 that
  interacts with disk gets stuck in ufs state, but in my case
  SIGSTOP/SIGCONT seemingly does not help.
 
  See developer handbook, Deadlock Debugging chapter for instruction
 what
  information shall be gathered to debug the problem.
 
 OK, I built kernel with debug options and will wait for stuck. By the
 way, when debug options turned on, I see this message on every 
 boot when nullfs mounting in progress:
 
 acquiring duplicate lock of same type: vnode interlock
  1st vnode interlock @ /usr/src/sys/kern/vfs_vnops.c:806
  2nd vnode interlock @ /usr/src/sys/kern/vfs_subr.c:2040
 KDB: stack backtrace:
 kdb_backtrace(3,cfc60300,c05926d0,c05926d0,c05542c4,...) at
 kdb_backtrace+0x29
 witness_checkorder(cfd5c4dc,9,c051cf1e,7f8) at witness_checkorder+0x578
 _mtx_lock_flags(cfd5c4dc,0,c051cf1e,7f8,cfb28b90,...) at
 _mtx_lock_flags+0x78
 vrefcnt(cfd5c414) at vrefcnt+0x20
 null_checkvp(cff5eae0,c050c4a6,215) at null_checkvp+0x56
 null_lock(f02f1a68) at null_lock+0x66
 VOP_LOCK_APV(c054d540,f02f1a68) at VOP_LOCK_APV+0x87
 vn_lock(cff5eae0,1002,cfc60300,cff5eae0,cff5ed04,...) at vn_lock+0xac
 nullfs_root(cff76b90,2,f02f1ae0,cfc60300,0,8,0,c05cfca0,0,c051c79c,407)
 at nullfs_root+0x26
 vfs_domount(cfc60300,cfe3d340,cfe3d130,d,cfe3d3f0,c05817e0,0,c051c79c,2bf)
 at vfs_domount+0x975
 vfs_donmount(cfc60300,d,cfe73080,cfe73080,0,...) at vfs_donmount+0x3f9
 nmount(cfc60300,f02f1d04) at nmount+0x8b
 syscall(3b,3b,3b,bf7fe5f5,bf7feea0,...) at syscall+0x25b
 Xint0x80_syscall() at Xint0x80_syscall+0x1f
 --- syscall (378, FreeBSD ELF32, nmount), eip = 0x280bc0e7, esp =
 0xbf7fe5bc, ebp = 0xbf7fee38 ---
 
 This host have nullfs filesystems. Is this can be related to deadlock ?

FYI: after replacing nullfs filesystems with unionfs (using new unionfs 
implementation):

http://people.freebsd.org/~daichi/unionfs/

all deadlocks are gone. It seems to be a problem in current nullfs 
implementation, but I can't debug it properly because deadlock cases are 
relatively rare and machine that uses nullfs is heavily loaded so WITNESS and 
DEBUG options leads to unacceptable performance penalty.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Processes get stuck in ufs state

2007-03-09 Thread Oleg Derevenetz

On Wed, Mar 07, 2007 at 05:22:38AM +0300, Oleg Derevenetz wrote:


Sometimes (once a week approximately) I have a problem with the same
symptoms described here on SMP FreeBSD 6.2-STABLE with dual AMD Opteron(tm)
Processor 850:

http://www.freebsd.org/cgi/query-pr.cgi?pr=104406cat=

Sometimes (apparently when CPU load suddenly goes up) all processes that
interacts with disk gets stuck in ufs state, but in my case
SIGSTOP/SIGCONT seemingly does not help.


See developer handbook, Deadlock Debugging chapter for instruction what
information shall be gathered to debug the problem.


OK, I built kernel with debug options and will wait for stuck. By the way, when debug options turned on, I see this message on every 
boot when nullfs mounting in progress:


acquiring duplicate lock of same type: vnode interlock
1st vnode interlock @ /usr/src/sys/kern/vfs_vnops.c:806
2nd vnode interlock @ /usr/src/sys/kern/vfs_subr.c:2040
KDB: stack backtrace:
kdb_backtrace(3,cfc60300,c05926d0,c05926d0,c05542c4,...) at kdb_backtrace+0x29
witness_checkorder(cfd5c4dc,9,c051cf1e,7f8) at witness_checkorder+0x578
_mtx_lock_flags(cfd5c4dc,0,c051cf1e,7f8,cfb28b90,...) at _mtx_lock_flags+0x78
vrefcnt(cfd5c414) at vrefcnt+0x20
null_checkvp(cff5eae0,c050c4a6,215) at null_checkvp+0x56
null_lock(f02f1a68) at null_lock+0x66
VOP_LOCK_APV(c054d540,f02f1a68) at VOP_LOCK_APV+0x87
vn_lock(cff5eae0,1002,cfc60300,cff5eae0,cff5ed04,...) at vn_lock+0xac
nullfs_root(cff76b90,2,f02f1ae0,cfc60300,0,8,0,c05cfca0,0,c051c79c,407) at 
nullfs_root+0x26
vfs_domount(cfc60300,cfe3d340,cfe3d130,d,cfe3d3f0,c05817e0,0,c051c79c,2bf) at 
vfs_domount+0x975
vfs_donmount(cfc60300,d,cfe73080,cfe73080,0,...) at vfs_donmount+0x3f9
nmount(cfc60300,f02f1d04) at nmount+0x8b
syscall(3b,3b,3b,bf7fe5f5,bf7feea0,...) at syscall+0x25b
Xint0x80_syscall() at Xint0x80_syscall+0x1f
--- syscall (378, FreeBSD ELF32, nmount), eip = 0x280bc0e7, esp = 0xbf7fe5bc, 
ebp = 0xbf7fee38 ---

This host have nullfs filesystems. Is this can be related to deadlock ?

--
Oleg Derevenetz [EMAIL PROTECTED] OOD3-RIPE
Phone: +7 4732 539880
Fax:   +7 4732 531415 http://www.vsi.ru
CenterTelecom Voronezh ISPhttp://isp.vsi.ru

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Processes get stuck in ufs state

2007-03-09 Thread Kostik Belousov
On Fri, Mar 09, 2007 at 06:08:25PM +0300, Oleg Derevenetz wrote:
 On Wed, Mar 07, 2007 at 05:22:38AM +0300, Oleg Derevenetz wrote:
 
 Sometimes (once a week approximately) I have a problem with the same
 symptoms described here on SMP FreeBSD 6.2-STABLE with dual AMD 
 Opteron(tm)
 Processor 850:
 
 http://www.freebsd.org/cgi/query-pr.cgi?pr=104406cat=
 
 Sometimes (apparently when CPU load suddenly goes up) all processes that
 interacts with disk gets stuck in ufs state, but in my case
 SIGSTOP/SIGCONT seemingly does not help.
 
 See developer handbook, Deadlock Debugging chapter for instruction what
 information shall be gathered to debug the problem.
 
 OK, I built kernel with debug options and will wait for stuck. By the way, 
 when debug options turned on, I see this message on every boot when nullfs 
 mounting in progress:
 
 acquiring duplicate lock of same type: vnode interlock
 1st vnode interlock @ /usr/src/sys/kern/vfs_vnops.c:806
 2nd vnode interlock @ /usr/src/sys/kern/vfs_subr.c:2040
 KDB: stack backtrace:
 kdb_backtrace(3,cfc60300,c05926d0,c05926d0,c05542c4,...) at 
 kdb_backtrace+0x29
 witness_checkorder(cfd5c4dc,9,c051cf1e,7f8) at witness_checkorder+0x578
 _mtx_lock_flags(cfd5c4dc,0,c051cf1e,7f8,cfb28b90,...) at 
 _mtx_lock_flags+0x78
 vrefcnt(cfd5c414) at vrefcnt+0x20
 null_checkvp(cff5eae0,c050c4a6,215) at null_checkvp+0x56
 null_lock(f02f1a68) at null_lock+0x66
 VOP_LOCK_APV(c054d540,f02f1a68) at VOP_LOCK_APV+0x87
 vn_lock(cff5eae0,1002,cfc60300,cff5eae0,cff5ed04,...) at vn_lock+0xac
 nullfs_root(cff76b90,2,f02f1ae0,cfc60300,0,8,0,c05cfca0,0,c051c79c,407) at 
 nullfs_root+0x26
 vfs_domount(cfc60300,cfe3d340,cfe3d130,d,cfe3d3f0,c05817e0,0,c051c79c,2bf) 
 at vfs_domount+0x975
 vfs_donmount(cfc60300,d,cfe73080,cfe73080,0,...) at vfs_donmount+0x3f9
 nmount(cfc60300,f02f1d04) at nmount+0x8b
 syscall(3b,3b,3b,bf7fe5f5,bf7feea0,...) at syscall+0x25b
 Xint0x80_syscall() at Xint0x80_syscall+0x1f
 --- syscall (378, FreeBSD ELF32, nmount), eip = 0x280bc0e7, esp = 
 0xbf7fe5bc, ebp = 0xbf7fee38 ---
 
 This host have nullfs filesystems. Is this can be related to deadlock ?

This is harmless, just ignore it.


pgp3azpHgEcQb.pgp
Description: PGP signature


Processes get stuck in ufs state

2007-03-06 Thread Oleg Derevenetz

Hi !

Sometimes (once a week approximately) I have a problem with the same 
symptoms described here on SMP FreeBSD 6.2-STABLE with dual AMD Opteron(tm) 
Processor 850:


http://www.freebsd.org/cgi/query-pr.cgi?pr=104406cat=

Sometimes (apparently when CPU load suddenly goes up) all processes that 
interacts with disk gets stuck in ufs state, but in my case 
SIGSTOP/SIGCONT seemingly does not help.


uname -a output:

FreeBSD serv2.vsi.ru 6.2-STABLE FreeBSD 6.2-STABLE #2: Sat Mar  3 01:59:08
MSK 2007 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/serv2  i386

dmesg.boot:

Copyright (c) 1992-2007 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
   The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 6.2-STABLE #2: Sat Mar  3 01:59:08 MSK 2007
   [EMAIL PROTECTED]:/usr/obj/usr/src/sys/serv2
Timecounter i8254 frequency 1193182 Hz quality 0
CPU: AMD Opteron(tm) Processor 850 (2389.26-MHz 686-class CPU)
 Origin = AuthenticAMD  Id = 0x20f51  Stepping = 1
 
Features=0x78bfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2
 Features2=0x1SSE3
 AMD Features=0xe2500800SYSCALL,NX,MMX+,FFXSR,LM,3DNow+,3DNow
 AMD Features2=0x1LAHF
real memory  = 8589934592 (8192 MB)
avail memory = 8350457856 (7963 MB)
ACPI APIC Table: PTLTD  APIC  
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
cpu0 (BSP): APIC ID:  0
cpu1 (AP): APIC ID:  1
MADT: Forcing active-low polarity and level trigger for SCI
ioapic0 Version 1.1 irqs 0-23 on motherboard
ioapic1 Version 1.1 irqs 24-27 on motherboard
ioapic2 Version 1.1 irqs 28-31 on motherboard
ioapic3 Version 1.1 irqs 32-35 on motherboard
ioapic4 Version 1.1 irqs 36-39 on motherboard
ioapic5 Version 1.1 irqs 40-43 on motherboard
ioapic6 Version 1.1 irqs 44-47 on motherboard
kbd1 at kbdmux0
acpi0: PTLTDXSDT on motherboard
acpi0: Power Button (fixed)
unknown: I/O range not supported
unknown: I/O range not supported
Timecounter ACPI-fast frequency 3579545 Hz quality 1000
acpi_timer0: 24-bit timer at 3.579545MHz port 0xf008-0xf00b on acpi0
cpu0: ACPI CPU on acpi0
cpu1: ACPI CPU on acpi0
acpi_button0: Power Button on acpi0
pcib0: ACPI Host-PCI bridge port 0xcf8-0xcff,0xf000-0xf07f,0xf080-0xf0ff
iomem 0xd8000-0xdbfff on acpi0
pci0: ACPI PCI bus on pcib0
pcib1: ACPI PCI-PCI bridge at device 6.0 on pci0
pci1: ACPI PCI bus on pcib1
ohci0: OHCI (generic) USB controller mem 0xfc90-0xfc900fff irq 19 at
device 0.0 on pci1
ohci0: [GIANT-LOCKED]
usb0: OHCI version 1.0, legacy support
usb0: SMM does not respond, resetting
usb0: OHCI (generic) USB controller on ohci0
usb0: USB revision 1.0
uhub0: AMD OHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 3 ports with 3 removable, self powered
ohci1: OHCI (generic) USB controller mem 0xfc901000-0xfc901fff irq 19 at
device 0.1 on pci1
ohci1: [GIANT-LOCKED]
usb1: OHCI version 1.0, legacy support
usb1: SMM does not respond, resetting
usb1: OHCI (generic) USB controller on ohci1
usb1: USB revision 1.0
uhub1: AMD OHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 3 ports with 3 removable, self powered
pci1: display, VGA at device 5.0 (no driver attached)
isab0: PCI-ISA bridge at device 7.0 on pci0
isa0: ISA bus on isab0
atapci0: AMD 8111 UDMA133 controller port
0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x1000-0x100f at device 7.1 on pci0
ata0: ATA channel 0 on atapci0
ata1: ATA channel 1 on atapci0
pci0: bridge at device 7.3 (no driver attached)
pcib2: ACPI PCI-PCI bridge at device 10.0 on pci0
pci2: ACPI PCI bus on pcib2
bge0: Broadcom BCM5704 A3, ASIC rev. 0x2003 mem
0xfe01-0xfe01,0xfe00-0xfe00 irq 25 at device 2.0 on pci2
miibus0: MII bus on bge0
brgphy0: BCM5704 10/100/1000baseTX PHY on miibus0
brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX,
1000baseTX-FDX, auto
bge0: Ethernet address: 00:09:3d:13:fd:00
bge1: Broadcom BCM5704 A3, ASIC rev. 0x2003 mem
0xfe03-0xfe03,0xfe02-0xfe02 irq 26 at device 2.1 on pci2
miibus1: MII bus on bge1
brgphy1: BCM5704 10/100/1000baseTX PHY on miibus1
brgphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX,
1000baseTX-FDX, auto
bge1: Ethernet address: 00:09:3d:13:fd:01
mpt0: LSILogic 1030 Ultra4 Adapter port 0x2000-0x20ff mem
0xfe05-0xfe05,0xfe04-0xfe04 irq 27 at device 4.0 on pci2
mpt0: [GIANT-LOCKED]
mpt0: MPI Version=1.2.15.0
mpt0: Capabilities: ( RAID-1E RAID-1 SAFTE )
mpt0: 0 Active Volumes (1 Max)
mpt0: 0 Hidden Drive Members (6 Max)
pci0: base peripheral, interrupt controller at device 10.1 (no driver
attached)
pcib3: ACPI PCI-PCI bridge at device 11.0 on pci0
pci3: ACPI PCI bus on pcib3
pci0: base peripheral, interrupt controller at device 11.1 (no driver
attached)
pcib4: ACPI Host-PCI bridge iomem
0xfe301000-0xfe301fff,0xfe303000-0xfe303fff,0xfe305000-0xfe305fff,0xfe307000-0xfe307fff
on acpi0
pci32: ACPI PCI bus on pcib4
pcib5: ACPI PCI-PCI bridge mem 

Processes get stuck in ufs state

2007-03-06 Thread Oleg Derevenetz

Hi !

Sometimes (once a week approximately) I have a problem with the same
symptoms described here on SMP FreeBSD 6.2-STABLE with dual AMD Opteron(tm)
Processor 850:

http://www.freebsd.org/cgi/query-pr.cgi?pr=104406cat=

Sometimes (apparently when CPU load suddenly goes up) all processes that
interacts with disk gets stuck in ufs state, but in my case
SIGSTOP/SIGCONT seemingly does not help.

uname -a output:

FreeBSD serv2.vsi.ru 6.2-STABLE FreeBSD 6.2-STABLE #2: Sat Mar  3 01:59:08
MSK 2007 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/serv2  i386

dmesg.boot:

Copyright (c) 1992-2007 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
   The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 6.2-STABLE #2: Sat Mar  3 01:59:08 MSK 2007
   [EMAIL PROTECTED]:/usr/obj/usr/src/sys/serv2
Timecounter i8254 frequency 1193182 Hz quality 0
CPU: AMD Opteron(tm) Processor 850 (2389.26-MHz 686-class CPU)
 Origin = AuthenticAMD  Id = 0x20f51  Stepping = 1
 
Features=0x78bfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2
 Features2=0x1SSE3
 AMD Features=0xe2500800SYSCALL,NX,MMX+,FFXSR,LM,3DNow+,3DNow
 AMD Features2=0x1LAHF
real memory  = 8589934592 (8192 MB)
avail memory = 8350457856 (7963 MB)
ACPI APIC Table: PTLTD  APIC  
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
cpu0 (BSP): APIC ID:  0
cpu1 (AP): APIC ID:  1
MADT: Forcing active-low polarity and level trigger for SCI
ioapic0 Version 1.1 irqs 0-23 on motherboard
ioapic1 Version 1.1 irqs 24-27 on motherboard
ioapic2 Version 1.1 irqs 28-31 on motherboard
ioapic3 Version 1.1 irqs 32-35 on motherboard
ioapic4 Version 1.1 irqs 36-39 on motherboard
ioapic5 Version 1.1 irqs 40-43 on motherboard
ioapic6 Version 1.1 irqs 44-47 on motherboard
kbd1 at kbdmux0
acpi0: PTLTDXSDT on motherboard
acpi0: Power Button (fixed)
unknown: I/O range not supported
unknown: I/O range not supported
Timecounter ACPI-fast frequency 3579545 Hz quality 1000
acpi_timer0: 24-bit timer at 3.579545MHz port 0xf008-0xf00b on acpi0
cpu0: ACPI CPU on acpi0
cpu1: ACPI CPU on acpi0
acpi_button0: Power Button on acpi0
pcib0: ACPI Host-PCI bridge port 0xcf8-0xcff,0xf000-0xf07f,0xf080-0xf0ff
iomem 0xd8000-0xdbfff on acpi0
pci0: ACPI PCI bus on pcib0
pcib1: ACPI PCI-PCI bridge at device 6.0 on pci0
pci1: ACPI PCI bus on pcib1
ohci0: OHCI (generic) USB controller mem 0xfc90-0xfc900fff irq 19 at
device 0.0 on pci1
ohci0: [GIANT-LOCKED]
usb0: OHCI version 1.0, legacy support
usb0: SMM does not respond, resetting
usb0: OHCI (generic) USB controller on ohci0
usb0: USB revision 1.0
uhub0: AMD OHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 3 ports with 3 removable, self powered
ohci1: OHCI (generic) USB controller mem 0xfc901000-0xfc901fff irq 19 at
device 0.1 on pci1
ohci1: [GIANT-LOCKED]
usb1: OHCI version 1.0, legacy support
usb1: SMM does not respond, resetting
usb1: OHCI (generic) USB controller on ohci1
usb1: USB revision 1.0
uhub1: AMD OHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 3 ports with 3 removable, self powered
pci1: display, VGA at device 5.0 (no driver attached)
isab0: PCI-ISA bridge at device 7.0 on pci0
isa0: ISA bus on isab0
atapci0: AMD 8111 UDMA133 controller port
0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x1000-0x100f at device 7.1 on pci0
ata0: ATA channel 0 on atapci0
ata1: ATA channel 1 on atapci0
pci0: bridge at device 7.3 (no driver attached)
pcib2: ACPI PCI-PCI bridge at device 10.0 on pci0
pci2: ACPI PCI bus on pcib2
bge0: Broadcom BCM5704 A3, ASIC rev. 0x2003 mem
0xfe01-0xfe01,0xfe00-0xfe00 irq 25 at device 2.0 on pci2
miibus0: MII bus on bge0
brgphy0: BCM5704 10/100/1000baseTX PHY on miibus0
brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX,
1000baseTX-FDX, auto
bge0: Ethernet address: 00:09:3d:13:fd:00
bge1: Broadcom BCM5704 A3, ASIC rev. 0x2003 mem
0xfe03-0xfe03,0xfe02-0xfe02 irq 26 at device 2.1 on pci2
miibus1: MII bus on bge1
brgphy1: BCM5704 10/100/1000baseTX PHY on miibus1
brgphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX,
1000baseTX-FDX, auto
bge1: Ethernet address: 00:09:3d:13:fd:01
mpt0: LSILogic 1030 Ultra4 Adapter port 0x2000-0x20ff mem
0xfe05-0xfe05,0xfe04-0xfe04 irq 27 at device 4.0 on pci2
mpt0: [GIANT-LOCKED]
mpt0: MPI Version=1.2.15.0
mpt0: Capabilities: ( RAID-1E RAID-1 SAFTE )
mpt0: 0 Active Volumes (1 Max)
mpt0: 0 Hidden Drive Members (6 Max)
pci0: base peripheral, interrupt controller at device 10.1 (no driver
attached)
pcib3: ACPI PCI-PCI bridge at device 11.0 on pci0
pci3: ACPI PCI bus on pcib3
pci0: base peripheral, interrupt controller at device 11.1 (no driver
attached)
pcib4: ACPI Host-PCI bridge iomem
0xfe301000-0xfe301fff,0xfe303000-0xfe303fff,0xfe305000-0xfe305fff,0xfe307000-0xfe307fff
on acpi0
pci32: ACPI PCI bus on pcib4
pcib5: ACPI PCI-PCI bridge mem 

Re: Processes get stuck in ufs state

2007-03-06 Thread Kostik Belousov
On Wed, Mar 07, 2007 at 05:22:38AM +0300, Oleg Derevenetz wrote:
 Hi !
 
 Sometimes (once a week approximately) I have a problem with the same 
 symptoms described here on SMP FreeBSD 6.2-STABLE with dual AMD Opteron(tm) 
 Processor 850:
 
 http://www.freebsd.org/cgi/query-pr.cgi?pr=104406cat=
 
 Sometimes (apparently when CPU load suddenly goes up) all processes that 
 interacts with disk gets stuck in ufs state, but in my case 
 SIGSTOP/SIGCONT seemingly does not help.
See developer handbook, Deadlock Debugging chapter for instruction what
information shall be gathered to debug the problem.


pgp97NW2c4Doa.pgp
Description: PGP signature