Re: gvinum: adding plex and subdisks to existing volume panic

2006-01-03 Thread Lukas Ertl

On Thu, 22 Dec 2005, Ludo Koren wrote:


Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0x40
fault code  = supervisor read, page not present
instruction pointer = 0x8:0xc07339c9
stack pointer   = 0x10:0xe5f9e840
frame pointer   = 0x10:0xe5f9e848
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 2 (g_event)
trap number = 12
panic: page fault
Uptime: 2m32s
Dumping 1022 MB
16 32 48 64 80 96 112 128 144 160 176 192 208 224 240 256 272 288 304 320 336 
352 368 384 400 416 432 448 464 480 496 512 528 544 560 576 592 608 624 640 656 
672 688 704 720 736 752 768 784 800 816 832 848 864 880 896 912 928 944 960 976 
992 1008

#0  doadump () at pcpu.h:160
160 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) add-symbol-file /usr/src/sys/modules/geom/geom_vinum/geom_vinum.ko 
0xc072 a9b8
add symbol table from file /usr/src/sys/modules/geom/geom_vinum/geom_vinum.ko 
at
.text_addr = 0xc072a9b8
(y or n) y
Reading symbols from /usr/src/sys/modules/geom/geom_vinum/geom_vinum.ko...done.
(kgdb) bt
#0  doadump () at pcpu.h:160
#1  0xc04eaf44 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:412
#2  0xc04eb1d8 in panic (fmt=0xc060fcbc %s)
   at /usr/src/sys/kern/kern_shutdown.c:568
#3  0xc05ec2f0 in trap_fatal (frame=0xe5f9e800, eva=64)
   at /usr/src/sys/i386/i386/trap.c:822
#4  0xc05ec057 in trap_pfault (frame=0xe5f9e800, usermode=0, eva=64)
   at /usr/src/sys/i386/i386/trap.c:737
#5  0xc05ebcb9 in trap (frame=
 {tf_fs = -1040646120, tf_es = -1037893616, tf_ds = -1038024688, tf_edi = 
-1038012416, tf_esi = 0, tf_ebp = -436606904, tf_isp = -436606932, tf_ebx = 
-1038077952, tf_edx = 1, tf_ecx = -1066995504, tf_eax = 0, tf_trapno = 12, 
tf_err = 0, tf_eip = -1066190391, tf_cs = 8, tf_eflags = 66182, tf_esp = 
-1038077952, tf_ss = -1038516864}) at /usr/src/sys/i386/i386/trap.c:427
#6  0xc05dc52a in calltrap () at /usr/src/sys/i386/i386/exception.s:140
#7  0xc1f90018 in ?? ()
#8  0xc2230010 in ?? ()
#9  0xc2210010 in ?? ()


I'm afraid that doesn't help me, either, as you can see there's no 
debugging information in there (the ?? question marks should be 
function calls actually).


Apparently there's a NULL pointer deref somewhere, I'll try to track it 
down on my own.


Thanks,
le

--
Lukas Ertl http://homepage.univie.ac.at/l.ertl/
[EMAIL PROTECTED] http://people.freebsd.org/~le/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Where ahve all the sockets gone?

2006-01-03 Thread Randy Bush
 I have connections to several systems and a few daemons listening, both
 INET and INET6. Since my last upgrade, I can't see them with netstat.
 
 Has anyone else seen this?

i have had known listeners not showing in 6 and 7

randy

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: gvinum: adding plex and subdisks to existing volume panic

2006-01-03 Thread Ludo Koren
 Lukas Ertl [EMAIL PROTECTED] writes:



  I'm afraid that doesn't help me, either, as you can see there's
  no debugging information in there (the ?? question marks
  should be function calls actually).

  Apparently there's a NULL pointer deref somewhere, I'll try to
  track it down on my own.

Is there a way how can I help you (probably the question marks -
source is from another module ) or something else ?

lk

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Where ahve all the sockets gone?

2006-01-03 Thread Dag-Erling Smørgrav
Kevin Oberman [EMAIL PROTECTED] writes:
 I have connections to several systems and a few daemons listening, both
 INET and INET6. Since my last upgrade, I can't see them with netstat.

Are you sure your kernel and userland are in synch?

DES
-- 
Dag-Erling Smørgrav - [EMAIL PROTECTED]

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Where ahve all the sockets gone?

2006-01-03 Thread Kevin Oberman
 From: [EMAIL PROTECTED] (=?iso-8859-1?q?Dag-Erling_Sm=F8rgrav?=)
 Date: Tue, 03 Jan 2006 17:55:02 +0100
 
 Kevin Oberman [EMAIL PROTECTED] writes:
  I have connections to several systems and a few daemons listening, both
  INET and INET6. Since my last upgrade, I can't see them with netstat.
 
 Are you sure your kernel and userland are in synch?

I would have sworn that my kernel was built from the exact same source
tree as my userland, but I just rebuilt the userland and my sockets are
back.

Is my system CVSup'ing behind my back? ;-)

In any case, thanks. It's all better.
-- 
R. Kevin Oberman, Network Engineer
Energy Sciences Network (ESnet)
Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab)
E-mail: [EMAIL PROTECTED]   Phone: +1 510 486-8634
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Odd performance problems after upgrade from 4.11 to 6.0-Stable

2006-01-03 Thread Kevin Oberman
 Date: Wed, 14 Dec 2005 19:52:03 -0500
 From: Kris Kennaway [EMAIL PROTECTED]
 
 
 --45Z9DzgjV8m4Oswq
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline
 Content-Transfer-Encoding: quoted-printable
 
 On Wed, Dec 14, 2005 at 04:45:47PM -0800, Kevin Oberman wrote:
   Date: Wed, 14 Dec 2005 19:34:04 -0500
   From: Kris Kennaway [EMAIL PROTECTED]
  =20
   On Wed, Dec 14, 2005 at 04:26:18PM -0800, Kevin Oberman wrote:
  =20
I am attaching a dmesg. I do have a few of drivers (uhci, pcm, psm,
atkbd0 and ichsmb) that are still marked as GIANT-LOCKED, but I'm not
using the USB very often. And I'm not using pcm or ichsmb during the
dump, either. I think everyone has the mouse and keyboard under GIANT,
but I can't really see those as a problem, either.
  =20
   A bunch of things are sharing interrupts with USB..disable it and see
   if that helps.  Also check vmstat -i to see if some device is
   storming.  If not, turn on MUTEX_PROFILING(9) in your kernel and run
   the dump (or something faster that also exhibits the problem), then
   look for what is contending with Giant.
 =20
  Yes, it may be time for MUTEX_PROFILING. I had already looked at
  interrupts. My kernel is sans APIC so I didn't really think that
  interrupts were a problems and I see:
  interrupt  total   rate
  irq0: clk  207037779   1000
  irq1: atkbd0   50208  0
  irq6: fdc0 9  0
  irq8: rtc   26498038128
  irq10: pcm0 ichsmb02  0
  irq11: xl0 uhci018076067 87
  irq12: psm0   869500  4
  irq13: npx01  0
  irq14: ata0 10423468 50
  irq15: ata1  112  0
  Total  262955184   1270
 
  Clearly no storms and nothing looks obviously broken. USB and the
  network card share an IRQ, but the USB is not connected to anything and
  I would not think that it is generating many interrupts. The network
  IS being used and I'm not seeing all that many interrupts on IRQ11.
 
 Whenever there is an interrupt on irq11 from the NIC, *both* drivers
 will wake up to process it.  uhci0 will need to acquire Giant.  If
 something else is also trying to acquire Giant (bufdaemon), then they
 will serialize, degrading performance.  This may not be the cause
 since there are only a few interrupts, but MUTEX_PROFILING will tell
 you.

Well, with the holidays and such, this has taken a while, but here is an
update.

I have removed USB support. I hardly ever use it on this system, so that
was an obvious step. No improvement at all.

# vmstat -i
interrupt  total   rate
irq0: clk  319818027   1000
irq1: atkbd0   15443  0
irq6: fdc011  0
irq8: rtc   40932392128
irq10: pcm0 ichsmb0   125545  0
irq11: xl0   3616426 11
irq12: psm0   281380  0
irq13: npx01  0
irq14: ata0  8756176 27
irq15: ata1  144  0
Total  373545545   1168

Only one shared interrupt and both IRQ 10 devices should have been
totally quiescent during my test run.

The test was building a glimpse index of my inbox. CPU at about
20%. System interactive response was terrible. Took about two minutes
just to log in. Starting Gnome takes roughly forever (about 10
minutes).

I collected mutex stats for just about 3 minutes and found nothing
surprising, but I may not know what to look for. Nothing shows a total
time of over 3.1 seconds. The total time for all of them is 28
seconds. The sum of all Giant lock times was only 4.65 seconds and the
largest of these was in kern_sysctl.c, so I expect it was the profiling
that ate 3.1 of those 4.65 seconds.

I am attaching a spreadsheet with the profile data in case anyone wants
to look at it. (Probably the mail system will strip it, so let me know if I 
should post it.)

Still totally baffled and still feeling the pain.
-- 
R. Kevin Oberman, Network Engineer
Energy Sciences Network (ESnet)
Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab)
E-mail: [EMAIL PROTECTED]   Phone: +1 510 486-8634

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: sendmail_enable=NO

2006-01-03 Thread Daniel Eischen
On Tue, 3 Jan 2006, Vivek Khera wrote:


 On Dec 31, 2005, at 6:56 PM, security wrote:

  And the rc.sendmail(8) under 5.4 stable says that NONE is
  deprecated and will be removed in a future release.  According to
  the man page,

 It says that in 6.0 also, so it will probably be at least until 7.0
 that it continues to work.

 Personally, I think having to set a bazillion variables to turn off
 sendmail in rc.conf and periodic.conf is just a pain (and probably a

Strongly seconded.  There should be one knob to disable the
entire thing.  SENDMAIL_ENABLE=NO should disable *everything*
without touching any other knobs.

 wrong design as it totally violates POLA), but at least it is just a
 cut/paste everytime I set up a new server.

-- 
DE

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: sendmail_enable=NO

2006-01-03 Thread Doug Barton

Daniel Eischen wrote:

On Tue, 3 Jan 2006, Vivek Khera wrote:


On Dec 31, 2005, at 6:56 PM, security wrote:


And the rc.sendmail(8) under 5.4 stable says that NONE is
deprecated and will be removed in a future release.  According to
the man page,

It says that in 6.0 also, so it will probably be at least until 7.0
that it continues to work.

Personally, I think having to set a bazillion variables to turn off
sendmail in rc.conf and periodic.conf is just a pain (and probably a


Strongly seconded.  There should be one knob to disable the
entire thing.  SENDMAIL_ENABLE=NO should disable *everything*
without touching any other knobs.


First, rc.conf and periodic.conf are totally separate, so having just one 
knob for both isn't practical now, but might be an interesting project down 
the road. Second, IIRC the first implementation of sendmail_enable=no did 
actually disable all of sendmail, but since people could not send mail 
locally that turned out to be a POLA violation itself, so the current 
two-stage system was developed. It's impossible to make everyone happy here, 
so I think the current system is a reasonable compromise.



Doug

--

This .signature sanitized for your protection

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: sendmail_enable=NO

2006-01-03 Thread Daniel Eischen
On Tue, 3 Jan 2006, Doug Barton wrote:

 First, rc.conf and periodic.conf are totally separate, so having just one
 knob for both isn't practical now, but might be an interesting project down
 the road. Second, IIRC the first implementation of sendmail_enable=no did
 actually disable all of sendmail, but since people could not send mail
 locally that turned out to be a POLA violation itself, so the current
 two-stage system was developed. It's impossible to make everyone happy here,
 so I think the current system is a reasonable compromise.

No, you can still have *one* overriding knob to turn off
everything.  If folks want to enable/disable different
parts of sendmail, they can do that with the other knobs.
POLA says the one overriding sendmail knob should be
sendmail_enable.

-- 
DE

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Recurring problem: processes block accessing UFS file system

2006-01-03 Thread Greg Rivers

On Tue, 22 Nov 2005, I wrote:


On Mon, 21 Nov 2005, Kris Kennaway wrote:

It may not be the same problem.  You should also try to obtain a trace when 
snapshots are not implicated.




Agreed.  I'll do so at the first opportunity.



First, my thanks to all of you for looking into this.

It's taken more than a month, but the problem has recurred without 
snapshots ever having been run.  I've got a good trace of the machine in 
this state (ftp://ftp.fedex.com/incoming/no-snapshots.bz2).  My apologies 
for the size of the debug output, but the processes had really stacked up 
this time before I noticed it.


I have enough capacity that I can afford to have this machine out of 
production for a while, so I've left it suspended in kdb for the time 
being in case additional information is needed.  Please let me know if 
there's anything else I can do to facilitate troubleshooting this. 
Thanks!


--
Greg
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Recurring problem: processes block accessing UFS file system

2006-01-03 Thread Don Lewis
On  3 Jan, Greg Rivers wrote:
 On Tue, 22 Nov 2005, I wrote:
 
 On Mon, 21 Nov 2005, Kris Kennaway wrote:

 It may not be the same problem.  You should also try to obtain a trace when 
 snapshots are not implicated.
 

 Agreed.  I'll do so at the first opportunity.

 
 First, my thanks to all of you for looking into this.
 
 It's taken more than a month, but the problem has recurred without 
 snapshots ever having been run.  I've got a good trace of the machine in 
 this state (attached).  My apologies for the size of the debug output, but 
 the processes had really stacked up this time before I noticed it.
 
 I have enough capacity that I can afford to have this machine out of 
 production for a while, so I've left it suspended in kdb for the time 
 being in case additional information is needed.  Please let me know if 
 there's anything else I can do to facilitate troubleshooting this. 
 Thanks!

There are large number of sendmail processes waiting on vnode locks
which are held by other sendmail processes that are waiting on other
vnode locks, etc. until we get to sendmail pid 87150 which is holding a
vnode lock and waiting to lock a buf.

Tracing command sendmail pid 87150 tid 100994 td 0xcf1c5480
sched_switch(cf1c5480,0,1,b2c5195e,a480a2bc) at sched_switch+0x158
mi_switch(1,0,c04d7b33,dc713fb0,ec26a6ac) at mi_switch+0x1d5
sleepq_switch(dc713fb0,ec26a6e0,c04bb9ce,dc713fb0,50) at sleepq_switch+0x16f
sleepq_wait(dc713fb0,50,c0618ef5,0,202122) at sleepq_wait+0x11
msleep(dc713fb0,c0658430,50,c0618ef5,0) at msleep+0x3d7
acquire(ec26a748,120,6,15c2e6e0,0) at acquire+0x89
lockmgr(dc713fb0,202122,c89855cc,cf1c5480,dc76fe30) at lockmgr+0x45f
getblk(c8985550,15c2e6e0,0,4000,0) at getblk+0x211
breadn(c8985550,15c2e6e0,0,4000,0) at breadn+0x52
bread(c8985550,15c2e6e0,0,4000,0) at bread+0x4c
ffs_vget(c887,ae58b3,2,ec26a8d4,8180) at ffs_vget+0x383
ffs_valloc(c8d41660,8180,c92e8d00,ec26a8d4,c05f9302) at ffs_valloc+0x154
ufs_makeinode(8180,c8d41660,ec26abd4,ec26abe8,ec26aa24) at ufs_makeinode+0x61
ufs_create(ec26aa50,ec26aa24,ec26ad04,ec26abc0,ec26ab0c) at ufs_create+0x36
VOP_CREATE_APV(c0646cc0,ec26aa50,2,ec26aa50,0) at VOP_CREATE_APV+0x3c
vn_open_cred(ec26abc0,ec26acc0,180,c92e8d00,6) at vn_open_cred+0x1fe
vn_open(ec26abc0,ec26acc0,180,6,c679eacb) at vn_open+0x33
kern_open(cf1c5480,81416c0,0,a03,180) at kern_open+0xca
open(cf1c5480,ec26ad04,c,cf1c5480,8169000) at open+0x36
syscall(3b,bfbf003b,bfbf003b,0,a02) at syscall+0x324
Xint0x80_syscall() at Xint0x80_syscall+0x1f

This doesn't appear to be a buf/memory exhausting problem because
syncer, bufdaemon, and pagedaemon all appear to be idle.

What does show lockedbufs say?

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: some more on Re: Adventurous fix for wheel mouse not working in FreeBSD 6.0

2006-01-03 Thread Torfinn Ingolfsen
On Thu, 08 Dec 2005 13:01:36 -0200
JoaoBR [EMAIL PROTECTED] wrote:

 I found out this for releng_6 and appearently is the same on former
 versions  since all call the same problem:

Interesting enough, today I stubled over a much easier solution than
those discussed earlier in this thread.
I simply removed 'moused_flags=-z4' from /etc/rc.conf, and restarted
moused and Xorg. Now (wheel) scroll works again!
Details:
[EMAIL PROTECTED] uname -a
FreeBSD kg-quiet.kg4.no 6.0-STABLE FreeBSD 6.0-STABLE #1: Thu Dec 22
06:29:24 CET 2005 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/QUIET 
amd64 [EMAIL PROTECTED] ps ax | grep moused
10104  ??  Is 0:00.04 /usr/sbin/moused -p /dev/psm0 -t auto
[EMAIL PROTECTED] portversion -v | grep xorg-server
xorg-server-6.8.99.903  =  up-to-date with port

I will have to test this on my laptop (6.0-stable / i386) sometime soon.
-- 
Regards,
Torfinn Ingolfsen,
Norway

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Recurring problem: processes block accessing UFS file system

2006-01-03 Thread Greg Rivers

On Tue, 3 Jan 2006, Don Lewis wrote:


There are large number of sendmail processes waiting on vnode locks
which are held by other sendmail processes that are waiting on other
vnode locks, etc. until we get to sendmail pid 87150 which is holding a
vnode lock and waiting to lock a buf.

Tracing command sendmail pid 87150 tid 100994 td 0xcf1c5480
sched_switch(cf1c5480,0,1,b2c5195e,a480a2bc) at sched_switch+0x158
mi_switch(1,0,c04d7b33,dc713fb0,ec26a6ac) at mi_switch+0x1d5
sleepq_switch(dc713fb0,ec26a6e0,c04bb9ce,dc713fb0,50) at sleepq_switch+0x16f
sleepq_wait(dc713fb0,50,c0618ef5,0,202122) at sleepq_wait+0x11
msleep(dc713fb0,c0658430,50,c0618ef5,0) at msleep+0x3d7
acquire(ec26a748,120,6,15c2e6e0,0) at acquire+0x89
lockmgr(dc713fb0,202122,c89855cc,cf1c5480,dc76fe30) at lockmgr+0x45f
getblk(c8985550,15c2e6e0,0,4000,0) at getblk+0x211
breadn(c8985550,15c2e6e0,0,4000,0) at breadn+0x52
bread(c8985550,15c2e6e0,0,4000,0) at bread+0x4c
ffs_vget(c887,ae58b3,2,ec26a8d4,8180) at ffs_vget+0x383
ffs_valloc(c8d41660,8180,c92e8d00,ec26a8d4,c05f9302) at ffs_valloc+0x154
ufs_makeinode(8180,c8d41660,ec26abd4,ec26abe8,ec26aa24) at ufs_makeinode+0x61
ufs_create(ec26aa50,ec26aa24,ec26ad04,ec26abc0,ec26ab0c) at ufs_create+0x36
VOP_CREATE_APV(c0646cc0,ec26aa50,2,ec26aa50,0) at VOP_CREATE_APV+0x3c
vn_open_cred(ec26abc0,ec26acc0,180,c92e8d00,6) at vn_open_cred+0x1fe
vn_open(ec26abc0,ec26acc0,180,6,c679eacb) at vn_open+0x33
kern_open(cf1c5480,81416c0,0,a03,180) at kern_open+0xca
open(cf1c5480,ec26ad04,c,cf1c5480,8169000) at open+0x36
syscall(3b,bfbf003b,bfbf003b,0,a02) at syscall+0x324
Xint0x80_syscall() at Xint0x80_syscall+0x1f

This doesn't appear to be a buf/memory exhausting problem because
syncer, bufdaemon, and pagedaemon all appear to be idle.

What does show lockedbufs say?



db show lockedbufs
buf at 0xdc625678
b_flags = 0x2000vmio
b_error = 0, b_bufsize = 16384, b_bcount = 16384, b_resid = 0
b_bufobj = (0xc8985610), b_data = 0xddcef000, b_blkno = 365233216
lockstatus = 2, excl count = 1, excl owner 0xfffe
b_npages = 4, pages(OBJ, IDX, PA): (0xc8984108, 0x2b8a088, 
0x53175000),(0xc8984108, 0x2b8a089, 0x7ae56000),(0xc8984108, 0x2b8a08a, 
0xd3f57000),(0xc8984108, 0x2b8a08b, 0xd7d58000)

buf at 0xdc6b8ab0
b_flags = 0x2000vmio
b_error = 0, b_bufsize = 16384, b_bcount = 16384, b_resid = 0
b_bufobj = (0xc8985610), b_data = 0xdf9ab000, b_blkno = 365257760
lockstatus = 2, excl count = 1, excl owner 0xfffe
b_npages = 4, pages(OBJ, IDX, PA): (0xc8984108, 0x2b8ac84, 
0x60b1000),(0xc8984108, 0x2b8ac85, 0x454d2000),(0xc8984108, 0x2b8ac86, 
0x1b273000),(0xc8984108, 0x2b8ac87, 0x47b74000)

buf at 0xdc6c3cc8
b_flags = 0x2000vmio
b_error = 0, b_bufsize = 16384, b_bcount = 16384, b_resid = 0
b_bufobj = (0xc8985610), b_data = 0xdfbd7000, b_blkno = 365265888
lockstatus = 2, excl count = 1, excl owner 0xfffe
b_npages = 4, pages(OBJ, IDX, PA): (0xc8984108, 0x2b8b07c, 
0xba549000),(0xc8984108, 0x2b8b07d, 0x92eaa000),(0xc8984108, 0x2b8b07e, 
0xbdf4b000),(0xc8984108, 0x2b8b07f, 0x2090c000)

buf at 0xdc6d9e68
b_flags = 0x2000vmio
b_error = 0, b_bufsize = 16384, b_bcount = 16384, b_resid = 0
b_bufobj = (0xc8985610), b_data = 0xe0027000, b_blkno = 365240832
lockstatus = 2, excl count = 1, excl owner 0xfffe
b_npages = 4, pages(OBJ, IDX, PA): (0xc8984108, 0x2b8a440, 
0x61b8d000),(0xc8984108, 0x2b8a441, 0xb0f4e000),(0xc8984108, 0x2b8a442, 
0x5f98f000),(0xc8984108, 0x2b8a443, 0x5c21)

buf at 0xdc6e5458
b_flags = 0x2000vmio
b_error = 0, b_bufsize = 16384, b_bcount = 16384, b_resid = 0
b_bufobj = (0xc8985610), b_data = 0xe025f000, b_blkno = 364617056
lockstatus = 2, excl count = 1, excl owner 0xfffe
b_npages = 4, pages(OBJ, IDX, PA): (0xc8984108, 0x2b773ac, 
0x4e539000),(0xc8984108, 0x2b773ad, 0x6e13a000),(0xc8984108, 0x2b773ae, 
0xc653b000),(0xc8984108, 0x2b773af, 0x14e3c000)

buf at 0xdc6fd4b8
b_flags = 0x2000vmio
b_error = 0, b_bufsize = 16384, b_bcount = 16384, b_resid = 0
b_bufobj = (0xc8985610), b_data = 0xe070f000, b_blkno = 365224960
lockstatus = 2, excl count = 1, excl owner 0xfffe
b_npages = 4, pages(OBJ, IDX, PA): (0xc8984108, 0x2b89c80, 
0x37c6d000),(0xc8984108, 0x2b89c81, 0x2e40e000),(0xc8984108, 0x2b89c82, 
0xa39af000),(0xc8984108, 0x2b89c83, 0x27ff)

buf at 0xdc713f50
b_flags = 0xa00200a0remfree,vmio,clusterok,delwri,cache
b_error = 0, b_bufsize = 16384, b_bcount = 16384, b_resid = 0
b_bufobj = (0xc8985610), b_data = 0xe0b7b000, b_blkno = 365094624
lockstatus = 2, excl count = 1, excl owner 0xcfeb5d80
b_npages = 4, pages(OBJ, IDX, PA): (0xc8984108, 0x2b85cdc, 
0xa89e9000),(0xc8984108, 0x2b85cdd, 0xa852a000),(0xc8984108, 0x2b85cde, 
0xa850b000),(0xc8984108, 0x2b85cdf, 0xa836c000)

buf at 0xdc765f50
b_flags = 0xa00200a0remfree,vmio,clusterok,delwri,cache
b_error = 0, b_bufsize = 16384, b_bcount = 16384, b_resid = 0
b_bufobj = (0xc9c68b60), b_data = 0xe1b7b000, b_blkno = 364555424
lockstatus = 2, excl count = 1, excl owner 0xce9d4c00
b_npages = 4, pages(OBJ, IDX, PA): (0xcff7, 

Re: Recurring problem: processes block accessing UFS file system

2006-01-03 Thread Don Lewis
On  3 Jan, Greg Rivers wrote:
 On Tue, 3 Jan 2006, Don Lewis wrote:
 
 There are large number of sendmail processes waiting on vnode locks
 which are held by other sendmail processes that are waiting on other
 vnode locks, etc. until we get to sendmail pid 87150 which is holding a
 vnode lock and waiting to lock a buf.

 Tracing command sendmail pid 87150 tid 100994 td 0xcf1c5480
 sched_switch(cf1c5480,0,1,b2c5195e,a480a2bc) at sched_switch+0x158
 mi_switch(1,0,c04d7b33,dc713fb0,ec26a6ac) at mi_switch+0x1d5
 sleepq_switch(dc713fb0,ec26a6e0,c04bb9ce,dc713fb0,50) at sleepq_switch+0x16f
 sleepq_wait(dc713fb0,50,c0618ef5,0,202122) at sleepq_wait+0x11
 msleep(dc713fb0,c0658430,50,c0618ef5,0) at msleep+0x3d7
 acquire(ec26a748,120,6,15c2e6e0,0) at acquire+0x89
 lockmgr(dc713fb0,202122,c89855cc,cf1c5480,dc76fe30) at lockmgr+0x45f
 getblk(c8985550,15c2e6e0,0,4000,0) at getblk+0x211
 breadn(c8985550,15c2e6e0,0,4000,0) at breadn+0x52
 bread(c8985550,15c2e6e0,0,4000,0) at bread+0x4c
 ffs_vget(c887,ae58b3,2,ec26a8d4,8180) at ffs_vget+0x383
 ffs_valloc(c8d41660,8180,c92e8d00,ec26a8d4,c05f9302) at ffs_valloc+0x154
 ufs_makeinode(8180,c8d41660,ec26abd4,ec26abe8,ec26aa24) at ufs_makeinode+0x61
 ufs_create(ec26aa50,ec26aa24,ec26ad04,ec26abc0,ec26ab0c) at ufs_create+0x36
 VOP_CREATE_APV(c0646cc0,ec26aa50,2,ec26aa50,0) at VOP_CREATE_APV+0x3c
 vn_open_cred(ec26abc0,ec26acc0,180,c92e8d00,6) at vn_open_cred+0x1fe
 vn_open(ec26abc0,ec26acc0,180,6,c679eacb) at vn_open+0x33
 kern_open(cf1c5480,81416c0,0,a03,180) at kern_open+0xca
 open(cf1c5480,ec26ad04,c,cf1c5480,8169000) at open+0x36
 syscall(3b,bfbf003b,bfbf003b,0,a02) at syscall+0x324
 Xint0x80_syscall() at Xint0x80_syscall+0x1f

 This doesn't appear to be a buf/memory exhausting problem because
 syncer, bufdaemon, and pagedaemon all appear to be idle.

 What does show lockedbufs say?

 
 db show lockedbufs

[snip]

looks like this is the buf that pid 87150 is waiting for:

 buf at 0xdc713f50
 b_flags = 0xa00200a0remfree,vmio,clusterok,delwri,cache
 b_error = 0, b_bufsize = 16384, b_bcount = 16384, b_resid = 0
 b_bufobj = (0xc8985610), b_data = 0xe0b7b000, b_blkno = 365094624
 lockstatus = 2, excl count = 1, excl owner 0xcfeb5d80
 b_npages = 4, pages(OBJ, IDX, PA): (0xc8984108, 0x2b85cdc, 
 0xa89e9000),(0xc8984108, 0x2b85cdd, 0xa852a000),(0xc8984108, 0x2b85cde, 
 0xa850b000),(0xc8984108, 0x2b85cdf, 0xa836c000)

which is locked by this thread:

Tracing command sendmail pid 87117 tid 101335 td 0xcfeb5d80
sched_switch(cfeb5d80,0,1,fd1926a,640c65f9) at sched_switch+0x158
mi_switch(1,0,c04d7b33,dc76fe8c,ec883b2c) at mi_switch+0x1d5
sleepq_switch(dc76fe8c,ec883b60,c04bb9ce,dc76fe8c,4c) at sleepq_switch+0x16f
sleepq_wait(dc76fe8c,4c,c061e9ac,0,0) at sleepq_wait+0x11
msleep(dc76fe8c,c0662f80,4c,c061e9ac,0) at msleep+0x3d7
getdirtybuf(dc76fe30,c0662f80,1,ec883ba8,0) at getdirtybuf+0x221   
softdep_update_inodeblock(cd1bc528,dc713f50,1,4000,0) at softdep_update_inodeblo
ck+0x267
ffs_update(cd953bb0,1,0,cd953bb0,ec883c78,c0529a59,0,0,0,4,1,cd953c2c) at ffs_up
date+0x27f
ffs_syncvnode(cd953bb0,1,4,ec883c78,c05f9a70) at ffs_syncvnode+0x52e
ffs_fsync(ec883cb4,ec883cd0,c052468a,c0646cc0,ec883cb4) at ffs_fsync+0x1c
VOP_FSYNC_APV(c0646cc0,ec883cb4,0,0,0) at VOP_FSYNC_APV+0x3a
fsync(cfeb5d80,ec883d04,4,cfeb5d80,ec883d2c) at fsync+0x1db
syscall(3b,3b,3b,80c7c1b,bfbfa6b0) at syscall+0x324
Xint0x80_syscall() at Xint0x80_syscall+0x1f
--- syscall (95, FreeBSD ELF32, fsync), eip = 0x8830f63f, esp = 0xbfbfa66c, ebp
= 0xbfbfaf98 ---


Pid 87117 is playing with buf 0xdc76fe30 which is not locked, and is
sleeping on the buf's b_xflags member.  It looks like 87117 is waiting
for an in-progress write to complete.  There are a large number of other
sendmail processes waiting in this same place.

How about show buffer 0xdc76fe30?

This is getting into an area of the kernel that I do not understand
well.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Recurring problem: processes block accessing UFS file system

2006-01-03 Thread Greg Rivers

On Tue, 3 Jan 2006, Don Lewis wrote:


db show lockedbufs


[snip]

looks like this is the buf that pid 87150 is waiting for:


buf at 0xdc713f50
b_flags = 0xa00200a0remfree,vmio,clusterok,delwri,cache
b_error = 0, b_bufsize = 16384, b_bcount = 16384, b_resid = 0
b_bufobj = (0xc8985610), b_data = 0xe0b7b000, b_blkno = 365094624
lockstatus = 2, excl count = 1, excl owner 0xcfeb5d80
b_npages = 4, pages(OBJ, IDX, PA): (0xc8984108, 0x2b85cdc, 
0xa89e9000),(0xc8984108, 0x2b85cdd, 0xa852a000),(0xc8984108, 0x2b85cde, 
0xa850b000),(0xc8984108, 0x2b85cdf, 0xa836c000)


which is locked by this thread:

Tracing command sendmail pid 87117 tid 101335 td 0xcfeb5d80
sched_switch(cfeb5d80,0,1,fd1926a,640c65f9) at sched_switch+0x158
mi_switch(1,0,c04d7b33,dc76fe8c,ec883b2c) at mi_switch+0x1d5
sleepq_switch(dc76fe8c,ec883b60,c04bb9ce,dc76fe8c,4c) at sleepq_switch+0x16f
sleepq_wait(dc76fe8c,4c,c061e9ac,0,0) at sleepq_wait+0x11
msleep(dc76fe8c,c0662f80,4c,c061e9ac,0) at msleep+0x3d7
getdirtybuf(dc76fe30,c0662f80,1,ec883ba8,0) at getdirtybuf+0x221
softdep_update_inodeblock(cd1bc528,dc713f50,1,4000,0) at softdep_update_inodeblo
ck+0x267
ffs_update(cd953bb0,1,0,cd953bb0,ec883c78,c0529a59,0,0,0,4,1,cd953c2c) at ffs_up
date+0x27f
ffs_syncvnode(cd953bb0,1,4,ec883c78,c05f9a70) at ffs_syncvnode+0x52e
ffs_fsync(ec883cb4,ec883cd0,c052468a,c0646cc0,ec883cb4) at ffs_fsync+0x1c
VOP_FSYNC_APV(c0646cc0,ec883cb4,0,0,0) at VOP_FSYNC_APV+0x3a
fsync(cfeb5d80,ec883d04,4,cfeb5d80,ec883d2c) at fsync+0x1db
syscall(3b,3b,3b,80c7c1b,bfbfa6b0) at syscall+0x324
Xint0x80_syscall() at Xint0x80_syscall+0x1f
--- syscall (95, FreeBSD ELF32, fsync), eip = 0x8830f63f, esp = 0xbfbfa66c, ebp
= 0xbfbfaf98 ---


Pid 87117 is playing with buf 0xdc76fe30 which is not locked, and is
sleeping on the buf's b_xflags member.  It looks like 87117 is waiting
for an in-progress write to complete.  There are a large number of other
sendmail processes waiting in this same place.

How about show buffer 0xdc76fe30?



db show buffer 0xdc76fe30
buf at 0xdc76fe30
b_flags = 0x20a0vmio,delwri,cache
b_error = 0, b_bufsize = 16384, b_bcount = 16384, b_resid = 0
b_bufobj = (0xc8985610), b_data = 0xe1d6b000, b_blkno = 365086368
lockstatus = 0, excl count = 0, excl owner 0x
b_npages = 4, pages(OBJ, IDX, PA): (0xc8984108, 0x2b858d4, 
0xa8de1000),(0xc8984108, 0x2b858d5, 0xa8c62000),(0xc8984108, 0x2b858d6, 
0xa8de3000),(0xc8984108, 0x2b858d7, 0xa8e64000)
db



This is getting into an area of the kernel that I do not understand
well.



For me, we've long since passed that point.  :-)

--
Greg
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]