Re: panic with heavy io

2003-02-11 Thread phk
In message [EMAIL PROTECTED], Mark Santcroos writes:
While doing heavy IO (updating my p4 repo) on my laptop I got the following 
panic. (I was running in X so both backtrace and dmesg are from the core
dump after reboot)

I'm wondering whether the ENOMEM's reported by GEOM point out that GEOM
has a problem or just tell us that the machine was out of memory and some
other subsystem failed.

ENOMEM 0xc3724300 on 0xc2412c80(ad0s1)

The ENOMEM from GEOM is just a notification that an I/O request failed
due to lack of memory.

GEOM reacts to this by rescheduling the I/O request and entering a
rudimentary back-off mode where further I/O requests are paced so
that some of the outstanding ones get a chance to complete.   The
current pacing is inspired a little bit by tcp slowstart btw.

By looking at the stack it seems that the NULL-pointer dereference is
going down pretty far.
The arguments in the lstat(frame #28) already seem bogus.

#10 0xc0381d12 in trap_fatal (frame=0xce5c1700, eva=0) at 
../../../i386/i386/trap.c:844

This is the interesting trap I think, all the stuff above is noise.

#11 0xc03819f2 in trap_pfault (frame=0xce5c1700, usermode=0, eva=20) at 
../../../i386/i386/trap.c:758
#12 0xc03814e0 in trap (frame=
  {tf_fs = -832831464, tf_es = -1071710192, tf_ds = -951058416, tf_edi = 
-1037023552, tf_esi = -951046744, tf_ebp = -832825484, tf_isp = -832825556, tf_ebx = 
0, tf_edx = 5, tf_ecx = 0, tf_eax = -1740064768, tf_trapno = 12, tf_err = 2, tf_eip = 
-1071712263, tf_cs = 8, tf_eflags = 66183, tf_esp 
= -951046744, tf_ss = -951046572})
at ../../../i386/i386/trap.c:445
#13 0xc0371bf8 in calltrap () at {standard input}:96
#14 0xc01edc00 in spec_xstrategy (vp=0xc23046c0, bp=0xc7502da8) at 
../../../fs/specfs/spec_vnops.c:596

This doesn't correspond to my sourcefile, but you should examine this one.


#15 0xc01edc7b in spec_specstrategy (ap=0x0) at ../../../fs/specfs/spec_vnops.c:633

This, I think is impossible, so I think we should assume that something
overwrite some memory and cleared out some bits which should have
survived.

-- 
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
[EMAIL PROTECTED] | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: panic with heavy io

2003-02-11 Thread Mark Santcroos
On Tue, Feb 11, 2003 at 12:21:19PM +0100, [EMAIL PROTECTED] wrote:
 #14 0xc01edc00 in spec_xstrategy (vp=0xc23046c0, bp=0xc7502da8) at 
../../../fs/specfs/spec_vnops.c:596
 
 This doesn't correspond to my sourcefile, but you should examine this one.

Sorry, I should have do the backtrace on the source with the date of the
kernel.

It was running spec_vnops.c:1.194

596:   DEV_STRATEGY(bp);

I don't understand what you want me to examine there as the arguments are
not usefull anymore (or are they?).

 #15 0xc01edc7b in spec_specstrategy (ap=0x0) at ../../../fs/specfs/spec_vnops.c:633
 
 This, I think is impossible, so I think we should assume that something
 overwrite some memory and cleared out some bits which should have
 survived.

That was my feeling too, it wouldn't have gotten so deep with NULL arguments.
Haven't checked the code so it is only an assumption.

Any idea's what to do now or what to do when I am able to reproduce it?

Mark

-- 
Mark SantcroosRIPE Network Coordination Centre
http://www.ripe.net/home/mark/New Projects Group/TTM

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: panic with heavy io

2003-02-11 Thread phk
In message [EMAIL PROTECTED], Mark Santcroos writes:

596:   DEV_STRATEGY(bp);

I don't understand what you want me to examine there as the arguments are
not usefull anymore (or are they?).

Well that line in my copy was not code, so I just wanted to see what
your sources said.

 #15 0xc01edc7b in spec_specstrategy (ap=0x0) at ../../../fs/specfs/spec_vnops.c:633
 
 This, I think is impossible, so I think we should assume that something
 overwrite some memory and cleared out some bits which should have
 survived.

That was my feeling too, it wouldn't have gotten so deep with NULL arguments.
Haven't checked the code so it is only an assumption.

Any idea's what to do now or what to do when I am able to reproduce it?

No idea at the moment.

-- 
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
[EMAIL PROTECTED] | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message