Re: GEOM_MBR breaks my kernel

2003-03-15 Thread walt
Poul-Henning Kamp wrote:
In message [EMAIL PROTECTED], walt writes:

I've been unable to boot any kernel I've built since about March 11
and I've narrowed it down to the GEOM_MBR option.
With GEOM_MBR I get a kernel page fault error when trying to
mount the root filesystem at boot time.

Can you get us the messages and a traceback ?
Well, no.  I've been trying to find a kernel configuration that
will allow me to reproduce the bug AND generate a traceback, but
so far I can't find one.
The problem is that just adding GEOM_MBR to a GENERIC kernel
doesn't produce the bug.  My normal custom kernel doesn't contain
the debugging stuff, and if I start changing things the bug
doesn't show.
The only semi-interesting result I've come up with is this:

I normally use only the 'cpu I686_CPU' flag because I have an
Athlon cpu.  But if I also include the 'cpu I586_CPU' flag the
bug completely changes:  the machine boots and the filesystems
mount just fine but about ten seconds after I start X running
the machine panics and reboots shortly thereafter.  The panic
message doesn't appear on the screen because the console is  not
visible at that point.
Does this suggest a gcc problem?  I've never really understood
how more than one 'cpu' flag can be included in the kernel config
file, so I'm not sure what actually changes when I do that.
To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message


GEOM_MBR breaks my kernel

2003-03-14 Thread walt
I've been unable to boot any kernel I've built since about March 11
and I've narrowed it down to the GEOM_MBR option.
With GEOM_MBR I get a kernel page fault error when trying to
mount the root filesystem at boot time.
ad0: 76319MB WDC WD800JB-00CRA1 [155061/16/63] at ata0-master UDMA100
ad1: 76319MB ST380021A [155061/16/63] at ata0-slave UDMA100
acd0: CDROM LTN301 at ata1-master PIO4
MBREXT Slice 5 on ad0s2:
[0] f:00 typ:7 s(CHS):6/1/66 e(CHS):250/254/255 s:64 l:12161141
[1] f:00 typ:5 s(CHS):251/0/193 e(CHS):255/254/255 s:12161205 l:20980890
MBREXT Slice 6 on ad0s2:
[0] f:00 typ:11 s(CHS):251/1/193 e(CHS):255/254/255 s:63 l:20980827
[1] f:00 typ:5 s(CHS):255/0/193 e(CHS):255/254/255 s:33142095 l:16787925
MBREXT Slice 7 on ad0s2:
[0] f:00 typ:131 s(CHS):255/1/193 e(CHS):255/254/255 s:63 l:16787862
[1] f:00 typ:5 s(CHS):255/0/193 e(CHS):255/254/255 s:49930020 l:2088450
MBREXT Slice 8 on ad0s2:
[0] f:00 typ:130 s(CHS):255/1/193 e(CHS):255/254/255 s:63 l:2088387
[1] f:00 typ:0 s(CHS):0/0/0 e(CHS):0/0/0 s:0 l:0
MBREXT Slice 5 on ad1s1:
[0] f:00 typ:11 s(CHS):255/1/193 e(CHS):255/254/255 s:63 l:20980827
[1] f:00 typ:0 s(CHS):0/0/0 e(CHS):0/0/0 s:0 l:0
Information from DOS bootblock is:
The data for partition 1 is:
sysid 11 (0x0b),(DOS or Windows 95 with 32 bit FAT)
start 63, size 4208967 (2055 Meg), flag 80 (active)
beg: cyl 0/ head 1/ sector 1;
end: cyl 261/ head 254/ sector 63
The data for partition 2 is:
sysid 15 (0x0f),(Extended DOS (LBA))
start 4209030, size 52018470 (25399 Meg), flag 0
beg: cyl 262/ head 0/ sector 1;
end: cyl 1023/ head 254/ sector 63
The data for partition 3 is:
sysid 165 (0xa5),(FreeBSD/NetBSD/386BSD)
start 139926150, size 16370235 (7993 Meg), flag 0
beg: cyl 1023/ head 0/ sector 1;
end: cyl 1023/ head 254/ sector 63
The data for partition 4 is:
UNUSED
# /dev/ad0s3c:
type: ESDI
sectors/cylinder: 16065
cylinders: 9729
sectors/unit: 156301488
8 partitions:
#size   offsetfstype   [fsize bsize bps/cpg]
  a: 15321596  10485764.2BSD0 0 0   # (Cyl.   65*- 1018*)
  b:  10485760  swap# (Cyl.0 - 65*)
  c: 163702350unused0 0 # (Cyl.0 - 1018)
Warning, partition c doesn't cover the whole unit!
Note that I put the swap partition first.  Could that cause this problem?

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message


Re: GEOM_MBR breaks my kernel

2003-03-14 Thread Poul-Henning Kamp
In message [EMAIL PROTECTED], walt writes:
I've been unable to boot any kernel I've built since about March 11
and I've narrowed it down to the GEOM_MBR option.

With GEOM_MBR I get a kernel page fault error when trying to
mount the root filesystem at boot time.

Can you get us the messages and a traceback ?

-- 
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
[EMAIL PROTECTED] | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message


Re: GEOM_MBR breaks my kernel

2003-03-14 Thread Krzysztof Parzyszek
On Fri, Mar 14, 2003 at 03:37:53PM +0100, Poul-Henning Kamp wrote:
 In message [EMAIL PROTECTED], walt writes:
 I've been unable to boot any kernel I've built since about March 11
 and I've narrowed it down to the GEOM_MBR option.
 
 With GEOM_MBR I get a kernel page fault error when trying to
 mount the root filesystem at boot time.
 
 Can you get us the messages and a traceback ?

I saw the same thing on my system.  I don't have the exact message
or traceback around, but the problem was essentially a null pointer
dereference while in kernel mode.
I was able to locate the offending line in the source:

In devfs_allocv:

if (de-de_dirent-d_type == DT_CHR) {
dev = *devfs_itod(de-de_inode);
if (dev == NULL)
return (ENOENT);
} else {
...


The first comparison causes the problem, since de-de_dirent == NULL.


The problem did not exist until I turned WITNESS  INVARIANTS off
(in a kernel with all GEOM_* stuff enabled).


Let me know if you need more information.  If you need the traceback,
I'd appreciate if you told me how to get it written to a file. :)


Krzysztof



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message