Someone have to step up and bisect the whole kernel tree.
On Sun, May 11, 2014 at 11:04 PM, Chavdar Ivanov <[email protected]> wrote: > I have sumilar problem - Ive discussed it earlier. > > > C/P from my earlier mail to this list: > ... > root file system type: ffs > fatal protection fauilt in supervisor mode > trap type 4 code 0 rip ffffffff80511a2a cs 8 rflags 10246 cr2 > ffff800036c88000 ilevel 0 rsp fffffe8006d08a20 > curlwp 0xfffffe8006d4da00 pid 1.1 lowest kstack 0xfffffe8006d052c0 > kernel: protection fault trap, code=0 > Stopped in pid 1.1 (init) at netbsd:check_exec+0x319: call > *8(%rax) > > db{1}> bt > check_exec() at netbsd:check_exec+0x319 > execve_loadvm() at netbsd:execve_loadvm+0x1ca > execve1() at netbsd:execve1+0x28 > start_init() at netbsd:start_init+0x26f > db{1}> > ..... > > I've tried bisecting with no success. We've located the exact place of > the crash (Masao pointed to me, I sprinkled printfs in the code). > > Still no idea what is the problem. > > . > > > On 11 May 2014 14:15, Paul Goyette <[email protected]> wrote: >> BTW, I just checked a 6.99.41 GENERIC kernel, and it fails in the same >> manner. So this is not likely a result of my customized kernel config. >> >> I'm suspecting that this is due to booting from an auto-configured raid >> mirror. >> >> >> >> On Sat, 10 May 2014, Paul Goyette wrote: >> >>> On Sat, 10 May 2014, Jonathan A. Kollasch wrote: >>> >>>> On Sat, May 10, 2014 at 04:17:10PM -0700, Paul Goyette wrote: >>>>> >>>>> 15:43:14). With no other changes than the updated kernel (and >>>>> modules), it crashes with >>>>> >>>>> kernel: pagefault trap, code=0 >>>>> uvm_fault(0xfffffe813aec5e60, 0x0, 2) -> e >>>>> fatal page fault in supervisor mode >>>>> trap type 6 code 2 rip ffffffff80230fe2 cs 8 rflags 10246 cr2 0 >>>>> ilevel 8 rsp fffffe813aebb048 >>>>> curlwp 0xfffffe813aec8880 pid 1.1 lowest kstack fffffe8a3aebc2c0 >>>>> >>>>> The above messages repeat several times, until reaching the bottom >>>>> of the screen, with a db-more prompt. >>>>> >>>>> Funny thing is, the rip reported seems to be in the middle of >>>>> setting the keyboard! >>>>> >>>> >>>> Is this a DEBUG and/or DIAGNOSTIC kernel? > > The funny thing is, with DEBUG and DIAGNOSTIC it works fine. And yes, > I also have autoconfigured RAID1 root - and another RAID5 array in > this system > >>> >>> No. Neither option is included in the kernel config file. >>> >>>>> WARNING: double match for boot device (wd0, wd1) >>>>> raid0: RAID Level 1 >>>>> raid0: Components: /dev/wd0e /dev/wd1e >>>>> raid0: Total sectors 488395008 (238474 MB) >>>>> raid1: RAID Level 1 >>>>> raid1: Components: /dev/wd2e /dev/wd2e >>>>> raid1: Total sectors 976770944 (476938 MB) >>>>> boot device: raid0 >>>>> root on raid0e dumps on raid0b >>>>> warning: no /dev/console >>>>> exec /sbin/init: error 2 >>>> >>>> >>>> raid0e looks weird. >>> >>> >>> There is no raid0e. >>> >>> wd0e and wd1e are partitioned as follows: >>> >>> screamer:netbsd-local {129} disklabel wd0 >>> # /dev/rwd0d: >>> type: ESDI >>> disk: ST3250318AS >>> <snip> >>> 5 partitions: >>> # size offset fstype [fsize bsize cpg/sgs] >>> c: 488395120 2048 unused 0 0 # (Cyl. 2*- 484520) >>> d: 488397168 0 unused 0 0 # (Cyl. 0 - 484520) >>> e: 488395120 2048 RAID # (Cyl. 2*- 484520) >>> >>> >>> And raid0 is partitioned as >>> >>> # /dev/rraid0d: >>> type: RAID >>> disk: raid >>> <snip> >>> 6 partitions: >>> # size offset fstype [fsize bsize cpg/sgs] >>> a: 41943040 0 4.2BSD 2048 16384 0 # (Cyl. 0 - >>> 40959) >>> b: 62914560 41943040 swap # (Cyl. 40960 - >>> 102399) >>> c: 488395008 0 unused 0 0 # (Cyl. 0 - >>> 476948*) >>> d: 488395008 0 unused 0 0 # (Cyl. 0 - >>> 476948*) >>> e: 125829120 104857600 4.2BSD 2048 16384 0 # (Cyl. 102400 - >>> 225279) >>> f: 257708288 230686720 4.2BSD 2048 16384 0 # (Cyl. 225280 - >>> 476948*) >>> >>> >>> And raid0 config looks like this: >>> >>> # raid0.conf RAID-1 configuration >>> # >>> # This array should be made bootable! >>> >>> # Describe the array >>> START array >>> >>> #numrow numcol numspare >>> 1 2 0 >>> >>> # Identify physical disks >>> START disks >>> /dev/wd0e >>> /dev/wd1e >>> >>> # Layout is simple - 64 sectors per stripe >>> START layout >>> >>> #Sect/StripeUnit StripeUnit/ParityUnit StripeUnit/ReconUnit RaidLevel >>> 128 1 1 1 >>> >>> # No spares >>> #START spare >>> >>> # Command queueing >>> START queue >>> fifo 100 >>> >>> >>> >>> >>> >>> ------------------------------------------------------------------------- >>> | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | >>> | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com | >>> | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net | >>> | Kernel Developer | | pgoyette at netbsd.org | >>> ------------------------------------------------------------------------- >>> >> >> ------------------------------------------------------------------------- >> | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | >> | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com | >> | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net | >> | Kernel Developer | | pgoyette at netbsd.org | >> ------------------------------------------------------------------------- > > > Chavdar Ivanov > > > -- > ----
