head -r308247 (12-CURRENT) PowerMac G5 powerpc64: ddb's dump gets "panic: vm_fault: fault on nofault entry, addr: c000000000022000"

2016-11-16 Thread Mark Millard
[A version of the below has been submitted to bugzilla as 214598. The specific 
powerpc64 context is a PowerMac11,3 PowerMac G5 "Quad Core" --actually 2 
sockets with each having 2 cores.]

The failure was: "panic: vm_fault: fault on nofault entry, addr: 
c0022000"

Details. . .

I manually entered ddb and gave it the dump command. 2 of 3 chunks worked. This 
is new since last I tried (long ago): it used to be it complained about the DMA 
request size being too large. (I've not tried in a long time, under some 10. 
.) So this got a lot farther than I've ever seen before for this type of 
context.

Transcribed from a camera picture of the screen. . . (Typos?)
(Also: I will drop off various leading zeros and some formatting will be 
different.)

KDB: enter: manual escape to debugger
[ thread pid 12 tid 10018 ]
Stopped at   .kdb_enter+0x70:   ori r0, r0, 0x0
db> dump
Dumping 9 MB (3 chunks)
  chunk 0: 10MB (2510 pages) ... ok
  chunk 1: 1MB (24 pages) ... ok
  chunk 2: 1MB (2 pages)panic: vm_fault: fault on nofault entry, addr: 
c0022000
cpuid = 1
KDB: stack backtrace:
0xf93cd0: at .kdb_backtrace+0x6c
0xf93df0: at .vpanic+0x178
0xf93eb0: at .panic+0x34
0xf93f30: at .vm_fault_hold+0x1ac
0xf94120: at .vm_fault+0x98
0xf941c0: at .trap_pfault+0xe8
0xf94240: at .trap+0x1e44
0xf944b0: at .powerpc_interrupt+0x1e0
0xf94550: kernel DSI read trap 0 0xc0022ff8
  by .memcpy+0x144:
  ssr1=0x90001032
  r1  =  0xf94800
  cr  =0x4424
  xer =0x2000
  ctr = 0x200
  r2  = 0x10de000
  sr  =0x4000
0xf94800: at .moea64_kextract+0x124
0xf94840: at tmpstk+0x297c
0xf948c0: at ._bus_dmamap_sync+0x124
0xf94960: at .ata_dmaload+0x1e8
0xf94a00: at .ata_begin_transaction+0x240
0xf94aa0: at .ataaction+0x4e0
0xf94b50: at .xpt_run_devq+0x934
0xf94c40: at .xpt_action_default+0x3a8
0xf94cf0: at .ata_action+0x3c8
0xf94d80: at .xpt_actino+0x44
0xf94e00: at .xpt_polled_action+0x60c
0xf94ec0: at .adadump+0x33c
0xf954e0: at .dump_write+0x88
0xf954e0: at .dump_sys_cb_dumpdata+0x124
0xf955c0: at .dumpsys_foreach_chunk+0x68
0xf95660: at .dumpsys_generic+0x26c
0xf95770: at .doadump+0xbc
0xf95800: at .db_dump+0x44
0xf95880: at .db_command+0x3f4

(I omit the rest going back to "0xc000b6b4f920: at blocked_loop+0x38".)


===
Mark Millard
markmi at dsl-only.net

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: panic: vm_fault: fault on nofault entry, addr: fffffe00873d8000

2015-12-07 Thread Fabian Keil
Konstantin Belousov  wrote:

> On Sun, Dec 06, 2015 at 06:51:36PM +0100, Fabian Keil wrote:
> > > > #16 0x80877d5a in bcopy () at 
> > > > /usr/src/sys/amd64/amd64/support.S:118
> > > > #17 0x805f64e8 in uiomove_faultflag (cp=, 
> > > > n=, uio=0xfe009444aae0, nofault= > > > optimized out>) at /usr/src/sys/kern/subr_uio.c:208
> > > > #18 0x8046236f in msdosfs_read (ap=) at 
> > > > /usr/src/sys/fs/msdosfs/msdosfs_vnops.c:596
> > > > #19 0x808feb20 in VOP_READ_APV (vop=, 
> > > > a=) at vnode_if.c:930
> > > > #20 0x8039bf3a in mdstart_vnode (sc=0xf8004c7ce000, 
> > > > bp=0xf80028fc81f0) at vnode_if.h:384
> > > From the frame 20, do 'p *bp' in kgdb and mail the result.  Do you have
> > > any non-standard values for buffer cache knobs, esp. for MAXPHYS ?  
> > 
> > (kgdb) p *bp
> > $1 = {bio_cmd = 1 '\001', bio_flags = 16 '\020', bio_cflags = 0 '\0', 
> > bio_pflags = 0 '\0', bio_dev = 0x0, bio_disk = 0x0, bio_offset = 0, 
> > bio_bcount = 0, 
> >   bio_data = 0xfe0077d94000 , 
> > bio_ma = 0xf8000275bc00, bio_ma_offset = 960,  
> 
> bio_ma_n = 33,
> This is the issue.  The upper layer (ZFS ?) passed down the request
> which is max-sized (see bio_length == 32 pages) but not aligned.
> The physical buffer used for transient mapping cannot handle this.
> 
> bio_error = 0, bio_resid = 0, 
> >   bio_done = 0x804e51d0 , bio_driver1 = 0x0, 
> > bio_driver2 = 0x0, bio_caller1 = 0x0, bio_caller2 = 0x0, bio_queue = 
> > {tqe_next = 0x0, tqe_prev = 0xf8004c7ce018}, bio_attribute = 0x0, 
> >   bio_from = 0xf80010131d80, bio_to = 0xf800694f2a00, bio_length = 
> > 131072, bio_completed = 0, bio_children = 0, bio_inbed = 0, bio_parent = 
> > 0xf8000628bd90, bio_t0 = {sec = 33029, 
> > frac = 13163670047247984455}, bio_task = 0, bio_task_arg = 0x0, 
> > bio_classifier1 = 0x0, bio_classifier2 = 0x0, bio_pblkno = 0}
> >  
> > I don't use non-standard values for MAXPHYS or other buffer cache settings.
> >   
> 
> Try the following patch.

With this patch I got:

[400] Fatal trap 9: general protection fault while in kernel mode
[400] cpuid = 0; apic id = 00
[400] instruction pointer   = 0x20:0x8086c603
[400] stack pointer = 0x28:0xfe0094422a60
[400] frame pointer = 0x28:0xfe0094422a80
[400] code segment  = base 0x0, limit 0xf, type 0x1b
[400]   = DPL 0, pres 1, long 1, def32 0, gran 1
[400] processor eflags  = interrupt enabled, resume, IOPL = 0
[400] current process   = 34142 (md0)
[...]
(kgdb) where
#0  doadump (textdump=0) at pcpu.h:221
#1  0x80316e5b in db_dump (dummy=, dummy2=false, 
dummy3=0, dummy4=0x0) at /usr/src/sys/ddb/db_command.c:533
#2  0x80316c4e in db_command (cmd_table=0x0) at 
/usr/src/sys/ddb/db_command.c:440
#3  0x803169e4 in db_command_loop () at 
/usr/src/sys/ddb/db_command.c:493
#4  0x803194eb in db_trap (type=, code=0) at 
/usr/src/sys/ddb/db_main.c:251
#5  0x805e2933 in kdb_trap (type=9, code=0, tf=) 
at /usr/src/sys/kern/subr_kdb.c:654
#6  0x8087d161 in trap_fatal (frame=0xfe00944229b0, eva=) at /usr/src/sys/amd64/amd64/trap.c:829
#7  0x8087ce3c in trap (frame=) at 
/usr/src/sys/amd64/amd64/trap.c:203
#8  0x80861ae7 in calltrap () at 
/usr/src/sys/amd64/amd64/exception.S:234
#9  0x8086c603 in pmap_qenter (sva=18446741876956168192, ma=, count=32) at /usr/src/sys/amd64/amd64/pmap.c:1991
#10 0x8039e673 in mdstart_vnode (sc=0xf80029ac7800, 
bp=0xf800270c15d0) at /usr/src/sys/dev/md/md.c:928
#11 0x8039c73c in md_kthread (arg=0xf80029ac7800) at 
/usr/src/sys/dev/md/md.c:1158
#12 0x8055c16c in fork_exit (callout=0x8039c510 , 
arg=0xf80029ac7800, frame=0xfe0094422c00) at 
/usr/src/sys/kern/kern_fork.c:1011
#13 0x8086201e in fork_trampoline () at 
/usr/src/sys/amd64/amd64/exception.S:609
#14 0x in ?? ()
Current language:  auto; currently minimal
(kgdb) f 9
#9  0x8086c603 in pmap_qenter (sva=18446741876956168192, ma=, count=32) at /usr/src/sys/amd64/amd64/pmap.c:1991
1991m = *ma++;
(kgdb) f 10
#10 0x8039e673 in mdstart_vnode (sc=0xf80029ac7800, 
bp=0xf800270c15d0) at /usr/src/sys/dev/md/md.c:928
928 pmap_qenter((vm_offset_t)pb->b_data,
(kgdb) l
923 unmapped_step:
924 npages = min(MAXPHYS, roundup2(len + ma_offs, 
PAGE_SIZE)) /
925 PAGE_SIZE;
926 iolen = min(npages * PAGE_SIZE - ma_offs, len);
927 KASSERT(iolen > 0, ("zero iolen"));
928 pmap_qenter((vm_offset_t)pb->b_data,
929 >bio_ma[ma_offs / PAGE_SIZE], npages);
930 aiov.iov_base = (void *)((vm_offset_t)pb->b_data +
931 ma_offs % PAGE_SIZE);
932 aiov.iov_len = 

Re: panic: vm_fault: fault on nofault entry, addr: fffffe00873d8000

2015-12-06 Thread Konstantin Belousov
On Sun, Dec 06, 2015 at 11:45:32AM +0100, Fabian Keil wrote:
> I got the following panic while trying to import a ZFS pool from a
> geli-encrypted memory disk backed by a file located on a msdosfs partition:
I smiled.

> 
> (kgdb) where
> #0  doadump (textdump=0) at pcpu.h:221
> #1  0x80314c1b in db_dump (dummy=, dummy2=false, 
> dummy3=0, dummy4=0x0) at /usr/src/sys/ddb/db_command.c:533
> #2  0x80314a0e in db_command (cmd_table=0x0) at 
> /usr/src/sys/ddb/db_command.c:440
> #3  0x803147a4 in db_command_loop () at 
> /usr/src/sys/ddb/db_command.c:493
> #4  0x803172ab in db_trap (type=, code=0) at 
> /usr/src/sys/ddb/db_main.c:251
> #5  0x805dfe33 in kdb_trap (type=3, code=0, tf=) 
> at /usr/src/sys/kern/subr_kdb.c:654
> #6  0x80879bc7 in trap (frame=0xfe009444a240) at 
> /usr/src/sys/amd64/amd64/trap.c:549
> #7  0x8085eb77 in calltrap () at 
> /usr/src/sys/amd64/amd64/exception.S:234
> #8  0x805df51b in kdb_enter (why=0x8096c7fb "panic", msg=0x32 
> ) at cpufunc.h:63
> #9  0x8059bbdf in vpanic (fmt=, ap= optimized out>) at /usr/src/sys/kern/kern_shutdown.c:750
> #10 0x8059ba33 in panic (fmt=0x0) at 
> /usr/src/sys/kern/kern_shutdown.c:688
> #11 0x8082ffb5 in vm_fault_hold (map=, 
> vaddr=, fault_type=, 
> fault_flags=, m_hold=)
> at /usr/src/sys/vm/vm_fault.c:332
> #12 0x8082de18 in vm_fault (map=0xf8000200, vaddr= optimized out>, fault_type=2 '\002', fault_flags=0) at 
> /usr/src/sys/vm/vm_fault.c:277
> #13 0x8087a33a in trap_pfault (frame=0xfe009444a8e0, usermode=0) 
> at /usr/src/sys/amd64/amd64/trap.c:734
> #14 0x80879bde in trap (frame=0xfe009444a8e0) at 
> /usr/src/sys/amd64/amd64/trap.c:435
> #15 0x8085eb77 in calltrap () at 
> /usr/src/sys/amd64/amd64/exception.S:234
> #16 0x80877d5a in bcopy () at /usr/src/sys/amd64/amd64/support.S:118
> #17 0x805f64e8 in uiomove_faultflag (cp=, 
> n=, uio=0xfe009444aae0, nofault= out>) at /usr/src/sys/kern/subr_uio.c:208
> #18 0x8046236f in msdosfs_read (ap=) at 
> /usr/src/sys/fs/msdosfs/msdosfs_vnops.c:596
> #19 0x808feb20 in VOP_READ_APV (vop=, a= optimized out>) at vnode_if.c:930
> #20 0x8039bf3a in mdstart_vnode (sc=0xf8004c7ce000, 
> bp=0xf80028fc81f0) at vnode_if.h:384
>From the frame 20, do 'p *bp' in kgdb and mail the result.  Do you have
any non-standard values for buffer cache knobs, esp. for MAXPHYS ?

> #21 0x8039a3cc in md_kthread (arg=0xf8004c7ce000) at 
> /usr/src/sys/dev/md/md.c:979
> #22 0x8055978c in fork_exit (callout=0x8039a1a0 , 
> arg=0xf8004c7ce000, frame=0xfe009444ac00) at 
> /usr/src/sys/kern/kern_fork.c:1011
> #23 0x8085f0ae in fork_trampoline () at 
> /usr/src/sys/amd64/amd64/exception.S:609
> #24 0x in ?? ()
> Current language:  auto; currently minimal
> 
> This is the second time I've seen this, the first time was with a kernel
> based on r290573 in November, but as I wasn't able to intentionally reproduce
> it with a more recent kernel my assumption was that the problem had already
> been fixed.
> 
> Currently my kernel is based on r291706.
> 
> Any ideas?
> 
> Fabian


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: panic: vm_fault: fault on nofault entry, addr: fffffe00873d8000

2015-12-06 Thread Fabian Keil
Konstantin Belousov  wrote:

> On Sun, Dec 06, 2015 at 11:45:32AM +0100, Fabian Keil wrote:
> > I got the following panic while trying to import a ZFS pool from a
> > geli-encrypted memory disk backed by a file located on a msdosfs partition: 
> >  
> I smiled.

I occasionally use this somewhat non-standard setup to store redundant
backups on a media player whose bootloader may not be prepared to deal
with multiple partitions ...

> > (kgdb) where
> > #0  doadump (textdump=0) at pcpu.h:221
> > #1  0x80314c1b in db_dump (dummy=, 
> > dummy2=false, dummy3=0, dummy4=0x0) at /usr/src/sys/ddb/db_command.c:533
> > #2  0x80314a0e in db_command (cmd_table=0x0) at 
> > /usr/src/sys/ddb/db_command.c:440
> > #3  0x803147a4 in db_command_loop () at 
> > /usr/src/sys/ddb/db_command.c:493
> > #4  0x803172ab in db_trap (type=, code=0) at 
> > /usr/src/sys/ddb/db_main.c:251
> > #5  0x805dfe33 in kdb_trap (type=3, code=0, tf= > out>) at /usr/src/sys/kern/subr_kdb.c:654
> > #6  0x80879bc7 in trap (frame=0xfe009444a240) at 
> > /usr/src/sys/amd64/amd64/trap.c:549
> > #7  0x8085eb77 in calltrap () at 
> > /usr/src/sys/amd64/amd64/exception.S:234
> > #8  0x805df51b in kdb_enter (why=0x8096c7fb "panic", 
> > msg=0x32 ) at cpufunc.h:63
> > #9  0x8059bbdf in vpanic (fmt=, ap= > optimized out>) at /usr/src/sys/kern/kern_shutdown.c:750
> > #10 0x8059ba33 in panic (fmt=0x0) at 
> > /usr/src/sys/kern/kern_shutdown.c:688
> > #11 0x8082ffb5 in vm_fault_hold (map=, 
> > vaddr=, fault_type=, 
> > fault_flags=, m_hold=)
> > at /usr/src/sys/vm/vm_fault.c:332
> > #12 0x8082de18 in vm_fault (map=0xf8000200, vaddr= > optimized out>, fault_type=2 '\002', fault_flags=0) at 
> > /usr/src/sys/vm/vm_fault.c:277
> > #13 0x8087a33a in trap_pfault (frame=0xfe009444a8e0, 
> > usermode=0) at /usr/src/sys/amd64/amd64/trap.c:734
> > #14 0x80879bde in trap (frame=0xfe009444a8e0) at 
> > /usr/src/sys/amd64/amd64/trap.c:435
> > #15 0x8085eb77 in calltrap () at 
> > /usr/src/sys/amd64/amd64/exception.S:234
> > #16 0x80877d5a in bcopy () at /usr/src/sys/amd64/amd64/support.S:118
> > #17 0x805f64e8 in uiomove_faultflag (cp=, 
> > n=, uio=0xfe009444aae0, nofault= > out>) at /usr/src/sys/kern/subr_uio.c:208
> > #18 0x8046236f in msdosfs_read (ap=) at 
> > /usr/src/sys/fs/msdosfs/msdosfs_vnops.c:596
> > #19 0x808feb20 in VOP_READ_APV (vop=, a= > optimized out>) at vnode_if.c:930
> > #20 0x8039bf3a in mdstart_vnode (sc=0xf8004c7ce000, 
> > bp=0xf80028fc81f0) at vnode_if.h:384  
> From the frame 20, do 'p *bp' in kgdb and mail the result.  Do you have
> any non-standard values for buffer cache knobs, esp. for MAXPHYS ?

(kgdb) p *bp
$1 = {bio_cmd = 1 '\001', bio_flags = 16 '\020', bio_cflags = 0 '\0', 
bio_pflags = 0 '\0', bio_dev = 0x0, bio_disk = 0x0, bio_offset = 0, bio_bcount 
= 0, 
  bio_data = 0xfe0077d94000 , 
bio_ma = 0xf8000275bc00, bio_ma_offset = 960, bio_ma_n = 33, bio_error = 0, 
bio_resid = 0, 
  bio_done = 0x804e51d0 , bio_driver1 = 0x0, bio_driver2 = 
0x0, bio_caller1 = 0x0, bio_caller2 = 0x0, bio_queue = {tqe_next = 0x0, 
tqe_prev = 0xf8004c7ce018}, bio_attribute = 0x0, 
  bio_from = 0xf80010131d80, bio_to = 0xf800694f2a00, bio_length = 
131072, bio_completed = 0, bio_children = 0, bio_inbed = 0, bio_parent = 
0xf8000628bd90, bio_t0 = {sec = 33029, 
frac = 13163670047247984455}, bio_task = 0, bio_task_arg = 0x0, 
bio_classifier1 = 0x0, bio_classifier2 = 0x0, bio_pblkno = 0}
 
I don't use non-standard values for MAXPHYS or other buffer cache settings.

Fabian


pgp8Ii1rOogBq.pgp
Description: OpenPGP digital signature


panic: vm_fault: fault on nofault entry, addr: fffffe00873d8000

2015-12-06 Thread Fabian Keil
I got the following panic while trying to import a ZFS pool from a
geli-encrypted memory disk backed by a file located on a msdosfs partition:

(kgdb) where
#0  doadump (textdump=0) at pcpu.h:221
#1  0x80314c1b in db_dump (dummy=, dummy2=false, 
dummy3=0, dummy4=0x0) at /usr/src/sys/ddb/db_command.c:533
#2  0x80314a0e in db_command (cmd_table=0x0) at 
/usr/src/sys/ddb/db_command.c:440
#3  0x803147a4 in db_command_loop () at 
/usr/src/sys/ddb/db_command.c:493
#4  0x803172ab in db_trap (type=, code=0) at 
/usr/src/sys/ddb/db_main.c:251
#5  0x805dfe33 in kdb_trap (type=3, code=0, tf=) 
at /usr/src/sys/kern/subr_kdb.c:654
#6  0x80879bc7 in trap (frame=0xfe009444a240) at 
/usr/src/sys/amd64/amd64/trap.c:549
#7  0x8085eb77 in calltrap () at 
/usr/src/sys/amd64/amd64/exception.S:234
#8  0x805df51b in kdb_enter (why=0x8096c7fb "panic", msg=0x32 
) at cpufunc.h:63
#9  0x8059bbdf in vpanic (fmt=, ap=) at /usr/src/sys/kern/kern_shutdown.c:750
#10 0x8059ba33 in panic (fmt=0x0) at 
/usr/src/sys/kern/kern_shutdown.c:688
#11 0x8082ffb5 in vm_fault_hold (map=, 
vaddr=, fault_type=, 
fault_flags=, m_hold=)
at /usr/src/sys/vm/vm_fault.c:332
#12 0x8082de18 in vm_fault (map=0xf8000200, vaddr=, fault_type=2 '\002', fault_flags=0) at 
/usr/src/sys/vm/vm_fault.c:277
#13 0x8087a33a in trap_pfault (frame=0xfe009444a8e0, usermode=0) at 
/usr/src/sys/amd64/amd64/trap.c:734
#14 0x80879bde in trap (frame=0xfe009444a8e0) at 
/usr/src/sys/amd64/amd64/trap.c:435
#15 0x8085eb77 in calltrap () at 
/usr/src/sys/amd64/amd64/exception.S:234
#16 0x80877d5a in bcopy () at /usr/src/sys/amd64/amd64/support.S:118
#17 0x805f64e8 in uiomove_faultflag (cp=, n=, uio=0xfe009444aae0, nofault=) at 
/usr/src/sys/kern/subr_uio.c:208
#18 0x8046236f in msdosfs_read (ap=) at 
/usr/src/sys/fs/msdosfs/msdosfs_vnops.c:596
#19 0x808feb20 in VOP_READ_APV (vop=, a=) at vnode_if.c:930
#20 0x8039bf3a in mdstart_vnode (sc=0xf8004c7ce000, 
bp=0xf80028fc81f0) at vnode_if.h:384
#21 0x8039a3cc in md_kthread (arg=0xf8004c7ce000) at 
/usr/src/sys/dev/md/md.c:979
#22 0x8055978c in fork_exit (callout=0x8039a1a0 , 
arg=0xf8004c7ce000, frame=0xfe009444ac00) at 
/usr/src/sys/kern/kern_fork.c:1011
#23 0x8085f0ae in fork_trampoline () at 
/usr/src/sys/amd64/amd64/exception.S:609
#24 0x in ?? ()
Current language:  auto; currently minimal

This is the second time I've seen this, the first time was with a kernel
based on r290573 in November, but as I wasn't able to intentionally reproduce
it with a more recent kernel my assumption was that the problem had already
been fixed.

Currently my kernel is based on r291706.

Any ideas?

Fabian


pgpQI7N3Yo7h5.pgp
Description: OpenPGP digital signature


Re: panic: vm_fault: fault on nofault entry, addr: fffffe00873d8000

2015-12-06 Thread Konstantin Belousov
On Sun, Dec 06, 2015 at 06:51:36PM +0100, Fabian Keil wrote:
> > > #16 0x80877d5a in bcopy () at 
> > > /usr/src/sys/amd64/amd64/support.S:118
> > > #17 0x805f64e8 in uiomove_faultflag (cp=, 
> > > n=, uio=0xfe009444aae0, nofault= > > out>) at /usr/src/sys/kern/subr_uio.c:208
> > > #18 0x8046236f in msdosfs_read (ap=) at 
> > > /usr/src/sys/fs/msdosfs/msdosfs_vnops.c:596
> > > #19 0x808feb20 in VOP_READ_APV (vop=, 
> > > a=) at vnode_if.c:930
> > > #20 0x8039bf3a in mdstart_vnode (sc=0xf8004c7ce000, 
> > > bp=0xf80028fc81f0) at vnode_if.h:384  
> > From the frame 20, do 'p *bp' in kgdb and mail the result.  Do you have
> > any non-standard values for buffer cache knobs, esp. for MAXPHYS ?
> 
> (kgdb) p *bp
> $1 = {bio_cmd = 1 '\001', bio_flags = 16 '\020', bio_cflags = 0 '\0', 
> bio_pflags = 0 '\0', bio_dev = 0x0, bio_disk = 0x0, bio_offset = 0, 
> bio_bcount = 0, 
>   bio_data = 0xfe0077d94000 , 
> bio_ma = 0xf8000275bc00, bio_ma_offset = 960,

bio_ma_n = 33,
This is the issue.  The upper layer (ZFS ?) passed down the request
which is max-sized (see bio_length == 32 pages) but not aligned.
The physical buffer used for transient mapping cannot handle this.

bio_error = 0, bio_resid = 0, 
>   bio_done = 0x804e51d0 , bio_driver1 = 0x0, bio_driver2 
> = 0x0, bio_caller1 = 0x0, bio_caller2 = 0x0, bio_queue = {tqe_next = 0x0, 
> tqe_prev = 0xf8004c7ce018}, bio_attribute = 0x0, 
>   bio_from = 0xf80010131d80, bio_to = 0xf800694f2a00, bio_length = 
> 131072, bio_completed = 0, bio_children = 0, bio_inbed = 0, bio_parent = 
> 0xf8000628bd90, bio_t0 = {sec = 33029, 
> frac = 13163670047247984455}, bio_task = 0, bio_task_arg = 0x0, 
> bio_classifier1 = 0x0, bio_classifier2 = 0x0, bio_pblkno = 0}
>  
> I don't use non-standard values for MAXPHYS or other buffer cache settings.
> 

Try the following patch.

diff --git a/sys/dev/md/md.c b/sys/dev/md/md.c
index a47066e..52142ed 100644
--- a/sys/dev/md/md.c
+++ b/sys/dev/md/md.c
@@ -836,8 +836,8 @@ mdstart_vnode(struct md_s *sc, struct bio *bp)
struct buf *pb;
bus_dma_segment_t *vlist;
struct thread *td;
-   off_t len, zerosize;
-   int ma_offs;
+   off_t iolen, len, zerosize;
+   int ma_offs, npages;
 
switch (bp->bio_cmd) {
case BIO_READ:
@@ -858,6 +858,7 @@ mdstart_vnode(struct md_s *sc, struct bio *bp)
pb = NULL;
piov = NULL;
ma_offs = bp->bio_ma_offset;
+   len = bp->bio_length;
 
/*
 * VNODE I/O
@@ -890,7 +891,6 @@ mdstart_vnode(struct md_s *sc, struct bio *bp)
auio.uio_iovcnt = howmany(bp->bio_length, zerosize);
piov = malloc(sizeof(*piov) * auio.uio_iovcnt, M_MD, M_WAITOK);
auio.uio_iov = piov;
-   len = bp->bio_length;
while (len > 0) {
piov->iov_base = __DECONST(void *, zero_region);
piov->iov_len = len;
@@ -904,7 +904,6 @@ mdstart_vnode(struct md_s *sc, struct bio *bp)
piov = malloc(sizeof(*piov) * bp->bio_ma_n, M_MD, M_WAITOK);
auio.uio_iov = piov;
vlist = (bus_dma_segment_t *)bp->bio_data;
-   len = bp->bio_length;
while (len > 0) {
piov->iov_base = (void *)(uintptr_t)(vlist->ds_addr +
ma_offs);
@@ -920,11 +919,20 @@ mdstart_vnode(struct md_s *sc, struct bio *bp)
piov = auio.uio_iov;
} else if ((bp->bio_flags & BIO_UNMAPPED) != 0) {
pb = getpbuf(_vnode_pbuf_freecnt);
-   pmap_qenter((vm_offset_t)pb->b_data, bp->bio_ma, bp->bio_ma_n);
-   aiov.iov_base = (void *)((vm_offset_t)pb->b_data + ma_offs);
-   aiov.iov_len = bp->bio_length;
+   bp->bio_resid = len;
+unmapped_step:
+   npages = min(MAXPHYS, roundup2(len + ma_offs, PAGE_SIZE)) /
+   PAGE_SIZE;
+   iolen = min(npages * PAGE_SIZE - ma_offs, len);
+   KASSERT(iolen > 0, ("zero iolen"));
+   pmap_qenter((vm_offset_t)pb->b_data,
+   >bio_ma[ma_offs / PAGE_SIZE], npages);
+   aiov.iov_base = (void *)((vm_offset_t)pb->b_data +
+   ma_offs % PAGE_SIZE);
+   aiov.iov_len = iolen;
auio.uio_iov = 
auio.uio_iovcnt = 1;
+   auio.uio_resid = aiov.iov_len;
} else {
aiov.iov_base = bp->bio_data;
aiov.iov_len = bp->bio_length;
@@ -948,15 +956,21 @@ mdstart_vnode(struct md_s *sc, struct bio *bp)
vn_finished_write(mp);
}
 
-   if (pb) {
-   pmap_qremove((vm_offset_t)pb->b_data, bp->bio_ma_n);
+   if (pb != NULL) {
+   pmap_qremove((vm_offset_t)pb->b_data, npages);
+   if (error == 0) {
+   len -= iolen;
+   

Re: panic: vm_fault: fault on nofault entry

2014-03-20 Thread Sean Bruno
On Mon, 2014-03-10 at 15:10 -0400, Glen Barber wrote:
 On Mon, Mar 10, 2014 at 09:01:12PM +0200, Konstantin Belousov wrote:
  On Mon, Mar 10, 2014 at 02:05:08PM -0400, Glen Barber wrote:
   Unread portion of the kernel message buffer:
   Sleeping thread (tid 100702, pid 24712) owns a non-sleepable lock
  
  Would be nice to see the full message before and panic from the console.
 
 I will include it in the future.
 
  From what I see, this is a lock leak, I forgot to unlock the map.
  It is nice that it is so simple to reproduce the issue in your setup.
  
  Try this update.
  
 
 I will have the machine updated with this patch in the next few minutes.
 
 Thank you.
 
 Glen
 


All 4 machines have been patched and have been grinding away for several
days now.  I'd say this is a good test and we should commit this.

$ for i in 1 2 3 4; do ssh redbuild0${i} uptime; done
 5:47PM  up 1 day, 23:22, 1 user, load averages: 1.36, 1.10, 0.57
 5:47PM  up 1 day, 23:23, 1 user, load averages: 4.33, 3.87, 2.08
 5:47PM  up 3 days, 22:45, 1 user, load averages: 16.87, 12.47, 10.11
 5:47PM  up 9 days, 20:10, 1 user, load averages: 11.58, 12.34, 10.93


sean


signature.asc
Description: This is a digitally signed message part


Re: panic: vm_fault: fault on nofault entry

2014-03-10 Thread Konstantin Belousov
On Sun, Mar 09, 2014 at 02:16:57PM -0400, Glen Barber wrote:
 panic: vm_fault: fault on nofault entry, addr: fe03becbc000

I see, this panic is for access to the kernel map, not for the direct map.
I think that this is a race with other CPU unmapping some page in the
kernel map, which cannot be solved by access checks.

Please try the following.  I booted with the patch and checked that
kgdb /boot/kernel/kernel /dev/mem works, but did not tried to reproduce
the issue.

diff --git a/sys/amd64/amd64/mem.c b/sys/amd64/amd64/mem.c
index a21..5a4d8a9 100644
--- a/sys/amd64/amd64/mem.c
+++ b/sys/amd64/amd64/mem.c
@@ -76,14 +76,16 @@ MALLOC_DEFINE(M_MEMDESC, memdesc, memory range 
descriptors);
 int
 memrw(struct cdev *dev, struct uio *uio, int flags)
 {
-   int o;
-   u_long c = 0, v;
struct iovec *iov;
-   int error = 0;
+   u_long c, v;
+   int error, o, sflags;
vm_offset_t addr, eaddr;
 
GIANT_REQUIRED;
 
+   error = 0;
+   c = 0;
+   sflags = curthread_pflags_set(TDP_DEVMEMIO);
while (uio-uio_resid  0  error == 0) {
iov = uio-uio_iov;
if (iov-iov_len == 0) {
@@ -98,7 +100,15 @@ memrw(struct cdev *dev, struct uio *uio, int flags)
 kmemphys:
o = v  PAGE_MASK;
c = min(uio-uio_resid, (u_int)(PAGE_SIZE - o));
-   error = uiomove((void *)PHYS_TO_DMAP(v), (int)c, uio);
+   v = PHYS_TO_DMAP(v);
+   if (v  DMAP_MIN_ADDRESS ||
+   (v  DMAP_MIN_ADDRESS + dmaplimit 
+   v = DMAP_MAX_ADDRESS) ||
+   pmap_kextract(v) == 0) {
+   error = EFAULT;
+   goto ret;
+   }
+   error = uiomove((void *)v, (int)c, uio);
continue;
}
else if (dev2unit(dev) == CDEV_MINOR_KMEM) {
@@ -119,22 +129,30 @@ kmemphys:
addr = trunc_page(v);
eaddr = round_page(v + c);
 
-   if (addr  VM_MIN_KERNEL_ADDRESS)
-   return (EFAULT);
-   for (; addr  eaddr; addr += PAGE_SIZE) 
-   if (pmap_extract(kernel_pmap, addr) == 0)
-   return (EFAULT);
-
+   if (addr  VM_MIN_KERNEL_ADDRESS) {
+   error = EFAULT;
+   goto ret;
+   }
+   for (; addr  eaddr; addr += PAGE_SIZE) {
+   if (pmap_extract(kernel_pmap, addr) == 0) {
+   error = EFAULT;
+   goto ret;
+   }
+   }
if (!kernacc((caddr_t)(long)v, c,
uio-uio_rw == UIO_READ ? 
-   VM_PROT_READ : VM_PROT_WRITE))
-   return (EFAULT);
+   VM_PROT_READ : VM_PROT_WRITE)) {
+   error = EFAULT;
+   goto ret;
+   }
 
error = uiomove((caddr_t)(long)v, (int)c, uio);
continue;
}
/* else panic! */
}
+ret:
+   curthread_pflags_restore(sflags);
return (error);
 }
 
diff --git a/sys/amd64/amd64/trap.c b/sys/amd64/amd64/trap.c
index f7d0afd..b1cbdbc 100644
--- a/sys/amd64/amd64/trap.c
+++ b/sys/amd64/amd64/trap.c
@@ -787,6 +787,12 @@ nogo:
frame-tf_rip = (long)curpcb-pcb_onfault;
return (0);
}
+   if ((td-td_pflags  TDP_DEVMEMIO) != 0) {
+   KASSERT(curpcb-pcb_onfault != NULL,
+   (/dev/mem without pcb_onfault));
+   frame-tf_rip = (long)curpcb-pcb_onfault;
+   return (0);
+   }
trap_fatal(frame, eva);
return (-1);
}
diff --git a/sys/sys/proc.h b/sys/sys/proc.h
index fce1f8a..e7cd022 100644
--- a/sys/sys/proc.h
+++ b/sys/sys/proc.h
@@ -424,6 +424,7 @@ do {
\
 #defineTDP_RESETSPUR   0x0400 /* Reset spurious page fault 
history. */
 #defineTDP_NERRNO  0x0800 /* Last errno is already in td_errno 
*/
 #defineTDP_UIOHELD 0x1000 /* Current uio has pages held in 
td_ma */
+#defineTDP_DEVMEMIO0x2000 /* Accessing memory for /dev/mem */
 
 /*
  * Reasons that the current thread can not be run yet.
diff --git a/sys/vm/vm_fault.c b/sys/vm/vm_fault.c
index 4a6495f..023860c 100644
--- a/sys/vm/vm_fault.c
+++ b/sys/vm/vm_fault.c
@@ -269,6

Re: panic: vm_fault: fault on nofault entry

2014-03-10 Thread Glen Barber
On Mon, Mar 10, 2014 at 05:46:06PM +0200, Konstantin Belousov wrote:
 On Sun, Mar 09, 2014 at 02:16:57PM -0400, Glen Barber wrote:
  panic: vm_fault: fault on nofault entry, addr: fe03becbc000
 
 I see, this panic is for access to the kernel map, not for the direct map.
 I think that this is a race with other CPU unmapping some page in the
 kernel map, which cannot be solved by access checks.
 
 Please try the following.  I booted with the patch and checked that
 kgdb /boot/kernel/kernel /dev/mem works, but did not tried to reproduce
 the issue.
 

Thank you for looking into this.  I will report back.

Glen



pgpXHb6fpu3n5.pgp
Description: PGP signature


Re: panic: vm_fault: fault on nofault entry

2014-03-10 Thread Glen Barber
On Mon, Mar 10, 2014 at 11:51:15AM -0400, Glen Barber wrote:
 On Mon, Mar 10, 2014 at 05:46:06PM +0200, Konstantin Belousov wrote:
  On Sun, Mar 09, 2014 at 02:16:57PM -0400, Glen Barber wrote:
   panic: vm_fault: fault on nofault entry, addr: fe03becbc000
  
  I see, this panic is for access to the kernel map, not for the direct map.
  I think that this is a race with other CPU unmapping some page in the
  kernel map, which cannot be solved by access checks.
  
  Please try the following.  I booted with the patch and checked that
  kgdb /boot/kernel/kernel /dev/mem works, but did not tried to reproduce
  the issue.
  
 
 Thank you for looking into this.  I will report back.
 

The machine this was tested paniced again, but a bit differently.

This is the kgdb session from this vmcore:


Script started on Mon Mar 10 17:58:33 2014
command: /bin/sh
# kgdb ./kernel.debug /var/crash/vmcore.last
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type show copying to see the conditions.
There is absolutely no warranty for GDB.  Type show warranty for details.
This GDB was configured as amd64-marcel-freebsd...

Unread portion of the kernel message buffer:
Sleeping thread (tid 100702, pid 24712) owns a non-sleepable lock
KDB: stack backtrace of thread 100702:
sched_switch() at sched_switch+0x29e/frame 0xfe18390b8820
mi_switch() at mi_switch+0xe1/frame 0xfe18390b8860
sleepq_catch_signals() at sleepq_catch_signals+0xab/frame 0xfe18390b88e0
sleepq_wait_sig() at sleepq_wait_sig+0xf/frame 0xfe18390b8910
_sleep() at _sleep+0x2a3/frame 0xfe18390b8990
pipe_read() at pipe_read+0x34a/frame 0xfe18390b89f0
dofileread() at dofileread+0x95/frame 0xfe18390b8a40
kern_readv() at kern_readv+0x68/frame 0xfe18390b8a90
sys_read() at sys_read+0x63/frame 0xfe18390b8ae0
amd64_syscall() at amd64_syscall+0x3fb/frame 0xfe18390b8bf0
Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfe18390b8bf0
--- syscall (3, FreeBSD ELF64, sys_read), rip = 0x800b8443a, rsp = 
0x7fffac88, rbp = 0x7fffb500 ---
panic: sleeping thread
cpuid = 19
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe18392db010
kdb_backtrace() at kdb_backtrace+0x39/frame 0xfe18392db0c0
panic() at panic+0x155/frame 0xfe18392db140
propagate_priority() at propagate_priority+0x259/frame 0xfe18392db170
turnstile_wait() at turnstile_wait+0x3fe/frame 0xfe18392db1c0
__mtx_lock_sleep() at __mtx_lock_sleep+0x163/frame 0xfe18392db240
vm_map_lookup() at vm_map_lookup+0x38/frame 0xfe18392db2c0
vm_fault_hold() at vm_fault_hold+0xd1/frame 0xfe18392db510
vm_fault() at vm_fault+0x77/frame 0xfe18392db550
trap_pfault() at trap_pfault+0x199/frame 0xfe18392db5f0
trap() at trap+0x4a0/frame 0xfe18392db800
calltrap() at calltrap+0x8/frame 0xfe18392db800
--- trap 0xc, rip = 0x80d972cd, rsp = 0xfe18392db8c0, rbp = 
0xfe18392db920 ---
copyin() at copyin+0x3d/frame 0xfe18392db920
pipe_write() at pipe_write+0x10ea/frame 0xfe18392db9f0
dofilewrite() at dofilewrite+0x87/frame 0xfe18392dba40
kern_writev() at kern_writev+0x68/frame 0xfe18392dba90
sys_write() at sys_write+0x63/frame 0xfe18392dbae0
amd64_syscall() at amd64_syscall+0x3fb/frame 0xfe18392dbbf0
Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfe18392dbbf0
--- syscall (4, FreeBSD ELF64, sys_write), rip = 0x800b35afc, rsp = 
0x7fffd3b8, rbp = 0x41 ---
KDB: enter: panic

Reading symbols from /boot/kernel/zfs.ko.symbols...done.
Loaded symbols for /boot/kernel/zfs.ko.symbols
Reading symbols from /boot/kernel/opensolaris.ko.symbols...done.
Loaded symbols for /boot/kernel/opensolaris.ko.symbols
Reading symbols from /boot/kernel/ums.ko.symbols...done.
Loaded symbols for /boot/kernel/ums.ko.symbols
Reading symbols from /boot/kernel/tmpfs.ko.symbols...done.
Loaded symbols for /boot/kernel/tmpfs.ko.symbols
Reading symbols from /boot/kernel/nullfs.ko.symbols...done.
Loaded symbols for /boot/kernel/nullfs.ko.symbols
Reading symbols from /boot/kernel/linprocfs.ko.symbols...done.
Loaded symbols for /boot/kernel/linprocfs.ko.symbols
Reading symbols from /boot/kernel/linux.ko.symbols...done.
Loaded symbols for /boot/kernel/linux.ko.symbols
#0  doadump (textdump=-959294432) at pcpu.h:219
219 __asm(movq %%gs:%1,%0 : =r (td)
(kgdb) bt
#0  doadump (textdump=-959294432) at pcpu.h:219
#1  0x8034a175 in db_fncall (dummy1=value optimized out, 
dummy2=value optimized out, dummy3=value optimized out, dummy4=value 
optimized out)
at /usr/src/sys/ddb/db_command.c:578
#2  0x80349e5d in db_command (cmd_table=0x0) at 
/usr/src/sys/ddb/db_command.c:449
#3  0x80349bd4 in db_command_loop () at 
/usr/src/sys/ddb/db_command.c:502
#4  0x8034c630 in db_trap (type=value optimized out, code=0

Re: panic: vm_fault: fault on nofault entry

2014-03-10 Thread Konstantin Belousov
 {
\
 #defineTDP_RESETSPUR   0x0400 /* Reset spurious page fault 
history. */
 #defineTDP_NERRNO  0x0800 /* Last errno is already in td_errno 
*/
 #defineTDP_UIOHELD 0x1000 /* Current uio has pages held in 
td_ma */
+#defineTDP_DEVMEMIO0x2000 /* Accessing memory for /dev/mem */
 
 /*
  * Reasons that the current thread can not be run yet.
diff --git a/sys/vm/vm_fault.c b/sys/vm/vm_fault.c
index 4a6495f..ab48462 100644
--- a/sys/vm/vm_fault.c
+++ b/sys/vm/vm_fault.c
@@ -269,6 +269,10 @@ RetryFault:;
map_generation = fs.map-timestamp;
 
if (fs.entry-eflags  MAP_ENTRY_NOFAULT) {
+   if ((curthread-td_pflags  TDP_DEVMEMIO) != 0) {
+   vm_map_unlock_read(fs.map);
+   return (KERN_FAILURE);
+   }
panic(vm_fault: fault on nofault entry, addr: %lx,
(u_long)vaddr);
}


pgpxYCZdi32dp.pgp
Description: PGP signature


Re: panic: vm_fault: fault on nofault entry

2014-03-10 Thread Glen Barber
On Mon, Mar 10, 2014 at 09:01:12PM +0200, Konstantin Belousov wrote:
 On Mon, Mar 10, 2014 at 02:05:08PM -0400, Glen Barber wrote:
  Unread portion of the kernel message buffer:
  Sleeping thread (tid 100702, pid 24712) owns a non-sleepable lock
 
 Would be nice to see the full message before and panic from the console.

I will include it in the future.

 From what I see, this is a lock leak, I forgot to unlock the map.
 It is nice that it is so simple to reproduce the issue in your setup.
 
 Try this update.
 

I will have the machine updated with this patch in the next few minutes.

Thank you.

Glen



pgp045NbOMN9q.pgp
Description: PGP signature


panic: vm_fault: fault on nofault entry

2014-03-09 Thread Glen Barber
We are having regular panics on several machines in the cluster.

Below follows the script from the kgdb(1) session, hopefully providing
enough information.  This machine runs 11.0-CURRENT #2 r262892, from
2 days ago.

It uses tmpfs(5) for the port build workspace.  I have an unconfirmed
suspicion that use of sysutils/lsof is involved somehow, but cannot be
sure.  (In my experience with panics with port building, removing lsof
from the system did have an effect, but I may be going down the wrong
rabbit hole.)


Script started on Sun Mar  9 16:40:07 2014
r...@redbuild01.nyi:/usr/obj/usr/src/sys/REDBUILD # sh
# kgdb ./kernel.debug /var/crash/vmcore.1
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type show copying to see the conditions.
There is absolutely no warranty for GDB.  Type show warranty for details.
This GDB was configured as amd64-marcel-freebsd...

Unread portion of the kernel message buffer:
panic: vm_fault: fault on nofault entry, addr: fe035021a000
cpuid = 1
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe1839a54180
kdb_backtrace() at kdb_backtrace+0x39/frame 0xfe1839a54230
panic() at panic+0x155/frame 0xfe1839a542b0
vm_fault_hold() at vm_fault_hold+0x1e7a/frame 0xfe1839a54500
vm_fault() at vm_fault+0x77/frame 0xfe1839a54540
trap_pfault() at trap_pfault+0x199/frame 0xfe1839a545e0
trap() at trap+0x4a0/frame 0xfe1839a547f0
calltrap() at calltrap+0x8/frame 0xfe1839a547f0
--- trap 0xc, rip = 0x80d97bab, rsp = 0xfe1839a548b0, rbp = 
0xfe1839a54910 ---
copyout() at copyout+0x3b/frame 0xfe1839a54910
memrw() at memrw+0x19f/frame 0xfe1839a54950
giant_read() at giant_read+0xa4/frame 0xfe1839a54990
devfs_read_f() at devfs_read_f+0xeb/frame 0xfe1839a549f0
dofileread() at dofileread+0x95/frame 0xfe1839a54a40
kern_readv() at kern_readv+0x68/frame 0xfe1839a54a90
sys_read() at sys_read+0x63/frame 0xfe1839a54ae0
amd64_syscall() at amd64_syscall+0x3fb/frame 0xfe1839a54bf0
Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfe1839a54bf0
--- syscall (3, FreeBSD ELF64, sys_read), rip = 0x800b8444a, rsp = 
0x7fffd088, rbp = 0x7fffd0d0 ---
KDB: enter: panic

Reading symbols from /boot/kernel/zfs.ko.symbols...done.
Loaded symbols for /boot/kernel/zfs.ko.symbols
Reading symbols from /boot/kernel/opensolaris.ko.symbols...done.
Loaded symbols for /boot/kernel/opensolaris.ko.symbols
Reading symbols from /boot/kernel/ums.ko.symbols...done.
Loaded symbols for /boot/kernel/ums.ko.symbols
Reading symbols from /boot/kernel/tmpfs.ko.symbols...done.
Loaded symbols for /boot/kernel/tmpfs.ko.symbols
Reading symbols from /boot/kernel/nullfs.ko.symbols...done.
Loaded symbols for /boot/kernel/nullfs.ko.symbols
Reading symbols from /boot/kernel/linprocfs.ko.symbols...done.
Loaded symbols for /boot/kernel/linprocfs.ko.symbols
Reading symbols from /boot/kernel/linux.ko.symbols...done.
Loaded symbols for /boot/kernel/linux.ko.symbols
#0  doadump (textdump=-967130448) at pcpu.h:219
219 __asm(movq %%gs:%1,%0 : =r (td)
(kgdb) bt
#0  doadump (textdump=-967130448) at pcpu.h:219
#1  0x8034a1a5 in db_fncall (dummy1=value optimized out, 
dummy2=value optimized out, dummy3=value optimized out, dummy4=value 
optimized out)
at /usr/src/sys/ddb/db_command.c:578
#2  0x80349e8d in db_command (cmd_table=0x0) at 
/usr/src/sys/ddb/db_command.c:449
#3  0x80349c04 in db_command_loop () at 
/usr/src/sys/ddb/db_command.c:502
#4  0x8034c660 in db_trap (type=value optimized out, code=0)
at /usr/src/sys/ddb/db_main.c:231
#5  0x80987ae9 in kdb_trap (type=3, code=0, tf=value optimized out)
at /usr/src/sys/kern/subr_kdb.c:656
#6  0x80d999b9 in trap (frame=0xfe1839a54160)
at /usr/src/sys/amd64/amd64/trap.c:571
#7  0x80d7e6e2 in calltrap () at 
/usr/src/sys/amd64/amd64/exception.S:231
#8  0x8098724e in kdb_enter (why=0x8100f4ba panic, msg=value 
optimized out)
at cpufunc.h:63
#9  0x80946a75 in panic (fmt=value optimized out)
at /usr/src/sys/kern/kern_shutdown.c:752
#10 0x80c0a1fa in vm_fault_hold (map=value optimized out, 
vaddr=value optimized out, fault_type=value optimized out, 
fault_flags=value optimized out, m_hold=value optimized out)
at /usr/src/sys/vm/vm_fault.c:272
#11 0x80c08337 in vm_fault (map=0xf8000200, vaddr=value 
optimized out, 
fault_type=1 '\001', fault_flags=128) at /usr/src/sys/vm/vm_fault.c:217
#12 0x80d9a1a9 in trap_pfault (frame=0xfe1839a54800, usermode=0)
at /usr/src/sys/amd64/amd64/trap.c:767
#13 0x80d999d0 in trap (frame=0xfe1839a54800)
at /usr/src/sys/amd64/amd64/trap.c:455
#14 0x80d7e6e2 in calltrap () at 
/usr/src/sys/amd64/amd64

Re: panic: vm_fault: fault on nofault entry

2014-03-09 Thread Konstantin Belousov
On Sun, Mar 09, 2014 at 12:56:48PM -0400, Glen Barber wrote:
 We are having regular panics on several machines in the cluster.
 
 Below follows the script from the kgdb(1) session, hopefully providing
 enough information.  This machine runs 11.0-CURRENT #2 r262892, from
 2 days ago.
 
 It uses tmpfs(5) for the port build workspace.  I have an unconfirmed
 suspicion that use of sysutils/lsof is involved somehow, but cannot be
 sure.  (In my experience with panics with port building, removing lsof
 from the system did have an effect, but I may be going down the wrong
 rabbit hole.)
 
 
 Script started on Sun Mar  9 16:40:07 2014
 r...@redbuild01.nyi:/usr/obj/usr/src/sys/REDBUILD # sh
 # kgdb ./kernel.debug /var/crash/vmcore.1
 GNU gdb 6.1.1 [FreeBSD]
 Copyright 2004 Free Software Foundation, Inc.
 GDB is free software, covered by the GNU General Public License, and you are
 welcome to change it and/or distribute copies of it under certain conditions.
 Type show copying to see the conditions.
 There is absolutely no warranty for GDB.  Type show warranty for details.
 This GDB was configured as amd64-marcel-freebsd...
 
 Unread portion of the kernel message buffer:
 panic: vm_fault: fault on nofault entry, addr: fe035021a000
 cpuid = 1
 KDB: stack backtrace:
 db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe1839a54180
 kdb_backtrace() at kdb_backtrace+0x39/frame 0xfe1839a54230
 panic() at panic+0x155/frame 0xfe1839a542b0
 vm_fault_hold() at vm_fault_hold+0x1e7a/frame 0xfe1839a54500
 vm_fault() at vm_fault+0x77/frame 0xfe1839a54540
 trap_pfault() at trap_pfault+0x199/frame 0xfe1839a545e0
 trap() at trap+0x4a0/frame 0xfe1839a547f0
 calltrap() at calltrap+0x8/frame 0xfe1839a547f0
 --- trap 0xc, rip = 0x80d97bab, rsp = 0xfe1839a548b0, rbp = 
 0xfe1839a54910 ---
 copyout() at copyout+0x3b/frame 0xfe1839a54910
 memrw() at memrw+0x19f/frame 0xfe1839a54950
 giant_read() at giant_read+0xa4/frame 0xfe1839a54990
 devfs_read_f() at devfs_read_f+0xeb/frame 0xfe1839a549f0
 dofileread() at dofileread+0x95/frame 0xfe1839a54a40
 kern_readv() at kern_readv+0x68/frame 0xfe1839a54a90
 sys_read() at sys_read+0x63/frame 0xfe1839a54ae0
 amd64_syscall() at amd64_syscall+0x3fb/frame 0xfe1839a54bf0
 Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfe1839a54bf0
 --- syscall (3, FreeBSD ELF64, sys_read), rip = 0x800b8444a, rsp = 
 0x7fffd088, rbp = 0x7fffd0d0 ---
 KDB: enter: panic

This is very similar to issue reported several time ago.
Try this patch.  I never get a feedback.

diff --git a/sys/amd64/amd64/mem.c b/sys/amd64/amd64/mem.c
index a21..fd9c5df 100644
--- a/sys/amd64/amd64/mem.c
+++ b/sys/amd64/amd64/mem.c
@@ -98,7 +98,13 @@ memrw(struct cdev *dev, struct uio *uio, int flags)
 kmemphys:
o = v  PAGE_MASK;
c = min(uio-uio_resid, (u_int)(PAGE_SIZE - o));
-   error = uiomove((void *)PHYS_TO_DMAP(v), (int)c, uio);
+   v = PHYS_TO_DMAP(v);
+   if (v  DMAP_MIN_ADDRESS ||
+   (v  DMAP_MIN_ADDRESS + dmaplimit 
+   v = DMAP_MAX_ADDRESS) ||
+   pmap_kextract(v) == 0)
+   return (EFAULT);
+   error = uiomove((void *)v, (int)c, uio);
continue;
}
else if (dev2unit(dev) == CDEV_MINOR_KMEM) {


pgp7EkK2uqfLA.pgp
Description: PGP signature


Re: panic: vm_fault: fault on nofault entry

2014-03-09 Thread Glen Barber
On Sun, Mar 09, 2014 at 08:01:32PM +0200, Konstantin Belousov wrote:
 On Sun, Mar 09, 2014 at 12:56:48PM -0400, Glen Barber wrote:
  We are having regular panics on several machines in the cluster.
  
  Below follows the script from the kgdb(1) session, hopefully providing
  enough information.  This machine runs 11.0-CURRENT #2 r262892, from
  2 days ago.
  
  It uses tmpfs(5) for the port build workspace.  I have an unconfirmed
  suspicion that use of sysutils/lsof is involved somehow, but cannot be
  sure.  (In my experience with panics with port building, removing lsof
  from the system did have an effect, but I may be going down the wrong
  rabbit hole.)
  
 
 This is very similar to issue reported several time ago.
 Try this patch.  I never get a feedback.
 
 diff --git a/sys/amd64/amd64/mem.c b/sys/amd64/amd64/mem.c
 index a21..fd9c5df 100644
 --- a/sys/amd64/amd64/mem.c
 +++ b/sys/amd64/amd64/mem.c
 @@ -98,7 +98,13 @@ memrw(struct cdev *dev, struct uio *uio, int flags)
  kmemphys:
   o = v  PAGE_MASK;
   c = min(uio-uio_resid, (u_int)(PAGE_SIZE - o));
 - error = uiomove((void *)PHYS_TO_DMAP(v), (int)c, uio);
 + v = PHYS_TO_DMAP(v);
 + if (v  DMAP_MIN_ADDRESS ||
 + (v  DMAP_MIN_ADDRESS + dmaplimit 
 + v = DMAP_MAX_ADDRESS) ||
 + pmap_kextract(v) == 0)
 + return (EFAULT);
 + error = uiomove((void *)v, (int)c, uio);
   continue;
   }
   else if (dev2unit(dev) == CDEV_MINOR_KMEM) {

There is a very similar patch on one of these machines.

  Index: sys/amd64/amd64/mem.c
  ===
  --- sys/amd64/amd64/mem.c (revision 262298)
  +++ sys/amd64/amd64/mem.c (working copy)
  @@ -98,6 +98,12 @@
   kmemphys:
o = v  PAGE_MASK;
c = min(uio-uio_resid, (u_int)(PAGE_SIZE - o));
  + v = PHYS_TO_DMAP(v);
  + if (v  DMAP_MIN_ADDRESS ||
  + (v  DMAP_MIN_ADDRESS + dmaplimit 
  + v = DMAP_MAX_ADDRESS) ||
  + pmap_kextract(v) == 0)
  + return (EFAULT);
error = uiomove((void *)PHYS_TO_DMAP(v), (int)c, uio);
continue;
}
  Index: sys/amd64/amd64/pmap.c
  ===
  --- sys/amd64/amd64/pmap.c(revision 262298)
  +++ sys/amd64/amd64/pmap.c(working copy)
  @@ -321,7 +321,7 @@
   Number of kernel page table pages allocated on bootup);
   
   static int ndmpdp;
  -static vm_paddr_t dmaplimit;
  +vm_paddr_t dmaplimit;
   vm_offset_t kernel_vm_end = VM_MIN_KERNEL_ADDRESS;
   pt_entry_t pg_nx;
   
  Index: sys/amd64/include/pmap.h
  ===
  --- sys/amd64/include/pmap.h  (revision 262298)
  +++ sys/amd64/include/pmap.h  (working copy)
  @@ -369,6 +369,7 @@
   extern vm_paddr_t dump_avail[];
   extern vm_offset_t virtual_avail;
   extern vm_offset_t virtual_end;
  +extern vm_paddr_t dmaplimit;
   
   #define  pmap_page_get_memattr(m)((vm_memattr_t)(m)-md.pat_mode)
   #define  pmap_page_is_write_mapped(m)(((m)-aflags  PGA_WRITEABLE) 
!= 0)

The machine this change is on paniced today as well.  That machine runs
r262298M, and I have a vmcore from Feb 24 (there was not enough
available space to get a crash dump today.)

The backtrace from Feb 24 follows.

Script started on Sun Mar  9 18:14:41 2014
r...@redbuild04.nyi:/usr/obj/usr/src/sys/REDBUILD # sh
# kgdb ./kernel.debug /var/crash/vmcore.3
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type show copying to see the conditions.
There is absolutely no warranty for GDB.  Type show warranty for details.
This GDB was configured as amd64-marcel-freebsd...

Unread portion of the kernel message buffer:
panic: vm_fault: fault on nofault entry, addr: fe03becbc000
cpuid = 23
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe1838ec1180
kdb_backtrace() at kdb_backtrace+0x39/frame 0xfe1838ec1230
panic() at panic+0x155/frame 0xfe1838ec12b0
vm_fault_hold() at vm_fault_hold+0x1e7a/frame 0xfe1838ec1500
vm_fault() at vm_fault+0x77/frame 0xfe1838ec1540
trap_pfault() at trap_pfault+0x199/frame 0xfe1838ec15e0
trap() at trap+0x4a0/frame 0xfe1838ec17f0
calltrap() at calltrap+0x8/frame 0xfe1838ec17f0
--- trap 0xc, rip = 0x80d971fb, rsp = 0xfe1838ec18b0, rbp = 
0xfe1838ec1910 ---
copyout() at copyout+0x3b/frame

Re: panic: vm_fault: fault on nofault entry

2014-03-09 Thread Sean Bruno
On Sun, 2014-03-09 at 14:16 -0400, Glen Barber wrote:
 On Sun, Mar 09, 2014 at 08:01:32PM +0200, Konstantin Belousov wrote:
  On Sun, Mar 09, 2014 at 12:56:48PM -0400, Glen Barber wrote:
   We are having regular panics on several machines in the cluster.
   
   Below follows the script from the kgdb(1) session, hopefully providing
   enough information.  This machine runs 11.0-CURRENT #2 r262892, from
   2 days ago.
   
   It uses tmpfs(5) for the port build workspace.  I have an unconfirmed
   suspicion that use of sysutils/lsof is involved somehow, but cannot be
   sure.  (In my experience with panics with port building, removing lsof
   from the system did have an effect, but I may be going down the wrong
   rabbit hole.)
   
  
  This is very similar to issue reported several time ago.
  Try this patch.  I never get a feedback.
  
  diff --git a/sys/amd64/amd64/mem.c b/sys/amd64/amd64/mem.c
  index a21..fd9c5df 100644
  --- a/sys/amd64/amd64/mem.c
  +++ b/sys/amd64/amd64/mem.c
  @@ -98,7 +98,13 @@ memrw(struct cdev *dev, struct uio *uio, int flags)
   kmemphys:
  o = v  PAGE_MASK;
  c = min(uio-uio_resid, (u_int)(PAGE_SIZE - o));
  -   error = uiomove((void *)PHYS_TO_DMAP(v), (int)c, uio);
  +   v = PHYS_TO_DMAP(v);
  +   if (v  DMAP_MIN_ADDRESS ||
  +   (v  DMAP_MIN_ADDRESS + dmaplimit 
  +   v = DMAP_MAX_ADDRESS) ||
  +   pmap_kextract(v) == 0)
  +   return (EFAULT);
  +   error = uiomove((void *)v, (int)c, uio);
  continue;
  }
  else if (dev2unit(dev) == CDEV_MINOR_KMEM) {
 
 There is a very similar patch on one of these machines.
 
   Index: sys/amd64/amd64/mem.c
   ===
   --- sys/amd64/amd64/mem.c   (revision 262298)
   +++ sys/amd64/amd64/mem.c   (working copy)
   @@ -98,6 +98,12 @@
kmemphys:
   o = v  PAGE_MASK;
   c = min(uio-uio_resid, (u_int)(PAGE_SIZE - o));
   +   v = PHYS_TO_DMAP(v);
   +   if (v  DMAP_MIN_ADDRESS ||
   +   (v  DMAP_MIN_ADDRESS + dmaplimit 
   +   v = DMAP_MAX_ADDRESS) ||
   +   pmap_kextract(v) == 0)
   +   return (EFAULT);
   error = uiomove((void *)PHYS_TO_DMAP(v), (int)c, uio);
   continue;
   }
   Index: sys/amd64/amd64/pmap.c
   ===
   --- sys/amd64/amd64/pmap.c  (revision 262298)
   +++ sys/amd64/amd64/pmap.c  (working copy)
   @@ -321,7 +321,7 @@
Number of kernel page table pages allocated on bootup);

static int ndmpdp;
   -static vm_paddr_t dmaplimit;
   +vm_paddr_t dmaplimit;
vm_offset_t kernel_vm_end = VM_MIN_KERNEL_ADDRESS;
pt_entry_t pg_nx;

   Index: sys/amd64/include/pmap.h
   ===
   --- sys/amd64/include/pmap.h(revision 262298)
   +++ sys/amd64/include/pmap.h(working copy)
   @@ -369,6 +369,7 @@
extern vm_paddr_t dump_avail[];
extern vm_offset_t virtual_avail;
extern vm_offset_t virtual_end;
   +extern vm_paddr_t dmaplimit;

#definepmap_page_get_memattr(m)((vm_memattr_t)(m)-md.pat_mode)
#definepmap_page_is_write_mapped(m)(((m)-aflags  PGA_WRITEABLE) 
 != 0)
 
 The machine this change is on paniced today as well.  That machine runs
 r262298M, and I have a vmcore from Feb 24 (there was not enough
 available space to get a crash dump today.)
 
 The backtrace from Feb 24 follows.
 
 Script started on Sun Mar  9 18:14:41 2014
 r...@redbuild04.nyi:/usr/obj/usr/src/sys/REDBUILD # sh
 # kgdb ./kernel.debug /var/crash/vmcore.3
 GNU gdb 6.1.1 [FreeBSD]
 Copyright 2004 Free Software Foundation, Inc.
 GDB is free software, covered by the GNU General Public License, and you are
 welcome to change it and/or distribute copies of it under certain conditions.
 Type show copying to see the conditions.
 There is absolutely no warranty for GDB.  Type show warranty for details.
 This GDB was configured as amd64-marcel-freebsd...
 
 Unread portion of the kernel message buffer:
 panic: vm_fault: fault on nofault entry, addr: fe03becbc000
 cpuid = 23
 KDB: stack backtrace:
 db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe1838ec1180
 kdb_backtrace() at kdb_backtrace+0x39/frame 0xfe1838ec1230
 panic() at panic+0x155/frame 0xfe1838ec12b0
 vm_fault_hold() at vm_fault_hold+0x1e7a/frame 0xfe1838ec1500
 vm_fault() at vm_fault+0x77/frame 0xfe1838ec1540
 trap_pfault() at trap_pfault+0x199/frame 0xfe1838ec15e0
 trap() at trap+0x4a0/frame 0xfe1838ec17f0
 calltrap() at calltrap+0x8/frame 0xfe1838ec17f0
 --- trap 0xc, rip

panic: vm_fault: fault on nofault entry, addr: c3dce000

2010-04-03 Thread Doug Barton
This is happening at boot time, but not every time. Yesterday's
-current, r206116. core.txt.3 is in freefall:~dougb.

#0  doadump () at pcpu.h:246
246 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) #0  doadump () at pcpu.h:246
#1  0xc05f754f in boot (howto=260)
at /usr/local/src/sys/kern/kern_shutdown.c:416
#2  0xc05f7832 in panic (fmt=Variable fmt is not available.
) at /usr/local/src/sys/kern/kern_shutdown.c:579
#3  0xc0810598 in vm_fault (map=0xc19b6000, vaddr=3286032384,
fault_type=Variable fault_type is not available.
)
at /usr/local/src/sys/vm/vm_fault.c:255
#4  0xc0866903 in trap_pfault (frame=0xd9bbaa38, usermode=0, eva=3286033576)
at /usr/local/src/sys/i386/i386/trap.c:842
#5  0xc08671c3 in trap (frame=0xd9bbaa38)
at /usr/local/src/sys/i386/i386/trap.c:533
#6  0xc08498cb in calltrap () at
/usr/local/src/sys/i386/i386/exception.s:165
#7  0xc0828ce2 in vm_reserv_alloc_page (object=0xc19a33b8, pindex=3073)
at /usr/local/src/sys/vm/vm_reserv.c:320
#8  0xc082220e in vm_page_alloc (object=0xc19a33b8, pindex=3073, req=64)
at /usr/local/src/sys/vm/vm_page.c:1086
#9  0xc08109e5 in vm_fault (map=0xc593ae80, vaddr=679325696,
fault_type=2 '\002', fault_flags=Variable fault_flags is not
available.
) at /usr/local/src/sys/vm/vm_fault.c:399
#10 0xc086688e in trap_pfault (frame=0xd9bbad38, usermode=1, eva=679325760)
at /usr/local/src/sys/i386/i386/trap.c:832
#11 0xc086700e in trap (frame=0xd9bbad38)
at /usr/local/src/sys/i386/i386/trap.c:401
#12 0xc08498cb in calltrap () at
/usr/local/src/sys/i386/i386/exception.s:165
#13 0x2a3d02ac in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb)


-- 

... and that's just a little bit of history repeating.
-- Propellerheads

Improve the effectiveness of your Internet presence with
a domain name makeover!http://SupersetSolutions.com/

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: panic: vm_fault: fault on nofault entry

2003-11-23 Thread Shin-ichi Yoshimoto
Subject: Re: panic: vm_fault: fault on nofault entry,
On Wed, 19 Nov 2003 12:38:14 +0900, Jun Kuriyama wrote:
 After CVSup'ing to latest source, it can be reproduced.  It happens at
 make release.  /mnt below may indicates this happened at making
 floppies with mfs filesystem.

Yaeh, latest kernel also caused the same panic at make release :-(

-- 
Shin-ichi YOSHIMOTO [EMAIL PROTECTED]
http://diary.waishi.jp/~yosimoto/diary/
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: panic: vm_fault: fault on nofault entry

2003-11-18 Thread Jun Kuriyama

After CVSup'ing to latest source, it can be reproduced.  It happens at
make release.  /mnt below may indicates this happened at making
floppies with mfs filesystem.


- serial console
/mnt: correcting fs_sblockloc from 8192 to 65536
panic: vm_fault: fault on nofault entry, addr: daef5000
cpuid = 0; 
Debugger(panic)
Stopped at  Debugger+0x55:  xchgl   %ebx,in_Debugger.0
db trace
Debugger(c075e5bd,0,c0777dc0,ece63608,100) at Debugger+0x55
panic(c0777dc0,daef5000,1,ece636b8,ece636a8) at panic+0x156
vm_fault(c1031000,daef5000,1,0,c8f21500) at vm_fault+0x122e
trap_pfault(ece6379c,0,daef5000,c07617f2,daef5000) at trap_pfault+0x152
trap(ece60018,c0550010,c0810010,cacfa000,daef5000) at trap+0x313
calltrap() at calltrap+0x5
--- trap 0xc, eip = 0xc0699384, esp = 0xece637dc, ebp = 0xece637f0 ---
ffs_load_inode(d4cc6310,caaff1a4,c87cb000,150,0) at ffs_load_inode+0xa4
ffs_vget(c8f53c00,150,2,ece638e0,8180) at ffs_vget+0x3a2
ffs_valloc(cacf4410,8180,c9082780,ece638e0,ece638f8) at ffs_valloc+0x100
ufs_makeinode(8180,cacf4410,ece63bec,ece63c00,202) at ufs_makeinode+0x69
ufs_create(ece63a68,ece63b24,c05c690e,ece63a68,ece63a64) at ufs_create+0x39
ufs_vnoperate(ece63a68,ece63a64,2,c07e0940,c8f21500) at ufs_vnoperate+0x18
vn_open_cred(ece63bd8,ece63cd8,180,c9082780,4) at vn_open_cred+0x19e
vn_open(ece63bd8,ece63cd8,180,4,c07e2690) at vn_open+0x33
kern_open(c8f21500,8059040,0,202,180) at kern_open+0xce
open(c8f21500,ece63d10,c077efce,3ee,3) at open+0x30
syscall(2f,2f,2f,3,bfbfe7b0) at syscall+0x2c0
Xint0x80_syscall() at Xint0x80_syscall+0x1d
--- syscall (5, FreeBSD ELF32, open), eip = 0x280ca08f, esp = 0xbfbfe78c, ebp = 
0xbfbfe898 ---

- gdb -k
#0  doadump () at ../../../kern/kern_shutdown.c:240
240 dumping++;
(kgdb) where
#0  doadump () at ../../../kern/kern_shutdown.c:240
#1  0xc055fa7b in boot (howto=260) at ../../../kern/kern_shutdown.c:372
#2  0xc055fe7d in panic () at ../../../kern/kern_shutdown.c:550
#3  0xc046f632 in db_panic () at ../../../ddb/db_command.c:450
#4  0xc046f592 in db_command (last_cmdp=0xc07d2760, cmd_table=0x0, 
aux_cmd_tablep=0xc0784658, aux_cmd_tablep_end=0xc078465c)
at ../../../ddb/db_command.c:346
#5  0xc046f6d5 in db_command_loop () at ../../../ddb/db_command.c:472
#6  0xc04726d5 in db_trap (type=3, code=0) at ../../../ddb/db_trap.c:73
#7  0xc06f8f8c in kdb_trap (type=3, code=0, regs=0xece6357c)
at ../../../i386/i386/db_interface.c:171
#8  0xc070e678 in trap (frame=
  {tf_fs = -1065484264, tf_es = -923664368, tf_ds = 16, tf_edi = -1065910848, 
tf_esi = 1, tf_ebp = -320457272, tf_isp = -320457304, tf_ebx = 0, tf_edx = 0, tf_ecx = 
1, tf_eax = 18, tf_trapno = 3, tf_err = 0, tf_eip = -1066429803, tf_cs = 8, tf_eflags 
= 642, tf_esp = -1065892694, tf_ss = -1066015299})
at ../../../i386/i386/trap.c:580
#9  0xc06fa9d8 in calltrap () at {standard input}:94
#10 0xc055fe16 in panic (
fmt=0xc0777dc0 vm_fault: fault on nofault entry, addr: %lx)
at ../../../kern/kern_shutdown.c:534
#11 0xc06b0fae in vm_fault (map=0xc1031000, vaddr=3673116672, 
fault_type=1 '\001', fault_flags=0) at ../../../vm/vm_fault.c:891
#12 0xc070e8c2 in trap_pfault (frame=0xece6379c, usermode=0, eva=3673116672)
at ../../../i386/i386/trap.c:723
#13 0xc070e4f3 in trap (frame=
  {tf_fs = -320471016, tf_es = -1068171248, tf_ds = -1065287664, tf_edi = 
-892362752, tf_esi = -621850624, tf_ebp = -320456720, tf_isp = -320456760, tf_ebx = 
-894439004, tf_edx = -621850624, tf_ecx = 64, tf_eax = 10, tf_trapno = 12, tf_err = 0, 
tf_eip = -1066822780, tf_cs = 8, tf_eflags = 66182, tf_esp = -892362752, tf_ss = 16}) 
at ../../../i386/i386/trap.c:420
#14 0xc06fa9d8 in calltrap () at {standard input}:94
#15 0xc069c362 in ffs_vget (mp=0xc8f53c00, ino=3402604544, flags=2, 
vpp=0xece638e0) at ../../../ufs/ffs/ffs_vfsops.c:1333
#16 0xc0681400 in ffs_valloc (pvp=0xcacf4410, mode=33152, cred=0xc9082780, 
vpp=0xece638e0) at ../../../ufs/ffs/ffs_alloc.c:861
#17 0xc06aac19 in ufs_makeinode (mode=33152, dvp=0xcacf4410, vpp=0xece63bec, 
cnp=0xece63c00) at ../../../ufs/ufs/ufs_vnops.c:2358
#18 0xc06a71b9 in ufs_create (ap=0xece63a68)
at ../../../ufs/ufs/ufs_vnops.c:199
#19 0xc06ab328 in ufs_vnoperate (ap=0x0) at ../../../ufs/ufs/ufs_vnops.c:2793
#20 0xc05c690e in vn_open_cred (ndp=0xece63bd8, flagp=0xece63cd8, cmode=384, 
cred=0xc9082780, fdidx=0) at vnode_if.h:118
#21 0xc05c6763 in vn_open (ndp=0x0, flagp=0x0, cmode=0, fdidx=0)
at ../../../kern/vfs_vnops.c:93
#22 0xc05bfc3e in kern_open (td=0xc8f21500, path=0x0, pathseg=UIO_USERSPACE, 
flags=514, mode=384) at ../../../kern/vfs_syscalls.c:963
#23 0xc05bfb60 in open (td=0x0, uap=0x0) at ../../../kern/vfs_syscalls.c:933
#24 0xc070f020 in syscall (frame=
  {tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = 3, tf_esi = -1077942352, tf_ebp = 
-1077942120, tf_isp = -320455308, tf_ebx = -1077942344, tf_edx = -1, tf_ecx = 2, 
tf_eax = 5, tf_trapno = 0, tf_err = 2, tf_eip = 671916175, tf_cs = 31, tf_eflags = 
518, tf_esp

panic: vm_fault: fault on nofault entry

2003-11-17 Thread Jun Kuriyama

Hmm, I'm not sure where the actual panic is caused.  This is
yesterday's kernel (about 16 hours ago).


-
% gdb -k kernel.debug /work/crash/vmcore.5 
GNU gdb 5.2.1 (FreeBSD)
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type show copying to see the conditions.
There is absolutely no warranty for GDB.  Type show warranty for details.
This GDB was configured as i386-undermydesk-freebsd...
panic: vm_fault: fault on nofault entry, addr: d84a5000
panic messages:
---
Syntax error: Unterminated quoted string
---
Reading symbols from /boot/kernel/mga.ko...done.
Loaded symbols for /boot/kernel/mga.ko
#0  doadump () at ../../../kern/kern_shutdown.c:240
240 dumping++;
(kgdb) bt
#0  doadump () at ../../../kern/kern_shutdown.c:240
#1  0xc055f60b in boot (howto=260) at ../../../kern/kern_shutdown.c:372
#2  0xc055fa0d in panic () at ../../../kern/kern_shutdown.c:550
#3  0xc046f5c2 in db_panic () at ../../../ddb/db_command.c:450
#4  0xc046f522 in db_command (last_cmdp=0xc07d2160, cmd_table=0x0, 
aux_cmd_tablep=0xc0784140, aux_cmd_tablep_end=0xc0784144)
at ../../../ddb/db_command.c:346
#5  0xc046f665 in db_command_loop () at ../../../ddb/db_command.c:472
#6  0xc0472665 in db_trap (type=3, code=0) at ../../../ddb/db_trap.c:73
#7  0xc06f8b2c in kdb_trap (type=3, code=0, regs=0xed17657c)
at ../../../i386/i386/db_interface.c:171
#8  0xc070e218 in trap (frame=
  {tf_fs = -1065484264, tf_es = -765460464, tf_ds = 16, tf_edi = -1065912277, 
tf_esi = 1, tf_ebp = -317233720, tf_isp = -317233752, tf_ebx = 0, tf_edx = 0, tf_ecx = 
1, tf_eax = 18, tf_trapno = 3, tf_err = 0, tf_eip = -1066430923, tf_cs = 8, tf_eflags 
= 642, tf_esp = -1065894006, tf_ss = -1066016728})
at ../../../i386/i386/trap.c:580
#9  0xc06fa578 in calltrap () at {standard input}:94
#10 0xc055f9a6 in panic (
fmt=0xc077782b vm_fault: fault on nofault entry, addr: %lx)
at ../../../kern/kern_shutdown.c:534
#11 0xc06b0aee in vm_fault (map=0xc1031000, vaddr=3628748800, 
fault_type=1 '\001', fault_flags=0) at ../../../vm/vm_fault.c:891
#12 0xc070e462 in trap_pfault (frame=0xed17679c, usermode=0, eva=3628748800)
at ../../../i386/i386/trap.c:723
#13 0xc070e093 in trap (frame=
  {tf_fs = -317259752, tf_es = -1068171248, tf_ds = -1065222128, tf_edi = 
-825817600, tf_esi = -666218496, tf_ebp = -317233168, tf_isp = -317233208, tf_ebx = 
-877446136, tf_edx = -666218496, tf_ecx = 64, tf_eax = 0, tf_trapno = 12, tf_err = 0, 
tf_eip = -1066823996, tf_cs = 8, tf_eflags = 66182, tf_esp = -825817600, tf_ss = 16}) 
at ../../../i386/i386/trap.c:420
#14 0xc06fa578 in calltrap () at {standard input}:94
#15 0xc069bea2 in ffs_vget (mp=0xd121dc00, ino=3469149696, flags=2, 
vpp=0xed1768e0) at ../../../ufs/ffs/ffs_vfsops.c:1333
#16 0xc0680f40 in ffs_valloc (pvp=0xcbf07a28, mode=33152, cred=0xd1bdec00, 
vpp=0xed1768e0) at ../../../ufs/ffs/ffs_alloc.c:861
#17 0xc06aa759 in ufs_makeinode (mode=33152, dvp=0xcbf07a28, vpp=0xed176bec, 
cnp=0xed176c00) at ../../../ufs/ufs/ufs_vnops.c:2358
#18 0xc06a6cf9 in ufs_create (ap=0xed176a68)
at ../../../ufs/ufs/ufs_vnops.c:199
#19 0xc06aae68 in ufs_vnoperate (ap=0x0) at ../../../ufs/ufs/ufs_vnops.c:2793
#20 0xc05c648e in vn_open_cred (ndp=0xed176bd8, flagp=0xed176cd8, cmode=384, 
cred=0xd1bdec00, fdidx=0) at vnode_if.h:118
#21 0xc05c62e3 in vn_open (ndp=0x0, flagp=0x0, cmode=0, fdidx=0)
at ../../../kern/vfs_vnops.c:93
#22 0xc05bf7be in kern_open (td=0xd2606780, path=0x0, pathseg=UIO_USERSPACE, 
flags=514, mode=384) at ../../../kern/vfs_syscalls.c:963
#23 0xc05bf6e0 in open (td=0x0, uap=0x0) at ../../../kern/vfs_syscalls.c:933
#24 0xc070ebc0 in syscall (frame=
  {tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = 3, tf_esi = -1077941360, tf_ebp = 
-1077941128, tf_isp = -317231756, tf_ebx = -1077941352, tf_edx = -1, tf_ecx = 2, 
tf_eax = 5, tf_trapno = 12, tf_err = 2, tf_eip = 671916207, tf_cs = 31, tf_eflags = 
518, tf_esp = -1077941396, tf_ss = 47})
at ../../../i386/i386/trap.c:1010
#25 0xc06fa5cd in Xint0x80_syscall () at {standard input}:136
---Can't read userspace from dump, or kernel process---

(kgdb) up 15
#15 0xc069bea2 in ffs_vget (mp=0xd121dc00, ino=3469149696, flags=2, 
vpp=0xed1768e0) at ../../../ufs/ffs/ffs_vfsops.c:1333
1333ffs_load_inode(bp, ip, fs, ino);
(kgdb) list
1328}
1329if (ip-i_ump-um_fstype == UFS1)
1330ip-i_din1 = uma_zalloc(uma_ufs1, M_WAITOK);
1331else
1332ip-i_din2 = uma_zalloc(uma_ufs2, M_WAITOK);
1333ffs_load_inode(bp, ip, fs, ino);
1334if (DOINGSOFTDEP(vp))
1335softdep_load_inodeblock(ip);
1336else
1337ip-i_effnlink = ip-i_nlink;
(kgdb) down
#14 0xc06fa578 in calltrap () at {standard input}:94
94  {standard input

Re: panic: vm_fault: fault on nofault entry, addr: fffffe0007e8e000

2003-08-02 Thread Kris Kennaway
On Thu, Jul 31, 2003 at 08:04:11PM -0700, Kris Kennaway wrote:
 On Thu, Jul 31, 2003 at 01:48:42PM -0700, Kris Kennaway wrote:
 
   I upgraded the alpha package machines tonight, and one of them fell
   over shortly after taking load, with the following:
   
   panic: vm_fault: fault on nofault entry, addr: fe0007e8e000
 
  Two more panics on alpha:
  
  panic: vm_fault: fault on nofault entry, addr: fe0007fde000
 
  fatal kernel trap:
  
  trap entry = 0x2 (memory management fault)
 
 I'm also seeing a lot of random corruption going on on the alpha
 machines (these packages build successfully if I retry):

Another one:

panic: vm_fault: fault on nofault entry, addr: fe0007fbe000
Debugger() at Debugger+0x38
panic() at panic+0x168
vm_fault() at vm_fault+0x1360
trap() at trap+0x5c8
XentMM() at XentMM+0x2c
--- memory management fault (from ipl 0) ---
bcopy() at bcopy+0x14
copyout() at copyout+0x38
uiomove() at uiomove+0x19c
pipe_read() at pipe_read+0x290
dofileread() at dofileread+0x100
read() at read+0x64
syscall() at syscall+0x33c
XentSys() at XentSys+0x64
--- syscall (3, FreeBSD ELF64, read) ---
--- user mode ---

Kris


pgp0.pgp
Description: PGP signature


Re: panic: vm_fault: fault on nofault entry, addr: fffffe0007e8e000

2003-08-02 Thread Alan L. Cox
Kris Kennaway wrote:
 
 On Thu, Jul 31, 2003 at 08:04:11PM -0700, Kris Kennaway wrote:
  On Thu, Jul 31, 2003 at 01:48:42PM -0700, Kris Kennaway wrote:
 
I upgraded the alpha package machines tonight, and one of them fell
over shortly after taking load, with the following:
   
panic: vm_fault: fault on nofault entry, addr: fe0007e8e000
 
   Two more panics on alpha:
  
   panic: vm_fault: fault on nofault entry, addr: fe0007fde000
 
   fatal kernel trap:
  
   trap entry = 0x2 (memory management fault)
 
  I'm also seeing a lot of random corruption going on on the alpha
  machines (these packages build successfully if I retry):
 
 Another one:
 
 panic: vm_fault: fault on nofault entry, addr: fe0007fbe000
 Debugger() at Debugger+0x38
 panic() at panic+0x168
 vm_fault() at vm_fault+0x1360
 trap() at trap+0x5c8
 XentMM() at XentMM+0x2c
 --- memory management fault (from ipl 0) ---
 bcopy() at bcopy+0x14
 copyout() at copyout+0x38
 uiomove() at uiomove+0x19c
 pipe_read() at pipe_read+0x290
 dofileread() at dofileread+0x100
 read() at read+0x64
 syscall() at syscall+0x33c
 XentSys() at XentSys+0x64
 --- syscall (3, FreeBSD ELF64, read) ---
 --- user mode ---
 

Could you please do an objdump -d on sys_pipe.o (or similar) to verify
that pipe_read+0x290 is the second call to uiomove() in pipe_read()?

Alan
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: panic: vm_fault: fault on nofault entry, addr: fffffe0007e8e000

2003-08-02 Thread Kris Kennaway
On Sat, Aug 02, 2003 at 01:29:22AM -0500, Alan L. Cox wrote:

 Could you please do an objdump -d on sys_pipe.o (or similar) to verify
 that pipe_read+0x290 is the second call to uiomove() in pipe_read()?

Confirmed: 

/sys/kern/sys_pipe.c:572

Kris



pgp0.pgp
Description: PGP signature


Re: panic: vm_fault: fault on nofault entry, addr: fffffe0007e8e000

2003-08-01 Thread Andrew Gallatin

Kris Kennaway writes:
  On Thu, Jul 31, 2003 at 01:48:59AM -0700, Kris Kennaway wrote:
   I upgraded the alpha package machines tonight, and one of them fell
   over shortly after taking load, with the following:


..

  Two more panics on alpha:
  
  panic: vm_fault: fault on nofault entry, addr: fe0007fde000
  Debugger() at Debugger+0x38
  panic() at panic+0x168
  vm_fault() at vm_fault+0x1360
  trap() at trap+0x5c8
  XentMM() at XentMM+0x2c
  --- memory management fault (from ipl 0) ---
  bcopy_samealign_lp() at bcopy_samealign_lp+0x8
  copyout() at copyout+0x38
  uiomove() at uiomove+0x19c
  pipe_read() at pipe_read+0x290
  dofileread() at dofileread+0x100
  read() at read+0x64
  syscall() at syscall+0x33c
  XentSys() at XentSys+0x64
  --- syscall (3, FreeBSD ELF64, read) ---
  --- user mode ---


The crashdump might actually be useful here.  You'd have only the
trap() and vm_fault() frames, but at least you'd have information
about the state of the vm system.

I updated a UP alpha here at roughly the same time.  I was able to
do a buildworld with the new kernel from sources dated July 30, around
noon PDT.

Is this GENERIC or an otherwise SMP kernel?  Just for the heck of it,
edit GENERIC and get rid of the SMP option if its a UP box.  That's
what I always run and I'm curious if that might be your stability problem.


Drew
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: panic: vm_fault: fault on nofault entry, addr: fffffe0007e8e000

2003-08-01 Thread Kris Kennaway
On Fri, Aug 01, 2003 at 09:00:44AM -0400, Andrew Gallatin wrote:

 The crashdump might actually be useful here.  You'd have only the
 trap() and vm_fault() frames, but at least you'd have information
 about the state of the vm system.

Two crashdumps coming up!  I'll move them onto beast:/j/kris/crash
together with the kernel.debug.

 I updated a UP alpha here at roughly the same time.  I was able to
 do a buildworld with the new kernel from sources dated July 30, around
 noon PDT.

The problems I'm seeing are only intermittent (about a dozen spurious
port build failures over 27 hours of package building on 9 machines,
plus 3 VM panics), but they're definitely ongoing.

 Is this GENERIC or an otherwise SMP kernel?  Just for the heck of it,
 edit GENERIC and get rid of the SMP option if its a UP box.  That's
 what I always run and I'm curious if that might be your stability problem.

It's a UP kernel.

Kris


pgp0.pgp
Description: PGP signature


Re: panic: vm_fault: fault on nofault entry, addr: fffffe0007e8e000

2003-08-01 Thread Andrew Gallatin

Kris Kennaway writes:
  On Fri, Aug 01, 2003 at 09:00:44AM -0400, Andrew Gallatin wrote:
  
   The crashdump might actually be useful here.  You'd have only the
   trap() and vm_fault() frames, but at least you'd have information
   about the state of the vm system.
  
  Two crashdumps coming up!  I'll move them onto beast:/j/kris/crash
  together with the kernel.debug.
  

I may have wasted your time.  The first one is unusable (lots of ddb
cruft).  Damned gdb -k.  Grrr.

I don't have read perms on vmcore.{1,2}, so I don't know if they are
helpful.   

If you're willing to get your traces via ddb's debug.trace_on_panic
and to set debug.debugger_on_panic=0, then we might get at least a
partial trace.  FWIW, I have to do this to get any sort of crashdump
at all on SMP x86.  I'm amazed you were able to call doadump from ddb.
When I try that on x86,   I just get a continuous stream of panics or
fatal traps.


Drew
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: panic: vm_fault: fault on nofault entry, addr: fffffe0007e8e000

2003-08-01 Thread Kris Kennaway
On Fri, Aug 01, 2003 at 10:59:06AM -0400, Andrew Gallatin wrote:
 
 Kris Kennaway writes:
   On Fri, Aug 01, 2003 at 09:00:44AM -0400, Andrew Gallatin wrote:
   
The crashdump might actually be useful here.  You'd have only the
trap() and vm_fault() frames, but at least you'd have information
about the state of the vm system.
   
   Two crashdumps coming up!  I'll move them onto beast:/j/kris/crash
   together with the kernel.debug.
   
 
 I may have wasted your time.  The first one is unusable (lots of ddb
 cruft).  Damned gdb -k.  Grrr.

That one was the third panic I posted, which was the:

trap entry = 0x2 (memory management fault)

 I don't have read perms on vmcore.{1,2}, so I don't know if they are
 helpful.   

Oops, sorry.

 If you're willing to get your traces via ddb's debug.trace_on_panic
 and to set debug.debugger_on_panic=0, then we might get at least a
 partial trace.  FWIW, I have to do this to get any sort of crashdump
 at all on SMP x86.  I'm amazed you were able to call doadump from ddb.
 When I try that on x86,   I just get a continuous stream of panics or
 fatal traps.

Hmm, I don't see this on the UP i386 machines.  Bento hasn't panicked
in a while (time to upgrade!), although I do seem to recall problems
the last time it did.

kris


pgp0.pgp
Description: PGP signature


panic: vm_fault: fault on nofault entry, addr: fffffe0007e8e000

2003-07-31 Thread Kris Kennaway
I upgraded the alpha package machines tonight, and one of them fell
over shortly after taking load, with the following:

panic: vm_fault: fault on nofault entry, addr: fe0007e8e000
Stack backtrace:
db_print_backtrace() at db_print_backtrace+0x18
backtrace() at backtrace+0x2c
panic() at panic+0x148
vm_fault() at vm_fault+0x1360
trap() at trap+0x5c8
XentMM() at XentMM+0x2c
--- memory management fault (from ipl 0) ---
bcopy_samealign_lp() at bcopy_samealign_lp
cpu_fork() at cpu_fork+0x12c
vm_forkproc() at vm_forkproc+0x204
fork1() at fork1+0xfec
vfork() at vfork+0x30
syscall() at syscall+0x33c
XentSys() at XentSys+0x64
--- syscall (66) ---
--- user mode ---


Kris

pgp0.pgp
Description: PGP signature


Re: panic: vm_fault: fault on nofault entry, addr: fffffe0007e8e000

2003-07-31 Thread Kris Kennaway
On Thu, Jul 31, 2003 at 01:48:59AM -0700, Kris Kennaway wrote:
 I upgraded the alpha package machines tonight, and one of them fell
 over shortly after taking load, with the following:
 
 panic: vm_fault: fault on nofault entry, addr: fe0007e8e000
 Stack backtrace:
 db_print_backtrace() at db_print_backtrace+0x18
 backtrace() at backtrace+0x2c
 panic() at panic+0x148
 vm_fault() at vm_fault+0x1360
 trap() at trap+0x5c8
 XentMM() at XentMM+0x2c
 --- memory management fault (from ipl 0) ---
 bcopy_samealign_lp() at bcopy_samealign_lp
 cpu_fork() at cpu_fork+0x12c
 vm_forkproc() at vm_forkproc+0x204
 fork1() at fork1+0xfec
 vfork() at vfork+0x30
 syscall() at syscall+0x33c
 XentSys() at XentSys+0x64
 --- syscall (66) ---
 --- user mode ---

Two more panics on alpha:

panic: vm_fault: fault on nofault entry, addr: fe0007fde000
Debugger() at Debugger+0x38
panic() at panic+0x168
vm_fault() at vm_fault+0x1360
trap() at trap+0x5c8
XentMM() at XentMM+0x2c
--- memory management fault (from ipl 0) ---
bcopy_samealign_lp() at bcopy_samealign_lp+0x8
copyout() at copyout+0x38
uiomove() at uiomove+0x19c
pipe_read() at pipe_read+0x290
dofileread() at dofileread+0x100
read() at read+0x64
syscall() at syscall+0x33c
XentSys() at XentSys+0x64
--- syscall (3, FreeBSD ELF64, read) ---
--- user mode ---
db

fatal kernel trap:

trap entry = 0x2 (memory management fault)
faulting va= 0xfe0007f9c088
type   = access violation
cause  = load instructon
pc = 0xfc45570c
ra = 0xfc455700
sp = 0xfe0008021ce0
usp= 0x11ffccd8
curthread  = 0xfc00014e84c0
pid = 4214, comm = bzip2

Stopped at  ast+0x3ec:  ldq t0,0x88(t1) 0xfe0007f9c088
t0=0xcfa0,t1=0xfe0007f9c00





pgp0.pgp
Description: PGP signature


Re: panic: vm_fault: fault on nofault entry, addr: fffffe0007e8e000

2003-07-31 Thread Kris Kennaway
On Thu, Jul 31, 2003 at 01:48:42PM -0700, Kris Kennaway wrote:

  I upgraded the alpha package machines tonight, and one of them fell
  over shortly after taking load, with the following:
  
  panic: vm_fault: fault on nofault entry, addr: fe0007e8e000

 Two more panics on alpha:
 
 panic: vm_fault: fault on nofault entry, addr: fe0007fde000

 fatal kernel trap:
 
 trap entry = 0x2 (memory management fault)

I'm also seeing a lot of random corruption going on on the alpha
machines (these packages build successfully if I retry):


(cd .  ln -s  )
usage: ln [-fhinsv] file1 file2
   ln [-fhinsv] file ... directory
   link file1 file2
*** Error code 1



bzip2: I/O or other error, bailing out.  Possible reason follows.
bzip2: Bad address
Input file = (stdin), output file = (stdout)

cc1plus
:

Bad

address
:

error

writing

to

-
*** Error code 1


Something funny is definitely going on.  The alpha machines were not
running with witness, but I have just upgraded the i386 package
machines (which do have witness) to try and reproduce it there.

Kris


pgp0.pgp
Description: PGP signature