Re: [markiyan.kush...@gmail.com: Re: 11.0-CURRENT panic (nfsd?)]

2014-01-06 Thread Alexander Motin

Thank you for the report. Bug fixed at r260367.


- Forwarded message from Markiyan Kushnir markiyan.kush...@gmail.com -

Date: Sun, 5 Jan 2014 19:47:37 +0200
Subject: Re: 11.0-CURRENT panic (nfsd?)
From: Markiyan Kushnir markiyan.kush...@gmail.com
To: Markiyan Kushnir markiyan.kush...@gmail.com, freebsd-current@freebsd.org

$ nm /boot/kernel/kernel | grep svc_run_internal
80714db0 t svc_run_internal
$ addr2line -e /boot/kernel/kernel 0x80715779
/usr/src.svnup/sys/rpc/svc.c:971

949  static void
950  svc_executereq(struct svc_req *rqstp)
951  {
952  SVCXPRT *xprt = rqstp-rq_xprt;
953  SVCPOOL *pool = xprt-xp_pool;
954  int prog_found;
955  rpcvers_t low_vers;
956  rpcvers_t high_vers;
957  struct svc_callout *s;
958
959  /* now match message with a registered service*/
960  prog_found = FALSE;
961  low_vers = (rpcvers_t) -1L;
962  high_vers = (rpcvers_t) 0L;
963  TAILQ_FOREACH(s, pool-sp_callouts, sc_link) {
964  if (s-sc_prog == rqstp-rq_prog) {
965  if (s-sc_vers == rqstp-rq_vers) {
966  /*
967   * We hand ownership of r to the
968   * dispatch method - they must call
969   * svc_freereq.
970   */
971  (*s-sc_dispatch)(rqstp, xprt);
972  return;
973  }  /* found correct version */
974  prog_found = TRUE;
975  if (s-sc_vers  low_vers)
976  low_vers = s-sc_vers;
977  if (s-sc_vers  high_vers)
978  high_vers = s-sc_vers;
979  }   /* found correct program */
980  }
981
982  /*
983   * if we got here, the program or version
984   * is not served ...
985   */
986  if (prog_found)
987  svcerr_progvers(rqstp, low_vers, high_vers);
988  else
989  svcerr_noprog(rqstp);
990
991  svc_freereq(rqstp);
992  }
993

2014/1/5 John-Mark Gurney j...@funkthat.com:

Markiyan Kushnir wrote this message on Sun, Jan 05, 2014 at 11:06 +0200:

2014/1/5 John-Mark Gurney j...@funkthat.com:

Markiyan Kushnir wrote this message on Sun, Jan 05, 2014 at 10:57 +0200:

I started to see a reliable panic on a recent CURRENT:

$ uname -a
FreeBSD mkushnir.mooo.com 11.0-CURRENT FreeBSD 11.0-CURRENT #0
r260296: Sun Jan  5 07:14:50 EET 2014
r...@vm.mkushnir.mooo.com:/usr/obj/usr/src.svnup/sys/MAREK  amd64

The panic is always triggered by the first request to the nfs service
(this machine runs a PXE server).

The core.txt is attached. Please let me know if I can help more.


Apparently the mime-type on the attachment was bad and got scrubbed...

Maybe include it inline if it isn't too long?



It's 144KB long. I will share it via Google Drive:

https://drive.google.com/file/d/0B9Q-zpUXxqCnNVhBY0M5ZzU4d1k/edit?usp=sharing


Looks like a NULL function pointer was called:
Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x0
fault code  = supervisor read instruction, page not present
instruction pointer = 0x20:0x0
stack pointer   = 0x28:0xfe00d9a2bea0
frame pointer   = 0x28:0xfe00d9a2c010
code segment= base 0x0, limit 0xf, type 0x1b
 = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 1323 (nfsd: master)
trap number = 12
panic: page fault

--- trap 0xc, rip = 0, rsp = 0xfe00d9a2bea0, rbp = 0xfe00d9a2c010 ---
uart_sab82532_class() at 0/frame 0xfe00d9a2c010
svc_run_internal() at svc_run_internal+0x9c9/frame 0xfe00d9a2c1b0
svc_run() at svc_run+0xed/frame 0xfe00d9a2c1f0
nfsrvd_nfsd() at nfsrvd_nfsd+0x19a/frame 0xfe00d9a2c350
nfssvc_nfsd() at nfssvc_nfsd+0x11a/frame 0xfe00d9a2c970
sys_nfssvc() at sys_nfssvc+0xd2/frame 0xfe00d9a2c9a0
amd64_syscall() at amd64_syscall+0x265/frame 0xfe00d9a2cab0
Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfe00d9a2cab0
--- syscall (155, FreeBSD ELF64, sys_nfssvc), rip = 0x80088c13a, rsp = 
0x7fffd438, rbp = 0x7fffd6e0 ---

The uart_sab82532_class is just the closest symbol to 0, so it's in
svc_run_internal that's the problem...  Could you run:
nm /boot/kernel/kernel | grep svc_run_internal

This should return a line w/ a large hex number at the front, then run:
addr2line -e /boot/kernel/kernel $( expr 0xlargehexnumber+0x9c9)

This will give you

Re: 11.0-CURRENT panic (nfsd?)

2014-01-05 Thread Markiyan Kushnir
Please ignore the attached core.txt.1.gz, and see the new
core.txt.2.gz in this attachment. I confused files.

--
Markiyan.


2014/1/5 Markiyan Kushnir markiyan.kush...@gmail.com:
 I started to see a reliable panic on a recent CURRENT:

 $ uname -a
 FreeBSD mkushnir.mooo.com 11.0-CURRENT FreeBSD 11.0-CURRENT #0
 r260296: Sun Jan  5 07:14:50 EET 2014
 r...@vm.mkushnir.mooo.com:/usr/obj/usr/src.svnup/sys/MAREK  amd64

 The panic is always triggered by the first request to the nfs service
 (this machine runs a PXE server).

 The core.txt is attached. Please let me know if I can help more.

 --
 Markiyan.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: 11.0-CURRENT panic (nfsd?)

2014-01-05 Thread John-Mark Gurney
Markiyan Kushnir wrote this message on Sun, Jan 05, 2014 at 10:57 +0200:
 I started to see a reliable panic on a recent CURRENT:
 
 $ uname -a
 FreeBSD mkushnir.mooo.com 11.0-CURRENT FreeBSD 11.0-CURRENT #0
 r260296: Sun Jan  5 07:14:50 EET 2014
 r...@vm.mkushnir.mooo.com:/usr/obj/usr/src.svnup/sys/MAREK  amd64
 
 The panic is always triggered by the first request to the nfs service
 (this machine runs a PXE server).
 
 The core.txt is attached. Please let me know if I can help more.

Apparently the mime-type on the attachment was bad and got scrubbed...

Maybe include it inline if it isn't too long?

-- 
  John-Mark Gurney  Voice: +1 415 225 5579

 All that I will do, has been done, All that I have, has not.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: 11.0-CURRENT panic (nfsd?)

2014-01-05 Thread Markiyan Kushnir
2014/1/5 John-Mark Gurney j...@funkthat.com:
 Markiyan Kushnir wrote this message on Sun, Jan 05, 2014 at 10:57 +0200:
 I started to see a reliable panic on a recent CURRENT:

 $ uname -a
 FreeBSD mkushnir.mooo.com 11.0-CURRENT FreeBSD 11.0-CURRENT #0
 r260296: Sun Jan  5 07:14:50 EET 2014
 r...@vm.mkushnir.mooo.com:/usr/obj/usr/src.svnup/sys/MAREK  amd64

 The panic is always triggered by the first request to the nfs service
 (this machine runs a PXE server).

 The core.txt is attached. Please let me know if I can help more.

 Apparently the mime-type on the attachment was bad and got scrubbed...

 Maybe include it inline if it isn't too long?


It's 144KB long. I will share it via Google Drive:

https://drive.google.com/file/d/0B9Q-zpUXxqCnNVhBY0M5ZzU4d1k/edit?usp=sharing

--
Markiyan.


 --
   John-Mark Gurney  Voice: +1 415 225 5579

  All that I will do, has been done, All that I have, has not.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: 11.0-CURRENT panic (nfsd?)

2014-01-05 Thread John-Mark Gurney
Markiyan Kushnir wrote this message on Sun, Jan 05, 2014 at 11:06 +0200:
 2014/1/5 John-Mark Gurney j...@funkthat.com:
  Markiyan Kushnir wrote this message on Sun, Jan 05, 2014 at 10:57 +0200:
  I started to see a reliable panic on a recent CURRENT:
 
  $ uname -a
  FreeBSD mkushnir.mooo.com 11.0-CURRENT FreeBSD 11.0-CURRENT #0
  r260296: Sun Jan  5 07:14:50 EET 2014
  r...@vm.mkushnir.mooo.com:/usr/obj/usr/src.svnup/sys/MAREK  amd64
 
  The panic is always triggered by the first request to the nfs service
  (this machine runs a PXE server).
 
  The core.txt is attached. Please let me know if I can help more.
 
  Apparently the mime-type on the attachment was bad and got scrubbed...
 
  Maybe include it inline if it isn't too long?
 
 
 It's 144KB long. I will share it via Google Drive:
 
 https://drive.google.com/file/d/0B9Q-zpUXxqCnNVhBY0M5ZzU4d1k/edit?usp=sharing

Looks like a NULL function pointer was called:
Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x0
fault code  = supervisor read instruction, page not present
instruction pointer = 0x20:0x0
stack pointer   = 0x28:0xfe00d9a2bea0
frame pointer   = 0x28:0xfe00d9a2c010
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 1323 (nfsd: master)
trap number = 12
panic: page fault

--- trap 0xc, rip = 0, rsp = 0xfe00d9a2bea0, rbp = 0xfe00d9a2c010 ---
uart_sab82532_class() at 0/frame 0xfe00d9a2c010
svc_run_internal() at svc_run_internal+0x9c9/frame 0xfe00d9a2c1b0
svc_run() at svc_run+0xed/frame 0xfe00d9a2c1f0
nfsrvd_nfsd() at nfsrvd_nfsd+0x19a/frame 0xfe00d9a2c350
nfssvc_nfsd() at nfssvc_nfsd+0x11a/frame 0xfe00d9a2c970
sys_nfssvc() at sys_nfssvc+0xd2/frame 0xfe00d9a2c9a0
amd64_syscall() at amd64_syscall+0x265/frame 0xfe00d9a2cab0
Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfe00d9a2cab0
--- syscall (155, FreeBSD ELF64, sys_nfssvc), rip = 0x80088c13a, rsp = 
0x7fffd438, rbp = 0x7fffd6e0 ---

The uart_sab82532_class is just the closest symbol to 0, so it's in
svc_run_internal that's the problem...  Could you run:
nm /boot/kernel/kernel | grep svc_run_internal

This should return a line w/ a large hex number at the front, then run:
addr2line -e /boot/kernel/kernel $( expr 0xlargehexnumber+0x9c9)

This will give you a file name and line number, and can you copy/paste
the lines around and including that line number?  This will help make
sure we get the correct code...

Thanks.

-- 
  John-Mark Gurney  Voice: +1 415 225 5579

 All that I will do, has been done, All that I have, has not.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: 11.0-CURRENT panic (nfsd?)

2014-01-05 Thread Markiyan Kushnir
$ nm /boot/kernel/kernel | grep svc_run_internal
80714db0 t svc_run_internal
$ addr2line -e /boot/kernel/kernel 0x80715779
/usr/src.svnup/sys/rpc/svc.c:971

   949  static void
   950  svc_executereq(struct svc_req *rqstp)
   951  {
   952  SVCXPRT *xprt = rqstp-rq_xprt;
   953  SVCPOOL *pool = xprt-xp_pool;
   954  int prog_found;
   955  rpcvers_t low_vers;
   956  rpcvers_t high_vers;
   957  struct svc_callout *s;
   958
   959  /* now match message with a registered service*/
   960  prog_found = FALSE;
   961  low_vers = (rpcvers_t) -1L;
   962  high_vers = (rpcvers_t) 0L;
   963  TAILQ_FOREACH(s, pool-sp_callouts, sc_link) {
   964  if (s-sc_prog == rqstp-rq_prog) {
   965  if (s-sc_vers == rqstp-rq_vers) {
   966  /*
   967   * We hand ownership of r to the
   968   * dispatch method - they must call
   969   * svc_freereq.
   970   */
   971  (*s-sc_dispatch)(rqstp, xprt);
   972  return;
   973  }  /* found correct version */
   974  prog_found = TRUE;
   975  if (s-sc_vers  low_vers)
   976  low_vers = s-sc_vers;
   977  if (s-sc_vers  high_vers)
   978  high_vers = s-sc_vers;
   979  }   /* found correct program */
   980  }
   981
   982  /*
   983   * if we got here, the program or version
   984   * is not served ...
   985   */
   986  if (prog_found)
   987  svcerr_progvers(rqstp, low_vers, high_vers);
   988  else
   989  svcerr_noprog(rqstp);
   990
   991  svc_freereq(rqstp);
   992  }
   993

2014/1/5 John-Mark Gurney j...@funkthat.com:
 Markiyan Kushnir wrote this message on Sun, Jan 05, 2014 at 11:06 +0200:
 2014/1/5 John-Mark Gurney j...@funkthat.com:
  Markiyan Kushnir wrote this message on Sun, Jan 05, 2014 at 10:57 +0200:
  I started to see a reliable panic on a recent CURRENT:
 
  $ uname -a
  FreeBSD mkushnir.mooo.com 11.0-CURRENT FreeBSD 11.0-CURRENT #0
  r260296: Sun Jan  5 07:14:50 EET 2014
  r...@vm.mkushnir.mooo.com:/usr/obj/usr/src.svnup/sys/MAREK  amd64
 
  The panic is always triggered by the first request to the nfs service
  (this machine runs a PXE server).
 
  The core.txt is attached. Please let me know if I can help more.
 
  Apparently the mime-type on the attachment was bad and got scrubbed...
 
  Maybe include it inline if it isn't too long?
 

 It's 144KB long. I will share it via Google Drive:

 https://drive.google.com/file/d/0B9Q-zpUXxqCnNVhBY0M5ZzU4d1k/edit?usp=sharing

 Looks like a NULL function pointer was called:
 Fatal trap 12: page fault while in kernel mode
 cpuid = 0; apic id = 00
 fault virtual address   = 0x0
 fault code  = supervisor read instruction, page not present
 instruction pointer = 0x20:0x0
 stack pointer   = 0x28:0xfe00d9a2bea0
 frame pointer   = 0x28:0xfe00d9a2c010
 code segment= base 0x0, limit 0xf, type 0x1b
 = DPL 0, pres 1, long 1, def32 0, gran 1
 processor eflags= interrupt enabled, resume, IOPL = 0
 current process = 1323 (nfsd: master)
 trap number = 12
 panic: page fault

 --- trap 0xc, rip = 0, rsp = 0xfe00d9a2bea0, rbp = 0xfe00d9a2c010 ---
 uart_sab82532_class() at 0/frame 0xfe00d9a2c010
 svc_run_internal() at svc_run_internal+0x9c9/frame 0xfe00d9a2c1b0
 svc_run() at svc_run+0xed/frame 0xfe00d9a2c1f0
 nfsrvd_nfsd() at nfsrvd_nfsd+0x19a/frame 0xfe00d9a2c350
 nfssvc_nfsd() at nfssvc_nfsd+0x11a/frame 0xfe00d9a2c970
 sys_nfssvc() at sys_nfssvc+0xd2/frame 0xfe00d9a2c9a0
 amd64_syscall() at amd64_syscall+0x265/frame 0xfe00d9a2cab0
 Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfe00d9a2cab0
 --- syscall (155, FreeBSD ELF64, sys_nfssvc), rip = 0x80088c13a, rsp = 
 0x7fffd438, rbp = 0x7fffd6e0 ---

 The uart_sab82532_class is just the closest symbol to 0, so it's in
 svc_run_internal that's the problem...  Could you run:
 nm /boot/kernel/kernel | grep svc_run_internal

 This should return a line w/ a large hex number at the front, then run:
 addr2line -e /boot/kernel/kernel $( expr 0xlargehexnumber+0x9c9)

 This will give you a file name and line number, and can you copy/paste
 the lines around and including that line number?  This will help make
 sure we get the correct code...

 Thanks.

 --
   John-Mark Gurney  Voice: +1 415 225 5579

  All that I will do, has been done, All that I have, has not.