Re: kernel page fault with nfs
Tobias C. Berner wrote: Tobias, Thank you very much, it really helps to simulate the problem. I'm gonna try as soon as possible and I will keep you informed. Best Regards, 2014-10-21 17:10 GMT+08:00 Tobias C. Berner tcber...@gmail.com: Hi Marcelo The following ist the current fstab-line which seems to run smoothly: odo.firefly:/storage/multimedia /multimedia nfs readahead=4,soft,intr,rw,tcp,wsize=32768,rsize=32768,late 0 0 I've just committed r273486 to head, which rounds rsize, wsize down to the power of 2 less than or equal to the value. I don't think anyone needs support for non-power of 2 rsize, wsize values, so this seems to be a reasonable resolution, given that the actual bug/fix is not known. rick nfsstat -m: odo.firefly:/storage/multimedia on /multimedia nfsv3,tcp,resvport,soft,intr,cto,lockd,sec=sys,acdirmin=3,acdirmax=60,acregmin=5,acregmax=60,nametimeo=60,negnametimeo=60,rsize=32768,wsize=32768,readdirsize=32768,readahead=4,wcommitsize=2798255,timeout=120,retrans=2 Now the bad line (no different appart from the typo) odo.firefly:/storage/multimedia /multimedia nfs readahead=4,soft,intr,rw,tcp,wsize=32767,rsize=32767,late 0 0 which leads to the page-faults. And as you said wsize/rsize gets rounded down to the multiple of 512: odo.firefly:/storage/multimedia on /multimedia nfsv3,tcp,resvport,soft,intr,cto,lockd,sec=sys,acdirmin=3,acdirmax=60,acregmin=5,acregmax=60,nametimeo=60,negnametimeo=60,rsize=32256,wsize=32256,readdirsize=32256,readahead=4,wcommitsize=2798255,timeout=120,retrans=2 I can easily reproduce the pagefault by letting for example multimedia/gpodder write to the nfs. hope this helps, mfg Tobias On Tuesday 21 October 2014 15.45:24 Marcelo Araujo wrote: Hello Tobias, That sounds good, at least you don't have any crash so far. I agree with you, seems a bug, I'm gonna take a look on that. Could you share with me your testbed or how you can reproduce the issue? Best Regards, 2014-10-21 15:36 GMT+08:00 T.C.Berner tcber...@gmail.com: The system now has an uptime of 24h using NFS heavily. So wsize/rsize=2^15-1 seems to have been the problem which is imho a bug therefore. mfg Tobias 2014-10-21 5:11 GMT+02:00 Marcelo Araujo araujobsdp...@gmail.com: Hello Tibias, Any news? Best Regards, 2014-10-20 20:55 GMT+08:00 Rick Macklem rmack...@uoguelph.ca: Tobias C. Berner wrote: Now that I posted it, 32767 should of course be 2^15=32768. Let me recheck if it still hangs with the correct value. On Monday 20 October 2014 09.15:39 Tobias C. Berner wrote: Hi Marcelo Yes, I'm using readahead: The mountoptions are readahead=4,soft,intr,rw,tcp,wsize=32767,rsize=32767,late If you type nfsstat -m, you will see what is actually getting used. (I suspect the above rsize/wsize got clipped to 32256 or something like that. I think it clips it to a multiple of 512.) If rsize/wsize are not a power of 2, there are issues, although I've never been able to see why it is broken. Maybe it should clip it to the power of 2 below the value, since it causes unexplained problems otherwise. rick mfg Tobias On Monday 20 October 2014 10.41:30 Marcelo Araujo wrote: Hello Tobias, Could you show how you are mount the NFS share? Are you using 'readahead' option? Best Regards, 2014-10-19 17:40 GMT+08:00 Tobias C. Berner tcber...@gmail.com: both are at 1100038. On Sunday 19 October 2014 11.12:36 Marcelo Araujo wrote: It is still strange, could you do what Allan said and send us the result in case you are not sure you have world and kernel in the same revision! On Oct 19, 2014 6:48 AM, Tobias C. Berner tcber...@gmail.com wrote: Hi World ist from october 16, installed world and kernel then. Kernel was later rebuilt with debug-options. Is the following more sensible? ## # kgdb NOXON/kernel.debug vmcore.1 Fatal trap 12: page fault while in kernel mode cpuid = 5; apic id = 05 fault virtual address = 0xfe07d1744000 fault code =
Re: kernel page fault with nfs
The system now has an uptime of 24h using NFS heavily. So wsize/rsize=2^15-1 seems to have been the problem which is imho a bug therefore. mfg Tobias 2014-10-21 5:11 GMT+02:00 Marcelo Araujo araujobsdp...@gmail.com: Hello Tibias, Any news? Best Regards, 2014-10-20 20:55 GMT+08:00 Rick Macklem rmack...@uoguelph.ca: Tobias C. Berner wrote: Now that I posted it, 32767 should of course be 2^15=32768. Let me recheck if it still hangs with the correct value. On Monday 20 October 2014 09.15:39 Tobias C. Berner wrote: Hi Marcelo Yes, I'm using readahead: The mountoptions are readahead=4,soft,intr,rw,tcp,wsize=32767,rsize=32767,late If you type nfsstat -m, you will see what is actually getting used. (I suspect the above rsize/wsize got clipped to 32256 or something like that. I think it clips it to a multiple of 512.) If rsize/wsize are not a power of 2, there are issues, although I've never been able to see why it is broken. Maybe it should clip it to the power of 2 below the value, since it causes unexplained problems otherwise. rick mfg Tobias On Monday 20 October 2014 10.41:30 Marcelo Araujo wrote: Hello Tobias, Could you show how you are mount the NFS share? Are you using 'readahead' option? Best Regards, 2014-10-19 17:40 GMT+08:00 Tobias C. Berner tcber...@gmail.com: both are at 1100038. On Sunday 19 October 2014 11.12:36 Marcelo Araujo wrote: It is still strange, could you do what Allan said and send us the result in case you are not sure you have world and kernel in the same revision! On Oct 19, 2014 6:48 AM, Tobias C. Berner tcber...@gmail.com wrote: Hi World ist from october 16, installed world and kernel then. Kernel was later rebuilt with debug-options. Is the following more sensible? ## # kgdb NOXON/kernel.debug vmcore.1 Fatal trap 12: page fault while in kernel mode cpuid = 5; apic id = 05 fault virtual address = 0xfe07d1744000 fault code = supervisor write data, page not present instruction pointer = 0x20:0x80d4d58a stack pointer = 0x28:0xfe086057f240 frame pointer = 0x28:0xfe086057f2f0 code segment = base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 6524 (python2.7) (kgdb) bt #0 doadump (textdump=1) at pcpu.h:219 #1 0x80926b6d in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:447 #2 0x809270c0 in panic (fmt=value optimized out) at /usr/src/sys/kern/kern_shutdown.c:746 #3 0x8035f167 in db_panic (addr=value optimized out, have_addr=2, count=0, modif=0x0) at /usr/src/sys/ddb/db_command.c:473 #4 0x8035ed7d in db_command (cmd_table=0x0) at /usr/src/sys/ddb/db_command.c:440 #5 0x8035eaf4 in db_command_loop () at /usr/src/sys/ddb/db_command.c:493 #6 0x80361600 in db_trap (type=value optimized out, code=0) at /usr/src/sys/ddb/db_main.c:251 #7 0x80966f01 in kdb_trap (type=12, code=0, tf=value optimized out) at /usr/src/sys/kern/subr_kdb.c:654 #8 0x80d4fa7c in trap_fatal (frame=0xfe086057f190, eva=value optimized out) at /usr/src/sys/amd64/amd64/trap.c:861 #9 0x80d4fe0c in trap_pfault (frame=0xfe086057f190, usermode=value optimized out) at /usr/src/sys/amd64/amd64/trap.c:677 #10 0x80d4f42e in trap (frame=0xfe086057f190) at /usr/src/sys/amd64/amd64/trap.c:426 #11 0x80d33972 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:231 #12 0x80d4d58a in bzero () at /usr/src/sys/amd64/amd64/support.S:53 #13 0x80830463 in ncl_doio (vp=0xf801e7f99938, bp=0xfe07c5a168e8, cr=value optimized out, td=value optimized out, called_from_strategy=value optimized out) at /usr/src/sys/fs/nfsclient/nfs_clbio.c:1648 ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org -- -- Marcelo Araujo(__)ara...@freebsd.org
Re: kernel page fault with nfs
Hello Tobias, That sounds good, at least you don't have any crash so far. I agree with you, seems a bug, I'm gonna take a look on that. Could you share with me your testbed or how you can reproduce the issue? Best Regards, 2014-10-21 15:36 GMT+08:00 T.C.Berner tcber...@gmail.com: The system now has an uptime of 24h using NFS heavily. So wsize/rsize=2^15-1 seems to have been the problem which is imho a bug therefore. mfg Tobias 2014-10-21 5:11 GMT+02:00 Marcelo Araujo araujobsdp...@gmail.com: Hello Tibias, Any news? Best Regards, 2014-10-20 20:55 GMT+08:00 Rick Macklem rmack...@uoguelph.ca: Tobias C. Berner wrote: Now that I posted it, 32767 should of course be 2^15=32768. Let me recheck if it still hangs with the correct value. On Monday 20 October 2014 09.15:39 Tobias C. Berner wrote: Hi Marcelo Yes, I'm using readahead: The mountoptions are readahead=4,soft,intr,rw,tcp,wsize=32767,rsize=32767,late If you type nfsstat -m, you will see what is actually getting used. (I suspect the above rsize/wsize got clipped to 32256 or something like that. I think it clips it to a multiple of 512.) If rsize/wsize are not a power of 2, there are issues, although I've never been able to see why it is broken. Maybe it should clip it to the power of 2 below the value, since it causes unexplained problems otherwise. rick mfg Tobias On Monday 20 October 2014 10.41:30 Marcelo Araujo wrote: Hello Tobias, Could you show how you are mount the NFS share? Are you using 'readahead' option? Best Regards, 2014-10-19 17:40 GMT+08:00 Tobias C. Berner tcber...@gmail.com: both are at 1100038. On Sunday 19 October 2014 11.12:36 Marcelo Araujo wrote: It is still strange, could you do what Allan said and send us the result in case you are not sure you have world and kernel in the same revision! On Oct 19, 2014 6:48 AM, Tobias C. Berner tcber...@gmail.com wrote: Hi World ist from october 16, installed world and kernel then. Kernel was later rebuilt with debug-options. Is the following more sensible? ## # kgdb NOXON/kernel.debug vmcore.1 Fatal trap 12: page fault while in kernel mode cpuid = 5; apic id = 05 fault virtual address = 0xfe07d1744000 fault code = supervisor write data, page not present instruction pointer = 0x20:0x80d4d58a stack pointer = 0x28:0xfe086057f240 frame pointer = 0x28:0xfe086057f2f0 code segment = base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 6524 (python2.7) (kgdb) bt #0 doadump (textdump=1) at pcpu.h:219 #1 0x80926b6d in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:447 #2 0x809270c0 in panic (fmt=value optimized out) at /usr/src/sys/kern/kern_shutdown.c:746 #3 0x8035f167 in db_panic (addr=value optimized out, have_addr=2, count=0, modif=0x0) at /usr/src/sys/ddb/db_command.c:473 #4 0x8035ed7d in db_command (cmd_table=0x0) at /usr/src/sys/ddb/db_command.c:440 #5 0x8035eaf4 in db_command_loop () at /usr/src/sys/ddb/db_command.c:493 #6 0x80361600 in db_trap (type=value optimized out, code=0) at /usr/src/sys/ddb/db_main.c:251 #7 0x80966f01 in kdb_trap (type=12, code=0, tf=value optimized out) at /usr/src/sys/kern/subr_kdb.c:654 #8 0x80d4fa7c in trap_fatal (frame=0xfe086057f190, eva=value optimized out) at /usr/src/sys/amd64/amd64/trap.c:861 #9 0x80d4fe0c in trap_pfault (frame=0xfe086057f190, usermode=value optimized out) at /usr/src/sys/amd64/amd64/trap.c:677 #10 0x80d4f42e in trap (frame=0xfe086057f190) at /usr/src/sys/amd64/amd64/trap.c:426 #11 0x80d33972 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:231 #12 0x80d4d58a in bzero () at /usr/src/sys/amd64/amd64/support.S:53 #13 0x80830463 in ncl_doio (vp=0xf801e7f99938, bp=0xfe07c5a168e8, cr=value optimized out, td=value optimized out, called_from_strategy=value optimized out) at /usr/src/sys/fs/nfsclient/nfs_clbio.c:1648
Re: kernel page fault with nfs
Hi Marcelo The following ist the current fstab-line which seems to run smoothly: odo.firefly:/storage/multimedia /multimedia nfs readahead=4,soft,intr,rw,tcp,wsize=32768,rsize=32768,late 0 0 nfsstat -m: odo.firefly:/storage/multimedia on /multimedia nfsv3,tcp,resvport,soft,intr,cto,lockd,sec=sys,acdirmin=3,acdirmax=60,acregmin=5,a cregmax=60,nametimeo=60,negnametimeo=60,rsize=32768,wsize=32768,readdirsi ze=32768,readahead=4,wcommitsize=2798255,timeout=120,retrans=2 Now the bad line (no different appart from the typo) odo.firefly:/storage/multimedia /multimedia nfs readahead=4,soft,intr,rw,tcp,wsize=32767,rsize=32767,late 0 0 which leads to the page-faults. And as you said wsize/rsize gets rounded down to the multiple of 512: odo.firefly:/storage/multimedia on /multimedia nfsv3,tcp,resvport,soft,intr,cto,lockd,sec=sys,acdirmin=3,acdirmax=60,acregmin=5,a cregmax=60,nametimeo=60,negnametimeo=60,rsize=32256,wsize=32256,readdirsi ze=32256,readahead=4,wcommitsize=2798255,timeout=120,retrans=2 I can easily reproduce the pagefault by letting for example multimedia/gpodder write to the nfs. hope this helps, mfg Tobias On Tuesday 21 October 2014 15.45:24 Marcelo Araujo wrote: Hello Tobias, That sounds good, at least you don't have any crash so far. I agree with you, seems a bug, I'm gonna take a look on that. Could you share with me your testbed or how you can reproduce the issue? Best Regards, 2014-10-21 15:36 GMT+08:00 T.C.Berner tcber...@gmail.com: The system now has an uptime of 24h using NFS heavily. So wsize/rsize=2^15-1 seems to have been the problem which is imho a bug therefore. mfg Tobias 2014-10-21 5:11 GMT+02:00 Marcelo Araujo araujobsdp...@gmail.com: Hello Tibias, Any news? Best Regards, 2014-10-20 20:55 GMT+08:00 Rick Macklem rmack...@uoguelph.ca: Tobias C. Berner wrote: Now that I posted it, 32767 should of course be 2^15=32768. Let me recheck if it still hangs with the correct value. On Monday 20 October 2014 09.15:39 Tobias C. Berner wrote: Hi Marcelo Yes, I'm using readahead: The mountoptions are readahead=4,soft,intr,rw,tcp,wsize=32767,rsize=32767,late If you type nfsstat -m, you will see what is actually getting used. (I suspect the above rsize/wsize got clipped to 32256 or something like that. I think it clips it to a multiple of 512.) If rsize/wsize are not a power of 2, there are issues, although I've never been able to see why it is broken. Maybe it should clip it to the power of 2 below the value, since it causes unexplained problems otherwise. rick mfg Tobias On Monday 20 October 2014 10.41:30 Marcelo Araujo wrote: Hello Tobias, Could you show how you are mount the NFS share? Are you using 'readahead' option? Best Regards, 2014-10-19 17:40 GMT+08:00 Tobias C. Berner tcber...@gmail.com: both are at 1100038. On Sunday 19 October 2014 11.12:36 Marcelo Araujo wrote: It is still strange, could you do what Allan said and send us the result in case you are not sure you have world and kernel in the same revision! On Oct 19, 2014 6:48 AM, Tobias C. Berner tcber...@gmail.com wrote: Hi World ist from october 16, installed world and kernel then. Kernel was later rebuilt with debug-options. Is the following more sensible? ## # kgdb NOXON/kernel.debug vmcore.1 ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: kernel page fault with nfs
Tobias, Thank you very much, it really helps to simulate the problem. I'm gonna try as soon as possible and I will keep you informed. Best Regards, 2014-10-21 17:10 GMT+08:00 Tobias C. Berner tcber...@gmail.com: Hi Marcelo The following ist the current fstab-line which seems to run smoothly: odo.firefly:/storage/multimedia /multimedia nfs readahead=4,soft,intr,rw,tcp,wsize=32768,rsize=32768,late 0 0 nfsstat -m: odo.firefly:/storage/multimedia on /multimedia nfsv3,tcp,resvport,soft,intr,cto,lockd,sec=sys,acdirmin=3,acdirmax=60,acregmin=5,acregmax=60,nametimeo=60,negnametimeo=60,rsize=32768,wsize=32768,readdirsize=32768,readahead=4,wcommitsize=2798255,timeout=120,retrans=2 Now the bad line (no different appart from the typo) odo.firefly:/storage/multimedia /multimedia nfs readahead=4,soft,intr,rw,tcp,wsize=32767,rsize=32767,late 0 0 which leads to the page-faults. And as you said wsize/rsize gets rounded down to the multiple of 512: odo.firefly:/storage/multimedia on /multimedia nfsv3,tcp,resvport,soft,intr,cto,lockd,sec=sys,acdirmin=3,acdirmax=60,acregmin=5,acregmax=60,nametimeo=60,negnametimeo=60,rsize=32256,wsize=32256,readdirsize=32256,readahead=4,wcommitsize=2798255,timeout=120,retrans=2 I can easily reproduce the pagefault by letting for example multimedia/gpodder write to the nfs. hope this helps, mfg Tobias On Tuesday 21 October 2014 15.45:24 Marcelo Araujo wrote: Hello Tobias, That sounds good, at least you don't have any crash so far. I agree with you, seems a bug, I'm gonna take a look on that. Could you share with me your testbed or how you can reproduce the issue? Best Regards, 2014-10-21 15:36 GMT+08:00 T.C.Berner tcber...@gmail.com: The system now has an uptime of 24h using NFS heavily. So wsize/rsize=2^15-1 seems to have been the problem which is imho a bug therefore. mfg Tobias 2014-10-21 5:11 GMT+02:00 Marcelo Araujo araujobsdp...@gmail.com: Hello Tibias, Any news? Best Regards, 2014-10-20 20:55 GMT+08:00 Rick Macklem rmack...@uoguelph.ca: Tobias C. Berner wrote: Now that I posted it, 32767 should of course be 2^15=32768. Let me recheck if it still hangs with the correct value. On Monday 20 October 2014 09.15:39 Tobias C. Berner wrote: Hi Marcelo Yes, I'm using readahead: The mountoptions are readahead=4,soft,intr,rw,tcp,wsize=32767,rsize=32767,late If you type nfsstat -m, you will see what is actually getting used. (I suspect the above rsize/wsize got clipped to 32256 or something like that. I think it clips it to a multiple of 512.) If rsize/wsize are not a power of 2, there are issues, although I've never been able to see why it is broken. Maybe it should clip it to the power of 2 below the value, since it causes unexplained problems otherwise. rick mfg Tobias On Monday 20 October 2014 10.41:30 Marcelo Araujo wrote: Hello Tobias, Could you show how you are mount the NFS share? Are you using 'readahead' option? Best Regards, 2014-10-19 17:40 GMT+08:00 Tobias C. Berner tcber...@gmail.com: both are at 1100038. On Sunday 19 October 2014 11.12:36 Marcelo Araujo wrote: It is still strange, could you do what Allan said and send us the result in case you are not sure you have world and kernel in the same revision! On Oct 19, 2014 6:48 AM, Tobias C. Berner tcber...@gmail.com wrote: Hi World ist from october 16, installed world and kernel then. Kernel was later rebuilt with debug-options. Is the following more sensible? ## # kgdb NOXON/kernel.debug vmcore.1 Fatal trap 12: page fault while in kernel mode cpuid = 5; apic id = 05 fault virtual address = 0xfe07d1744000 fault code = supervisor write data, page not present instruction pointer = 0x20:0x80d4d58a stack pointer = 0x28:0xfe086057f240 frame pointer = 0x28:0xfe086057f2f0 code segment = base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 6524 (python2.7) (kgdb) bt #0 doadump (textdump=1) at pcpu.h:219 #1 0x80926b6d in kern_reboot
Re: kernel page fault with nfs
No problem, I hope you are successful. mfg Tobias On Tuesday 21 October 2014 17.13:32 Marcelo Araujo wrote: Tobias, Thank you very much, it really helps to simulate the problem. I'm gonna try as soon as possible and I will keep you informed. Best Regards, 2014-10-21 17:10 GMT+08:00 Tobias C. Berner tcber...@gmail.com: Hi Marcelo The following ist the current fstab-line which seems to run smoothly: odo.firefly:/storage/multimedia /multimedia nfs readahead=4,soft,intr,rw,tcp,wsize=32768,rsize=32768,late 0 0 nfsstat -m: odo.firefly:/storage/multimedia on /multimedia nfsv3,tcp,resvport,soft,intr,cto,lockd,sec=sys,acdirmin=3,acdirmax=60,acre gmin=5,acregmax=60,nametimeo=60,negnametimeo=60,rsize=32768,wsize=32768,re addirsize=32768,readahead=4,wcommitsize=2798255,timeout=120,retrans=2 Now the bad line (no different appart from the typo) odo.firefly:/storage/multimedia /multimedia nfs readahead=4,soft,intr,rw,tcp,wsize=32767,rsize=32767,late 0 0 which leads to the page-faults. And as you said wsize/rsize gets rounded down to the multiple of 512: odo.firefly:/storage/multimedia on /multimedia nfsv3,tcp,resvport,soft,intr,cto,lockd,sec=sys,acdirmin=3,acdirmax=60,acre gmin=5,acregmax=60,nametimeo=60,negnametimeo=60,rsize=32256,wsize=32256,re addirsize=32256,readahead=4,wcommitsize=2798255,timeout=120,retrans=2 I can easily reproduce the pagefault by letting for example multimedia/gpodder write to the nfs. hope this helps, mfg Tobias On Tuesday 21 October 2014 15.45:24 Marcelo Araujo wrote: Hello Tobias, That sounds good, at least you don't have any crash so far. I agree with you, seems a bug, I'm gonna take a look on that. Could you share with me your testbed or how you can reproduce the issue? Best Regards, 2014-10-21 15:36 GMT+08:00 T.C.Berner tcber...@gmail.com: The system now has an uptime of 24h using NFS heavily. So wsize/rsize=2^15-1 seems to have been the problem which is imho a bug therefore. mfg Tobias 2014-10-21 5:11 GMT+02:00 Marcelo Araujo araujobsdp...@gmail.com: Hello Tibias, Any news? Best Regards, 2014-10-20 20:55 GMT+08:00 Rick Macklem rmack...@uoguelph.ca: Tobias C. Berner wrote: Now that I posted it, 32767 should of course be 2^15=32768. Let me recheck if it still hangs with the correct value. On Monday 20 October 2014 09.15:39 Tobias C. Berner wrote: Hi Marcelo Yes, I'm using readahead: The mountoptions are readahead=4,soft,intr,rw,tcp,wsize=32767,rsize=32767,late If you type nfsstat -m, you will see what is actually getting used. (I suspect the above rsize/wsize got clipped to 32256 or something like that. I think it clips it to a multiple of 512.) If rsize/wsize are not a power of 2, there are issues, although I've never been able to see why it is broken. Maybe it should clip it to the power of 2 below the value, since it causes unexplained problems otherwise. rick mfg Tobias On Monday 20 October 2014 10.41:30 Marcelo Araujo wrote: Hello Tobias, Could you show how you are mount the NFS share? Are you using 'readahead' option? Best Regards, 2014-10-19 17:40 GMT+08:00 Tobias C. Berner tcber...@gmail.com: both are at 1100038. On Sunday 19 October 2014 11.12:36 Marcelo Araujo wrote: It is still strange, could you do what Allan said and send us the result in case you are not sure you have world and kernel in the same revision! On Oct 19, 2014 6:48 AM, Tobias C. Berner tcber...@gmail.com wrote: Hi World ist from october 16, installed world and kernel then. Kernel was later rebuilt with debug-options. Is the following more sensible? ## # kgdb NOXON/kernel.debug vmcore.1 Fatal trap 12: page fault while in kernel mode cpuid = 5; apic id = 05 fault virtual address = 0xfe07d1744000
Re: kernel page fault with nfs
Marcelo Araujo wrote: Tobias, Thank you very much, it really helps to simulate the problem. I'm gonna try as soon as possible and I will keep you informed. You are more than welcome to try and find this. However, I don't think folks need to use a non-power of 2 rsize/wsize, so I think I'll change the client to clip rsize/wsize to that. (This example was just a typo.) Is there anyone out there that thinks having an rsize/wsize that isn't a power of 2 is needed? Thanks, rick Best Regards, 2014-10-21 17:10 GMT+08:00 Tobias C. Berner tcber...@gmail.com: Hi Marcelo The following ist the current fstab-line which seems to run smoothly: odo.firefly:/storage/multimedia /multimedia nfs readahead=4,soft,intr,rw,tcp,wsize=32768,rsize=32768,late 0 0 nfsstat -m: odo.firefly:/storage/multimedia on /multimedia nfsv3,tcp,resvport,soft,intr,cto,lockd,sec=sys,acdirmin=3,acdirmax=60,acregmin=5,acregmax=60,nametimeo=60,negnametimeo=60,rsize=32768,wsize=32768,readdirsize=32768,readahead=4,wcommitsize=2798255,timeout=120,retrans=2 Now the bad line (no different appart from the typo) odo.firefly:/storage/multimedia /multimedia nfs readahead=4,soft,intr,rw,tcp,wsize=32767,rsize=32767,late 0 0 which leads to the page-faults. And as you said wsize/rsize gets rounded down to the multiple of 512: odo.firefly:/storage/multimedia on /multimedia nfsv3,tcp,resvport,soft,intr,cto,lockd,sec=sys,acdirmin=3,acdirmax=60,acregmin=5,acregmax=60,nametimeo=60,negnametimeo=60,rsize=32256,wsize=32256,readdirsize=32256,readahead=4,wcommitsize=2798255,timeout=120,retrans=2 I can easily reproduce the pagefault by letting for example multimedia/gpodder write to the nfs. hope this helps, mfg Tobias On Tuesday 21 October 2014 15.45:24 Marcelo Araujo wrote: Hello Tobias, That sounds good, at least you don't have any crash so far. I agree with you, seems a bug, I'm gonna take a look on that. Could you share with me your testbed or how you can reproduce the issue? Best Regards, 2014-10-21 15:36 GMT+08:00 T.C.Berner tcber...@gmail.com: The system now has an uptime of 24h using NFS heavily. So wsize/rsize=2^15-1 seems to have been the problem which is imho a bug therefore. mfg Tobias 2014-10-21 5:11 GMT+02:00 Marcelo Araujo araujobsdp...@gmail.com: Hello Tibias, Any news? Best Regards, 2014-10-20 20:55 GMT+08:00 Rick Macklem rmack...@uoguelph.ca: Tobias C. Berner wrote: Now that I posted it, 32767 should of course be 2^15=32768. Let me recheck if it still hangs with the correct value. On Monday 20 October 2014 09.15:39 Tobias C. Berner wrote: Hi Marcelo Yes, I'm using readahead: The mountoptions are readahead=4,soft,intr,rw,tcp,wsize=32767,rsize=32767,late If you type nfsstat -m, you will see what is actually getting used. (I suspect the above rsize/wsize got clipped to 32256 or something like that. I think it clips it to a multiple of 512.) If rsize/wsize are not a power of 2, there are issues, although I've never been able to see why it is broken. Maybe it should clip it to the power of 2 below the value, since it causes unexplained problems otherwise. rick mfg Tobias On Monday 20 October 2014 10.41:30 Marcelo Araujo wrote: Hello Tobias, Could you show how you are mount the NFS share? Are you using 'readahead' option? Best Regards, 2014-10-19 17:40 GMT+08:00 Tobias C. Berner tcber...@gmail.com: both are at 1100038. On Sunday 19 October 2014 11.12:36 Marcelo Araujo wrote: It is still strange, could you do what Allan said and send us the result in case you are not sure you have world and kernel in the same revision! On Oct 19, 2014 6:48 AM, Tobias C. Berner tcber...@gmail.com wrote: Hi World ist from october 16, installed world and kernel then. Kernel was later rebuilt with debug-options. Is the following more sensible? ## # kgdb NOXON/kernel.debug vmcore.1 Fatal trap 12: page fault while in kernel mode cpuid = 5; apic id = 05 fault virtual address = 0xfe07d1744000
Re: kernel page fault with nfs
Marcelo Araujo wrote: Tobias, Thank you very much, it really helps to simulate the problem. I'm gonna try as soon as possible and I will keep you informed. Oh, and since the NFS client code is straightforward, I think the bug is somewhere in the buffer cache/vm code. Maybe for cases where the size of the buffer in the buffer cache isn't an exact multiple of PAGE_SIZE. rick Best Regards, 2014-10-21 17:10 GMT+08:00 Tobias C. Berner tcber...@gmail.com: Hi Marcelo The following ist the current fstab-line which seems to run smoothly: odo.firefly:/storage/multimedia /multimedia nfs readahead=4,soft,intr,rw,tcp,wsize=32768,rsize=32768,late 0 0 nfsstat -m: odo.firefly:/storage/multimedia on /multimedia nfsv3,tcp,resvport,soft,intr,cto,lockd,sec=sys,acdirmin=3,acdirmax=60,acregmin=5,acregmax=60,nametimeo=60,negnametimeo=60,rsize=32768,wsize=32768,readdirsize=32768,readahead=4,wcommitsize=2798255,timeout=120,retrans=2 Now the bad line (no different appart from the typo) odo.firefly:/storage/multimedia /multimedia nfs readahead=4,soft,intr,rw,tcp,wsize=32767,rsize=32767,late 0 0 which leads to the page-faults. And as you said wsize/rsize gets rounded down to the multiple of 512: odo.firefly:/storage/multimedia on /multimedia nfsv3,tcp,resvport,soft,intr,cto,lockd,sec=sys,acdirmin=3,acdirmax=60,acregmin=5,acregmax=60,nametimeo=60,negnametimeo=60,rsize=32256,wsize=32256,readdirsize=32256,readahead=4,wcommitsize=2798255,timeout=120,retrans=2 I can easily reproduce the pagefault by letting for example multimedia/gpodder write to the nfs. hope this helps, mfg Tobias On Tuesday 21 October 2014 15.45:24 Marcelo Araujo wrote: Hello Tobias, That sounds good, at least you don't have any crash so far. I agree with you, seems a bug, I'm gonna take a look on that. Could you share with me your testbed or how you can reproduce the issue? Best Regards, 2014-10-21 15:36 GMT+08:00 T.C.Berner tcber...@gmail.com: The system now has an uptime of 24h using NFS heavily. So wsize/rsize=2^15-1 seems to have been the problem which is imho a bug therefore. mfg Tobias 2014-10-21 5:11 GMT+02:00 Marcelo Araujo araujobsdp...@gmail.com: Hello Tibias, Any news? Best Regards, 2014-10-20 20:55 GMT+08:00 Rick Macklem rmack...@uoguelph.ca: Tobias C. Berner wrote: Now that I posted it, 32767 should of course be 2^15=32768. Let me recheck if it still hangs with the correct value. On Monday 20 October 2014 09.15:39 Tobias C. Berner wrote: Hi Marcelo Yes, I'm using readahead: The mountoptions are readahead=4,soft,intr,rw,tcp,wsize=32767,rsize=32767,late If you type nfsstat -m, you will see what is actually getting used. (I suspect the above rsize/wsize got clipped to 32256 or something like that. I think it clips it to a multiple of 512.) If rsize/wsize are not a power of 2, there are issues, although I've never been able to see why it is broken. Maybe it should clip it to the power of 2 below the value, since it causes unexplained problems otherwise. rick mfg Tobias On Monday 20 October 2014 10.41:30 Marcelo Araujo wrote: Hello Tobias, Could you show how you are mount the NFS share? Are you using 'readahead' option? Best Regards, 2014-10-19 17:40 GMT+08:00 Tobias C. Berner tcber...@gmail.com: both are at 1100038. On Sunday 19 October 2014 11.12:36 Marcelo Araujo wrote: It is still strange, could you do what Allan said and send us the result in case you are not sure you have world and kernel in the same revision! On Oct 19, 2014 6:48 AM, Tobias C. Berner tcber...@gmail.com wrote: Hi World ist from october 16, installed world and kernel then. Kernel was later rebuilt with debug-options. Is the following more sensible? ## # kgdb NOXON/kernel.debug vmcore.1 Fatal trap 12: page fault while in kernel mode cpuid = 5; apic id = 05 fault virtual address = 0xfe07d1744000 fault code = supervisor write data, page not present
Re: kernel page fault with nfs
2014-10-21 20:48 GMT+08:00 Rick Macklem rmack...@uoguelph.ca: Marcelo Araujo wrote: Tobias, Thank you very much, it really helps to simulate the problem. I'm gonna try as soon as possible and I will keep you informed. You are more than welcome to try and find this. However, I don't think folks need to use a non-power of 2 rsize/wsize, so I think I'll change the client to clip rsize/wsize to that. (This example was just a typo.) Yes, that would be nice if you can do that. As seems you gonna take a look on it, I'm gonna move to other things. Is there anyone out there that thinks having an rsize/wsize that isn't a power of 2 is needed? Personally, I can't see where it would be needed. Thanks, rick Best Regards, 2014-10-21 17:10 GMT+08:00 Tobias C. Berner tcber...@gmail.com: Hi Marcelo The following ist the current fstab-line which seems to run smoothly: odo.firefly:/storage/multimedia /multimedia nfs readahead=4,soft,intr,rw,tcp,wsize=32768,rsize=32768,late 0 0 nfsstat -m: odo.firefly:/storage/multimedia on /multimedia nfsv3,tcp,resvport,soft,intr,cto,lockd,sec=sys,acdirmin=3,acdirmax=60,acregmin=5,acregmax=60,nametimeo=60,negnametimeo=60,rsize=32768,wsize=32768,readdirsize=32768,readahead=4,wcommitsize=2798255,timeout=120,retrans=2 Now the bad line (no different appart from the typo) odo.firefly:/storage/multimedia /multimedia nfs readahead=4,soft,intr,rw,tcp,wsize=32767,rsize=32767,late 0 0 which leads to the page-faults. And as you said wsize/rsize gets rounded down to the multiple of 512: odo.firefly:/storage/multimedia on /multimedia nfsv3,tcp,resvport,soft,intr,cto,lockd,sec=sys,acdirmin=3,acdirmax=60,acregmin=5,acregmax=60,nametimeo=60,negnametimeo=60,rsize=32256,wsize=32256,readdirsize=32256,readahead=4,wcommitsize=2798255,timeout=120,retrans=2 I can easily reproduce the pagefault by letting for example multimedia/gpodder write to the nfs. hope this helps, mfg Tobias On Tuesday 21 October 2014 15.45:24 Marcelo Araujo wrote: Hello Tobias, That sounds good, at least you don't have any crash so far. I agree with you, seems a bug, I'm gonna take a look on that. Could you share with me your testbed or how you can reproduce the issue? Best Regards, 2014-10-21 15:36 GMT+08:00 T.C.Berner tcber...@gmail.com: The system now has an uptime of 24h using NFS heavily. So wsize/rsize=2^15-1 seems to have been the problem which is imho a bug therefore. mfg Tobias 2014-10-21 5:11 GMT+02:00 Marcelo Araujo araujobsdp...@gmail.com: Hello Tibias, Any news? Best Regards, 2014-10-20 20:55 GMT+08:00 Rick Macklem rmack...@uoguelph.ca: Tobias C. Berner wrote: Now that I posted it, 32767 should of course be 2^15=32768. Let me recheck if it still hangs with the correct value. On Monday 20 October 2014 09.15:39 Tobias C. Berner wrote: Hi Marcelo Yes, I'm using readahead: The mountoptions are readahead=4,soft,intr,rw,tcp,wsize=32767,rsize=32767,late If you type nfsstat -m, you will see what is actually getting used. (I suspect the above rsize/wsize got clipped to 32256 or something like that. I think it clips it to a multiple of 512.) If rsize/wsize are not a power of 2, there are issues, although I've never been able to see why it is broken. Maybe it should clip it to the power of 2 below the value, since it causes unexplained problems otherwise. rick mfg Tobias On Monday 20 October 2014 10.41:30 Marcelo Araujo wrote: Hello Tobias, Could you show how you are mount the NFS share? Are you using 'readahead' option? Best Regards, 2014-10-19 17:40 GMT+08:00 Tobias C. Berner tcber...@gmail.com: both are at 1100038. On Sunday 19 October 2014 11.12:36 Marcelo Araujo wrote: It is still strange, could you do what Allan said and send us the result in case you are not sure you have world and kernel in the same revision! On Oct 19, 2014 6:48 AM, Tobias C. Berner tcber...@gmail.com wrote: Hi World ist from october 16, installed world and
Re: kernel page fault with nfs
Hi Marcelo Yes, I'm using readahead: The mountoptions are readahead=4,soft,intr,rw,tcp,wsize=32767,rsize=32767,late mfg Tobias On Monday 20 October 2014 10.41:30 Marcelo Araujo wrote: Hello Tobias, Could you show how you are mount the NFS share? Are you using 'readahead' option? Best Regards, 2014-10-19 17:40 GMT+08:00 Tobias C. Berner tcber...@gmail.com: both are at 1100038. On Sunday 19 October 2014 11.12:36 Marcelo Araujo wrote: It is still strange, could you do what Allan said and send us the result in case you are not sure you have world and kernel in the same revision! On Oct 19, 2014 6:48 AM, Tobias C. Berner tcber...@gmail.com wrote: Hi World ist from october 16, installed world and kernel then. Kernel was later rebuilt with debug-options. Is the following more sensible? ## # kgdb NOXON/kernel.debug vmcore.1 Fatal trap 12: page fault while in kernel mode cpuid = 5; apic id = 05 fault virtual address = 0xfe07d1744000 fault code = supervisor write data, page not present instruction pointer = 0x20:0x80d4d58a stack pointer = 0x28:0xfe086057f240 frame pointer = 0x28:0xfe086057f2f0 code segment = base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 6524 (python2.7) (kgdb) bt #0 doadump (textdump=1) at pcpu.h:219 #1 0x80926b6d in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:447 #2 0x809270c0 in panic (fmt=value optimized out) at /usr/src/sys/kern/kern_shutdown.c:746 #3 0x8035f167 in db_panic (addr=value optimized out, have_addr=2, count=0, modif=0x0) at /usr/src/sys/ddb/db_command.c:473 #4 0x8035ed7d in db_command (cmd_table=0x0) at /usr/src/sys/ddb/db_command.c:440 #5 0x8035eaf4 in db_command_loop () at /usr/src/sys/ddb/db_command.c:493 #6 0x80361600 in db_trap (type=value optimized out, code=0) at /usr/src/sys/ddb/db_main.c:251 #7 0x80966f01 in kdb_trap (type=12, code=0, tf=value optimized out) at /usr/src/sys/kern/subr_kdb.c:654 #8 0x80d4fa7c in trap_fatal (frame=0xfe086057f190, eva=value optimized out) at /usr/src/sys/amd64/amd64/trap.c:861 #9 0x80d4fe0c in trap_pfault (frame=0xfe086057f190, usermode=value optimized out) at /usr/src/sys/amd64/amd64/trap.c:677 #10 0x80d4f42e in trap (frame=0xfe086057f190) at /usr/src/sys/amd64/amd64/trap.c:426 #11 0x80d33972 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:231 #12 0x80d4d58a in bzero () at /usr/src/sys/amd64/amd64/support.S:53 #13 0x80830463 in ncl_doio (vp=0xf801e7f99938, bp=0xfe07c5a168e8, cr=value optimized out, td=value optimized out, called_from_strategy=value optimized out) at /usr/src/sys/fs/nfsclient/nfs_clbio.c:1648 #14 0x80831acf in ncl_write (ap=value optimized out) at /usr/src/sys/fs/nfsclient/nfs_clbio.c:1124 #15 0x80e646a5 in VOP_WRITE_APV (vop=value optimized out, a=value optimized out) at vnode_if.c:997 #16 0x809f52f9 in vn_write (fp=0xf80101c62780, uio=0xfe086057f970, active_cred=value optimized out, flags=320, td=0x0) at vnode_if.h:413 #17 0x809f5602 in vn_io_fault_doio (args=value optimized out, uio=0xa00, td=0x0) at /usr/src/sys/kern/vfs_vnops.c:991 #18 0x809f2aec in vn_io_fault1 () at /usr/src/sys/kern/vfs_vnops.c:1047 #19 0x809f0e3b in vn_io_fault (fp=0xf80101c62780, uio=0xfe086057f970, active_cred=value optimized out, flags=0, td=0xf80171d79920) at /usr/src/sys/kern/vfs_vnops.c:1152 #20 0x80982357 in dofilewrite (td=0xf80171d79920, fd=19, fp=0xf80101c62780, auio=0xfe086057f970, offset=value optimized out, flags=0) at file.h:306 #21 0x80982088 in kern_writev (td=0xf80171d79920, fd=19, auio=0xfe086057f970) at /usr/src/sys/kern/sys_generic.c:467 #22 0x80982013 in sys_write (td=value optimized out, uap=value optimized out) at /usr/src/sys/kern/sys_generic.c:382 #23 0x80d5051b in amd64_syscall (td=0xf80171d79920, traced=0) at subr_syscall.c:133 #24 0x80d33c5b in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:390 #25 0x00080137de4a in ?? () ##
Re: kernel page fault with nfs
Now that I posted it, 32767 should of course be 2^15=32768. Let me recheck if it still hangs with the correct value. On Monday 20 October 2014 09.15:39 Tobias C. Berner wrote: Hi Marcelo Yes, I'm using readahead: The mountoptions are readahead=4,soft,intr,rw,tcp,wsize=32767,rsize=32767,late mfg Tobias On Monday 20 October 2014 10.41:30 Marcelo Araujo wrote: Hello Tobias, Could you show how you are mount the NFS share? Are you using 'readahead' option? Best Regards, 2014-10-19 17:40 GMT+08:00 Tobias C. Berner tcber...@gmail.com: both are at 1100038. On Sunday 19 October 2014 11.12:36 Marcelo Araujo wrote: It is still strange, could you do what Allan said and send us the result in case you are not sure you have world and kernel in the same revision! On Oct 19, 2014 6:48 AM, Tobias C. Berner tcber...@gmail.com wrote: Hi World ist from october 16, installed world and kernel then. Kernel was later rebuilt with debug-options. Is the following more sensible? ## # kgdb NOXON/kernel.debug vmcore.1 Fatal trap 12: page fault while in kernel mode cpuid = 5; apic id = 05 fault virtual address = 0xfe07d1744000 fault code = supervisor write data, page not present instruction pointer = 0x20:0x80d4d58a stack pointer = 0x28:0xfe086057f240 frame pointer = 0x28:0xfe086057f2f0 code segment = base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 6524 (python2.7) (kgdb) bt #0 doadump (textdump=1) at pcpu.h:219 #1 0x80926b6d in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:447 #2 0x809270c0 in panic (fmt=value optimized out) at /usr/src/sys/kern/kern_shutdown.c:746 #3 0x8035f167 in db_panic (addr=value optimized out, have_addr=2, count=0, modif=0x0) at /usr/src/sys/ddb/db_command.c:473 #4 0x8035ed7d in db_command (cmd_table=0x0) at /usr/src/sys/ddb/db_command.c:440 #5 0x8035eaf4 in db_command_loop () at /usr/src/sys/ddb/db_command.c:493 #6 0x80361600 in db_trap (type=value optimized out, code=0) at /usr/src/sys/ddb/db_main.c:251 #7 0x80966f01 in kdb_trap (type=12, code=0, tf=value optimized out) at /usr/src/sys/kern/subr_kdb.c:654 #8 0x80d4fa7c in trap_fatal (frame=0xfe086057f190, eva=value optimized out) at /usr/src/sys/amd64/amd64/trap.c:861 #9 0x80d4fe0c in trap_pfault (frame=0xfe086057f190, usermode=value optimized out) at /usr/src/sys/amd64/amd64/trap.c:677 #10 0x80d4f42e in trap (frame=0xfe086057f190) at /usr/src/sys/amd64/amd64/trap.c:426 #11 0x80d33972 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:231 #12 0x80d4d58a in bzero () at /usr/src/sys/amd64/amd64/support.S:53 #13 0x80830463 in ncl_doio (vp=0xf801e7f99938, bp=0xfe07c5a168e8, cr=value optimized out, td=value optimized out, called_from_strategy=value optimized out) at /usr/src/sys/fs/nfsclient/nfs_clbio.c:1648 ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: kernel page fault with nfs
Tobias C. Berner wrote: Now that I posted it, 32767 should of course be 2^15=32768. Let me recheck if it still hangs with the correct value. On Monday 20 October 2014 09.15:39 Tobias C. Berner wrote: Hi Marcelo Yes, I'm using readahead: The mountoptions are readahead=4,soft,intr,rw,tcp,wsize=32767,rsize=32767,late If you type nfsstat -m, you will see what is actually getting used. (I suspect the above rsize/wsize got clipped to 32256 or something like that. I think it clips it to a multiple of 512.) If rsize/wsize are not a power of 2, there are issues, although I've never been able to see why it is broken. Maybe it should clip it to the power of 2 below the value, since it causes unexplained problems otherwise. rick mfg Tobias On Monday 20 October 2014 10.41:30 Marcelo Araujo wrote: Hello Tobias, Could you show how you are mount the NFS share? Are you using 'readahead' option? Best Regards, 2014-10-19 17:40 GMT+08:00 Tobias C. Berner tcber...@gmail.com: both are at 1100038. On Sunday 19 October 2014 11.12:36 Marcelo Araujo wrote: It is still strange, could you do what Allan said and send us the result in case you are not sure you have world and kernel in the same revision! On Oct 19, 2014 6:48 AM, Tobias C. Berner tcber...@gmail.com wrote: Hi World ist from october 16, installed world and kernel then. Kernel was later rebuilt with debug-options. Is the following more sensible? ## # kgdb NOXON/kernel.debug vmcore.1 Fatal trap 12: page fault while in kernel mode cpuid = 5; apic id = 05 fault virtual address = 0xfe07d1744000 fault code = supervisor write data, page not present instruction pointer = 0x20:0x80d4d58a stack pointer = 0x28:0xfe086057f240 frame pointer = 0x28:0xfe086057f2f0 code segment = base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 6524 (python2.7) (kgdb) bt #0 doadump (textdump=1) at pcpu.h:219 #1 0x80926b6d in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:447 #2 0x809270c0 in panic (fmt=value optimized out) at /usr/src/sys/kern/kern_shutdown.c:746 #3 0x8035f167 in db_panic (addr=value optimized out, have_addr=2, count=0, modif=0x0) at /usr/src/sys/ddb/db_command.c:473 #4 0x8035ed7d in db_command (cmd_table=0x0) at /usr/src/sys/ddb/db_command.c:440 #5 0x8035eaf4 in db_command_loop () at /usr/src/sys/ddb/db_command.c:493 #6 0x80361600 in db_trap (type=value optimized out, code=0) at /usr/src/sys/ddb/db_main.c:251 #7 0x80966f01 in kdb_trap (type=12, code=0, tf=value optimized out) at /usr/src/sys/kern/subr_kdb.c:654 #8 0x80d4fa7c in trap_fatal (frame=0xfe086057f190, eva=value optimized out) at /usr/src/sys/amd64/amd64/trap.c:861 #9 0x80d4fe0c in trap_pfault (frame=0xfe086057f190, usermode=value optimized out) at /usr/src/sys/amd64/amd64/trap.c:677 #10 0x80d4f42e in trap (frame=0xfe086057f190) at /usr/src/sys/amd64/amd64/trap.c:426 #11 0x80d33972 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:231 #12 0x80d4d58a in bzero () at /usr/src/sys/amd64/amd64/support.S:53 #13 0x80830463 in ncl_doio (vp=0xf801e7f99938, bp=0xfe07c5a168e8, cr=value optimized out, td=value optimized out, called_from_strategy=value optimized out) at /usr/src/sys/fs/nfsclient/nfs_clbio.c:1648 ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: kernel page fault with nfs
Hello Tibias, Any news? Best Regards, 2014-10-20 20:55 GMT+08:00 Rick Macklem rmack...@uoguelph.ca: Tobias C. Berner wrote: Now that I posted it, 32767 should of course be 2^15=32768. Let me recheck if it still hangs with the correct value. On Monday 20 October 2014 09.15:39 Tobias C. Berner wrote: Hi Marcelo Yes, I'm using readahead: The mountoptions are readahead=4,soft,intr,rw,tcp,wsize=32767,rsize=32767,late If you type nfsstat -m, you will see what is actually getting used. (I suspect the above rsize/wsize got clipped to 32256 or something like that. I think it clips it to a multiple of 512.) If rsize/wsize are not a power of 2, there are issues, although I've never been able to see why it is broken. Maybe it should clip it to the power of 2 below the value, since it causes unexplained problems otherwise. rick mfg Tobias On Monday 20 October 2014 10.41:30 Marcelo Araujo wrote: Hello Tobias, Could you show how you are mount the NFS share? Are you using 'readahead' option? Best Regards, 2014-10-19 17:40 GMT+08:00 Tobias C. Berner tcber...@gmail.com: both are at 1100038. On Sunday 19 October 2014 11.12:36 Marcelo Araujo wrote: It is still strange, could you do what Allan said and send us the result in case you are not sure you have world and kernel in the same revision! On Oct 19, 2014 6:48 AM, Tobias C. Berner tcber...@gmail.com wrote: Hi World ist from october 16, installed world and kernel then. Kernel was later rebuilt with debug-options. Is the following more sensible? ## # kgdb NOXON/kernel.debug vmcore.1 Fatal trap 12: page fault while in kernel mode cpuid = 5; apic id = 05 fault virtual address = 0xfe07d1744000 fault code = supervisor write data, page not present instruction pointer = 0x20:0x80d4d58a stack pointer = 0x28:0xfe086057f240 frame pointer = 0x28:0xfe086057f2f0 code segment = base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 6524 (python2.7) (kgdb) bt #0 doadump (textdump=1) at pcpu.h:219 #1 0x80926b6d in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:447 #2 0x809270c0 in panic (fmt=value optimized out) at /usr/src/sys/kern/kern_shutdown.c:746 #3 0x8035f167 in db_panic (addr=value optimized out, have_addr=2, count=0, modif=0x0) at /usr/src/sys/ddb/db_command.c:473 #4 0x8035ed7d in db_command (cmd_table=0x0) at /usr/src/sys/ddb/db_command.c:440 #5 0x8035eaf4 in db_command_loop () at /usr/src/sys/ddb/db_command.c:493 #6 0x80361600 in db_trap (type=value optimized out, code=0) at /usr/src/sys/ddb/db_main.c:251 #7 0x80966f01 in kdb_trap (type=12, code=0, tf=value optimized out) at /usr/src/sys/kern/subr_kdb.c:654 #8 0x80d4fa7c in trap_fatal (frame=0xfe086057f190, eva=value optimized out) at /usr/src/sys/amd64/amd64/trap.c:861 #9 0x80d4fe0c in trap_pfault (frame=0xfe086057f190, usermode=value optimized out) at /usr/src/sys/amd64/amd64/trap.c:677 #10 0x80d4f42e in trap (frame=0xfe086057f190) at /usr/src/sys/amd64/amd64/trap.c:426 #11 0x80d33972 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:231 #12 0x80d4d58a in bzero () at /usr/src/sys/amd64/amd64/support.S:53 #13 0x80830463 in ncl_doio (vp=0xf801e7f99938, bp=0xfe07c5a168e8, cr=value optimized out, td=value optimized out, called_from_strategy=value optimized out) at /usr/src/sys/fs/nfsclient/nfs_clbio.c:1648 ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org -- -- Marcelo Araujo(__)ara...@freebsd.org \\\'',)http://www.FreeBSD.org http://www.freebsd.org/ \/ \ ^ Power To Server. .\. /_) ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To
Re: kernel page fault with nfs
both are at 1100038. On Sunday 19 October 2014 11.12:36 Marcelo Araujo wrote: It is still strange, could you do what Allan said and send us the result in case you are not sure you have world and kernel in the same revision! On Oct 19, 2014 6:48 AM, Tobias C. Berner tcber...@gmail.com wrote: Hi World ist from october 16, installed world and kernel then. Kernel was later rebuilt with debug-options. Is the following more sensible? ## # kgdb NOXON/kernel.debug vmcore.1 Fatal trap 12: page fault while in kernel mode cpuid = 5; apic id = 05 fault virtual address = 0xfe07d1744000 fault code = supervisor write data, page not present instruction pointer = 0x20:0x80d4d58a stack pointer = 0x28:0xfe086057f240 frame pointer = 0x28:0xfe086057f2f0 code segment = base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 6524 (python2.7) (kgdb) bt #0 doadump (textdump=1) at pcpu.h:219 #1 0x80926b6d in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:447 #2 0x809270c0 in panic (fmt=value optimized out) at /usr/src/sys/kern/kern_shutdown.c:746 #3 0x8035f167 in db_panic (addr=value optimized out, have_addr=2, count=0, modif=0x0) at /usr/src/sys/ddb/db_command.c:473 #4 0x8035ed7d in db_command (cmd_table=0x0) at /usr/src/sys/ddb/db_command.c:440 #5 0x8035eaf4 in db_command_loop () at /usr/src/sys/ddb/db_command.c:493 #6 0x80361600 in db_trap (type=value optimized out, code=0) at /usr/src/sys/ddb/db_main.c:251 #7 0x80966f01 in kdb_trap (type=12, code=0, tf=value optimized out) at /usr/src/sys/kern/subr_kdb.c:654 #8 0x80d4fa7c in trap_fatal (frame=0xfe086057f190, eva=value optimized out) at /usr/src/sys/amd64/amd64/trap.c:861 #9 0x80d4fe0c in trap_pfault (frame=0xfe086057f190, usermode=value optimized out) at /usr/src/sys/amd64/amd64/trap.c:677 #10 0x80d4f42e in trap (frame=0xfe086057f190) at /usr/src/sys/amd64/amd64/trap.c:426 #11 0x80d33972 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:231 #12 0x80d4d58a in bzero () at /usr/src/sys/amd64/amd64/support.S:53 #13 0x80830463 in ncl_doio (vp=0xf801e7f99938, bp=0xfe07c5a168e8, cr=value optimized out, td=value optimized out, called_from_strategy=value optimized out) at /usr/src/sys/fs/nfsclient/nfs_clbio.c:1648 #14 0x80831acf in ncl_write (ap=value optimized out) at /usr/src/sys/fs/nfsclient/nfs_clbio.c:1124 #15 0x80e646a5 in VOP_WRITE_APV (vop=value optimized out, a=value optimized out) at vnode_if.c:997 #16 0x809f52f9 in vn_write (fp=0xf80101c62780, uio=0xfe086057f970, active_cred=value optimized out, flags=320, td=0x0) at vnode_if.h:413 #17 0x809f5602 in vn_io_fault_doio (args=value optimized out, uio=0xa00, td=0x0) at /usr/src/sys/kern/vfs_vnops.c:991 #18 0x809f2aec in vn_io_fault1 () at /usr/src/sys/kern/vfs_vnops.c:1047 #19 0x809f0e3b in vn_io_fault (fp=0xf80101c62780, uio=0xfe086057f970, active_cred=value optimized out, flags=0, td=0xf80171d79920) at /usr/src/sys/kern/vfs_vnops.c:1152 #20 0x80982357 in dofilewrite (td=0xf80171d79920, fd=19, fp=0xf80101c62780, auio=0xfe086057f970, offset=value optimized out, flags=0) at file.h:306 #21 0x80982088 in kern_writev (td=0xf80171d79920, fd=19, auio=0xfe086057f970) at /usr/src/sys/kern/sys_generic.c:467 #22 0x80982013 in sys_write (td=value optimized out, uap=value optimized out) at /usr/src/sys/kern/sys_generic.c:382 #23 0x80d5051b in amd64_syscall (td=0xf80171d79920, traced=0) at subr_syscall.c:133 #24 0x80d33c5b in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:390 #25 0x00080137de4a in ?? () ## Thanks in advance, Tobias Berner On Saturday 18 October 2014 20.43:12 Marcelo Araujo wrote: When you rebuild your system, did you rebuild and install all kernel and world? Best Regards, On Oct 18, 2014 7:57 PM, John Baldwin j...@freebsd.org wrote: On Friday, October 17, 2014 11:11:26 PM Tobias C. Berner wrote: Hi For some days now I've had problems with my current (last test with r273178M). Sometimes when accessing a nfs-share there is a pagefault: ### Fatal trap 12: page fault while in kernel mode
Re: kernel page fault with nfs
Hello Tobias, Could you show how you are mount the NFS share? Are you using 'readahead' option? Best Regards, 2014-10-19 17:40 GMT+08:00 Tobias C. Berner tcber...@gmail.com: both are at 1100038. On Sunday 19 October 2014 11.12:36 Marcelo Araujo wrote: It is still strange, could you do what Allan said and send us the result in case you are not sure you have world and kernel in the same revision! On Oct 19, 2014 6:48 AM, Tobias C. Berner tcber...@gmail.com wrote: Hi World ist from october 16, installed world and kernel then. Kernel was later rebuilt with debug-options. Is the following more sensible? ## # kgdb NOXON/kernel.debug vmcore.1 Fatal trap 12: page fault while in kernel mode cpuid = 5; apic id = 05 fault virtual address = 0xfe07d1744000 fault code = supervisor write data, page not present instruction pointer = 0x20:0x80d4d58a stack pointer = 0x28:0xfe086057f240 frame pointer = 0x28:0xfe086057f2f0 code segment = base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 6524 (python2.7) (kgdb) bt #0 doadump (textdump=1) at pcpu.h:219 #1 0x80926b6d in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:447 #2 0x809270c0 in panic (fmt=value optimized out) at /usr/src/sys/kern/kern_shutdown.c:746 #3 0x8035f167 in db_panic (addr=value optimized out, have_addr=2, count=0, modif=0x0) at /usr/src/sys/ddb/db_command.c:473 #4 0x8035ed7d in db_command (cmd_table=0x0) at /usr/src/sys/ddb/db_command.c:440 #5 0x8035eaf4 in db_command_loop () at /usr/src/sys/ddb/db_command.c:493 #6 0x80361600 in db_trap (type=value optimized out, code=0) at /usr/src/sys/ddb/db_main.c:251 #7 0x80966f01 in kdb_trap (type=12, code=0, tf=value optimized out) at /usr/src/sys/kern/subr_kdb.c:654 #8 0x80d4fa7c in trap_fatal (frame=0xfe086057f190, eva=value optimized out) at /usr/src/sys/amd64/amd64/trap.c:861 #9 0x80d4fe0c in trap_pfault (frame=0xfe086057f190, usermode=value optimized out) at /usr/src/sys/amd64/amd64/trap.c:677 #10 0x80d4f42e in trap (frame=0xfe086057f190) at /usr/src/sys/amd64/amd64/trap.c:426 #11 0x80d33972 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:231 #12 0x80d4d58a in bzero () at /usr/src/sys/amd64/amd64/support.S:53 #13 0x80830463 in ncl_doio (vp=0xf801e7f99938, bp=0xfe07c5a168e8, cr=value optimized out, td=value optimized out, called_from_strategy=value optimized out) at /usr/src/sys/fs/nfsclient/nfs_clbio.c:1648 #14 0x80831acf in ncl_write (ap=value optimized out) at /usr/src/sys/fs/nfsclient/nfs_clbio.c:1124 #15 0x80e646a5 in VOP_WRITE_APV (vop=value optimized out, a=value optimized out) at vnode_if.c:997 #16 0x809f52f9 in vn_write (fp=0xf80101c62780, uio=0xfe086057f970, active_cred=value optimized out, flags=320, td=0x0) at vnode_if.h:413 #17 0x809f5602 in vn_io_fault_doio (args=value optimized out, uio=0xa00, td=0x0) at /usr/src/sys/kern/vfs_vnops.c:991 #18 0x809f2aec in vn_io_fault1 () at /usr/src/sys/kern/vfs_vnops.c:1047 #19 0x809f0e3b in vn_io_fault (fp=0xf80101c62780, uio=0xfe086057f970, active_cred=value optimized out, flags=0, td=0xf80171d79920) at /usr/src/sys/kern/vfs_vnops.c:1152 #20 0x80982357 in dofilewrite (td=0xf80171d79920, fd=19, fp=0xf80101c62780, auio=0xfe086057f970, offset=value optimized out, flags=0) at file.h:306 #21 0x80982088 in kern_writev (td=0xf80171d79920, fd=19, auio=0xfe086057f970) at /usr/src/sys/kern/sys_generic.c:467 #22 0x80982013 in sys_write (td=value optimized out, uap=value optimized out) at /usr/src/sys/kern/sys_generic.c:382 #23 0x80d5051b in amd64_syscall (td=0xf80171d79920, traced=0) at subr_syscall.c:133 #24 0x80d33c5b in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:390 #25 0x00080137de4a in ?? () ## Thanks in advance, Tobias Berner On Saturday 18 October 2014 20.43:12 Marcelo Araujo wrote: When you rebuild your system, did you rebuild and install all kernel and world? Best Regards, On Oct 18, 2014 7:57 PM, John Baldwin j...@freebsd.org wrote: On Friday, October 17, 2014 11:11:26 PM Tobias C. Berner wrote: Hi For some days now I've had problems with my
Re: kernel page fault with nfs
On Friday, October 17, 2014 11:11:26 PM Tobias C. Berner wrote: Hi For some days now I've had problems with my current (last test with r273178M). Sometimes when accessing a nfs-share there is a pagefault: ### Fatal trap 12: page fault while in kernel mode cpuid = 7; apic id = 07 fault virtual address = 0xfe07cfe60400 fault code = supervisor read data, page not present instruction pointer = 0x20:0x80d4d4b6 stack pointer = 0x28:0xfe086088b380 frame pointer = 0x28:0xfe086088b3f0 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 43868 (mplayer) #0 0x80926d29 in shutdown_nice (howto=1) at /usr/src/sys/kern/kern_shutdown.c:207 #1 0x80926a2d in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:444 #2 0x80926f80 in panic (fmt=0x104 Address 0x104 out of bounds) at /usr/src/sys/kern/kern_shutdown.c:698 #3 0x8035f147 in panic_cmd_del (arg=0x0) at /usr/src/sys/ddb/db_command.c:244 #4 0x8035ed5d in db_command (cmd_table=0x0) at /usr/src/sys/ddb/db_command.c:439 #5 0x8035ead4 in db_command_loop () at /usr/src/sys/ddb/db_command.c:488 #6 0x803615e0 in db_trap (type=value optimized out, code=0) at /usr/src/sys/ddb/db_main.c:247 #7 0x80966db1 in kdb_trap (type=12, code=0, tf=0xfe086088b2d0) at /usr/src/sys/kern/subr_kdb.c:626 #8 0x80d4f92c in trap_fatal (frame=0xfe086088b2d0, eva=value optimized out) at /usr/src/sys/amd64/amd64/trap.c:835 #9 0x80d4fcbc in trap_pfault (frame=0xfe086088b2d0, usermode=0) at atomic.h:161 #10 0x80d4f2de in trap (frame=0xfe086088b2d0) at /usr/src/sys/amd64/amd64/trap.c:594 #11 0x80d33822 in Xtss () at /usr/src/sys/amd64/amd64/exception.S:154 #12 0x80d4d4b6 in stack_save_td (st=value optimized out, td=value optimized out) at /usr/src/sys/amd64/amd64/stack_machdep.c:74 #13 0x809f30b2 in foffset_unlock (fp=value optimized out, val=value optimized out, flags=value optimized out) at /usr/src/sys/kern/vfs_vnops.c:700 #14 0x8082faad in ncl_bioread (vp=0xf80201dd7490, uio=0xfe086088b7d8, ioflag=value optimized out, cred=0xf8015816a600) at /usr/src/sys/fs/nfsclient/nfs_clbio.c:511 #15 0x80e64381 in VOP_MARKATIME_APV (vop=value optimized out, a=0xfe086088b650) at vnode_if.c:856 #16 0x809f4dd5 in vn_read (fp=0xf80272490cd0, uio=0xfe086088b7d8, active_cred=0xf8015816a600, flags=128, td=0xf800) at vnode_if.h:859 #17 0x809f5502 in get_advice (fp=0xfe086088b730, uio=0x400) at /usr/src/sys/kern/vfs_vnops.c:729 #18 0x809f2b80 in vn_io_fault1 () at /usr/src/sys/kern/vfs_vnops.c:1058 #19 0x809f0d3b in vn_io_fault (fp=0xf80272490cd0, uio=0xfe086088b970, active_cred=0x400, flags=128, td=0xf800) at /usr/src/sys/kern/vfs_vnops.c:128 #20 0x80981d95 in freebsd6_pread (td=0xf802d93204a8, uap=0xf9fb094c00a8) at /usr/src/sys/kern/sys_generic.c:217 #21 0x80981ab8 in sys_cap_fcntls_get (td=value optimized out, uap=0x800) at /usr/src/sys/kern/sys_capability.c:576 #22 0x80981a43 in sys_cap_fcntls_get (td=value optimized out, uap=0x0) at sx.h:183 #23 0x80d503cb in amd64_syscall (td=0x45e400, traced=0) at subr_syscall.c:87 #24 0x80d33b0b in Xprot () at /usr/src/sys/amd64/amd64/exception.S:324 #25 0x000806a7fe6a in ?? () ## The functions in this stack trace don't make sense. It is as if you are running kgdb against the wrong kernel. -- John Baldwin ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: kernel page fault with nfs
When you rebuild your system, did you rebuild and install all kernel and world? Best Regards, On Oct 18, 2014 7:57 PM, John Baldwin j...@freebsd.org wrote: On Friday, October 17, 2014 11:11:26 PM Tobias C. Berner wrote: Hi For some days now I've had problems with my current (last test with r273178M). Sometimes when accessing a nfs-share there is a pagefault: ### Fatal trap 12: page fault while in kernel mode cpuid = 7; apic id = 07 fault virtual address = 0xfe07cfe60400 fault code = supervisor read data, page not present instruction pointer = 0x20:0x80d4d4b6 stack pointer = 0x28:0xfe086088b380 frame pointer = 0x28:0xfe086088b3f0 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 43868 (mplayer) #0 0x80926d29 in shutdown_nice (howto=1) at /usr/src/sys/kern/kern_shutdown.c:207 #1 0x80926a2d in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:444 #2 0x80926f80 in panic (fmt=0x104 Address 0x104 out of bounds) at /usr/src/sys/kern/kern_shutdown.c:698 #3 0x8035f147 in panic_cmd_del (arg=0x0) at /usr/src/sys/ddb/db_command.c:244 #4 0x8035ed5d in db_command (cmd_table=0x0) at /usr/src/sys/ddb/db_command.c:439 #5 0x8035ead4 in db_command_loop () at /usr/src/sys/ddb/db_command.c:488 #6 0x803615e0 in db_trap (type=value optimized out, code=0) at /usr/src/sys/ddb/db_main.c:247 #7 0x80966db1 in kdb_trap (type=12, code=0, tf=0xfe086088b2d0) at /usr/src/sys/kern/subr_kdb.c:626 #8 0x80d4f92c in trap_fatal (frame=0xfe086088b2d0, eva=value optimized out) at /usr/src/sys/amd64/amd64/trap.c:835 #9 0x80d4fcbc in trap_pfault (frame=0xfe086088b2d0, usermode=0) at atomic.h:161 #10 0x80d4f2de in trap (frame=0xfe086088b2d0) at /usr/src/sys/amd64/amd64/trap.c:594 #11 0x80d33822 in Xtss () at /usr/src/sys/amd64/amd64/exception.S:154 #12 0x80d4d4b6 in stack_save_td (st=value optimized out, td=value optimized out) at /usr/src/sys/amd64/amd64/stack_machdep.c:74 #13 0x809f30b2 in foffset_unlock (fp=value optimized out, val=value optimized out, flags=value optimized out) at /usr/src/sys/kern/vfs_vnops.c:700 #14 0x8082faad in ncl_bioread (vp=0xf80201dd7490, uio=0xfe086088b7d8, ioflag=value optimized out, cred=0xf8015816a600) at /usr/src/sys/fs/nfsclient/nfs_clbio.c:511 #15 0x80e64381 in VOP_MARKATIME_APV (vop=value optimized out, a=0xfe086088b650) at vnode_if.c:856 #16 0x809f4dd5 in vn_read (fp=0xf80272490cd0, uio=0xfe086088b7d8, active_cred=0xf8015816a600, flags=128, td=0xf800) at vnode_if.h:859 #17 0x809f5502 in get_advice (fp=0xfe086088b730, uio=0x400) at /usr/src/sys/kern/vfs_vnops.c:729 #18 0x809f2b80 in vn_io_fault1 () at /usr/src/sys/kern/vfs_vnops.c:1058 #19 0x809f0d3b in vn_io_fault (fp=0xf80272490cd0, uio=0xfe086088b970, active_cred=0x400, flags=128, td=0xf800) at /usr/src/sys/kern/vfs_vnops.c:128 #20 0x80981d95 in freebsd6_pread (td=0xf802d93204a8, uap=0xf9fb094c00a8) at /usr/src/sys/kern/sys_generic.c:217 #21 0x80981ab8 in sys_cap_fcntls_get (td=value optimized out, uap=0x800) at /usr/src/sys/kern/sys_capability.c:576 #22 0x80981a43 in sys_cap_fcntls_get (td=value optimized out, uap=0x0) at sx.h:183 #23 0x80d503cb in amd64_syscall (td=0x45e400, traced=0) at subr_syscall.c:87 #24 0x80d33b0b in Xprot () at /usr/src/sys/amd64/amd64/exception.S:324 #25 0x000806a7fe6a in ?? () ## The functions in this stack trace don't make sense. It is as if you are running kgdb against the wrong kernel. -- John Baldwin ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: kernel page fault with nfs
Hi World ist from october 16, installed world and kernel then. Kernel was later rebuilt with debug-options. Is the following more sensible? ## # kgdb NOXON/kernel.debug vmcore.1 Fatal trap 12: page fault while in kernel mode cpuid = 5; apic id = 05 fault virtual address = 0xfe07d1744000 fault code = supervisor write data, page not present instruction pointer = 0x20:0x80d4d58a stack pointer = 0x28:0xfe086057f240 frame pointer = 0x28:0xfe086057f2f0 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 6524 (python2.7) (kgdb) bt #0 doadump (textdump=1) at pcpu.h:219 #1 0x80926b6d in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:447 #2 0x809270c0 in panic (fmt=value optimized out) at /usr/src/sys/kern/kern_shutdown.c:746 #3 0x8035f167 in db_panic (addr=value optimized out, have_addr=2, count=0, modif=0x0) at /usr/src/sys/ddb/db_command.c:473 #4 0x8035ed7d in db_command (cmd_table=0x0) at /usr/src/sys/ddb/db_command.c:440 #5 0x8035eaf4 in db_command_loop () at /usr/src/sys/ddb/db_command.c:493 #6 0x80361600 in db_trap (type=value optimized out, code=0) at /usr/src/sys/ddb/db_main.c:251 #7 0x80966f01 in kdb_trap (type=12, code=0, tf=value optimized out) at /usr/src/sys/kern/subr_kdb.c:654 #8 0x80d4fa7c in trap_fatal (frame=0xfe086057f190, eva=value optimized out) at /usr/src/sys/amd64/amd64/trap.c:861 #9 0x80d4fe0c in trap_pfault (frame=0xfe086057f190, usermode=value optimized out) at /usr/src/sys/amd64/amd64/trap.c:677 #10 0x80d4f42e in trap (frame=0xfe086057f190) at /usr/src/sys/amd64/amd64/trap.c:426 #11 0x80d33972 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:231 #12 0x80d4d58a in bzero () at /usr/src/sys/amd64/amd64/support.S:53 #13 0x80830463 in ncl_doio (vp=0xf801e7f99938, bp=0xfe07c5a168e8, cr=value optimized out, td=value optimized out, called_from_strategy=value optimized out) at /usr/src/sys/fs/nfsclient/nfs_clbio.c:1648 #14 0x80831acf in ncl_write (ap=value optimized out) at /usr/src/sys/fs/nfsclient/nfs_clbio.c:1124 #15 0x80e646a5 in VOP_WRITE_APV (vop=value optimized out, a=value optimized out) at vnode_if.c:997 #16 0x809f52f9 in vn_write (fp=0xf80101c62780, uio=0xfe086057f970, active_cred=value optimized out, flags=320, td=0x0) at vnode_if.h:413 #17 0x809f5602 in vn_io_fault_doio (args=value optimized out, uio=0xa00, td=0x0) at /usr/src/sys/kern/vfs_vnops.c:991 #18 0x809f2aec in vn_io_fault1 () at /usr/src/sys/kern/vfs_vnops.c:1047 #19 0x809f0e3b in vn_io_fault (fp=0xf80101c62780, uio=0xfe086057f970, active_cred=value optimized out, flags=0, td=0xf80171d79920) at /usr/src/sys/kern/vfs_vnops.c:1152 #20 0x80982357 in dofilewrite (td=0xf80171d79920, fd=19, fp=0xf80101c62780, auio=0xfe086057f970, offset=value optimized out, flags=0) at file.h:306 #21 0x80982088 in kern_writev (td=0xf80171d79920, fd=19, auio=0xfe086057f970) at /usr/src/sys/kern/sys_generic.c:467 #22 0x80982013 in sys_write (td=value optimized out, uap=value optimized out) at /usr/src/sys/kern/sys_generic.c:382 #23 0x80d5051b in amd64_syscall (td=0xf80171d79920, traced=0) at subr_syscall.c:133 #24 0x80d33c5b in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:390 #25 0x00080137de4a in ?? () ## Thanks in advance, Tobias Berner On Saturday 18 October 2014 20.43:12 Marcelo Araujo wrote: When you rebuild your system, did you rebuild and install all kernel and world? Best Regards, On Oct 18, 2014 7:57 PM, John Baldwin j...@freebsd.org wrote: On Friday, October 17, 2014 11:11:26 PM Tobias C. Berner wrote: Hi For some days now I've had problems with my current (last test with r273178M). Sometimes when accessing a nfs-share there is a pagefault: ## # Fatal trap 12: page fault while in kernel mode cpuid = 7; apic id = 07 fault virtual address = 0xfe07cfe60400 fault code = supervisor read data, page not present instruction pointer = 0x20:0x80d4d4b6 stack pointer = 0x28:0xfe086088b380 frame pointer = 0x28:0xfe086088b3f0 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 43868 (mplayer) #0 0x80926d29 in shutdown_nice
Re: kernel page fault with nfs
On 2014-10-18 18:48, Tobias C. Berner wrote: Hi World ist from october 16, installed world and kernel then. Kernel was later rebuilt with debug-options. Is the following more sensible? ## # kgdb NOXON/kernel.debug vmcore.1 Fatal trap 12: page fault while in kernel mode cpuid = 5; apic id = 05 fault virtual address = 0xfe07d1744000 fault code = supervisor write data, page not present instruction pointer = 0x20:0x80d4d58a stack pointer = 0x28:0xfe086057f240 frame pointer = 0x28:0xfe086057f2f0 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 6524 (python2.7) (kgdb) bt #0 doadump (textdump=1) at pcpu.h:219 #1 0x80926b6d in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:447 #2 0x809270c0 in panic (fmt=value optimized out) at /usr/src/sys/kern/kern_shutdown.c:746 #3 0x8035f167 in db_panic (addr=value optimized out, have_addr=2, count=0, modif=0x0) at /usr/src/sys/ddb/db_command.c:473 #4 0x8035ed7d in db_command (cmd_table=0x0) at /usr/src/sys/ddb/db_command.c:440 #5 0x8035eaf4 in db_command_loop () at /usr/src/sys/ddb/db_command.c:493 #6 0x80361600 in db_trap (type=value optimized out, code=0) at /usr/src/sys/ddb/db_main.c:251 #7 0x80966f01 in kdb_trap (type=12, code=0, tf=value optimized out) at /usr/src/sys/kern/subr_kdb.c:654 #8 0x80d4fa7c in trap_fatal (frame=0xfe086057f190, eva=value optimized out) at /usr/src/sys/amd64/amd64/trap.c:861 #9 0x80d4fe0c in trap_pfault (frame=0xfe086057f190, usermode=value optimized out) at /usr/src/sys/amd64/amd64/trap.c:677 #10 0x80d4f42e in trap (frame=0xfe086057f190) at /usr/src/sys/amd64/amd64/trap.c:426 #11 0x80d33972 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:231 #12 0x80d4d58a in bzero () at /usr/src/sys/amd64/amd64/support.S:53 #13 0x80830463 in ncl_doio (vp=0xf801e7f99938, bp=0xfe07c5a168e8, cr=value optimized out, td=value optimized out, called_from_strategy=value optimized out) at /usr/src/sys/fs/nfsclient/nfs_clbio.c:1648 #14 0x80831acf in ncl_write (ap=value optimized out) at /usr/src/sys/fs/nfsclient/nfs_clbio.c:1124 #15 0x80e646a5 in VOP_WRITE_APV (vop=value optimized out, a=value optimized out) at vnode_if.c:997 #16 0x809f52f9 in vn_write (fp=0xf80101c62780, uio=0xfe086057f970, active_cred=value optimized out, flags=320, td=0x0) at vnode_if.h:413 #17 0x809f5602 in vn_io_fault_doio (args=value optimized out, uio=0xa00, td=0x0) at /usr/src/sys/kern/vfs_vnops.c:991 #18 0x809f2aec in vn_io_fault1 () at /usr/src/sys/kern/vfs_vnops.c:1047 #19 0x809f0e3b in vn_io_fault (fp=0xf80101c62780, uio=0xfe086057f970, active_cred=value optimized out, flags=0, td=0xf80171d79920) at /usr/src/sys/kern/vfs_vnops.c:1152 #20 0x80982357 in dofilewrite (td=0xf80171d79920, fd=19, fp=0xf80101c62780, auio=0xfe086057f970, offset=value optimized out, flags=0) at file.h:306 #21 0x80982088 in kern_writev (td=0xf80171d79920, fd=19, auio=0xfe086057f970) at /usr/src/sys/kern/sys_generic.c:467 #22 0x80982013 in sys_write (td=value optimized out, uap=value optimized out) at /usr/src/sys/kern/sys_generic.c:382 #23 0x80d5051b in amd64_syscall (td=0xf80171d79920, traced=0) at subr_syscall.c:133 #24 0x80d33c5b in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:390 #25 0x00080137de4a in ?? () ## Thanks in advance, Tobias Berner On Saturday 18 October 2014 20.43:12 Marcelo Araujo wrote: When you rebuild your system, did you rebuild and install all kernel and world? Best Regards, On Oct 18, 2014 7:57 PM, John Baldwin j...@freebsd.org wrote: On Friday, October 17, 2014 11:11:26 PM Tobias C. Berner wrote: Hi For some days now I've had problems with my current (last test with r273178M). Sometimes when accessing a nfs-share there is a pagefault: ## # Fatal trap 12: page fault while in kernel mode cpuid = 7; apic id = 07 fault virtual address = 0xfe07cfe60400 fault code = supervisor read data, page not present instruction pointer = 0x20:0x80d4d4b6 stack pointer = 0x28:0xfe086088b380 frame pointer = 0x28:0xfe086088b3f0 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current
Re: kernel page fault with nfs
It is still strange, could you do what Allan said and send us the result in case you are not sure you have world and kernel in the same revision! On Oct 19, 2014 6:48 AM, Tobias C. Berner tcber...@gmail.com wrote: Hi World ist from october 16, installed world and kernel then. Kernel was later rebuilt with debug-options. Is the following more sensible? ## # kgdb NOXON/kernel.debug vmcore.1 Fatal trap 12: page fault while in kernel mode cpuid = 5; apic id = 05 fault virtual address = 0xfe07d1744000 fault code = supervisor write data, page not present instruction pointer = 0x20:0x80d4d58a stack pointer = 0x28:0xfe086057f240 frame pointer = 0x28:0xfe086057f2f0 code segment = base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 6524 (python2.7) (kgdb) bt #0 doadump (textdump=1) at pcpu.h:219 #1 0x80926b6d in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:447 #2 0x809270c0 in panic (fmt=value optimized out) at /usr/src/sys/kern/kern_shutdown.c:746 #3 0x8035f167 in db_panic (addr=value optimized out, have_addr=2, count=0, modif=0x0) at /usr/src/sys/ddb/db_command.c:473 #4 0x8035ed7d in db_command (cmd_table=0x0) at /usr/src/sys/ddb/db_command.c:440 #5 0x8035eaf4 in db_command_loop () at /usr/src/sys/ddb/db_command.c:493 #6 0x80361600 in db_trap (type=value optimized out, code=0) at /usr/src/sys/ddb/db_main.c:251 #7 0x80966f01 in kdb_trap (type=12, code=0, tf=value optimized out) at /usr/src/sys/kern/subr_kdb.c:654 #8 0x80d4fa7c in trap_fatal (frame=0xfe086057f190, eva=value optimized out) at /usr/src/sys/amd64/amd64/trap.c:861 #9 0x80d4fe0c in trap_pfault (frame=0xfe086057f190, usermode=value optimized out) at /usr/src/sys/amd64/amd64/trap.c:677 #10 0x80d4f42e in trap (frame=0xfe086057f190) at /usr/src/sys/amd64/amd64/trap.c:426 #11 0x80d33972 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:231 #12 0x80d4d58a in bzero () at /usr/src/sys/amd64/amd64/support.S:53 #13 0x80830463 in ncl_doio (vp=0xf801e7f99938, bp=0xfe07c5a168e8, cr=value optimized out, td=value optimized out, called_from_strategy=value optimized out) at /usr/src/sys/fs/nfsclient/nfs_clbio.c:1648 #14 0x80831acf in ncl_write (ap=value optimized out) at /usr/src/sys/fs/nfsclient/nfs_clbio.c:1124 #15 0x80e646a5 in VOP_WRITE_APV (vop=value optimized out, a=value optimized out) at vnode_if.c:997 #16 0x809f52f9 in vn_write (fp=0xf80101c62780, uio=0xfe086057f970, active_cred=value optimized out, flags=320, td=0x0) at vnode_if.h:413 #17 0x809f5602 in vn_io_fault_doio (args=value optimized out, uio=0xa00, td=0x0) at /usr/src/sys/kern/vfs_vnops.c:991 #18 0x809f2aec in vn_io_fault1 () at /usr/src/sys/kern/vfs_vnops.c:1047 #19 0x809f0e3b in vn_io_fault (fp=0xf80101c62780, uio=0xfe086057f970, active_cred=value optimized out, flags=0, td=0xf80171d79920) at /usr/src/sys/kern/vfs_vnops.c:1152 #20 0x80982357 in dofilewrite (td=0xf80171d79920, fd=19, fp=0xf80101c62780, auio=0xfe086057f970, offset=value optimized out, flags=0) at file.h:306 #21 0x80982088 in kern_writev (td=0xf80171d79920, fd=19, auio=0xfe086057f970) at /usr/src/sys/kern/sys_generic.c:467 #22 0x80982013 in sys_write (td=value optimized out, uap=value optimized out) at /usr/src/sys/kern/sys_generic.c:382 #23 0x80d5051b in amd64_syscall (td=0xf80171d79920, traced=0) at subr_syscall.c:133 #24 0x80d33c5b in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:390 #25 0x00080137de4a in ?? () ## Thanks in advance, Tobias Berner On Saturday 18 October 2014 20.43:12 Marcelo Araujo wrote: When you rebuild your system, did you rebuild and install all kernel and world? Best Regards, On Oct 18, 2014 7:57 PM, John Baldwin j...@freebsd.org wrote: On Friday, October 17, 2014 11:11:26 PM Tobias C. Berner wrote: Hi For some days now I've had problems with my current (last test with r273178M). Sometimes when accessing a nfs-share there is a pagefault: ### Fatal trap 12: page fault while in kernel mode cpuid = 7; apic id = 07 fault virtual address = 0xfe07cfe60400 fault code = supervisor read data, page not present instruction pointer = 0x20:0x80d4d4b6 stack pointer = 0x28:0xfe086088b380 frame pointer = 0x28:0xfe086088b3f0 code segment = base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32