Re: Kernel crash during heavy disk access

2013-07-10 Thread Kirk McKusick
 Date: Tue, 9 Jul 2013 18:29:01 -0700
 Subject: Re: Kernel crash during heavy disk access
 From: Adrian Chadd adr...@freebsd.org
 To: Benjamin Kaduk b...@freebsd.org, Jeff Roberson j...@freebsd.org,
 Kirk McKusick mckus...@mckusick.com
 Cc: Eric Camachat eric.camac...@gmail.com, curr...@freebsd.org
 
 Well, best to tell kirk and jeffr.
 
 Jeffr wrote the journaling stuff.
 
 .. but I thought they knew there's still problems?
 
 -adrian

Jeff has fixed all the journaling issues for which we have some way
of reproducing them. We do still have some reports that there are
problems but only a vague description and nothing that we can use
to reproduce them on our systems.

One of the inherit characteristics of any type of journaling is that
once it thinks that it has fixed something, it never goes back and
checks it again later. So, if there is some inconsistency that gets
into your filesystem through media error or an earlier journaling bug,
it will stay there and continue to plague you until a full fsck is
run to clean it up. So, if you are getting filesystem related crashes,
the first thing you should do is a full (fsck -f) check to make sure
that you are starting from a clean state. After that, if you find that
the journaling is not keeping it consistent, please send Jeff and me
a report of what you are doing, what problems it creates, and most
importantly transcript of a run of `fsck_ffs -d' first using the 
journal and then a second time with a full check (fsck_ffs -f -d)
so that we can try to analyse what is going wrong. Note that you
need to run fsck_ffs explicitly because the fsck front end will not
pass the -d (debug output) flag through to fsck_ffs.

Kirk McKusick

 On 9 July 2013 17:48, Benjamin Kaduk b...@freebsd.org wrote:
 On Tue, 9 Jul 2013, Adrian Chadd wrote:

 On 9 July 2013 09:24, Eric Camachat eric.camac...@gmail.com wrote:

 On Mon, 2013-07-08 at 23:05 -0700, Adrian Chadd wrote:

 Hi,

 Try doing a full, non-journal fsck.

 -adrian


 Thank you, it fixed the problem!
 Does it mean journal didn't work?


 Yup :(


 So, you are going to tell Kirk about it?

 -Ben
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Kernel crash during heavy disk access

2013-07-10 Thread Adrian Chadd
I still get issues with latest stable/9 and panics during or just
after a bunch of disk IO.

I can try to reproduce this if you'd like.



-adrian
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Kernel crash during heavy disk access

2013-07-09 Thread Adrian Chadd
Hi,

Try doing a full, non-journal fsck.



-adrian

On 8 July 2013 21:41, Eric Camachat eric.camac...@gmail.com wrote:
 I experienced kernel crashes while make world or ports.
 For example:
 # cd /usr/port/lang/mono
 # make

 Will cause the crash, from /var/crash/core.txt:
 eb8460p dumped core - see /var/crash/vmcore.5

 Mon Jul  8 21:22:58 PDT 2013

 FreeBSD eb8460p 10.0-CURRENT FreeBSD 10.0-CURRENT #5 r253048: Mon Jul  8
 19:07:18 PDT 2013 root@eb8460p:/u
 sr/obj/usr/src/sys/EB8460p  amd64

 panic: ffs_valloc: dup alloc

 GNU gdb 6.1.1 [FreeBSD]
 Copyright 2004 Free Software Foundation, Inc.
 GDB is free software, covered by the GNU General Public License, and you
 are
 welcome to change it and/or distribute copies of it under certain
 conditions.
 Type show copying to see the conditions.
 There is absolutely no warranty for GDB.  Type show warranty for
 details.
 This GDB was configured as amd64-marcel-freebsd...

 Unread portion of the kernel message buffer:
 mode = 0100600, inum = 52969060, fs = /
 panic: ffs_valloc: dup alloc
 cpuid = 0
 KDB: stack backtrace:
 #0 0x805fd6d0 at kdb_backtrace+0x60
 #1 0x805c5b65 at panic+0x155
 #2 0x807dda6a at ffs_valloc+0x88a
 #3 0x8081a34c at ufs_makeinode+0x7c
 #4 0x808d2872 at VOP_CREATE_APV+0x92
 #5 0x80670c49 at vn_open_cred+0x2c9
 #6 0x8066a22f at kern_openat+0x1ef
 #7 0x8085db47 at amd64_syscall+0x357
 #8 0x808475db at Xfast_syscall+0xfb
 Uptime: 6m57s
 Dumping 599 out of 3972
 MB:..3%..11%..22%..33%..41%..51%..62%..73%..81%..91%

 Reading symbols from /boot/modules/cuse4bsd.ko...done.
 Loaded symbols for /boot/modules/cuse4bsd.ko
 Reading symbols from /boot/kernel/fdescfs.ko.symbols...done.
 Loaded symbols for /boot/kernel/fdescfs.ko.symbols
 Reading symbols from /boot/kernel/ng_ubt.ko.symbols...done.
 Loaded symbols for /boot/kernel/ng_ubt.ko.symbols
 Reading symbols from /boot/kernel/netgraph.ko.symbols...done.
 Loaded symbols for /boot/kernel/netgraph.ko.symbols
 Reading symbols from /boot/kernel/ng_hci.ko.symbols...done.
 Loaded symbols for /boot/kernel/ng_hci.ko.symbols
 Reading symbols from /boot/kernel/ng_bluetooth.ko.symbols...done.
 Loaded symbols for /boot/kernel/ng_bluetooth.ko.symbols
 Reading symbols from /boot/kernel/ums.ko.symbols...done.
 Loaded symbols for /boot/kernel/ums.ko.symbols
 Reading symbols from /boot/kernel/ng_l2cap.ko.symbols...done.
 Loaded symbols for /boot/kernel/ng_l2cap.ko.symbols
 Reading symbols from /boot/kernel/ng_btsocket.ko.symbols...done.
 Loaded symbols for /boot/kernel/ng_btsocket.ko.symbols
 Reading symbols from /boot/kernel/ng_socket.ko.symbols...done.
 Loaded symbols for /boot/kernel/ng_socket.ko.symbols
 #0  doadump (textdump=value optimized out) at pcpu.h:236
 236 pcpu.h: No such file or directory.
 in pcpu.h
 (kgdb) #0  doadump (textdump=value optimized out) at pcpu.h:236
 #1  0x805c57e0 in kern_reboot (howto=260)
 at /usr/src/sys/kern/kern_shutdown.c:447
 #2  0x805c5ba4 in panic (fmt=value optimized out)
 at /usr/src/sys/kern/kern_shutdown.c:754
 #3  0x807dda6a in ffs_valloc (pvp=value optimized out,
 mode=value optimized out, cred=value optimized out,
 vpp=value optimized out) at /usr/src/sys/ufs/ffs/ffs_alloc.c:1022
 #4  0x8081a34c in ufs_makeinode (mode=value optimized out,
 dvp=0xfe011bf44ce8, vpp=0xff811ba058d8,
 cnp=0xff811ba05900)
 at /usr/src/sys/ufs/ufs/ufs_vnops.c:2620
 #5  0x808d2872 in VOP_CREATE_APV (vop=value optimized out,
 a=value optimized out) at vnode_if.c:265
 #6  0x80670c49 in vn_open_cred (ndp=0xff811ba05880,
 flagp=0xff811ba0595c, cmode=420, vn_open_flags=value optimized
 out,
 cred=0xfe0011fcee00, fp=0xfe00110925a0) at vnode_if.h:109
 #7  0x8066a22f in kern_openat (td=0xfe011960f920,
 fd=value optimized out,
 path=0x801dbd580 Address 0x801dbd580 out of bounds,
 pathseg=UIO_USERSPACE, flags=1538, mode=value optimized out)
 at /usr/src/sys/kern/vfs_syscalls.c:1093
 #8  0x8085db47 in amd64_syscall (td=0xfe011960f920,
 traced=0)
 at subr_syscall.c:134
 #9  0x808475db in Xfast_syscall ()
 at /usr/src/sys/amd64/amd64/exception.S:391
 #10 0x0008013a5f2a in ?? ()
 Previous frame inner to this frame (corrupt stack?)
 Current language:  auto; currently minimal
 (kgdb)


 --
 Eric Camachat
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Kernel crash during heavy disk access

2013-07-09 Thread Eric Camachat
On Mon, 2013-07-08 at 23:05 -0700, Adrian Chadd wrote:
 Hi,
 
 Try doing a full, non-journal fsck.
 
 -adrian

Thank you, it fixed the problem!
Does it mean journal didn't work?

-- 
Eric Camachat


signature.asc
Description: This is a digitally signed message part


Re: Kernel crash during heavy disk access

2013-07-09 Thread Adrian Chadd
On 9 July 2013 09:24, Eric Camachat eric.camac...@gmail.com wrote:
 On Mon, 2013-07-08 at 23:05 -0700, Adrian Chadd wrote:
 Hi,

 Try doing a full, non-journal fsck.

 -adrian

 Thank you, it fixed the problem!
 Does it mean journal didn't work?

Yup :(



-adrian
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Kernel crash during heavy disk access

2013-07-09 Thread Benjamin Kaduk

On Tue, 9 Jul 2013, Adrian Chadd wrote:


On 9 July 2013 09:24, Eric Camachat eric.camac...@gmail.com wrote:

On Mon, 2013-07-08 at 23:05 -0700, Adrian Chadd wrote:

Hi,

Try doing a full, non-journal fsck.

-adrian


Thank you, it fixed the problem!
Does it mean journal didn't work?


Yup :(


So, you are going to tell Kirk about it?

-Ben
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Kernel crash during heavy disk access

2013-07-09 Thread Adrian Chadd
Well, best to tell kirk and jeffr.

Jeffr wrote the journaling stuff.

.. but I thought they knew there's still problems?



-adrian


On 9 July 2013 17:48, Benjamin Kaduk b...@freebsd.org wrote:
 On Tue, 9 Jul 2013, Adrian Chadd wrote:

 On 9 July 2013 09:24, Eric Camachat eric.camac...@gmail.com wrote:

 On Mon, 2013-07-08 at 23:05 -0700, Adrian Chadd wrote:

 Hi,

 Try doing a full, non-journal fsck.

 -adrian


 Thank you, it fixed the problem!
 Does it mean journal didn't work?


 Yup :(


 So, you are going to tell Kirk about it?

 -Ben
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org