Re: XFS: Assertion failed: xfs_dir2_sf_lookup(args) == ENOENT, file: fs/xfs/xfs_dir2_sf.c, line: 358
On Thu, Jul 11, 2013 at 10:39:30PM -0400, Dave Jones wrote: > Just saw this during boot after an unclean shutdown. It hung afterwards. > > [ 97.162665] XFS: Assertion failed: xfs_dir2_sf_lookup(args) == ENOENT, > file: fs/xfs/xfs_dir2_sf.c, line: 358 > [ 97.164646] [ cut here ] I just reproduced this by chance. I know exactly what I was doing when it went bang, so I might be able to reduce it from scattergun blast testing to be able to reproduce it on demand. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: XFS: Assertion failed: xfs_dir2_sf_lookup(args) == ENOENT, file: fs/xfs/xfs_dir2_sf.c, line: 358
On Thu, Jul 11, 2013 at 10:39:30PM -0400, Dave Jones wrote: Just saw this during boot after an unclean shutdown. It hung afterwards. [ 97.162665] XFS: Assertion failed: xfs_dir2_sf_lookup(args) == ENOENT, file: fs/xfs/xfs_dir2_sf.c, line: 358 [ 97.164646] [ cut here ] I just reproduced this by chance. I know exactly what I was doing when it went bang, so I might be able to reduce it from scattergun blast testing to be able to reproduce it on demand. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: XFS: Assertion failed: xfs_dir2_sf_lookup(args) == ENOENT, file: fs/xfs/xfs_dir2_sf.c, line: 358
On Mon, Jul 15, 2013 at 09:42:44PM -0400, Dave Jones wrote: > On Tue, Jul 16, 2013 at 10:27:03AM +1000, Dave Chinner wrote: > > On Mon, Jul 15, 2013 at 03:58:00PM -0400, Dave Jones wrote: > > > On Sat, Jul 13, 2013 at 12:00:30PM +1000, Dave Chinner wrote: > > > > > > > > [ 97.162665] XFS: Assertion failed: xfs_dir2_sf_lookup(args) == > ENOENT, file: fs/xfs/xfs_dir2_sf.c, line: 358 > > > > > > > > > > > > So, it's not clear what has caused this yet. Is it reproducable? > > > > > > Just hit it again on the same machine. > > > > > > > would be good to get a trace of lookup vs addname events from XFS, > > > > too (i.e. all the xfs_dir* and xfs_da* events) so we can see if the > > > > correct lookups were done prior to the failing addname operation... > > > > > > <- xfs tracing noob. Got a recipe I can cut-n-paste ? > > > Or a patch if that's necessary.. > > > > Just use your usual method of pulling tracepoints out of the kernel, > > be it with perf or (my preferred method) trace-cmd: > > > > # trace-cmd record -e xfs_dir\* -e xfs_da\* > > > > # trace-cmd report > trace.report.out > > > > The trace.report.out file is what I'd like to see. The report file > > could be large, so compressing it might be an idea... > > Hmm, will that only log traces from the command that I run ? No, it will capture kernel wide. The command being run just gives it a defined end-point to the capture. i.e. I often just use "sleep 30" as the command. $ sudo ~/trace-cmd/trace-cmd record -e xfs\* sleep 30 & [1] 4180 $ /sys/kernel/debug/tracing/events/xfs*/filter /sys/kernel/debug/tracing/events/*/xfs*/filter $ sudo mount /dev/vda /mnt/test $ sync $ sudo umount /mnt/test $ fg sudo ~/trace-cmd/trace-cmd record -e xfs\* sleep 30 trace-cmd: Interrupted system call recorder error in splice input offset=3eb000 Kernel buffer statistics: Note: "entries" are the entries left in the kernel ring buffer and are not recorded in the trace data. They should all be zero. CPU: 0 entries: 55 overrun: 0 commit overrun: 0 bytes: 3420 oldest event ts: 48589.221789 now ts: 48607.261966 dropped events: 0 read events: 352 $ sudo ~/trace-cmd/trace-cmd report > ~/t.t $ head -10 ~/t.t version = 6 cpus=1 mount-4185 [000] 48582.777997: xfs_buf_init: dev 253:0 bno 0x nblks 0x1 hold 1 pincount 0 lock 0 flags caller xfs_buf_get_uncached mount-4185 [000] 48582.778001: xfs_buf_get_uncached: dev 253:0 bno 0x nblks 0x1 hold 1 pincount 0 lock 0 flags PAGES caller xfs_buf_read_uncached mount-4185 [000] 48582.778002: xfs_buf_iorequest:dev 253:0 bno 0x0 nblks 0x1 hold 1 pincount 0 lock 0 flags READ|PAGES caller xfsbdstrat mount-4185 [000] 48582.778002: xfs_buf_hold: dev 253:0 bno 0x0 nblks 0x1 hold 1 pincount 0 lock 0 flags READ|PAGES caller xfs_buf_iorequest mount-4185 [000] 48582.778017: xfs_buf_rele: dev 253:0 bno 0x0 nblks 0x1 hold 2 pincount 0 lock 0 flags READ|PAGES caller xfs_buf_iorequest mount-4185 [000] 48582.778017: xfs_buf_iowait: dev 253:0 bno 0x0 nblks 0x1 hold 1 pincount 0 lock 0 flags READ|PAGES caller xfs_buf_read_uncached -0 [000] 48582.778692: xfs_buf_ioerror: dev 253:0 bno 0x0 len 0x200 hold 1 pincount 0 lock 0 error 0 flags READ|PAGES caller xfs_buf_bio_end_io -0 [000] 48582.778693: xfs_buf_iodone: dev 253:0 bno 0x0 nblks 0x1 hold 1 pincount 0 lock 0 flags READ|PAGES caller _xfs_buf_ioend $ > I might need to do some kind of system-wide tracing here, > because every time this has happened so far, it's happened in > a different process. (rpcbind the first [wtf?], and chrony the 2nd) > Can trace-cmd do that ? Yup. trace-cmd will trace global events unless you give it a specific process filter to limit what PIDs it gathers events from. Basically, trace-cmd is a much nicer front end to perf... Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: XFS: Assertion failed: xfs_dir2_sf_lookup(args) == ENOENT, file: fs/xfs/xfs_dir2_sf.c, line: 358
On Tue, Jul 16, 2013 at 10:27:03AM +1000, Dave Chinner wrote: > On Mon, Jul 15, 2013 at 03:58:00PM -0400, Dave Jones wrote: > > On Sat, Jul 13, 2013 at 12:00:30PM +1000, Dave Chinner wrote: > > > > > > [ 97.162665] XFS: Assertion failed: xfs_dir2_sf_lookup(args) == > > ENOENT, file: fs/xfs/xfs_dir2_sf.c, line: 358 > > > > > > > > > So, it's not clear what has caused this yet. Is it reproducable? > > > > Just hit it again on the same machine. > > > > > would be good to get a trace of lookup vs addname events from XFS, > > > too (i.e. all the xfs_dir* and xfs_da* events) so we can see if the > > > correct lookups were done prior to the failing addname operation... > > > > <- xfs tracing noob. Got a recipe I can cut-n-paste ? > > Or a patch if that's necessary.. > > Just use your usual method of pulling tracepoints out of the kernel, > be it with perf or (my preferred method) trace-cmd: > > # trace-cmd record -e xfs_dir\* -e xfs_da\* > > # trace-cmd report > trace.report.out > > The trace.report.out file is what I'd like to see. The report file > could be large, so compressing it might be an idea... Hmm, will that only log traces from the command that I run ? I might need to do some kind of system-wide tracing here, because every time this has happened so far, it's happened in a different process. (rpcbind the first [wtf?], and chrony the 2nd) Can trace-cmd do that ? Dave -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: XFS: Assertion failed: xfs_dir2_sf_lookup(args) == ENOENT, file: fs/xfs/xfs_dir2_sf.c, line: 358
On Mon, Jul 15, 2013 at 03:58:00PM -0400, Dave Jones wrote: > On Sat, Jul 13, 2013 at 12:00:30PM +1000, Dave Chinner wrote: > > > > [ 97.162665] XFS: Assertion failed: xfs_dir2_sf_lookup(args) == > ENOENT, file: fs/xfs/xfs_dir2_sf.c, line: 358 > > > > > > So, it's not clear what has caused this yet. Is it reproducable? > > Just hit it again on the same machine. > > > would be good to get a trace of lookup vs addname events from XFS, > > too (i.e. all the xfs_dir* and xfs_da* events) so we can see if the > > correct lookups were done prior to the failing addname operation... > > <- xfs tracing noob. Got a recipe I can cut-n-paste ? > Or a patch if that's necessary.. Just use your usual method of pulling tracepoints out of the kernel, be it with perf or (my preferred method) trace-cmd: # trace-cmd record -e xfs_dir\* -e xfs_da\* # trace-cmd report > trace.report.out The trace.report.out file is what I'd like to see. The report file could be large, so compressing it might be an idea... Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: XFS: Assertion failed: xfs_dir2_sf_lookup(args) == ENOENT, file: fs/xfs/xfs_dir2_sf.c, line: 358
On Sat, Jul 13, 2013 at 12:00:30PM +1000, Dave Chinner wrote: > > [ 97.162665] XFS: Assertion failed: xfs_dir2_sf_lookup(args) == ENOENT, > > file: fs/xfs/xfs_dir2_sf.c, line: 358 > > > So, it's not clear what has caused this yet. Is it reproducable? Just hit it again on the same machine. > would be good to get a trace of lookup vs addname events from XFS, > too (i.e. all the xfs_dir* and xfs_da* events) so we can see if the > correct lookups were done prior to the failing addname operation... <- xfs tracing noob. Got a recipe I can cut-n-paste ? Or a patch if that's necessary.. Dave -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: XFS: Assertion failed: xfs_dir2_sf_lookup(args) == ENOENT, file: fs/xfs/xfs_dir2_sf.c, line: 358
On Sat, Jul 13, 2013 at 12:00:30PM +1000, Dave Chinner wrote: [ 97.162665] XFS: Assertion failed: xfs_dir2_sf_lookup(args) == ENOENT, file: fs/xfs/xfs_dir2_sf.c, line: 358 So, it's not clear what has caused this yet. Is it reproducable? Just hit it again on the same machine. would be good to get a trace of lookup vs addname events from XFS, too (i.e. all the xfs_dir* and xfs_da* events) so we can see if the correct lookups were done prior to the failing addname operation... - xfs tracing noob. Got a recipe I can cut-n-paste ? Or a patch if that's necessary.. Dave -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: XFS: Assertion failed: xfs_dir2_sf_lookup(args) == ENOENT, file: fs/xfs/xfs_dir2_sf.c, line: 358
On Mon, Jul 15, 2013 at 03:58:00PM -0400, Dave Jones wrote: On Sat, Jul 13, 2013 at 12:00:30PM +1000, Dave Chinner wrote: [ 97.162665] XFS: Assertion failed: xfs_dir2_sf_lookup(args) == ENOENT, file: fs/xfs/xfs_dir2_sf.c, line: 358 So, it's not clear what has caused this yet. Is it reproducable? Just hit it again on the same machine. would be good to get a trace of lookup vs addname events from XFS, too (i.e. all the xfs_dir* and xfs_da* events) so we can see if the correct lookups were done prior to the failing addname operation... - xfs tracing noob. Got a recipe I can cut-n-paste ? Or a patch if that's necessary.. Just use your usual method of pulling tracepoints out of the kernel, be it with perf or (my preferred method) trace-cmd: # trace-cmd record -e xfs_dir\* -e xfs_da\* command that dies # trace-cmd report trace.report.out The trace.report.out file is what I'd like to see. The report file could be large, so compressing it might be an idea... Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: XFS: Assertion failed: xfs_dir2_sf_lookup(args) == ENOENT, file: fs/xfs/xfs_dir2_sf.c, line: 358
On Tue, Jul 16, 2013 at 10:27:03AM +1000, Dave Chinner wrote: On Mon, Jul 15, 2013 at 03:58:00PM -0400, Dave Jones wrote: On Sat, Jul 13, 2013 at 12:00:30PM +1000, Dave Chinner wrote: [ 97.162665] XFS: Assertion failed: xfs_dir2_sf_lookup(args) == ENOENT, file: fs/xfs/xfs_dir2_sf.c, line: 358 So, it's not clear what has caused this yet. Is it reproducable? Just hit it again on the same machine. would be good to get a trace of lookup vs addname events from XFS, too (i.e. all the xfs_dir* and xfs_da* events) so we can see if the correct lookups were done prior to the failing addname operation... - xfs tracing noob. Got a recipe I can cut-n-paste ? Or a patch if that's necessary.. Just use your usual method of pulling tracepoints out of the kernel, be it with perf or (my preferred method) trace-cmd: # trace-cmd record -e xfs_dir\* -e xfs_da\* command that dies # trace-cmd report trace.report.out The trace.report.out file is what I'd like to see. The report file could be large, so compressing it might be an idea... Hmm, will that only log traces from the command that I run ? I might need to do some kind of system-wide tracing here, because every time this has happened so far, it's happened in a different process. (rpcbind the first [wtf?], and chrony the 2nd) Can trace-cmd do that ? Dave -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: XFS: Assertion failed: xfs_dir2_sf_lookup(args) == ENOENT, file: fs/xfs/xfs_dir2_sf.c, line: 358
On Mon, Jul 15, 2013 at 09:42:44PM -0400, Dave Jones wrote: On Tue, Jul 16, 2013 at 10:27:03AM +1000, Dave Chinner wrote: On Mon, Jul 15, 2013 at 03:58:00PM -0400, Dave Jones wrote: On Sat, Jul 13, 2013 at 12:00:30PM +1000, Dave Chinner wrote: [ 97.162665] XFS: Assertion failed: xfs_dir2_sf_lookup(args) == ENOENT, file: fs/xfs/xfs_dir2_sf.c, line: 358 So, it's not clear what has caused this yet. Is it reproducable? Just hit it again on the same machine. would be good to get a trace of lookup vs addname events from XFS, too (i.e. all the xfs_dir* and xfs_da* events) so we can see if the correct lookups were done prior to the failing addname operation... - xfs tracing noob. Got a recipe I can cut-n-paste ? Or a patch if that's necessary.. Just use your usual method of pulling tracepoints out of the kernel, be it with perf or (my preferred method) trace-cmd: # trace-cmd record -e xfs_dir\* -e xfs_da\* command that dies # trace-cmd report trace.report.out The trace.report.out file is what I'd like to see. The report file could be large, so compressing it might be an idea... Hmm, will that only log traces from the command that I run ? No, it will capture kernel wide. The command being run just gives it a defined end-point to the capture. i.e. I often just use sleep 30 as the command. $ sudo ~/trace-cmd/trace-cmd record -e xfs\* sleep 30 [1] 4180 $ /sys/kernel/debug/tracing/events/xfs*/filter /sys/kernel/debug/tracing/events/*/xfs*/filter $ sudo mount /dev/vda /mnt/test $ sync $ sudo umount /mnt/test $ fg sudo ~/trace-cmd/trace-cmd record -e xfs\* sleep 30 trace-cmd: Interrupted system call recorder error in splice input offset=3eb000 Kernel buffer statistics: Note: entries are the entries left in the kernel ring buffer and are not recorded in the trace data. They should all be zero. CPU: 0 entries: 55 overrun: 0 commit overrun: 0 bytes: 3420 oldest event ts: 48589.221789 now ts: 48607.261966 dropped events: 0 read events: 352 $ sudo ~/trace-cmd/trace-cmd report ~/t.t $ head -10 ~/t.t version = 6 cpus=1 mount-4185 [000] 48582.777997: xfs_buf_init: dev 253:0 bno 0x nblks 0x1 hold 1 pincount 0 lock 0 flags caller xfs_buf_get_uncached mount-4185 [000] 48582.778001: xfs_buf_get_uncached: dev 253:0 bno 0x nblks 0x1 hold 1 pincount 0 lock 0 flags PAGES caller xfs_buf_read_uncached mount-4185 [000] 48582.778002: xfs_buf_iorequest:dev 253:0 bno 0x0 nblks 0x1 hold 1 pincount 0 lock 0 flags READ|PAGES caller xfsbdstrat mount-4185 [000] 48582.778002: xfs_buf_hold: dev 253:0 bno 0x0 nblks 0x1 hold 1 pincount 0 lock 0 flags READ|PAGES caller xfs_buf_iorequest mount-4185 [000] 48582.778017: xfs_buf_rele: dev 253:0 bno 0x0 nblks 0x1 hold 2 pincount 0 lock 0 flags READ|PAGES caller xfs_buf_iorequest mount-4185 [000] 48582.778017: xfs_buf_iowait: dev 253:0 bno 0x0 nblks 0x1 hold 1 pincount 0 lock 0 flags READ|PAGES caller xfs_buf_read_uncached idle-0 [000] 48582.778692: xfs_buf_ioerror: dev 253:0 bno 0x0 len 0x200 hold 1 pincount 0 lock 0 error 0 flags READ|PAGES caller xfs_buf_bio_end_io idle-0 [000] 48582.778693: xfs_buf_iodone: dev 253:0 bno 0x0 nblks 0x1 hold 1 pincount 0 lock 0 flags READ|PAGES caller _xfs_buf_ioend $ I might need to do some kind of system-wide tracing here, because every time this has happened so far, it's happened in a different process. (rpcbind the first [wtf?], and chrony the 2nd) Can trace-cmd do that ? Yup. trace-cmd will trace global events unless you give it a specific process filter to limit what PIDs it gathers events from. Basically, trace-cmd is a much nicer front end to perf... Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: XFS: Assertion failed: xfs_dir2_sf_lookup(args) == ENOENT, file: fs/xfs/xfs_dir2_sf.c, line: 358
On Sat, Jul 13, 2013 at 12:00:30PM +1000, Dave Chinner wrote: > On Thu, Jul 11, 2013 at 10:39:30PM -0400, Dave Jones wrote: > > Just saw this during boot after an unclean shutdown. It hung afterwards. > > > > [ 97.162665] XFS: Assertion failed: xfs_dir2_sf_lookup(args) == ENOENT, > > file: fs/xfs/xfs_dir2_sf.c, line: 358 > > > [ 97.173730] [] xfs_dir2_sf_addname+0x43/0x760 [xfs] > > [ 97.173743] [] xfs_dir_createname+0x15c/0x1b0 [xfs] > > [ 97.173754] [] xfs_create+0x4cc/0x710 [xfs] > > [ 97.173764] [] xfs_vn_mknod+0x9a/0x1c0 [xfs] > > [ 97.173773] [] xfs_vn_create+0x13/0x20 [xfs] > > [ 97.173776] [] vfs_create+0x9d/0x100 > > [ 97.173778] [] do_last+0x925/0xe00 > > [ 97.173780] [] path_openat+0xbe/0x6f0 > > [ 97.173783] [] ? local_clock+0x3f/0x50 > > [ 97.173785] [] ? __alloc_fd+0xaf/0x200 > > [ 97.173787] [] do_filp_open+0x3a/0x90 > > [ 97.173789] [] ? __alloc_fd+0xaf/0x200 > > [ 97.173790] [] do_sys_open+0x10b/0x200 > > [ 97.173792] [] ? syscall_trace_enter+0x18/0x290 > > [ 97.173794] [] SyS_open+0x1e/0x20 > > > > This trace repeated a few times, then the same assertion was triggered > > from sys_renameat. > > That's rather curious. What this means is that there is either an > EIO or EEXIST error being returned from xfs_dir2_sf_lookup() when a > we're about to add the new entry. There are two things here - EIO > can only be returned if a shutdown has occurred - are there any > signs of a shutdown in the logs? not that I recall (machine is off, and remote right now). 1st time it was during boot (above), and 2nd time I hit it the machine kept fuzzing for a while afterwards. I'll see if I can reproduce it again on monday, and if so I'll grab an xfsdump for you to stare at. Dave -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: XFS: Assertion failed: xfs_dir2_sf_lookup(args) == ENOENT, file: fs/xfs/xfs_dir2_sf.c, line: 358
On Sat, Jul 13, 2013 at 12:00:30PM +1000, Dave Chinner wrote: On Thu, Jul 11, 2013 at 10:39:30PM -0400, Dave Jones wrote: Just saw this during boot after an unclean shutdown. It hung afterwards. [ 97.162665] XFS: Assertion failed: xfs_dir2_sf_lookup(args) == ENOENT, file: fs/xfs/xfs_dir2_sf.c, line: 358 [ 97.173730] [a0076953] xfs_dir2_sf_addname+0x43/0x760 [xfs] [ 97.173743] [a0067cfc] xfs_dir_createname+0x15c/0x1b0 [xfs] [ 97.173754] [a002f2dc] xfs_create+0x4cc/0x710 [xfs] [ 97.173764] [a00278ca] xfs_vn_mknod+0x9a/0x1c0 [xfs] [ 97.173773] [a0027a03] xfs_vn_create+0x13/0x20 [xfs] [ 97.173776] [811d100d] vfs_create+0x9d/0x100 [ 97.173778] [811d1995] do_last+0x925/0xe00 [ 97.173780] [811d1f2e] path_openat+0xbe/0x6f0 [ 97.173783] [8109e33f] ? local_clock+0x3f/0x50 [ 97.173785] [811e1b5f] ? __alloc_fd+0xaf/0x200 [ 97.173787] [811d2c3a] do_filp_open+0x3a/0x90 [ 97.173789] [811e1b5f] ? __alloc_fd+0xaf/0x200 [ 97.173790] [811c0ddb] do_sys_open+0x10b/0x200 [ 97.173792] [81010578] ? syscall_trace_enter+0x18/0x290 [ 97.173794] [811c0eee] SyS_open+0x1e/0x20 This trace repeated a few times, then the same assertion was triggered from sys_renameat. That's rather curious. What this means is that there is either an EIO or EEXIST error being returned from xfs_dir2_sf_lookup() when a we're about to add the new entry. There are two things here - EIO can only be returned if a shutdown has occurred - are there any signs of a shutdown in the logs? not that I recall (machine is off, and remote right now). 1st time it was during boot (above), and 2nd time I hit it the machine kept fuzzing for a while afterwards. I'll see if I can reproduce it again on monday, and if so I'll grab an xfsdump for you to stare at. Dave -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: XFS: Assertion failed: xfs_dir2_sf_lookup(args) == ENOENT, file: fs/xfs/xfs_dir2_sf.c, line: 358
On Thu, Jul 11, 2013 at 10:39:30PM -0400, Dave Jones wrote: > Just saw this during boot after an unclean shutdown. It hung afterwards. > > [ 97.162665] XFS: Assertion failed: xfs_dir2_sf_lookup(args) == ENOENT, > file: fs/xfs/xfs_dir2_sf.c, line: 358 > [ 97.173730] [] xfs_dir2_sf_addname+0x43/0x760 [xfs] > [ 97.173743] [] xfs_dir_createname+0x15c/0x1b0 [xfs] > [ 97.173754] [] xfs_create+0x4cc/0x710 [xfs] > [ 97.173764] [] xfs_vn_mknod+0x9a/0x1c0 [xfs] > [ 97.173773] [] xfs_vn_create+0x13/0x20 [xfs] > [ 97.173776] [] vfs_create+0x9d/0x100 > [ 97.173778] [] do_last+0x925/0xe00 > [ 97.173780] [] path_openat+0xbe/0x6f0 > [ 97.173783] [] ? local_clock+0x3f/0x50 > [ 97.173785] [] ? __alloc_fd+0xaf/0x200 > [ 97.173787] [] do_filp_open+0x3a/0x90 > [ 97.173789] [] ? __alloc_fd+0xaf/0x200 > [ 97.173790] [] do_sys_open+0x10b/0x200 > [ 97.173792] [] ? syscall_trace_enter+0x18/0x290 > [ 97.173794] [] SyS_open+0x1e/0x20 > > This trace repeated a few times, then the same assertion was triggered from > sys_renameat. That's rather curious. What this means is that there is either an EIO or EEXIST error being returned from xfs_dir2_sf_lookup() when a we're about to add the new entry. There are two things here - EIO can only be returned if a shutdown has occurred - are there any signs of a shutdown in the logs? If there is a shutdown in progress, then this is just unlucky to shutdown with an inode in an inconsistent state in memory that triggers this validity check failure. And EEXIST means that the initial lookup of the name during the open failed to find the entry we are now trying to create. i.e. the initial path walk failed to do the correct lookup on the directory, and so never got down to xfs_dir2_sf_lookup() to find the directory entry (perhaps a problem with a cached negative dentry?). Hence it was decided during the open(O_CREATE) call that the directory entry needed to be created, we get down to XFS to create it, and then get EEXIST because the name already exists... So, it's not clear what has caused this yet. Is it reproducable? If would be good to get a trace of lookup vs addname events from XFS, too (i.e. all the xfs_dir* and xfs_da* events) so we can see if the correct lookups were done prior to the failing addname operation... Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: XFS: Assertion failed: xfs_dir2_sf_lookup(args) == ENOENT, file: fs/xfs/xfs_dir2_sf.c, line: 358
On Thu, Jul 11, 2013 at 10:39:30PM -0400, Dave Jones wrote: Just saw this during boot after an unclean shutdown. It hung afterwards. [ 97.162665] XFS: Assertion failed: xfs_dir2_sf_lookup(args) == ENOENT, file: fs/xfs/xfs_dir2_sf.c, line: 358 [ 97.173730] [a0076953] xfs_dir2_sf_addname+0x43/0x760 [xfs] [ 97.173743] [a0067cfc] xfs_dir_createname+0x15c/0x1b0 [xfs] [ 97.173754] [a002f2dc] xfs_create+0x4cc/0x710 [xfs] [ 97.173764] [a00278ca] xfs_vn_mknod+0x9a/0x1c0 [xfs] [ 97.173773] [a0027a03] xfs_vn_create+0x13/0x20 [xfs] [ 97.173776] [811d100d] vfs_create+0x9d/0x100 [ 97.173778] [811d1995] do_last+0x925/0xe00 [ 97.173780] [811d1f2e] path_openat+0xbe/0x6f0 [ 97.173783] [8109e33f] ? local_clock+0x3f/0x50 [ 97.173785] [811e1b5f] ? __alloc_fd+0xaf/0x200 [ 97.173787] [811d2c3a] do_filp_open+0x3a/0x90 [ 97.173789] [811e1b5f] ? __alloc_fd+0xaf/0x200 [ 97.173790] [811c0ddb] do_sys_open+0x10b/0x200 [ 97.173792] [81010578] ? syscall_trace_enter+0x18/0x290 [ 97.173794] [811c0eee] SyS_open+0x1e/0x20 This trace repeated a few times, then the same assertion was triggered from sys_renameat. That's rather curious. What this means is that there is either an EIO or EEXIST error being returned from xfs_dir2_sf_lookup() when a we're about to add the new entry. There are two things here - EIO can only be returned if a shutdown has occurred - are there any signs of a shutdown in the logs? If there is a shutdown in progress, then this is just unlucky to shutdown with an inode in an inconsistent state in memory that triggers this validity check failure. And EEXIST means that the initial lookup of the name during the open failed to find the entry we are now trying to create. i.e. the initial path walk failed to do the correct lookup on the directory, and so never got down to xfs_dir2_sf_lookup() to find the directory entry (perhaps a problem with a cached negative dentry?). Hence it was decided during the open(O_CREATE) call that the directory entry needed to be created, we get down to XFS to create it, and then get EEXIST because the name already exists... So, it's not clear what has caused this yet. Is it reproducable? If would be good to get a trace of lookup vs addname events from XFS, too (i.e. all the xfs_dir* and xfs_da* events) so we can see if the correct lookups were done prior to the failing addname operation... Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
XFS: Assertion failed: xfs_dir2_sf_lookup(args) == ENOENT, file: fs/xfs/xfs_dir2_sf.c, line: 358
Just saw this during boot after an unclean shutdown. It hung afterwards. [ 97.162665] XFS: Assertion failed: xfs_dir2_sf_lookup(args) == ENOENT, file: fs/xfs/xfs_dir2_sf.c, line: 358 [ 97.164646] [ cut here ] [ 97.165312] kernel BUG at fs/xfs/xfs_message.c:108! [ 97.165986] invalid opcode: [#1] PREEMPT SMP DEBUG_PAGEALLOC [ 97.166947] Modules linked in: snd_hda_codec_realtek pcspkr usb_debug snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm e1000e ptp snd_page_alloc pps_core snd_timer snd soundcore xfs libcrc32c [ 97.170782] CPU: 3 PID: 341 Comm: rpcbind Not tainted 3.10.0+ #49 [ 97.173684] task: 880235d14a40 ti: 880235dae000 task.ti: 880235dae000 [ 97.173701] RIP: 0010:[] [] assfail+0x22/0x30 [xfs] [ 97.173702] RSP: 0018:880235dafa98 EFLAGS: 00010296 [ 97.173703] RAX: 0060 RBX: 880235dafb20 RCX: 0006 [ 97.173703] RDX: 31e0 RSI: 880235d152a0 RDI: 880235d14a40 [ 97.173704] RBP: 880235dafa98 R08: 0002 R09: [ 97.173704] R10: 0001 R11: 0001 R12: 880235e54000 [ 97.173705] R13: 00035e92 R14: 880235dafc88 R15: 880235dafc28 [ 97.173706] FS: 7fa6aac79840() GS:880244e0() knlGS: [ 97.173706] CS: 0010 DS: ES: CR0: 80050033 [ 97.173707] CR2: 7fffb1905f18 CR3: 000235ce5000 CR4: 001407e0 [ 97.173707] DR0: DR1: DR2: [ 97.173708] DR3: DR6: fffe0ff0 DR7: 0400 [ 97.173708] Stack: [ 97.173712] 880235dafb00 a0076953 e8c066a0 880240e04ca8 [ 97.173713] 0250 0020 88023d05af40 880235dafb20 [ 97.173715] 880235a14a40 880235e54000 00035e92 880235dafc88 [ 97.173715] Call Trace: [ 97.173730] [] xfs_dir2_sf_addname+0x43/0x760 [xfs] [ 97.173743] [] xfs_dir_createname+0x15c/0x1b0 [xfs] [ 97.173754] [] xfs_create+0x4cc/0x710 [xfs] [ 97.173764] [] xfs_vn_mknod+0x9a/0x1c0 [xfs] [ 97.173773] [] xfs_vn_create+0x13/0x20 [xfs] [ 97.173776] [] vfs_create+0x9d/0x100 [ 97.173778] [] do_last+0x925/0xe00 [ 97.173780] [] path_openat+0xbe/0x6f0 [ 97.173783] [] ? local_clock+0x3f/0x50 [ 97.173785] [] ? __alloc_fd+0xaf/0x200 [ 97.173787] [] do_filp_open+0x3a/0x90 [ 97.173789] [] ? __alloc_fd+0xaf/0x200 [ 97.173790] [] do_sys_open+0x10b/0x200 [ 97.173792] [] ? syscall_trace_enter+0x18/0x290 [ 97.173794] [] SyS_open+0x1e/0x20 [ 97.173796] [] tracesys+0xdd/0xe2 [ 97.173813] Code: 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 f1 41 89 d0 48 89 e5 48 89 fa 48 c7 c6 a8 98 0b a0 31 ff 31 c0 e8 ce fb ff ff <0f> 0b 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 This trace repeated a few times, then the same assertion was triggered from sys_renameat. Dave -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
XFS: Assertion failed: xfs_dir2_sf_lookup(args) == ENOENT, file: fs/xfs/xfs_dir2_sf.c, line: 358
Just saw this during boot after an unclean shutdown. It hung afterwards. [ 97.162665] XFS: Assertion failed: xfs_dir2_sf_lookup(args) == ENOENT, file: fs/xfs/xfs_dir2_sf.c, line: 358 [ 97.164646] [ cut here ] [ 97.165312] kernel BUG at fs/xfs/xfs_message.c:108! [ 97.165986] invalid opcode: [#1] PREEMPT SMP DEBUG_PAGEALLOC [ 97.166947] Modules linked in: snd_hda_codec_realtek pcspkr usb_debug snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm e1000e ptp snd_page_alloc pps_core snd_timer snd soundcore xfs libcrc32c [ 97.170782] CPU: 3 PID: 341 Comm: rpcbind Not tainted 3.10.0+ #49 [ 97.173684] task: 880235d14a40 ti: 880235dae000 task.ti: 880235dae000 [ 97.173701] RIP: 0010:[a002a362] [a002a362] assfail+0x22/0x30 [xfs] [ 97.173702] RSP: 0018:880235dafa98 EFLAGS: 00010296 [ 97.173703] RAX: 0060 RBX: 880235dafb20 RCX: 0006 [ 97.173703] RDX: 31e0 RSI: 880235d152a0 RDI: 880235d14a40 [ 97.173704] RBP: 880235dafa98 R08: 0002 R09: [ 97.173704] R10: 0001 R11: 0001 R12: 880235e54000 [ 97.173705] R13: 00035e92 R14: 880235dafc88 R15: 880235dafc28 [ 97.173706] FS: 7fa6aac79840() GS:880244e0() knlGS: [ 97.173706] CS: 0010 DS: ES: CR0: 80050033 [ 97.173707] CR2: 7fffb1905f18 CR3: 000235ce5000 CR4: 001407e0 [ 97.173707] DR0: DR1: DR2: [ 97.173708] DR3: DR6: fffe0ff0 DR7: 0400 [ 97.173708] Stack: [ 97.173712] 880235dafb00 a0076953 e8c066a0 880240e04ca8 [ 97.173713] 0250 0020 88023d05af40 880235dafb20 [ 97.173715] 880235a14a40 880235e54000 00035e92 880235dafc88 [ 97.173715] Call Trace: [ 97.173730] [a0076953] xfs_dir2_sf_addname+0x43/0x760 [xfs] [ 97.173743] [a0067cfc] xfs_dir_createname+0x15c/0x1b0 [xfs] [ 97.173754] [a002f2dc] xfs_create+0x4cc/0x710 [xfs] [ 97.173764] [a00278ca] xfs_vn_mknod+0x9a/0x1c0 [xfs] [ 97.173773] [a0027a03] xfs_vn_create+0x13/0x20 [xfs] [ 97.173776] [811d100d] vfs_create+0x9d/0x100 [ 97.173778] [811d1995] do_last+0x925/0xe00 [ 97.173780] [811d1f2e] path_openat+0xbe/0x6f0 [ 97.173783] [8109e33f] ? local_clock+0x3f/0x50 [ 97.173785] [811e1b5f] ? __alloc_fd+0xaf/0x200 [ 97.173787] [811d2c3a] do_filp_open+0x3a/0x90 [ 97.173789] [811e1b5f] ? __alloc_fd+0xaf/0x200 [ 97.173790] [811c0ddb] do_sys_open+0x10b/0x200 [ 97.173792] [81010578] ? syscall_trace_enter+0x18/0x290 [ 97.173794] [811c0eee] SyS_open+0x1e/0x20 [ 97.173796] [81717c14] tracesys+0xdd/0xe2 [ 97.173813] Code: 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 f1 41 89 d0 48 89 e5 48 89 fa 48 c7 c6 a8 98 0b a0 31 ff 31 c0 e8 ce fb ff ff 0f 0b 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 This trace repeated a few times, then the same assertion was triggered from sys_renameat. Dave -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/