On Tue, Aug 09, 2016 at 06:27:33PM +0000, Hugo Mills wrote: > On Tue, Aug 09, 2016 at 02:26:14PM -0400, Chris Mason wrote: > > On 08/09/2016 02:23 PM, Hugo Mills wrote: > > > Hi, Chris, > > > > > >On Tue, Aug 09, 2016 at 02:02:20PM -0400, Chris Mason wrote: > > >>On 08/09/2016 01:27 PM, Hugo Mills wrote: > > >>> Over the weekend, I started doing some maintenance on my server: I > > >>>upgraded to 4.7.0, and I started deleting a device from my array, > > >>>preparatory to putting in a larger one. About halfway through the > > >>>operation, several kernel threads hung up for a while (resulting in > > >>>"blocked for 120s" messages), and then the delete process seems to > > >>>have stopped entirely, although several kernel threads are at maximum > > >>>usage. > > >>> > > >>> After a few hours, I rebooted the machine, and left it for a day or > > >>>so. I tried the delete again this afternoon, and it's done the same > > >>>thing again. The full log is included below. I have a kworker and a > > >>>btrfs-transaction pegged at close to 100% of a core each, and a > > >>>btrfs-cleaner (and the btrfs dev del process) in D state. > > >>> > > >>> The FS was not under load at the time of the failure, and it passes > > >>>scrub. I haven't tried a btrfs check yet. > > >> > > >>Thanks Hugo, can you nail down which line of code belongs to: > > >> > > >>btrfs_async_run_delayed_refs+0xc6 > > > > > > I'm having a spot of trouble with this. The btrfs on this kernel is > > >built-in, and I've lost the contents of the build directory (it's done > > >by an overnight build script, and it's already built a 4.8-rc1 for one > > >of my other machines). > > > > > >(gdb) file /boot/vmlinuz-4.7.0-dirty > > >BFD: /boot/vmlinuz-4.7.0-dirty: Warning: Ignoring section flag > > >IMAGE_SCN_MEM_NOT_PAGED in section .bss > > >Reading symbols from /boot/vmlinuz-4.7.0-dirty...(no debugging symbols > > >found)...done. > > >(gdb) list *btrfs_async_run_delayed_refs+0xc6 > > >No symbol table is loaded. Use the "file" command. > > > > > > There must be a way of getting this info from here, but I'm not > > >sure I know what it is. Build a new kernel from 4.7 with this > > >machine's config and run gdb on the btrfs.o file? Not a problem to do, > > >but it might take a little while. > > > > As long as you use the same gcc and config file, it'll almost always > > generate the same offsets/code. You can recompile with debug > > symbols on and it'll be accurate. > > OK. Back later.
(gdb) file fs/btrfs/btrfs.o
Reading symbols from fs/btrfs/btrfs.o...done.
(gdb) list *btrfs_async_run_delayed_refs+0xc6
0x13dae is in btrfs_async_run_delayed_refs (fs/btrfs/extent-tree.c:2915).
2910
2911 btrfs_queue_work(root->fs_info->extent_workers, &async->work);
2912
2913 if (wait) {
2914 wait_for_completion(&async->wait);
2915 ret = async->error;
2916 kfree(async);
2917 return ret;
2918 }
2919 return
0;
Hugo.
--
Hugo Mills | You shouldn't anthropomorphise computers. They
hugo@... carfax.org.uk | really don't like that.
http://carfax.org.uk/ |
PGP: E2AB1DE4 |
signature.asc
Description: Digital signature
