Sorry it's been a bit since I reported back, but I got a chance to apply your patch and I have mixed results to report. Basically, it looks like the patch may resolve the issues, but not entirely. After building the new kernel, I was still able to reproduce the problem initially in a sub directory, but not on the top level. When it was happening, this is what showed up in the logs: Feb  8 15:09:57 aufs kernel: [  998.810169] ------------[ cut here ]------------ Feb  8 15:09:57 aufs kernel: [  998.810177] WARNING: CPU: 0 PID: 2076 at /build/buildd/linux-3.16.0/fs/inode.c:282 drop_nlink+0x41/0x50() Feb  8 15:09:57 aufs kernel: [  998.810178] Modules linked in: aufs serio_raw snd_hda_codec_generic kvm_amd kvm qxl crct10dif_pclmul ttm drm_kms_helper crc32_pclmul ghash_clmulni_intel aesni_intel drm aes_x86_64 lrw ppdev snd_hda_intel snd_hda_controller i2c_piix4 snd_hda_codec gf128mul glue_helper snd_hwdep snd_pcm snd_timer ablk_helper cryptd snd parport_pc soundcore pvpanic parport mac_hid btrfs xor raid6_pq psmouse floppy pata_acpi Feb  8 15:09:57 aufs kernel: [  998.810200] CPU: 0 PID: 2076 Comm: rm Tainted: G     W   3.16.0-29-generic #39-Ubuntu Feb  8 15:09:57 aufs kernel: [  998.810202] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_171129-lamiak 04/01/2014 Feb  8 15:09:57 aufs kernel: [  998.810204]  0000000000000009 ffff88003c7f3cc8 ffffffff8178218a 0000000000000000 Feb  8 15:09:57 aufs kernel: [  998.810206]  ffff88003c7f3d00 ffffffff8106fedd ffff88003bc1a658 ffff88003bc1a658 Feb  8 15:09:57 aufs kernel: [  998.810208]  0000000000000001 ffff88003cbcede0 ffff880039d91c00 ffff88003c7f3d10 Feb  8 15:09:57 aufs kernel: [  998.810210] Call Trace: Feb  8 15:09:57 aufs kernel: [  998.810216]  [<ffffffff8178218a>] dump_stack+0x45/0x56 Feb  8 15:09:57 aufs kernel: [  998.810219]  [<ffffffff8106fedd>] warn_slowpath_common+0x7d/0xa0 Feb  8 15:09:57 aufs kernel: [  998.810221]  [<ffffffff8106ffba>] warn_slowpath_null+0x1a/0x20 Feb  8 15:09:57 aufs kernel: [  998.810224]  [<ffffffff811fcc31>] drop_nlink+0x41/0x50 Feb  8 15:09:57 aufs kernel: [  998.810231]  [<ffffffffc03d6c1d>] au_whtmp_rmdir+0x19d/0x1b0 [aufs] Feb  8 15:09:57 aufs kernel: [  998.810236]  [<ffffffffc03d5e5e>] ? au_whtmp_ren+0x6e/0xe0 [aufs] Feb  8 15:09:57 aufs kernel: [  998.810241]  [<ffffffffc03e52e7>] aufs_rmdir+0x2b7/0x430 [aufs] Feb  8 15:09:57 aufs kernel: [  998.810244]  [<ffffffff811f8390>] ? prepend.constprop.25+0x30/0x30 Feb  8 15:09:57 aufs kernel: [  998.810246]  [<ffffffff811eede7>] vfs_rmdir+0xa7/0x100 Feb  8 15:09:57 aufs kernel: [  998.810248]  [<ffffffff811f20b9>] do_rmdir+0x1d9/0x1f0 Feb  8 15:09:57 aufs kernel: [  998.810250]  [<ffffffff811e2dbe>] ? ____fput+0xe/0x10 Feb  8 15:09:57 aufs kernel: [  998.810253]  [<ffffffff81091afc>] ? task_work_run+0xbc/0xf0 Feb  8 15:09:57 aufs kernel: [  998.810256]  [<ffffffff810131e7>] ? do_notify_resume+0x97/0xb0 Feb  8 15:09:57 aufs kernel: [  998.810258]  [<ffffffff811f3365>] SyS_unlinkat+0x25/0x40 Feb  8 15:09:57 aufs kernel: [  998.810261]  [<ffffffff8178a1ad>] system_call_fastpath+0x1a/0x1f Feb  8 15:09:57 aufs kernel: [  998.810263] ---[ end trace 7e60ac9f449b2a85 ]--- Once I saw that it was still happening, I attempted to clean things up and present a more complete example of how I was getting this to happen. I unmounted the aufs filesystem and removed the contents of the underlying filesystems (including the .wh* bits) and remounted. Once I saw what the aufs mountpoint was correctly reflecting that everything was removed, I tried to reproduce the error again, and I have not been able to. I can only assume that something with the old .wh* files was causing some sort of issue. Ultimately, at this time I cannot reproduce the problem, so I guess we call it fixed?
On Wed, Feb 4, 2015 at 5:41 AM, <[1]sf...@users.sourceforge.net> wrote: Michael Johnson - MJ: > I've made it through the initial build and the problem does in fact still > occur. I completed the build with CONFIG_AUFS_DEBUG=y and as one would > expect, the problem still occurs.  Interestingly, I cannot reproduce 100% > of the time in the way previously described. Here is what does work to > reproduce for me 100% of the time currently: > > mkdir -p a/b; cd .; rm -rf a; ls; Still I cannot reproduce... But I will try more. > For the most part I did not get anything in my kernel debug logs, but after > hammering on it a bit I did get the following in my logs which appears to > be related:     ::: > Feb 3 23:54:38 aufs kernel: [ 3577.351206] WARNING: CPU: 0 PID: 3725 at > /build/buildd/linux-3.16.0/fs > /inode.c:282 drop_nlink+0x41/0x50() It is one of good news. It is caused by a strange link count of a dir on btrfs. At least this problem will be solved by the patch I sent previously. But we may meet another problem later. J. R. Okajima -- Michael Johnson - MJ References 1. mailto:sf...@users.sourceforge.net
------------------------------------------------------------------------------ Dive into the World of Parallel Programming. The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/