Re: aufs2.1 Monday release
hi, i seem to be getting compile issues below. i took standalone git and integrated with an aufs linux git.( since i wanted to compile as a module only) r...@saravanan-desktop:/MHACK/aufs2-2.6/aufs2-2.6# make M=fs/aufs modules CC [M] fs/aufs/branch.o fs/aufs/branch.c: In function ‘au_br_mod_files_ro’: fs/aufs/branch.c:859: error: request for member ‘next’ in something not a structure or union fs/aufs/branch.c:859: warning: comparison of distinct pointer types lacks a cast make[1]: *** [fs/aufs/branch.o] Error 1 make: *** [_module_fs/aufs] Error 2 r...@saravanan-desktop:/MHACK/aufs2-2.6/aufs2-2.6# more .config | grep CONFIG_SMP CONFIG_SMP=y it appear the s_files in sb_info is a kind of per_cpu list ( which may cause this issue?) #ifdef CONFIG_SMP struct list_head __percpu *s_files; #else struct list_heads_files; #endif ( please let me know whether i am using proper git branches.) standalone.git r...@saravanan-desktop:/MHACK/aufs2-standalone# git branch * (no branch) master thx. On Wed, Dec 15, 2010 at 7:44 PM, sf...@users.sourceforge.net wrote: Hi, Thayumanavar Sachithanantham: Thanks for the patch. Nowdays i am no more involved in work related to AUFS so won't have access to very high multi core CPUs. But i am going to go through this patch along with current aufs over the weekend and will give it a try on my virtual machine this weekend. Thanx. Here is another additional patch which will also be include in next Monday release. May I say have nice weekend? :-) J. R. Okajima -- Lotusphere 2011 Register now for Lotusphere 2011 and learn how to connect the dots, take your collaborative environment to the next level, and enter the era of Social Business. http://p.sf.net/sfu/lotusphere-d2d
Re: Fwd: question on branch deletion in aufs - reg.
I don't understand you well. If you apply the debug patch successfully, reproduce the problem, and get the debug log, then send me the debug log. Sorry i meant to say that i have been unable to reproduce the crash with the patch applied. If suse people manages the kernel source files by GIT and put it in public, let me know the URL. I don't see any public URL providing the source of the SLES 11 that i test in my labs. As you might know, the cause of your problem may exist anywhere, in aufs, in reiserfs, in vanilla kernel, in suse kernel, or in their combination. At first, we need to focus the first problem, crash in deleting a branch, and I need the source files of suse. yes the bug could be in any of the parameters but i am sorry that the company policy doesn't allow to provide the SLES 11 kernel source because it is confidential and proprietary information of the company. i downloaded the 2.6.27 git with aufs source from aufs.sourcforge.net and had been unable to hit the issue so far. i will continue to test and provide the info. once i can recreate this issue. Also i am unable to recreate the issue with the custom kernel. Thx,Thayumanavar S. -- This SF.net email is sponsored by Make an app they can't live without Enter the BlackBerry Developer Challenge http://p.sf.net/sfu/RIM-dev2dev
Re: Fwd: question on branch deletion in aufs - reg.
Hello Okajima, i tried the patch that you seem to have provided but seem to have been hit the issue. Once hit the issue i provide the details of the log. Regarding diff between vanilla and SLES, i believe there is not much significant difference. but i let you know on this later. After branch deletion, when i try to umount the reiseferfs partition, we seem hit to the BUG_ON in fs/dcache.c on line 666( because of d_count, so i believe some mismatch between dget/dput may be) as below: Jul 27 13:16:13 linux-k95z kernel: [26627][ 905.413467] BUG: Dentry 8803b0d14640{i=0,n=448} still in use (1) [unmount of reiserfs sdu1] Jul 27 13:16:13 linux-k95z kernel: [26627][ 905.413490] [ cut here ] Jul 27 13:16:13 linux-k95z kernel: [26627][ 905.413499] kernel BUG at fs/dcache.c:666! Jul 27 13:16:13 linux-k95z kernel: [26627][ 905.413503] invalid opcode: [1] SMP DEBUG_PAGEALLOC Jul 27 13:16:13 linux-k95z kernel: [26627][ 905.413509] last sysfs file: /sys/fs/aufs/si_21194aace0e4a251/br6 Jul 27 13:16:13 linux-k95z kernel: [26627][ 905.413514] CPU 7 Jul 27 13:16:13 linux-k95z kernel: [26627][ 905.413517] Modules linked in: aufs(N) exportfs(N) reiserfs(N) ipv6(N) cpufreq_c onservative(N) cpufreq_userspace(N) cpufreq_powersave(N) acpi_cpufreq(N) microcode(N) fuse(N) loop(N) dm_mod(N) rtc_cmos(N) r tc_core(N) i2c_i801(N) ses(N) serio_raw(N) rtc_lib(N) button(N) pcspkr(N) enclosure(N) i2c_core(N) bnx2(N) sg(N) usbhid(N) hid(N) ff_memless(N) uhci_hcd(N) sd_mod(N) crc_t10dif(N) ehci_hcd(N) mptsas(N) mptscsih(N) usbcore(N) mptbase(N) scsi_transport _sas(N) edd(N) ext3(N) mbcache(N) jbd(N) fan(N) ide_pci_generic(N) ide_core(N) ata_generic(N) ata_piix(N) libata(N) dock(N) t hermal(N) processor(N) thermal_sys(N) hwmon(N) megaraid_sas(N) scsi_mod(N) Jul 27 13:16:13 linux-k95z kernel: [26627][ 905.413579] Supported: No, Unsupported modules are loaded Jul 27 13:16:13 linux-k95z kernel: [26627][ 905.413585] Pid: 26627, comm: umount Tainted: G M 2.6.27.45-0.1-4827 #9 Jul 27 13:16:13 linux-k95z kernel: [26627][ 905.413590] RIP: 0010:[802d3d6e] [802d3d6e] shrink_dcache_f or_umount_subtree+0x1b2/0x2af Jul 27 13:16:13 linux-k95z kernel: [26627][ 905.413602] RSP: 0018:8803781c9de8 EFLAGS: 00010296 Jul 27 13:16:13 linux-k95z kernel: [26627][ 905.413607] RAX: 006d RBX: 880437b77738 RCX: Jul 27 13:16:13 linux-k95z kernel: [26627][ 905.413611] RDX: 8802cb77d000 RSI: 0001 RDI: 804c9709 Jul 27 13:16:13 linux-k95z kernel: [26627][ 905.413616] RBP: 8803781c9e18 R08: 0016 R09: 8025758b Jul 27 13:16:13 linux-k95z kernel: [26627][ 905.413621] R10: 8803781c9c58 R11: 000a R12: 8803b0d14640 Jul 27 13:16:13 linux-k95z kernel: [26627][ 905.413626] R13: 000b R14: 8803b0d146c8 R15: Jul 27 13:16:13 linux-k95z kernel: [26627][ 905.413631] FS: 7f39788e96f0() GS:88043f149cc0() knlGS: Jul 27 13:16:13 linux-k95z kernel: [26627][ 905.413637] CS: 0010 DS: ES: CR0: 8005003b Jul 27 13:16:13 linux-k95z kernel: [26627][ 905.413641] CR2: 7f39782501b0 CR3: 000378402000 CR4: 06e0 Jul 27 13:16:13 linux-k95z kernel: [26627][ 905.413646] DR0: DR1: DR2: Jul 27 13:16:13 linux-k95z kernel: [26627][ 905.413651] DR3: DR6: 0ff0 DR7: 0400 Jul 27 13:16:13 linux-k95z kernel: [26627][ 905.413656] Process umount (pid: 26627, threadinfo 8803781c8000, task 88 0409566040) Jul 27 13:16:13 linux-k95z kernel: [26627][ 905.413661] Stack: 88043ac8ec50 80256f35 88043ac8e800 a 02c2b50 Jul 27 13:16:13 linux-k95z kernel: [26627][ 905.413669] 88043ac8e800 88043d5ffac0 8803781c9e38 802d3ea2 Jul 27 13:16:13 linux-k95z kernel: [26627][ 905.413677] 8803fb572d08 88043ac8e800 8803781c9e58 802c39f3 Jul 27 13:16:13 linux-k95z kernel: [26627][ 905.413685] Call Trace: Jul 27 13:16:13 linux-k95z kernel: [26627][ 905.413700] [802d3ea2] shrink_dcache_for_umount+0x37/0x47 Jul 27 13:16:13 linux-k95z kernel: [26627][ 905.413712] [802c39f3] generic_shutdown_super+0x1a/0xf7 Jul 27 13:16:13 linux-k95z kernel: [26627][ 905.413723] [802c3ae5] kill_block_super+0x15/0x29 Jul 27 13:16:13 linux-k95z kernel: [26627][ 905.413745] [a02ac538] reiserfs_kill_sb+0x93/0x97 [reiserfs] Jul 27 13:16:13 linux-k95z kernel: [26627][ 905.413769] [802c3bc8] deactivate_super+0x68/0x7d Jul 27 13:16:13 linux-k95z kernel:
question on branch deletion in aufs - reg.
Hi, In AUFS branch deletion code paths, it seems that s_inodes of the aufs superblock is walked through without taking the inode_lock. Is it a safe to walk through this list and working on it without holding the inode_lock? Rest of places in the kernel where s_inodes list is manipulated with holding inode_lock. Also i see that inode_lock doesn't seem to be an exported symbol and VFS seems to be the only consumer of it. Sometimes on case of branch deletion on one of branch succeeds when some i/o happening on aufs mount point, sometimes i seem to hit upon a crash in the code path au_h_iptr.( via au_refresh_hinode_self code) where h_inode contain some invalid kernel virtual address. Also i see that au_ii could return NULL. does the below patch is necessary in the code path remount, del branch scenario? --- iinfo.c.orig2010-08-06 09:08:59.0 -0400 +++ iinfo.c 2010-08-06 09:24:31.0 -0400 @@ -24,11 +24,14 @@ struct inode *au_h_iptr(struct inode *inode, aufs_bindex_t bindex) { - struct inode *h_inode; + struct inode *h_inode = NULL; + struct au_iinfo *iinfo; IiMustAnyLock(inode); - h_inode = au_ii(inode)-ii_hinode[0 + bindex].hi_inode; + iinfo = au_ii(inode); + if ( iinfo ) + h_inode = au_ii(inode)-ii_hinode[0 + bindex].hi_inode; AuDebugOn(h_inode atomic_read(h_inode-i_count) = 0); return h_inode; } thx,thayumanavar s. -- This SF.net email is sponsored by Make an app they can't live without Enter the BlackBerry Developer Challenge http://p.sf.net/sfu/RIM-dev2dev
Fwd: question on branch deletion in aufs - reg.
adding aufs-users list. -- Forwarded message -- From: Thayumanavar Sachithanantham thay...@gmail.com Date: Fri, Aug 6, 2010 at 7:50 PM Subject: Re: question on branch deletion in aufs - reg. To: sf...@users.sourceforge.net Hello Okajima, I don't have some info that you had requested. Info which i can provide is: All of CONFIG_AUFS* option is set. The kernel is 2.6.27.45 SLES kernel with aufs ( recent ver as taken git for 2.6.27) pulled into it. I do not have the .config and /sys related info. under the kernel for which the crash occurred. i have 4 branchs /dev/sd* disks formatted with reiserfs filesystem on it. The issue doesn't get reproduce frequently. The test cases involves creating parallel threads ( that does file creation (using dd), mkdir, unlink and rename) and a parallel script that tries do branch deletion contiuously in a loop until we succed to delete a branch ( when no EBUSY comes up). Info. regarding the system it is a 8 CPU X86_64 (2127 Mhz) with 16 GB memory. Here is the backtrace and diassembly for the crash: [ from diassembly it appear that deref of h_inode-i_count in au_h_iptr led to crash). ( Thank for the quick response to my earlier mail.) PID: 22513 TASK: 88046296e840 CPU: 3 COMMAND: mount #0 [8803c284f960] crash_kexec at 80267322 #1 [8803c284fa30] __die at 8049add1 #2 [8803c284fa50] do_page_fault at 8049c7b1 #3 [8803c284fbb0] error_exit at 8049a1d9 [exception RIP: au_h_iptr+69] RIP: a02a777d RSP: 8803c284fc60 RFLAGS: 00010286 RAX: 001d RBX: RCX: RDX: ffe0 RSI: RDI: 8803ea4850c0 RBP: 8801735f9080 R8: 8801735f9080 R9: R10: 0001 R11: R12: 735f9002 R13: 0008 R14: 8801735f90c8 R15: 8801735f9080 ORIG_RAX: CS: 0010 SS: 0018 #4 [8803c284fc60] au_refresh_hinode_self at a02a817b #5 [8803c284fd00] aufs_remount_fs at a028f8a8 #6 [8803c284fdd0] __do_remount_sb at 802b1670 #7 [8803c284fe00] do_remount at 802c5b55 #8 [8803c284fe50] do_mount at 802c78c7 #9 [8803c284ff30] sys_mount at 802c79e4 #10 [8803c284ff80] system_call_fastpath at 8020bfcb RIP: 7fbcb5f7f88a RSP: 7fffc4a7a938 RFLAGS: 00010202 RAX: 00a5 RBX: 8020bfcb RCX: c0ed0020 RDX: 0061aec0 RSI: 0061aea0 RDI: 0061ae80 RBP: 0061ae80 R8: 0061af80 R9: 7fffc4a7abdc R10: c0ed0020 R11: 0202 R12: 7fffc4a7abdc R13: 7fffc4a7af68 R14: c0ed0020 R15: 7fffc4a7ab80 ORIG_RAX: 00a5 CS: 0033 SS: 002b crash dis au_h_iptr 0xa02a7738 au_h_iptr: lea 0xffb8(%rdi),%r8 0xa02a773c au_h_iptr+4: mov 0xfff0(%rdi),%rdi 0xa02a7740 au_h_iptr+8: xor %ecx,%ecx 0xa02a7742 au_h_iptr+10: mov %rcx,%rax 0xa02a7745 au_h_iptr+13: mov %sil,%dl 0xa02a7748 au_h_iptr+16: test %rdi,%rdi 0xa02a774b au_h_iptr+19: cmovne %r8,%rax 0xa02a774f au_h_iptr+23: cmpl $0x0,0x28(%rax) 0xa02a7753 au_h_iptr+27: jg 0xa02a775f 0xa02a7755 au_h_iptr+29: cmpl $0x0,0x2c(%rax) 0xa02a7759 au_h_iptr+33: jg 0xa02a775f 0xa02a775b au_h_iptr+35: ud2a 0xa02a775d au_h_iptr+37: jmp 0xa02a775d 0xa02a775f au_h_iptr+39: xor %eax,%eax 0xa02a7761 au_h_iptr+41: test %rdi,%rdi 0xa02a7764 au_h_iptr+44: movsbq %dl,%rdx 0xa02a7768 au_h_iptr+48: cmovne %r8,%rax 0xa02a776c au_h_iptr+52: shl $0x5,%rdx 0xa02a7770 au_h_iptr+56: mov 0x38(%rax),%rax 0xa02a7774 au_h_iptr+60: mov (%rax,%rdx,1),%rax 0xa02a7778 au_h_iptr+64: test %rax,%rax 0xa02a777b au_h_iptr+67: je 0xa02a7787 0xa02a777d au_h_iptr+69: cmpl $0x0,0x48(%rax) 0xa02a7781 au_h_iptr+73: jg 0xa02a7787 0xa02a7783 au_h_iptr+75: ud2a 0xa02a7785 au_h_iptr+77: jmp 0xa02a7785 0xa02a7787 au_h_iptr+79: retq On Fri, Aug 6, 2010 at 7:19 PM, sf...@users.sourceforge.net wrote: Hello Thayumanavar, Thayumanavar Sachithanantham: In AUFS branch deletion code paths, it seems that s_inodes of the aufs superblock is walked through without taking the inode_lock. Is it a safe to walk through this list and working on it without holding the inode_lock? Rest of places in the kernel where s_inodes list is sb-s_umount should protect it in vfs and I don't think inode_lock is necessary. Sometimes on case of branch deletion on one of branch succeeds when
[PATCH] aufs: Fix slab memory corruption when rmdir a directory of length 242 under certain circumstances.
From: thay...@gmail.com Fix a slab memory corruption seen when rmdir a directory of length greater than 242 characters. This happen because in whout.c when cnt++ get large value and directory len become, we write past end of allocated memory. Signed-off-by: Thayumanavar Sachithanantham thay...@gmail.com --- --- aufs2-2.6/fs/aufs/whout.c.orig 2010-07-19 10:49:37.0 -0400 +++ aufs2-2.6/fs/aufs/whout.c 2010-07-19 10:52:42.0 -0400 @@ -150,7 +150,7 @@ struct dentry *au_whtmp_lkup(struct dent qs.name = name; for (i = 0; i 3; i++) { - sprintf(p, %.*d, AUFS_WH_TMP_LEN, cnt++); + snprintf(p, NAME_MAX - ( p - name ) + AUFS_WH_TMP_LEN + 1,%.*d, AUFS_WH_TMP_LEN, cnt++); dentry = au_sio_lkup_one(qs, h_parent, br); if (IS_ERR(dentry) || !dentry-d_inode) goto out_name; -- This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
[PATCH] aufs: Fix null pointer dereference on branch deletion
From: thay...@gmail.com In au_br_del, we free up br in au_br_do_del and later on access br. Avoid deference on br to prevent crash. Signed-off-by: Thayumanavar Sachithanantham thay...@gmail.com --- --- aufs2-2.6/fs/aufs/branch.c.orig 2010-07-19 11:32:41.0 -0400 +++ aufs2-2.6/fs/aufs/branch.c 2010-07-19 11:34:08.0 -0400 @@ -794,7 +794,7 @@ int au_br_del(struct super_block *sb, st if (au_opt_test(mnt_flags, PLINK)) au_plink_half_refresh(sb, br_id); - if (au_xino_brid(sb) == br-br_id) + if (au_xino_brid(sb) == br_id) au_xino_brid_set(sb, -1); goto out; /* success */ -- This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first