Re: possible recursive locking detected - while running fs operations in loops - 2.6.18-rc2-git5
Hello, On Sunday 30 July 2006 10:17, Jesper Juhl wrote: > On 30/07/06, Hans Reiser <[EMAIL PROTECTED]> wrote: > > Jesper Juhl wrote: > > > Thanks. That's a nice little test suite. > > > > Yes, it is quite useful, our developers have added it to the > > regression suite > > That's nice. > > Now how about that lock validator message I managed to tease out? > > Akpm said "... the reiserfs locking appears to be unneeded - this > inode is going down and nobody else can look it up, so what is to be > locked against?" - can you comment on that? Thanks. it is correct. Andrew, please apply the following patch: i_mutex does not need to be locked in reiserfs_delete_inode. Signed-off-by: Alexander Zarochentsev <[EMAIL PROTECTED]> fs/reiserfs/inode.c | 12 ++-- 1 files changed, 2 insertions(+), 10 deletions(-) Index: linux-2.6-git/fs/reiserfs/inode.c === --- linux-2.6-git.orig/fs/reiserfs/inode.c +++ linux-2.6-git/fs/reiserfs/inode.c @@ -37,14 +37,10 @@ void reiserfs_delete_inode(struct inode /* The = 0 happens when we abort creating a new inode for some reason like lack of space.. */ if (!(inode->i_state & I_NEW) && INODE_PKEY(inode)->k_objectid != 0) { /* also handles bad_inode case */ - mutex_lock(&inode->i_mutex); - reiserfs_delete_xattrs(inode); - if (journal_begin(&th, inode->i_sb, jbegin_count)) { - mutex_unlock(&inode->i_mutex); + if (journal_begin(&th, inode->i_sb, jbegin_count)) goto out; - } reiserfs_update_inode_transaction(inode); err = reiserfs_delete_object(&th, inode); @@ -55,12 +51,8 @@ void reiserfs_delete_inode(struct inode if (!err) DQUOT_FREE_INODE(inode); - if (journal_end(&th, inode->i_sb, jbegin_count)) { - mutex_unlock(&inode->i_mutex); + if (journal_end(&th, inode->i_sb, jbegin_count)) goto out; - } - - mutex_unlock(&inode->i_mutex); /* check return value from reiserfs_delete_object after * ending the transaction
Re: possible recursive locking detected - while running fs operations in loops - 2.6.18-rc2-git5
On 30/07/06, Hans Reiser <[EMAIL PROTECTED]> wrote: Jesper Juhl wrote: > On 30/07/06, Hans Reiser <[EMAIL PROTECTED]> wrote: > >> Jesper Juhl wrote: >> >> > >> > Thanks. That's a nice little test suite. >> > >> Yes, it is quite useful, our developers have added it to the regression >> suite >> > That's nice. > > Now how about that lock validator message I managed to tease out? > > Akpm said "... the reiserfs locking appears to be unneeded - this inode > is going down and nobody else can look it up, so what is to be locked > against?" - can you comment on that? > > Err, how about Zam handles all locking issues and this is Sunday with the family? I know, lame, but he'll answer you on Monday Russian time.;-) :-) Not a problem at all. -- Jesper Juhl <[EMAIL PROTECTED]> Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html Plain text mails only, please http://www.expita.com/nomime.html
Re: possible recursive locking detected - while running fs operations in loops - 2.6.18-rc2-git5
Jesper Juhl wrote: > On 30/07/06, Hans Reiser <[EMAIL PROTECTED]> wrote: > >> Jesper Juhl wrote: >> >> > >> > Thanks. That's a nice little test suite. >> > >> Yes, it is quite useful, our developers have added it to the regression >> suite >> > That's nice. > > Now how about that lock validator message I managed to tease out? > > Akpm said "... the reiserfs locking appears to be unneeded - this inode > is going down and nobody else can look it up, so what is to be locked > against?" - can you comment on that? > > Err, how about Zam handles all locking issues and this is Sunday with the family? I know, lame, but he'll answer you on Monday Russian time.;-)
Re: possible recursive locking detected - while running fs operations in loops - 2.6.18-rc2-git5
On 30/07/06, Hans Reiser <[EMAIL PROTECTED]> wrote: Jesper Juhl wrote: > > Thanks. That's a nice little test suite. > Yes, it is quite useful, our developers have added it to the regression suite That's nice. Now how about that lock validator message I managed to tease out? Akpm said "... the reiserfs locking appears to be unneeded - this inode is going down and nobody else can look it up, so what is to be locked against?" - can you comment on that? -- Jesper Juhl <[EMAIL PROTECTED]> Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html Plain text mails only, please http://www.expita.com/nomime.html
Re: possible recursive locking detected - while running fs operations in loops - 2.6.18-rc2-git5
Jesper Juhl wrote: > > Thanks. That's a nice little test suite. > Yes, it is quite useful, our developers have added it to the regression suite Hans
Re: possible recursive locking detected - while running fs operations in loops - 2.6.18-rc2-git5
On 26/07/06, Andreas Dilger <[EMAIL PROTECTED]> wrote: On Jul 26, 2006 00:16 +0200, Jesper Juhl wrote: > What I did to provoke it was to run 6 different xterms (with a bash > shell) with the following loops in them in a test directory that was > initially empty : > > xterm1: while true; do mkdir a; done > xterm2: while true; do rmdir a; done > xterm3: while true; do touch a/foo; done > xterm4: while true; do find .; done > xterm5: while true; do sync; sleep 1; done > xterm6: while true; do rm -r a; done See racer test at ftp.lustre.org/pub/benchmarks/racer-lustre.tar.gz It does the above, but a bunch more things and is a truly pathalogical test script that does lots of "stupid user tricks", unlike normal tests which are only doing operations that expect to be successful. PS - during the racer.sh test run "rm" is known to segfault after hitting an internal assertion, nobody is sure why. PPS- I don't know who wrote this program, it was originally posted by someone not the author to linux-fsdevel or something. Thanks. That's a nice little test suite. -- Jesper Juhl <[EMAIL PROTECTED]> Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html Plain text mails only, please http://www.expita.com/nomime.html
Re: possible recursive locking detected - while running fs operations in loops - 2.6.18-rc2-git5
On Jul 26, 2006 00:16 +0200, Jesper Juhl wrote: > What I did to provoke it was to run 6 different xterms (with a bash > shell) with the following loops in them in a test directory that was > initially empty : > > xterm1: while true; do mkdir a; done > xterm2: while true; do rmdir a; done > xterm3: while true; do touch a/foo; done > xterm4: while true; do find .; done > xterm5: while true; do sync; sleep 1; done > xterm6: while true; do rm -r a; done See racer test at ftp.lustre.org/pub/benchmarks/racer-lustre.tar.gz It does the above, but a bunch more things and is a truly pathalogical test script that does lots of "stupid user tricks", unlike normal tests which are only doing operations that expect to be successful. PS - during the racer.sh test run "rm" is known to segfault after hitting an internal assertion, nobody is sure why. PPS- I don't know who wrote this program, it was originally posted by someone not the author to linux-fsdevel or something. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc.
possible recursive locking detected - while running fs operations in loops - 2.6.18-rc2-git5
Hi, I just got this from the lock validator : = [ INFO: possible recursive locking detected ] - rm/2498 is trying to acquire lock: (&inode->i_mutex){--..}, at: [] mutex_lock+0x1c/0x20 but task is already holding lock: (&inode->i_mutex){--..}, at: [] mutex_lock+0x1c/0x20 other info that might help us debug this: 2 locks held by rm/2498: #0: (&inode->i_mutex/1){--..}, at: [] do_rmdir+0x73/0xe0 #1: (&inode->i_mutex){--..}, at: [] mutex_lock+0x1c/0x20 stack backtrace: [] show_trace_log_lvl+0x12a/0x150 [] show_trace+0x12/0x20 [] dump_stack+0x19/0x20 [] print_deadlock_bug+0xb9/0xd0 [] check_deadlock+0x6b/0x80 [] __lock_acquire+0x354/0x990 [] lock_acquire+0x75/0xa0 [] __mutex_lock_slowpath+0x70/0x2a0 [] mutex_lock+0x1c/0x20 [] reiserfs_delete_inode+0x63/0xd0 [] generic_delete_inode+0x61/0xe0 [] generic_drop_inode+0xf/0x20 [] iput+0x56/0x80 [] dentry_iput+0x5e/0xc0 [] dput+0xa8/0x170 [] prune_one_dentry+0x6b/0x80 [] prune_dcache+0x15b/0x170 [] shrink_dcache_parent+0x10/0x20 [] dentry_unhash+0x5a/0xc0 [] vfs_rmdir+0x5f/0xc0 [] do_rmdir+0xcd/0xe0 [] sys_rmdir+0x10/0x20 [] syscall_call+0x7/0xb [] 0xb7e79d7d What I did to provoke it was to run 6 different xterms (with a bash shell) with the following loops in them in a test directory that was initially empty : xterm1: while true; do mkdir a; done xterm2: while true; do rmdir a; done xterm3: while true; do touch a/foo; done xterm4: while true; do find .; done xterm5: while true; do sync; sleep 1; done xterm6: while true; do rm -r a; done I then left that alone for ~15 minutes and then lockdep complained. This was on a reiserfs 3.6 filesystem. My kernel version is 2.6.18-rc2-git5 (i386 build, not x86_64) The CPU is a Athlon64 X2 4400+ Some details : $ uname -a Linux dragon 2.6.18-rc2-git5 #1 SMP PREEMPT Tue Jul 25 22:58:52 CEST 2006 i686 athlon-4 i386 GNU/Linux [EMAIL PROTECTED]:~/download/kernel/linux-2.6.18-rc2-git5$ scripts/ver_linux If some fields are empty or look unusual you may have an old version. Compare to the current minimal requirements in Documentation/Changes. Linux dragon 2.6.18-rc2-git5 #1 SMP PREEMPT Tue Jul 25 22:58:52 CEST 2006 i686 athlon-4 i386 GNU/Linux Gnu C 3.4.6 Gnu make 3.81 binutils 2.15.92.0.2 util-linux 2.12r mount 2.12r module-init-tools 3.2.2 e2fsprogs 1.39 reiserfsprogs 3.6.19 quota-tools3.13. PPP2.4.4b1 Linux C Library2.3.6 Dynamic linker (ldd) 2.3.6 Linux C++ Library 6.0.3 Procps 3.2.7 Net-tools 1.60 Kbd1.12 Sh-utils 5.97 udev 071 Modules Loaded snd_seq_oss snd_seq_midi_event snd_seq snd_pcm_oss snd_mixer_oss uhci_hcd usbcore snd_emu10k1 snd_rawmidi snd_ac97_codec snd_ac97_bus snd_pcm snd_seq_device snd_timer snd_page_alloc snd_util_mem snd_hwdep snd agpgart If more info is needed then just ask and I'll be happy to provide what I can. -- Jesper Juhl <[EMAIL PROTECTED]> Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html Plain text mails only, please http://www.expita.com/nomime.html