Re: possible recursive locking detected - while running fs operations in loops - 2.6.18-rc2-git5

2006-07-31 Thread Alexander Zarochentsev
Hello,

On Sunday 30 July 2006 10:17, Jesper Juhl wrote:
> On 30/07/06, Hans Reiser <[EMAIL PROTECTED]> wrote:
> > Jesper Juhl wrote:
> > > Thanks. That's a nice little test suite.
> >
> > Yes, it is quite useful, our developers have added it to the
> > regression suite
>
> That's nice.
>
> Now how about that lock validator message I managed to tease out?
>
> Akpm said "... the reiserfs locking appears to be unneeded - this
> inode is going down and nobody else can look it up, so what is to be
> locked against?" - can you comment on that?

Thanks. it is correct.
Andrew, please apply the following patch: 

i_mutex does not need to be locked in reiserfs_delete_inode.

Signed-off-by: Alexander Zarochentsev <[EMAIL PROTECTED]>

fs/reiserfs/inode.c |   12 ++--
 1 files changed, 2 insertions(+), 10 deletions(-)

Index: linux-2.6-git/fs/reiserfs/inode.c
===
--- linux-2.6-git.orig/fs/reiserfs/inode.c
+++ linux-2.6-git/fs/reiserfs/inode.c
@@ -37,14 +37,10 @@ void reiserfs_delete_inode(struct inode 
 
/* The = 0 happens when we abort creating a new inode for some reason 
like lack of space.. */
if (!(inode->i_state & I_NEW) && INODE_PKEY(inode)->k_objectid != 0) {  
/* also handles bad_inode case */
-   mutex_lock(&inode->i_mutex);
-
reiserfs_delete_xattrs(inode);
 
-   if (journal_begin(&th, inode->i_sb, jbegin_count)) {
-   mutex_unlock(&inode->i_mutex);
+   if (journal_begin(&th, inode->i_sb, jbegin_count))
goto out;
-   }
reiserfs_update_inode_transaction(inode);
 
err = reiserfs_delete_object(&th, inode);
@@ -55,12 +51,8 @@ void reiserfs_delete_inode(struct inode 
if (!err) 
DQUOT_FREE_INODE(inode);
 
-   if (journal_end(&th, inode->i_sb, jbegin_count)) {
-   mutex_unlock(&inode->i_mutex);
+   if (journal_end(&th, inode->i_sb, jbegin_count))
goto out;
-   }
-
-   mutex_unlock(&inode->i_mutex);
 
/* check return value from reiserfs_delete_object after
 * ending the transaction




Re: possible recursive locking detected - while running fs operations in loops - 2.6.18-rc2-git5

2006-07-30 Thread Jesper Juhl

On 30/07/06, Hans Reiser <[EMAIL PROTECTED]> wrote:

Jesper Juhl wrote:

> On 30/07/06, Hans Reiser <[EMAIL PROTECTED]> wrote:
>
>> Jesper Juhl wrote:
>>
>> >
>> > Thanks. That's a nice little test suite.
>> >
>> Yes, it is quite useful, our developers have added it to the regression
>> suite
>>
> That's nice.
>
> Now how about that lock validator message I managed to tease out?
>
> Akpm said "... the reiserfs locking appears to be unneeded - this inode
> is going down and nobody else can look it up, so what is to be locked
> against?" - can you comment on that?
>
>
Err, how about Zam handles all locking issues and this is Sunday with
the family?  I know, lame, but he'll answer you on Monday Russian
time.;-)


:-) Not a problem at all.

--
Jesper Juhl <[EMAIL PROTECTED]>
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please  http://www.expita.com/nomime.html


Re: possible recursive locking detected - while running fs operations in loops - 2.6.18-rc2-git5

2006-07-30 Thread Hans Reiser
Jesper Juhl wrote:

> On 30/07/06, Hans Reiser <[EMAIL PROTECTED]> wrote:
>
>> Jesper Juhl wrote:
>>
>> >
>> > Thanks. That's a nice little test suite.
>> >
>> Yes, it is quite useful, our developers have added it to the regression
>> suite
>>
> That's nice.
>
> Now how about that lock validator message I managed to tease out?
>
> Akpm said "... the reiserfs locking appears to be unneeded - this inode
> is going down and nobody else can look it up, so what is to be locked
> against?" - can you comment on that?
>
>
Err, how about Zam handles all locking issues and this is Sunday with
the family?  I know, lame, but he'll answer you on Monday Russian
time.;-)


Re: possible recursive locking detected - while running fs operations in loops - 2.6.18-rc2-git5

2006-07-29 Thread Jesper Juhl

On 30/07/06, Hans Reiser <[EMAIL PROTECTED]> wrote:

Jesper Juhl wrote:

>
> Thanks. That's a nice little test suite.
>
Yes, it is quite useful, our developers have added it to the regression
suite


That's nice.

Now how about that lock validator message I managed to tease out?

Akpm said "... the reiserfs locking appears to be unneeded - this inode
is going down and nobody else can look it up, so what is to be locked
against?" - can you comment on that?


--
Jesper Juhl <[EMAIL PROTECTED]>
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please  http://www.expita.com/nomime.html


Re: possible recursive locking detected - while running fs operations in loops - 2.6.18-rc2-git5

2006-07-29 Thread Hans Reiser
Jesper Juhl wrote:

>
> Thanks. That's a nice little test suite.
>
Yes, it is quite useful, our developers have added it to the regression
suite

Hans


Re: possible recursive locking detected - while running fs operations in loops - 2.6.18-rc2-git5

2006-07-29 Thread Jesper Juhl

On 26/07/06, Andreas Dilger <[EMAIL PROTECTED]> wrote:

On Jul 26, 2006  00:16 +0200, Jesper Juhl wrote:
> What I did to provoke it was to run 6 different xterms (with a bash
> shell) with the following loops in them in a test directory that was
> initially empty :
>
> xterm1:   while true; do mkdir a; done
> xterm2:   while true; do rmdir a; done
> xterm3:   while true; do touch a/foo; done
> xterm4:   while true; do find .; done
> xterm5:   while true; do sync; sleep 1; done
> xterm6:   while true; do rm -r a; done

See racer test at ftp.lustre.org/pub/benchmarks/racer-lustre.tar.gz

It does the above, but a bunch more things and is a truly pathalogical
test script that does lots of "stupid user tricks", unlike normal tests
which are only doing operations that expect to be successful.

PS - during the racer.sh test run "rm" is known to segfault after hitting
 an internal assertion, nobody is sure why.
PPS- I don't know who wrote this program, it was originally posted by
 someone not the author to linux-fsdevel or something.



Thanks. That's a nice little test suite.

--
Jesper Juhl <[EMAIL PROTECTED]>
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please  http://www.expita.com/nomime.html


Re: possible recursive locking detected - while running fs operations in loops - 2.6.18-rc2-git5

2006-07-25 Thread Andreas Dilger
On Jul 26, 2006  00:16 +0200, Jesper Juhl wrote:
> What I did to provoke it was to run 6 different xterms (with a bash
> shell) with the following loops in them in a test directory that was
> initially empty :
> 
> xterm1:   while true; do mkdir a; done
> xterm2:   while true; do rmdir a; done
> xterm3:   while true; do touch a/foo; done
> xterm4:   while true; do find .; done
> xterm5:   while true; do sync; sleep 1; done
> xterm6:   while true; do rm -r a; done

See racer test at ftp.lustre.org/pub/benchmarks/racer-lustre.tar.gz

It does the above, but a bunch more things and is a truly pathalogical
test script that does lots of "stupid user tricks", unlike normal tests
which are only doing operations that expect to be successful.

PS - during the racer.sh test run "rm" is known to segfault after hitting
 an internal assertion, nobody is sure why.
PPS- I don't know who wrote this program, it was originally posted by
 someone not the author to linux-fsdevel or something.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.



possible recursive locking detected - while running fs operations in loops - 2.6.18-rc2-git5

2006-07-25 Thread Jesper Juhl

Hi,

I just got this from the lock validator :

=
[ INFO: possible recursive locking detected ]
-
rm/2498 is trying to acquire lock:
(&inode->i_mutex){--..}, at: [] mutex_lock+0x1c/0x20

but task is already holding lock:
(&inode->i_mutex){--..}, at: [] mutex_lock+0x1c/0x20

other info that might help us debug this:
2 locks held by rm/2498:
#0:  (&inode->i_mutex/1){--..}, at: [] do_rmdir+0x73/0xe0
#1:  (&inode->i_mutex){--..}, at: [] mutex_lock+0x1c/0x20

stack backtrace:
[] show_trace_log_lvl+0x12a/0x150
[] show_trace+0x12/0x20
[] dump_stack+0x19/0x20
[] print_deadlock_bug+0xb9/0xd0
[] check_deadlock+0x6b/0x80
[] __lock_acquire+0x354/0x990
[] lock_acquire+0x75/0xa0
[] __mutex_lock_slowpath+0x70/0x2a0
[] mutex_lock+0x1c/0x20
[] reiserfs_delete_inode+0x63/0xd0
[] generic_delete_inode+0x61/0xe0
[] generic_drop_inode+0xf/0x20
[] iput+0x56/0x80
[] dentry_iput+0x5e/0xc0
[] dput+0xa8/0x170
[] prune_one_dentry+0x6b/0x80
[] prune_dcache+0x15b/0x170
[] shrink_dcache_parent+0x10/0x20
[] dentry_unhash+0x5a/0xc0
[] vfs_rmdir+0x5f/0xc0
[] do_rmdir+0xcd/0xe0
[] sys_rmdir+0x10/0x20
[] syscall_call+0x7/0xb
[] 0xb7e79d7d


What I did to provoke it was to run 6 different xterms (with a bash
shell) with the following loops in them in a test directory that was
initially empty :

xterm1:   while true; do mkdir a; done
xterm2:   while true; do rmdir a; done
xterm3:   while true; do touch a/foo; done
xterm4:   while true; do find .; done
xterm5:   while true; do sync; sleep 1; done
xterm6:   while true; do rm -r a; done

I then left that alone for ~15 minutes and then lockdep complained.

This was on a reiserfs 3.6 filesystem.
My kernel version is 2.6.18-rc2-git5 (i386 build, not x86_64)
The CPU is a Athlon64 X2 4400+

Some details :

$ uname -a
Linux dragon 2.6.18-rc2-git5 #1 SMP PREEMPT Tue Jul 25 22:58:52 CEST
2006 i686 athlon-4 i386 GNU/Linux

[EMAIL PROTECTED]:~/download/kernel/linux-2.6.18-rc2-git5$ scripts/ver_linux
If some fields are empty or look unusual you may have an old version.
Compare to the current minimal requirements in Documentation/Changes.

Linux dragon 2.6.18-rc2-git5 #1 SMP PREEMPT Tue Jul 25 22:58:52 CEST
2006 i686 athlon-4 i386 GNU/Linux

Gnu C  3.4.6
Gnu make   3.81
binutils   2.15.92.0.2
util-linux 2.12r
mount  2.12r
module-init-tools  3.2.2
e2fsprogs  1.39
reiserfsprogs  3.6.19
quota-tools3.13.
PPP2.4.4b1
Linux C Library2.3.6
Dynamic linker (ldd)   2.3.6
Linux C++ Library  6.0.3
Procps 3.2.7
Net-tools  1.60
Kbd1.12
Sh-utils   5.97
udev   071
Modules Loaded snd_seq_oss snd_seq_midi_event snd_seq
snd_pcm_oss snd_mixer_oss uhci_hcd usbcore snd_emu10k1 snd_rawmidi
snd_ac97_codec snd_ac97_bus snd_pcm snd_seq_device snd_timer
snd_page_alloc snd_util_mem snd_hwdep snd agpgart

If more info is needed then just ask and I'll be happy to provide what I can.

--
Jesper Juhl <[EMAIL PROTECTED]>
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please  http://www.expita.com/nomime.html