Re: [Lustre-discuss] Looping in __d_lookup

2008-11-18 Thread Jakob Goldbach
On Mon, 2008-11-17 at 10:59 -0600, Andreas Dilger wrote: On Nov 17, 2008 16:07 +0100, Reto Gantenbein wrote: Why can't this bug being accessed? Even when I login with a bugzilla account there is the message You are not authorized to access bug #15975. This bug was marked private by the

Re: [Lustre-discuss] Looping in __d_lookup

2008-11-17 Thread Andreas Dilger
On Nov 17, 2008 16:07 +0100, Reto Gantenbein wrote: Why can't this bug being accessed? Even when I login with a bugzilla account there is the message You are not authorized to access bug #15975. This bug was marked private by the customer that filed it. The patch was included in the

Re: [Lustre-discuss] Looping in __d_lookup

2008-06-13 Thread Jakob Goldbach
Actually, _d_rehash() is not exported by the kernel, so I assume that you have patched the kernel. At least, that's what I understand from the OpenVZ ticket: http://bugzilla.openvz.org/show_bug.cgi?id=895#c27 Of course, we cannot do this for patchless clients :( True - I patched my

Re: [Lustre-discuss] Looping in __d_lookup

2008-06-04 Thread Jakob Goldbach
On Tue, 2008-06-03 at 17:05 -0600, Andreas Dilger wrote: Can you please file a bug with the original details, so that this gets fixed in the next release. https://bugzilla.lustre.org/show_bug.cgi?id=15975 ___ Lustre-discuss mailing list

Re: [Lustre-discuss] Looping in __d_lookup

2008-06-04 Thread Alex Lyashkov
On Tue, 2008-06-03 at 17:05 -0600, Andreas Dilger wrote: On Jun 04, 2008 00:19 +0200, Jakob Goldbach wrote: On Wed, 2008-05-21 at 21:05 +0200, Jakob Goldbach wrote: So the lockup in __d_lookup may just relate to newer patchless clients. I got rid of my dcache chain corruption

Re: [Lustre-discuss] Looping in __d_lookup

2008-06-04 Thread Jakob Goldbach
On Wed, 2008-06-04 at 12:10 +0300, Alex Lyashkov wrote: --- ./lustre/llite/dcache.c.xxx 2007-09-27 16:04:08.0 +0400 +++ ./lustre/llite/dcache.c 2008-05-29 11:53:07.0 +0400 @@ -470,8 +470,8 @@ revalidate_finish: spin_lock(dcache_lock);

Re: [Lustre-discuss] Looping in __d_lookup

2008-06-03 Thread Jakob Goldbach
On Wed, 2008-05-21 at 21:05 +0200, Jakob Goldbach wrote: So the lockup in __d_lookup may just relate to newer patchless clients. I got rid of my dcache chain corruption by adding patch below and exporting _d_rehash from kernel (offcourse, no longer patchless). Could this fix a race in

Re: [Lustre-discuss] Looping in __d_lookup

2008-06-03 Thread Andreas Dilger
On Jun 04, 2008 00:19 +0200, Jakob Goldbach wrote: On Wed, 2008-05-21 at 21:05 +0200, Jakob Goldbach wrote: So the lockup in __d_lookup may just relate to newer patchless clients. I got rid of my dcache chain corruption by adding patch below and exporting _d_rehash from kernel

Re: [Lustre-discuss] Looping in __d_lookup

2008-05-22 Thread Jakob Goldbach
This patch and backtrace say - dcache chain was damaged _before_ enter to lustre, lustre start add entry to new position in dentry cache, and find damaged entry in list. Allright. Any idea on how to detect the bad entry upon insertion ? Thanks, Jakob

Re: [Lustre-discuss] Looping in __d_lookup

2008-05-21 Thread Robin Humble
On Tue, May 20, 2008 at 10:24:25PM +0200, Jakob Goldbach wrote: Hm, so you actually have a circular loop? Yes - I've asked for help on the OpenVZ list as well - Pavel Emelyanov provided me with a debug patch. This patch has now confirmed the loop in __d_lookup. we're also seeing __d_lookup soft

Re: [Lustre-discuss] Looping in __d_lookup

2008-05-21 Thread Jakob Goldbach
kernel is 2.6.23.17 with patchless lustre 1.6.4.3, I'm running 1.6.4.3 patchless as well against an 2.6.18 vanilla kernel. Or at least that is what I thought. OpenVz patch effectively makes the kernel a 2.6.18++ kernel because they add features from newer kernels in their maintained 2.6.18

Re: [Lustre-discuss] Looping in __d_lookup

2008-05-21 Thread Jakob Goldbach
On Wed, 2008-05-21 at 20:04 -0600, Andreas Dilger wrote: Do you have the fixes for the statahead patches, disable statahead via echo 0 /proc/fs/lustre/llite/*/statahead_max, or can you try out the v1_6_5_RC4 tag from CVS (which also contains those patches)? I'm running without statahead

Re: [Lustre-discuss] Looping in __d_lookup

2008-05-21 Thread Robin Humble
On Wed, May 21, 2008 at 08:04:27PM -0600, Andreas Dilger wrote: On May 21, 2008 21:05 +0200, Jakob Goldbach wrote: I'm running 1.6.4.3 patchless as well against an 2.6.18 vanilla kernel. Or at least that is what I thought. OpenVz patch effectively makes the kernel a 2.6.18++ kernel because

Re: [Lustre-discuss] Looping in __d_lookup

2008-05-21 Thread Jakob Goldbach
On Thu, 2008-05-22 at 05:46 +0200, Jakob Goldbach wrote: On Wed, 2008-05-21 at 20:04 -0600, Andreas Dilger wrote: Do you have the fixes for the statahead patches, disable statahead via echo 0 /proc/fs/lustre/llite/*/statahead_max, or can you try out the v1_6_5_RC4 tag from CVS (which

Re: [Lustre-discuss] Looping in __d_lookup

2008-05-20 Thread Jakob Goldbach
Hm, so you actually have a circular loop? Yes - I've asked for help on the OpenVZ list as well - Pavel Emelyanov provided me with a debug patch. This patch has now confirmed the loop in __d_lookup. I wonder if you can e.g. run your (or similar setup) in vmware [snip] Hmm, I'm not

Re: [Lustre-discuss] Looping in __d_lookup

2008-05-16 Thread Oleg Drokin
Hello! On May 15, 2008, at 5:34 AM, Jakob Goldbach wrote: On a regular basis a process get stuck in __d_lookup. When I dig in it seems I'm caught in the hlist_for_each_entry_rcu loop never satisfying the exit-from-loop condition. Hm, so you actually have a circular loop? I wonder if you can

[Lustre-discuss] Looping in __d_lookup

2008-05-15 Thread Jakob Goldbach
Hi, I run 1.6.4.3-patchless againt a vanilla 2.6.18 - but with OpenVZ patches on my clients. OpenVZ is virtualization (like vserver, bsd jails etc.). On a regular basis a process get stuck in __d_lookup. When I dig in it seems I'm caught in the hlist_for_each_entry_rcu loop never satisfying the