Re: [Lustre-discuss] The Lustre download page seems to be broken

2008-06-03 Thread Larvoire, Jean-Francois
FYI I was never able to download anything with my initial account. Finally I recreated another one, and using this new one, everything works flawlessly. So the Sun download server is definitely buggy! (And there's no way to delete an existing account. I had to recreate the new one using an alias

Re: [Lustre-discuss] performance issues in simple lustre setup

2008-06-03 Thread Isaac Huang
Since both the configuration and the IB link bandwidth looked fine, I'd suggest to measure lnet throughput by lnet selftest: 1. On both client and server: modprobe lnet_selftest 2. On the client: export LST_SESSION=$$ lst new_session --timeo 10 test lst add_group s [EMAIL PROTECTED] lst

Re: [Lustre-discuss] performance issues in simple lustre setup

2008-06-03 Thread Isaac Huang
On Tue, Jun 03, 2008 at 08:32:09AM -0400, Murray Smigel wrote: Some additional information on the problem. I tried disconnecting the ethernet connection to the server machine (192.168.1.94) and tried running a disk test on the client (192.168.1.156 via ethernet), writing to

Re: [Lustre-discuss] Lustre Mount Crashing

2008-06-03 Thread Andreas Dilger
On Jun 02, 2008 19:51 -0400, Charles Taylor wrote: Wow, you are one powerful witch doctor. So we rebuilt our system disk (just to be sure) and that made no difference we still panicked as soon as mounted the MDT. The -o abort_recov did not help either. However, your recipe below

Re: [Lustre-discuss] Lustre Mount Crashing

2008-06-03 Thread Andreas Dilger
On Jun 03, 2008 16:37 -0400, Charles Taylor wrote: I'm sorry, I should have updated you. You are right, it was misleading.The MDS/MDT was fine and after about twenty minutes or so everything became active and we now have a working file system with data that we can access so we

Re: [Lustre-discuss] Looping in __d_lookup

2008-06-03 Thread Jakob Goldbach
On Wed, 2008-05-21 at 21:05 +0200, Jakob Goldbach wrote: So the lockup in __d_lookup may just relate to newer patchless clients. I got rid of my dcache chain corruption by adding patch below and exporting _d_rehash from kernel (offcourse, no longer patchless). Could this fix a race in

Re: [Lustre-discuss] Looping in __d_lookup

2008-06-03 Thread Andreas Dilger
On Jun 04, 2008 00:19 +0200, Jakob Goldbach wrote: On Wed, 2008-05-21 at 21:05 +0200, Jakob Goldbach wrote: So the lockup in __d_lookup may just relate to newer patchless clients. I got rid of my dcache chain corruption by adding patch below and exporting _d_rehash from kernel

Re: [Lustre-discuss] performance issues in simple lustre setup

2008-06-03 Thread murray
nasnu3:/slut# lst run bw;lst stat c bw is running now [LNet Rates of c] [W] Avg: 800 RPC/s Min: 800 RPC/s Max: 800 RPC/s [R] Avg: 1596 RPC/s Min: 1596 RPC/s Max: 1596 RPC/s [LNet Bandwidth of c] [W] Avg: 0.13 MB/s Min: 0.13 MB/s Max: 0.13 MB/s [R] Avg:

Re: [Lustre-discuss] performance issues in simple lustre setup

2008-06-03 Thread murray
OK, I have a work around. I removed the tcp option from the lnet configuration so that it now reads: options lnet networks=o2ib options mlx4_core msi_x=1 alias ib0 ib_ipoib alias ib1 ib_ipoib Now I am seeing write bw 70 MB/sec. Thanks, murray Murray Smigel wrote: Hi, I have built a

Re: [Lustre-discuss] performance issues in simple lustre setup

2008-06-03 Thread Murray Smigel
Some additional information on the problem. I tried disconnecting the ethernet connection to the server machine (192.168.1.94) and tried running a disk test on the client (192.168.1.156 via ethernet), writing to what I thought was the IB mounted file system (mount -t lustre [EMAIL

Re: [Lustre-discuss] performance issues in simple lustre setup

2008-06-03 Thread Murray Smigel
More data. When I disconnect the IB connection on the server ([EMAIL PROTECTED]) I see, on the client: un 3 08:55:30 nasnu3 kernel: LustreError: 3634:0:(events.c:55:request_out_callback()) @@@ type 4, status -5 [EMAIL PROTECTED] x602471/t0 o400-[EMAIL PROTECTED]@o2ib_0:26 lens 128/128 ref 2