Re: [Lustre-discuss] 1.6.5.1 OSS crashes

2008-07-28 Thread Brian J. Murrell
On Mon, 2008-07-28 at 09:17 -0500, Troy Benjegerdes wrote: Maybe a lot less painful for Sun support ;) No. I specifically meant less painful on the implementer. It's not really any more difficult on Sun for you to use whatever kernel you want to use. But what if you need a new RDMA NIC

Re: [Lustre-discuss] 1.6.5.1 OSS crashes

2008-07-27 Thread Mag Gam
Robin, Thankyou very much for helping with this. I want to try kernel 2.6.25 or even 2.6.26. But its not a big deal, I just patched my distro kernel and everything seems to work well. I am hoping in the future lustre will become a deamon or a module instead of patching the actual kernel source

Re: [Lustre-discuss] 1.6.5.1 OSS crashes

2008-07-25 Thread Robin Humble
On Sun, Jul 20, 2008 at 08:40:19AM -0400, Mag Gam wrote: I am trying to understand. What was the problem? How does SD_IOSTATS affect the crash? How did you disable this? the comments describe the bug: https://bugzilla.lustre.org/show_bug.cgi?id=16404#c22 which from a quick look seems like a SMP

Re: [Lustre-discuss] 1.6.5.1 OSS crashes

2008-07-24 Thread Mag Gam
I am trying to understand. What was the problem? How does SD_IOSTATS affect the crash? How did you disable this? Sorry for a newbie question TIA On Sun, Jul 20, 2008 at 4:54 AM, Robin Humble [EMAIL PROTECTED] wrote: On Fri, Jul 18, 2008 at 09:02:36AM -0400, Brian J. Murrell wrote: On Fri,

Re: [Lustre-discuss] 1.6.5.1 OSS crashes

2008-07-21 Thread Brian J. Murrell
On Sun, 2008-07-20 at 04:54 -0400, Robin Humble wrote: done. I rebuilt using the stock kernel's InfiniBand stack and # CONFIG_SD_IOSTATS is not set % cexec -p oss: uptime oss x17: 18:45:07 up 1 day, 30 min, 1 user, load average: 4.97, 7.00, 6.27 oss x18: 18:45:07 up 1 day, 23 min,

Re: [Lustre-discuss] 1.6.5.1 OSS crashes

2008-07-20 Thread Robin Humble
On Fri, Jul 18, 2008 at 09:02:36AM -0400, Brian J. Murrell wrote: On Fri, 2008-07-18 at 05:52 -0400, Robin Humble wrote: Hi, I'm seeing coordinated OSS crashes with Lustre 1.6.5.1. our RHEL4 OSS have been stable for ~months with these kernels:

Re: [Lustre-discuss] 1.6.5.1 OSS crashes

2008-07-18 Thread Brian J. Murrell
On Fri, 2008-07-18 at 05:52 -0400, Robin Humble wrote: Hi, I'm seeing coordinated OSS crashes with Lustre 1.6.5.1. our RHEL4 OSS have been stable for ~months with these kernels: kernel-lustre-smp-2.6.9-67.0.4.EL_lustre.1.6.4.3 kernel-lustre-smp-2.6.9-55.0.9.EL_lustre.1.6.4.2 but