Re: [Lustre-discuss] Announce: Lustre 1.8.3 is available!

2010-05-03 Thread Heiko Schröter
Am Samstag 01 Mai 2010, um 01:49:14 schrieb Terry Rutledge: Hello, no more 1.8.3 src.tgz file ? Would be nice. Thanks and Regards Heiko Hi all, Lustre 1.8.3 is available on the Sun Download Center Site. http://www.sun.com/software/products/lustre/get.jsp Our forwarding link has not

[Lustre-discuss] Problems with fstab entry on sles11

2010-05-03 Thread Matthias Wolf
Hello Folks, i guad probelms with the fstab entry. The lustre filesystem doesnt startup on boot. A mount -a after reboot start the lustre filesystem. Whats the problem? Is there something special with SLES11. Here are some infos about our client configuration. snowball003:~ # cat /etc/fstab

Re: [Lustre-discuss] Problems with fstab entry on sles11

2010-05-03 Thread Brian J. Murrell
On Mon, 2010-05-03 at 13:18 +0200, Matthias Wolf wrote: Hello Folks, i guad probelms with the fstab entry. The lustre filesystem doesnt startup on boot. A mount -a after reboot start the lustre filesystem. Whats the problem? Is there something special with SLES11. Yes, there is. Lustre,

Re: [Lustre-discuss] Rolling client upgrades

2010-05-03 Thread Brian J. Murrell
On Fri, 2010-04-30 at 16:39 -0400, Erik Froese wrote: I ask because 2 of my OSS just panic'd while I was doing the upgrade. A failover pair to boot! Sorry for attaching a picture but its the only way I can get the text from a panic with the remote console. I can provide logs if necessary.

Re: [Lustre-discuss] configure --with-o2ib

2010-05-03 Thread Brian J. Murrell
On Mon, 2010-05-03 at 09:00 +0200, Fourie Joubert wrote: Hi Folks Hi, Could someone provide clarification regarding where the --with-o2ib= option should be pointing to? It should be pointing to /usr/src/ofa_kernel (and not /usr/src/ofa_kernel-1.4.2) for a stock OFED installation. b.

Re: [Lustre-discuss] Problems with fstab entry on sles11

2010-05-03 Thread Dardo D Kleiner - CONTRACTOR
On 5/3/10 10:43 AM, Ken Hornstein wrote: Perhaps other SLES users here can post ideas on how they handle this situation. I guess for those using heartbeat, this is a non-issue as heartbeat handles the mounting and heartbeat is started after networking. So, as a SLES user, let me offer my

Re: [Lustre-discuss] Problems with fstab entry on sles11

2010-05-03 Thread Brian J. Murrell
On Mon, 2010-05-03 at 10:43 -0400, Ken Hornstein wrote: _netdev still works in the sense that it prevents the mount command from mounting those filesystems at the normal time. Yes, I believe this was covered in one of the bugs I pointed to, but worth mentioning here. But there's nothing

Re: [Lustre-discuss] Problems with fstab entry on sles11

2010-05-03 Thread Brian J. Murrell
On Mon, 2010-05-03 at 10:55 -0400, Dardo D Kleiner - CONTRACTOR wrote: Another alternative is to automount on demand (man 5 autofs), though there are some issues with that Yeah, as discussed here not that long ago, there are other issues with this even. There might be a bug in BZ about it

[Lustre-discuss] lock callback timer expired, lock on destroyed export, locks stolen, busy with active RPCs, operation 400 on unconnected MDS

2010-05-03 Thread Thomas Roth
Hi all, just want to share my recent insight and increase the number of Google hits for those who suffer from - MDT / filesystem becoming suddenly unusable - LustreError: ... lock callback timer expired ... - LustreError: ... lock on destroyed export ... - Lustre: ... Stealing 1 locks ... -

Re: [Lustre-discuss] lock callback timer expired, lock on destroyed export, locks stolen, busy with active RPCs, operation 400 on unconnected MDS

2010-05-03 Thread Oleg Drokin
Hello! On May 3, 2010, at 11:49 AM, Thomas Roth wrote: We found a user job submission script that probably caused all this by starting - several hundred (900) jobs simultaneously - all of them opening one and the same file for batch system errors and one and the same file for its output.