[Lustre-discuss] Repairing a filesystem with lfsck output

2008-06-27 Thread Chad Kerner
Hello, I have a filesystem that went south in a fairly bad way. The lfsck(ran in read-only mode) listed 31216 files that were 'not created' and in the lfsck output log there were error messages about dangling inodes and when they were all added up, they equated out to 31216 dangling

Re: [Lustre-discuss] Confusion with failover

2008-06-27 Thread Dhruv
Actually with my kernel of 2.6.9-22, lustre 1.4.5.1 fits. And i am not in position to change the OS itself. I tried with the failover of OSTs without Linux HA. It worked fairly. I am now testing the same rigoursly to see whether i am correct. But the failover of MDS without HA didnt worked

Re: [Lustre-discuss] Confusion with failover

2008-06-27 Thread Dhruv
Actually with my kernel of 2.6.9-22, lustre 1.4.5.1 fits. And i am not in position to change the OS itself. I tried with the failover of OSTs without Linux HA. It worked fairly. I am now testing the same rigoursly to see whether i am correct. But the failover of MDS without HA didnt worked

Re: [Lustre-discuss] Lustre failback

2008-06-27 Thread Dhruv
Lustre supports failover/failback for sure. Which Lustre and kernel are you using? I tried configuring failover of OSTs and it worked fine. Stilll gonna test once more. Failover of MDS i am trying and have also posted a query. Currently without Linux HA its not working. Or may be it might work in

Re: [Lustre-discuss] lustre file system with failover

2008-06-27 Thread Dhruv
On Jun 13, 1:53 pm, Johann Lombardi [EMAIL PROTECTED] wrote: On Wed, Jun 11, 2008 at 01:07:56PM +0100, trupti shete wrote: I am having the following scenario of lustre file system-- MDT- /dev/sdc on node1 and failnode node2 [...] If client1 opens a file to write and at that time if I

Re: [Lustre-discuss] lustre file system with failover

2008-06-27 Thread Kalpak Shah
On Thu, 2008-06-26 at 21:25 -0700, Dhruv wrote: On Jun 13, 1:53 pm, Johann Lombardi [EMAIL PROTECTED] wrote: On Wed, Jun 11, 2008 at 01:07:56PM +0100, trupti shete wrote: I am having the following scenario of lustre file system-- MDT- /dev/sdc on node1 and failnode node2 [...] If

[Lustre-discuss] Reclaiming ext3 reserved space failed

2008-06-27 Thread Timh Bergström
Hi all, Im trying to reclaim the 5% space reserved for root from my osts, which the manual says I should be able to do with tune2fs. So I downloaded the latest version, tune2fs 1.40.11 (17-June-2008) and im trying to run as stated in the manual $tune2fs -m 1 /dev/sdXY. This gives me : tune2fs:

Re: [Lustre-discuss] Reclaiming ext3 reserved space failed

2008-06-27 Thread Kalpak Shah
On Fri, 2008-06-27 at 13:56 +0200, Timh Bergström wrote: 2008/6/27 Kalpak Shah [EMAIL PROTECTED]: On Fri, 2008-06-27 at 12:23 +0200, Timh Bergström wrote: Hi all, Im trying to reclaim the 5% space reserved for root from my osts, which the manual says I should be able to do with tune2fs.

Re: [Lustre-discuss] lctl peer/conn_list

2008-06-27 Thread Scott Atchley
On Jun 26, 2008, at 9:03 AM, Bastian Tweddell wrote: On 25.Jun.08 23:28 -0600, Andreas Dilger wrote: On Jun 25, 2008 20:04 -0400, Scott Atchley wrote: When I test Lustre over myri10ge, I do not use myri10ge as the network interface name. I use the actual ethX that myri10ge is providing:

Re: [Lustre-discuss] MDT MGS migration

2008-06-27 Thread Brian J. Murrell
On Fri, 2008-06-27 at 10:27 +0200, Enrico Morelli wrote: Dear all, is it possible to migrate a MDT/MGS partition to another filesystem? Sure. I want to migrate it in a raid 1 SAS partition on the same server. Is it possible? You probably want to look in the Ops manual at

[Lustre-discuss] OSS load in the roof

2008-06-27 Thread Brock Palen
our OSS went crazy today. It is attached to two OST's. The load normally around 2-4. Right now it is 123. I noticed this to be the cause: root 6748 0.0 0.0 00 ?DMay27 8:57 [ll_ost_io_123] All of them are stuck in un-interruptible sleep. Has anyone seen this

Re: [Lustre-discuss] OSS load in the roof

2008-06-27 Thread Bernd Schubert
On Fri, Jun 27, 2008 at 01:07:32PM -0400, Brian J. Murrell wrote: On Fri, 2008-06-27 at 12:44 -0400, Brock Palen wrote: All of them are stuck in un-interruptible sleep. Has anyone seen this happen before? Is this caused by a pending disk failure? Well, they are certainly stuck

Re: [Lustre-discuss] OSS load in the roof

2008-06-27 Thread Brock Palen
On Jun 27, 2008, at 1:39 PM, Bernd Schubert wrote: On Fri, Jun 27, 2008 at 01:07:32PM -0400, Brian J. Murrell wrote: On Fri, 2008-06-27 at 12:44 -0400, Brock Palen wrote: All of them are stuck in un-interruptible sleep. Has anyone seen this happen before? Is this caused by a pending disk

Re: [Lustre-discuss] OSS load in the roof

2008-06-27 Thread Bernd Schubert
On Fri, Jun 27, 2008 at 01:44:13PM -0400, Brock Palen wrote: On Jun 27, 2008, at 1:39 PM, Bernd Schubert wrote: On Fri, Jun 27, 2008 at 01:07:32PM -0400, Brian J. Murrell wrote: On Fri, 2008-06-27 at 12:44 -0400, Brock Palen wrote: All of them are stuck in un-interruptible sleep. Has anyone

Re: [Lustre-discuss] OSS load in the roof

2008-06-27 Thread Brock Palen
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Jun 27, 2008, at 1:07 PM, Brian J. Murrell wrote: On Fri, 2008-06-27 at 12:44 -0400, Brock Palen wrote: All of them are stuck in un-interruptible sleep. Has anyone seen this happen before? Is this caused by a pending disk failure? Well,

Re: [Lustre-discuss] Repairing a filesystem with lfsck output

2008-06-27 Thread Andreas Dilger
On Jun 27, 2008 01:23 -0500, Chad Kerner wrote: I have a filesystem that went south in a fairly bad way. The lfsck(ran in read-only mode) listed 31216 files that were 'not created' and in the lfsck output log there were error messages about dangling inodes and when they were all