Re: [Lustre-discuss] ll_ost_creat_* goes bersek (100% cpu used - OST disabled)

2010-08-14 Thread Adrian Ulrich
> The journal will prevent inconsistencies in the filesystem in case of a crash.
> It cannot prevent corruption of the on-disk data, inconsistencies caused by 
> cache
> enabled on the disks or in a RAID controller, software bugs, memory 
> corruption, bad cables, etc. 

The OSS is part of a 'Snowbird' installation, so the RAID/Disk part should be 
fine.
I hope that we 'just' hit a small software bug :-/


> That is why it is still a good idea for users to run e2fsck periodically on a 
> filesystem.

Ok, we will keep this in mind (e2fsck was surprisingly fast anyway!)


Regards,
 Adrian

-- 
 RFC 1925:
   (11) Every old idea will be proposed again with a different name and
a different presentation, regardless of whether it works.

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] O_DIRECT

2010-08-14 Thread Andreas Dilger
On 2010-08-14, at 1:32, Michael Kluge  wrote:
> how does Lustre handle write() requests to files opened with O_DIRECT. 
> Does the OSS enforce that the OST has physically written the data to the 
> OST before the op is completed or does the write() call return on the 
> client before this?

The write will be submitted directly from the client to the OST, and the OST 
always does synchronous writes, regardless of whether it is O_DIRECT or not. It 
cannot return from the syscall until the write is complete, because those pages 
are shared from userspace. 

Cheers, Andreas
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] ll_ost_creat_* goes bersek (100% cpu used - OST disabled)

2010-08-14 Thread Andreas Dilger
On 2010-08-14, at 2:28, Adrian Ulrich  wrote:
>> - the on-disk structure of the object directory for this OST is corrupted.
>>  Run "e2fsck -fp /dev/{ostdev}" on the unmounted OST filesystem.
> 
> e2fsck fixed it: The OST is now running since 40 minutes without problems:
> 
> But shouldn't the journal of ext3/ldiskfs make running e2fsck unnecessary?

The journal will prevent inconsistencies in the filesystem in case of a crash. 
It cannot prevent corruption of the on-disk data, inconsistencies caused by 
cache enabled on the disks or in a RAID controller, software bugs, memory 
corruption, bad cables, etc. 

That is why it is still a good idea for users to run e2fsck periodically on a 
filesystem. If you are using LVM there is an lvcheck script I wrote that can 
check a filesystem snapshot on a running system, but otherwise you should do it 
whenever the opportunity arises. 

Cheers, Andreas
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] ll_ost_creat_* goes bersek (100% cpu used - OST disabled)

2010-08-14 Thread Adrian Ulrich

> - the on-disk structure of the object directory for this OST is corrupted.
>   Run "e2fsck -fp /dev/{ostdev}" on the unmounted OST filesystem.

e2fsck fixed it: The OST is now running since 40 minutes without problems:

e2fsck 1.41.6.sun1 (30-May-2009)
lustre1-OST0005: recovering journal
lustre1-OST0005 has been mounted 72 times without being checked, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Directory inode 440696867, block 493, offset 0: directory corrupted
Salvage? yes

Directory inode 440696853, block 517, offset 0: directory corrupted
Salvage? yes

Directory inode 440696842, block 560, offset 0: directory corrupted
Salvage? yes

Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Unattached inode 17769156
Connect to /lost+found? yes

Inode 17769156 ref count is 2, should be 1.  Fix? yes

Unattached zero-length inode 17883901.  Clear? yes

Pass 5: Checking group summary information

lustre1-OST0005: * FILE SYSTEM WAS MODIFIED *
lustre1-OST0005: 44279/488382464 files (15.4% non-contiguous), 
280329314/1953524992 blocks



But shouldn't the journal of ext3/ldiskfs make running e2fsck unnecessary?


Have a nice weekend and thanks a lot for the fast reply!

Regards,
 Adrian


___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] O_DIRECT

2010-08-14 Thread Michael Kluge
Hi all,

how does Lustre handle write() requests to files opened with O_DIRECT. 
Does the OSS enforce that the OST has physically written the data to the 
OST before the op is completed or does the write() call return on the 
client before this? I do not see the whole file content walking through 
the FC port of the RAID controller, but it can also be that my 
measurement is wrong ...


Michael


-- 
Michael Kluge, M.Sc.

Technische Universität Dresden
Center for Information Services and
High Performance Computing (ZIH)
D-01062 Dresden
Germany

Contact:
Willersbau, Room WIL A 208
Phone:  (+49) 351 463-34217
Fax:(+49) 351 463-37773
e-mail: michael.kl...@tu-dresden.de
WWW:http://www.tu-dresden.de/zih
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss