On May 15, 2007 16:03 -0400, pauln wrote:
> Andreas Dilger wrote:
> >On May 15, 2007 13:03 -0400, pauln wrote:
> >>I've attached a spreadsheet containing data from a lustre create test
> >>which I ran some time ago. The purpose of the test was to determine how
> >>different hardware configs affected create performance. As you'll see
> >>from the data, the ost is actually the slowest component in the create
> >>chain. I tested several OST and MDS configs and found that every
> >>disk-based OST configuration was susceptible to lengthy operation times
> >>interspersed throughout the test. This periodic slowness was correlated
> >>with disk activity on the OST - at the time I suspected that the
> >>activity was on behalf of the journal. Moving the entire OST onto a
> >>ramdisk increased the performance substantially.
> >
> >what version of Lustre were you testing? How large are the ext3 inodes on
> >the OSTs (can be seen with "dumpe2fs -h /dev/{ostdev}")? What is the
> >default stripe count?
> >
> >If you are running 1.4.6 and the ext3 inode size is 128 bytes then there
> >can be a significant performance hit due to extra metadata being stored
> >on the OSTs. This is not an issue with filesystems using a newer Lustre.
>
> The lustre version was 1.4.6.1 (rhel kernel 2.6.9-34). I used the
> default inode size and only had 1 OST. Can you briefly describe the
> problems with 128 byte inodes and suggest a more optimal size?
In 1.4.6 a feature was added to store extra EA data with each OST object
to allow object recovery in the face of serious data corruption. If the
directory that references an object is corrupted, then the data may still
be present on the disk, but is not identifyable. Each OST object now
keeps the object ID, along with the MDS inode number and stripe index.
It was identified that this can cause a slowdown during rapid object
creates, but for most normal uses this is not noticable because there
are multiple OSTs per filesystem, and the MDS precreates sufficient
objects to avoid a slowdown.
In 1.4.7 the default OST inode size was increased to 256 bytes to avoid
the slowdown.
Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.
_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss