On 2010-10-18, at 10:40, Johann Lombardi wrote:
> On Mon, Oct 18, 2010 at 01:58:40PM +0200, Michael Kluge wrote:
>> dd if=/dev/zero of=$RAM_DEV bs=1M count=1000
>> mke2fs -O journal_dev -b 4096 $RAM_DEV
>> 
>> mkfs.lustre  --device-size=$((7*1024*1024*1024)) --ost --fsname=luram
>> --mgsnode=$MDS_NID --mkfsoptions="-E stride=32,stripe-width=256 -b 4096
>> -j -J device=$RAM_DEV" /dev/disk/by-path/...
>> 
>> mount -t ldiskfs /dev/disk/by-path/... /mnt/ost_1
> 
> In fact, Lustre uses additional mount options (see "Persistent mount opts" in 
> tunefs.lustre output).
> If your ldiskfs module is based on ext3, you should add the extents and 
> mballoc options which are known to improve performance.

Even then, the IO submission path of ext3 from userspace is not very good, and 
such a performance difference is not unexpected.  When submitting IO from 
userspace to ext3/ldiskfs it is being done in 4kB blocks, and each block is 
allocated separately (regardless of mballoc, unfortunately).  When Lustre is 
doing IO from the kernel, the client is aggregating the IO into 1MB chunks and 
the entire 1MB write is allocated in one operation.

That is why we developed the "delalloc" code for ext4 - so that userspace could 
also get better IO performance, and utilize the multi-block allocation 
(mballoc) routines that have been in ldiskfs for ages, but only accessible from 
the kernel.

For Lustre performance testing, I would suggest looking at lustre-iokit, and in 
particular "sgpdd" to test the underlying block device, and then 
obdfilter-survey to test the local Lustre IO submission path.

Cheers, Andreas
--
Andreas Dilger
Lustre Technical Lead
Oracle Corporation Canada Inc.

_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Reply via email to