> For your final final filesystem you still probably want to enable async > journals (unless you are willing to enable the S2A unmirrored device cache).
OK, thanks. We'll give this a try. Michael > Most obdecho/obdfilter-survey bugs are gone in 1.8.4, except your ctrl+c > problem, for which a patch exists: > > https://bugzilla.lustre.org/show_bug.cgi?id=21745 > > Cheers, > Bernd > > > On Wednesday, October 20, 2010, Michael Kluge wrote: >> Thanks a lot for all the replies. sgpdd shows 700+ MB/s for the device. >> We trapped into one or two bugs with obdfilter-survey as lctl has at >> least one bug in 1.8.3 when is uses multiple threads and >> obdfilter-survey also causes an LBUG when you CTRL+C it. We see 600+ >> MB/s for obdfilter-survey over a reasonable parameter space after we >> changed to the ext4 based ldiskfs. So that seems to be the trick. >> >> Michael >> >> Am Montag, den 18.10.2010, 14:04 -0600 schrieb Andreas Dilger: >>> On 2010-10-18, at 10:40, Johann Lombardi wrote: >>>> On Mon, Oct 18, 2010 at 01:58:40PM +0200, Michael Kluge wrote: >>>>> dd if=/dev/zero of=$RAM_DEV bs=1M count=1000 >>>>> mke2fs -O journal_dev -b 4096 $RAM_DEV >>>>> >>>>> mkfs.lustre --device-size=$((7*1024*1024*1024)) --ost --fsname=luram >>>>> --mgsnode=$MDS_NID --mkfsoptions="-E stride=32,stripe-width=256 -b >>>>> 4096 -j -J device=$RAM_DEV" /dev/disk/by-path/... >>>>> >>>>> mount -t ldiskfs /dev/disk/by-path/... /mnt/ost_1 >>>> >>>> In fact, Lustre uses additional mount options (see "Persistent mount >>>> opts" in tunefs.lustre output). If your ldiskfs module is based on >>>> ext3, you should add the extents and mballoc options which are known >>>> to improve performance. >>> >>> Even then, the IO submission path of ext3 from userspace is not very >>> good, and such a performance difference is not unexpected. When >>> submitting IO from userspace to ext3/ldiskfs it is being done in 4kB >>> blocks, and each block is allocated separately (regardless of mballoc, >>> unfortunately). When Lustre is doing IO from the kernel, the client is >>> aggregating the IO into 1MB chunks and the entire 1MB write is allocated >>> in one operation. >>> >>> That is why we developed the "delalloc" code for ext4 - so that userspace >>> could also get better IO performance, and utilize the multi-block >>> allocation (mballoc) routines that have been in ldiskfs for ages, but >>> only accessible from the kernel. >>> >>> For Lustre performance testing, I would suggest looking at lustre-iokit, >>> and in particular "sgpdd" to test the underlying block device, and then >>> obdfilter-survey to test the local Lustre IO submission path. >>> >>> Cheers, Andreas >>> -- >>> Andreas Dilger >>> Lustre Technical Lead >>> Oracle Corporation Canada Inc. > > -- Michael Kluge, M.Sc. Technische Universität Dresden Center for Information Services and High Performance Computing (ZIH) D-01062 Dresden Germany Contact: Willersbau, Room WIL A 208 Phone: (+49) 351 463-34217 Fax: (+49) 351 463-37773 e-mail: [email protected] WWW: http://www.tu-dresden.de/zih _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
