From: Andreas Dilger <[EMAIL PROTECTED]>
Date: Fri, 18 May 2007 12:43:48 -0600
On May 18, 2007 07:56 -0400, John R. Dunning wrote:
> I'm using 2.6.15 kernel, and qlogic 2462 hbas with 8.01.07 driver. Using
the
> anticipatory scheduler, and tweaking up the readahead size for the
blockdev, I
For a DDN you should probably use noop or deadline scheduler. Anticipatory
is really tuned for desktop workloads.
Yes, others have said the same thing. I've tried them both but so far there's
not much difference. The evidence is that something in the block layer is
breaking up read requests, which seems to negate any effect I might be getting
from the iosched.
I found /proc/sys/vm/block_dump, added some extra instrumentation to it, and
turned it on. On the write side, I'm seeing nice big requests (though the
sizes are a bit all over the place) but on the read side it seems to be
willing to go up to 32 elements in the bio and never go any higher. That
statement so far seems to be true regardless of what I use for readahead
values, what scheduler tuning params I give it, what kind of request size the
higher level thinks it's issuing etc. It's behaving like there's something
which has an arbitrary limit on the size of a read request, but I haven't yet
figured out what that is.
> can get around 300MB/s by using 4 threads on a port, or about 3/4 of the
> expected max. Writes max out easily. The ddn's stats say that the large
> majority of my reads are only 256K, even though the requests are larger
than
> that.
What tool are you using to measure performance?
Various. Mostly iozone and timing dd and stuff like that. I'm not (yet)
running lustre against the ddn.
I'd strongly suggest using
the lustre-iokit, which has several components in order to test bare-disk,
local filesystem, network, and lustre-filesystem components independently.
Ok. I tried an older version of it last year, and it didn't seem to be
telling anything I hadn't already found out by other means. EEB shipped me a
newer version, which I've unpacked, and am currently trying to figure out how
to build. It seems to be set up such that I have to autoconf it, but trying
to do that causes errors. Hints?
Lustre can consistently generate 1MB IOs to the underlying filesystem
because
it submits the IO in 1MB chunks, unlike the kernel's read() and write()
calls
which submit IO in 4kB chunks and hope the elevator can merge them.
See also the DDN tuning section in the Lustre manual.
Ok, will do. Thanks.
_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss