On 04.09.24 15:59, Henrik Cednert wrote:
Thanks UweI will have to digest what you write for a while, at this time of day some of it flies over my head (...and some probably will no matter the time of day). =)But to add some info about the file system. File system attributes for /dev/mmfs1: ====================================== flag value description------------------- ------------------------ ----------------------------------- -f 8192 Minimum fragment (subblock) size in bytes (system pool) 131072 Minimum fragment (subblock) size in bytes (other pools)-i 512 Inode size in bytes -I 32768 Indirect block size in bytes-m 2 Default number of metadata replicas -M 2 Maximum number of metadata replicas -r 1 Default number of data replicas -R 2 Maximum number of data replicas-j scatter Block allocation type-D nfs4 File locking semantics in effect-k nfs4 ACL semantics in effect-n 32 Estimated number of nodes that will mount file system-B 524288 Block size (system pool) 8388608 Block size (other pools) Regarding,>The iohist snippet for reads comprises 74 IOs in about 0.854s, this relates to roughly 690MiB/s, far from the 10-fold value you reported.I'm not sure about that 10-fold value you refer to here. 690MiB is pretty much exactly what I saw reported in the disk speed test I was running when extracting that data.
I referred to numbers like in > Job: seqrw-10gb-1mb-t4 > • Write: 549 MB/s (7 ms) > • Read: 6987 MB/s (1 ms) was I mistaken ?
I will re-run my fio-tests on the other systems so that I have fresh values. If I'm sure that they are trustworthy...? Can one ever be? Network graphs and reported fio results are all I have to lean against. Attaching a few lines of --iohist for one of those 10GbE clients that currently is running my batch fio test.<--------------------------------------snip------------------------------------>
There is always bit of uncertainty but one could try to minimize that and also look for signs of it. Caching effects are prone to be a vivid source of errors in benchmarks. I am not sure we have it here but thought there are serious indicators. but i might have mixed up your tests.
also I see you do not run concurring reads and writes , but just read or write tests at a time.
but it remains from your waiters list: GPFS doesn't seem to be (held) busy (by the test app). If there are 74 IO requests in 0.85secs each one taking less than 6ms that accounts for less than just 0.444s -- about half of the time GPFS is idling. with full IO queues you should see at least about 1.3GiB/s (which is still not 3, but would be better than what you had now). But the next thing to explore is: while many read IOs take about 5..6ms (mind: that is supsiciously low!) what is causing the others to take 20..40ms (or even more). You could also check the iohistory on the NSD servers where you'd see the times it takes the IO requests issued by the NSD server against the storage backend. But you should not post the output. You may send the output as an attachment (but maybe not to the entire mailing list :-).
Can you start your tests with multiple threads (or maybe multiple tests in parallel)?
Uwe -- Karlsruhe Institute of Technology (KIT) Scientific Computing Centre (SCC) Scientific Data Management (SDM) Uwe Falke Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 D-76344 Eggenstein-Leopoldshafen Tel: +49 721 608 28024 Email:[email protected] www.scc.kit.edu Registered office: Kaiserstraße 12, 76131 Karlsruhe, Germany KIT – The Research University in the Helmholtz Association
smime.p7s
Description: S/MIME Cryptographic Signature
_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
