> read | write >disk I/Os in flight ios % cum % | ios % cum % >1: 211177215 61 61 | 29305564 97 97 >2: 41332944 11 72 | 498260 1 99 >[..] >Does these lines means : >Since last snapshot there was 211177215x1 and read 41332944x2 I/O in flight ?
It means (since the last time the statistics were cleared) * 11% of the time, 2 READ I/O requests were "in-flights" to disk, meaning 2 I/O were sent to disks and not yet commit/acknowledged * 61 % of the time, only 1 READ I/O request. Same principle for write. What this means here is that your workload is not feeding the disks with lots of write (97% with 1 I/O in flight), but a bit more reads. Disks and especially disk arrays are reordering I/O and distributing them across the various drives they are composed of to optimized bandwith. To really take benefits of all the possible bandwith/throughput your hardward can offer, you often need to be able to have lots of big I/O and possible multiple I/O in flights. Few I/O in flight could means: * your workload is not really big * your hardward is fast compared to the throughput coming to this server (ratio disk BW vs network BW by example) This could also help you identify bad performance numbers and find from where the bottleneck comes from. De : lustre-discuss <[email protected]> au nom de Louis Bailleul <[email protected]> Date : mardi 16 juillet 2019 à 17:49 À : lustre-discuss <[email protected]> Objet : Re: [lustre-discuss] [External] Re: obdfilter/mdt stats meaning ? Hi Aurélien, Thanks for the prompt reply. For the ost stats, any idea what the preprw and commitrw mean ? And why there are two entries with different values for statfs ? For brw_stats even with the doc I still struggle to read this. For example how do you make sense of disk I/O in flight ? read | write disk I/Os in flight ios % cum % | ios % cum % 1: 211177215 61 61 | 29305564 97 97 2: 41332944 11 72 | 498260 1 99 [..] Does these lines means : Since last snapshot there was 211177215x1 and read 41332944x2 I/O in flight ? Best regards, Louis On 16/07/2019 15:50, Degremont, Aurelien wrote: Hi Louis, About brw_stats, there are a bit of explanation in the Lustre Doc (not that detailed, but still) http://doc.lustre.org/lustre_manual.xhtml#dbdoclet.50438271_55057 > Last thing, is there any way to get the name of the filesystem an OST is part > of by using lctl ? I don't know what you want exactly, but the OST names are self explanatory, there always are like: fsname-OSTXXXX Where fsname is the lustre filesystem they are part of. For obdfilter stats, these are mostly action to OST objects or client connection management RPCs. setattr: changing an OST object attributes (owner, group, ...) punch: mostly used for truncate (theorically can do holes in files, like truncate with a start and length) sync: straighforward, sync OST to disk destroy: delete an OST object (mostly when a file is deleted) create: create an OST object statfs: like 'df' for this specific OST (used by 'lfs df' by example) (re)connect: when a client connect/reconnect to this OST ping: when a client ping this OST. Aurélien De : lustre-discuss <[email protected]><mailto:[email protected]> au nom de Louis Bailleul <[email protected]><mailto:[email protected]> Date : mardi 16 juillet 2019 à 16:38 À : lustre-discuss <[email protected]><mailto:[email protected]> Objet : [lustre-discuss] obdfilter/mdt stats meaning ? Hi all, I am trying to make sense of some of the OST/MDT stats for 2.12. Can anybody point me to the doc that explain what the metrics are ? The wiki only mention read/write/get_info : http://wiki.lustre.org/Lustre_Monitoring_and_Statistics_Guide<https://urldefense.proofpoint.com/v2/url?u=http-3A__wiki.lustre.org_Lustre-5FMonitoring-5Fand-5FStatistics-5FGuide&d=DwMGaQ&c=KV_I7O14pmwRcmAVyJ1eg4Jwb8Y2JAxuL5YgMGHpjcQ&r=FTXmt89oLXmbXfP78w86-PxB1XdLYgxG8hEoAnZvCvs&m=UC1t7z9tgmxUE2FWaTFHFT_Y69z_VMH0dEYF1VXadX0&s=cdXTUStD_NPwj3GtNYBqJA2nkJ1Ec53F9aD5UxFo5tw&e=> But the list I get is quite different : obdfilter.OST001.stats= snapshot_time 1563285450.647120173 secs.nsecs read_bytes 340177708 samples [bytes] 4096 4194304 396712660910080 write_bytes 30008856 samples [bytes] 24 4194304 78618271501667 setattr 1755 samples [reqs] punch 73463 samples [reqs] sync 50606 samples [reqs] destroy 31990 samples [reqs] create 956 samples [reqs] statfs 75378743 samples [reqs] connect 5798 samples [reqs] reconnect 3242 samples [reqs] disconnect 5820 samples [reqs] statfs 3737980 samples [reqs] preprw 370186566 samples [reqs] commitrw 370186557 samples [reqs] ping 882096292 samples [reqs] For the MDT, most are pretty much self explanatory, but I'll still be happy to be pointed to some doc. mdt.MDT0000.md_stats= snapshot_time 1563287416.006001068 secs.nsecs open 3174644054 samples [reqs] close 3174494603 samples [reqs] mknod 107564 samples [reqs] unlink 99625 samples [reqs] mkdir 199643 samples [reqs] rmdir 45021 samples [reqs] rename 12728 samples [reqs] getattr 50227431 samples [reqs] setattr 103435 samples [reqs] getxattr 9051470 samples [reqs] setxattr 14 samples [reqs] statfs 7525513 samples [reqs] sync 20597 samples [reqs] samedir_rename 207 samples [reqs] crossdir_rename 12521 samples [reqs] And anyone knows how to read the OST brw_stats ? obdfilter.OST0014.brw_stats= snapshot_time: 1563287631.511085465 (secs.nsecs) read | write pages per bulk r/w rpcs % cum % | rpcs % cum % 1: 231699298 66 66 | 180944 0 0 2: 855611 0 67 | 322359 1 1 4: 541749 0 67 | 5539716 18 20 8: 1281219 0 67 | 67837 0 20 16: 637808 0 67 | 114546 0 20 32: 1342813 0 68 | 3099780 10 31 64: 1559834 0 68 | 173166 0 31 128: 1583127 0 69 | 211512 0 32 256: 10627583 3 72 | 499978 1 34 512: 3909601 1 73 | 1029686 3 37 1K: 92141161 26 100 | 18788597 62 100 read | write discontiguous pages rpcs % cum % | rpcs % cum % 0: 346179839 100 100 | 180946 0 0 1: 0 0 100 | 322363 1 1 2: 0 0 100 | 5521062 18 20 3: 0 0 100 | 18650 0 20 4: 0 0 100 | 18159 0 20 5: 0 0 100 | 26664 0 20 6: 0 0 100 | 10830 0 20 7: 0 0 100 | 12189 0 20 8: 0 0 100 | 11365 0 20 9: 0 0 100 | 10253 0 20 10: 0 0 100 | 8810 0 20 11: 0 0 100 | 9825 0 20 12: 0 0 100 | 16740 0 20 13: 0 0 100 | 14421 0 20 14: 0 0 100 | 10513 0 20 15: 0 0 100 | 32655 0 20 16: 0 0 100 | 1418677 4 25 17: 0 0 100 | 1477077 4 30 18: 0 0 100 | 6227 0 30 19: 0 0 100 | 7071 0 30 20: 0 0 100 | 7297 0 30 21: 0 0 100 | 8478 0 30 22: 0 0 100 | 34591 0 30 23: 0 0 100 | 35591 0 30 24: 0 0 100 | 8378 0 30 25: 0 0 100 | 8724 0 30 26: 0 0 100 | 52300 0 30 27: 0 0 100 | 14038 0 30 28: 0 0 100 | 4734 0 30 29: 0 0 100 | 4878 0 31 30: 0 0 100 | 6232 0 31 31: 0 0 100 | 20708383 68 100 read | write disk I/Os in flight ios % cum % | ios % cum % 1: 211177215 61 61 | 29305564 97 97 2: 41332944 11 72 | 498260 1 99 3: 22250410 6 79 | 86831 0 99 4: 15524737 4 83 | 34513 0 99 5: 12049717 3 87 | 19442 0 99 6: 8904108 2 89 | 13107 0 99 7: 5955503 1 91 | 8748 0 99 8: 3943444 1 92 | 6869 0 99 9: 3115034 0 93 | 5447 0 99 10: 2553941 0 94 | 4593 0 99 11: 2121217 0 95 | 3828 0 99 12: 1709040 0 95 | 3264 0 99 13: 1418541 0 95 | 2800 0 99 14: 1184247 0 96 | 2454 0 99 15: 1047397 0 96 | 2153 0 99 16: 875229 0 96 | 1871 0 99 17: 752555 0 97 | 1643 0 99 18: 656424 0 97 | 1531 0 99 19: 584066 0 97 | 1375 0 99 20: 529630 0 97 | 1267 0 99 21: 477143 0 97 | 1144 0 99 22: 426303 0 97 | 1067 0 99 23: 385707 0 97 | 984 0 99 24: 354584 0 98 | 959 0 99 25: 328332 0 98 | 899 0 99 26: 305886 0 98 | 828 0 99 27: 281444 0 98 | 786 0 99 28: 261958 0 98 | 734 0 99 29: 242335 0 98 | 711 0 99 30: 227010 0 98 | 692 0 99 31: 5203738 1 100 | 13757 0 100 read | write I/O time (1/1000s) ios % cum % | ios % cum % 1: 34363647 26 26 | 0 0 0 2: 9013233 7 33 | 0 0 0 4: 3381561 2 36 | 0 0 0 8: 2194196 1 38 | 0 0 0 16: 8767687 6 45 | 0 0 0 32: 25062401 19 64 | 0 0 0 64: 27196704 21 85 | 0 0 0 128: 10760610 8 94 | 0 0 0 256: 4203334 3 97 | 0 0 0 512: 2002196 1 99 | 0 0 0 1K: 785539 0 99 | 0 0 0 2K: 340525 0 99 | 0 0 0 4K: 140336 0 99 | 0 0 0 8K: 6875 0 99 | 0 0 0 16K: 161 0 100 | 0 0 0 read | write disk I/O size ios % cum % | ios % cum % 8: 4 0 0 | 0 0 0 16: 0 0 0 | 0 0 0 32: 1 0 0 | 4 0 0 64: 1 0 0 | 5703 0 0 128: 3061 0 0 | 2853 0 0 256: 1 0 0 | 3340 0 0 512: 1 0 0 | 309 0 0 1K: 0 0 0 | 3697 0 0 2K: 2 0 0 | 38311 0 0 4K: 231696225 66 66 | 126727 0 0 8K: 855613 0 67 | 322359 1 1 16K: 541749 0 67 | 5539716 18 20 32K: 1281219 0 67 | 67837 0 20 64K: 637808 0 67 | 114546 0 20 128K: 1342813 0 68 | 3099780 10 31 256K: 1559834 0 68 | 173166 0 31 512K: 1583127 0 69 | 211512 0 32 1M: 10627583 3 72 | 499978 1 34 2M: 3909601 1 73 | 1029686 3 37 4M: 92141161 26 100 | 18788597 62 100 Last thing, is there any way to get the name of the filesystem an OST is part of by using lctl ? Best regards, Louis
_______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
