Hi all, I am testing a Lustre system that includes 1 MGS, 2 MDS, and 8 OSS with 8 OSTs running RAID 6 (8d+2p). Each OST's performance is approximately 16 GB/s for WRITE and 33 GB/s for READ (measured with FIO test: blocksize=1m, iodepth=64, numjob=2 sequential test). The system has 16 clients.
I am encountering issues with performance testing using IOR with the following options: ``` mpirun --allow-run-as-root --mca pml ucx -x UCX_TLS=rc_mlx5,ud_mlx5,self -x UCX_NET_DEVICES=mlx5_0:1 --mca btl ^openib --hostfile mphost10 -np <number_of_process> -map-by node ior -w -r -b 2m -t 2m -C -s 4000 -k -e -o /lustre/testFS/ior/iortest ``` The stripe_count is set equal to the number of processes (overstriping), and the stripe_size is equal to the block size (2m). The issues I am facing are: 1. Performance does not increase beyond 2 processes per client. With 1 client and 1 OST, I achieve approximately 2 GB/s for WRITE. With 2 clients and 4 processes, I achieve 4 GB/s. To reach 16 GB/s, I need to use 16 clients with 2 processes per client. Stripe count NP Write (MB/s) Read (MB/s) 1 1 1843.57 1618.57 1 2 2079.28 1914.32 2 2 2579.28 2298.19 2 4 1337.38 1310.23 16 16 1313.24 1345.24 16 32 1455.45 1398.23 32 32 1477.75 1410.68 800 32 1326.41 1210.13 1. Performance does not improve by adding more OSTs. With 2 OSTs and 2 clients, the performance remains at 4 GB/s, and with 16 clients, the performance is only equivalent to 1 OST. I am wondering why the performance does not scale after 2 processes per client. Could it be that overstriping alone is not sufficient to enhance performance for Single Shared File mode? Are there any additional settings I should consider configuring beyond overstriping? The results of obdfilter-survey and lnet do not show any bottleneck. I am using Lustre 2.15.4 with Rocky Linux 8.9 and kernel 4.18.0-513.9.1.el8_lustre.x86_64. - Information of MGS/MDS/OSS: 16 CPUs, 32 GB RAM. - Information of Clients: AMD EPYC 7662 64 core x2, 512 GB RAM. The network connection is InfiniBand with 400 Gbps bandwidth. Other settings on the Lustre cluster: ``` # Clients: options lnet networks="o2ib(ib0)" options ko2iblnd peer_credits=32 peer_credits_hiw=16 credits=256 concurrent_sends=64 lctl set_param osc.*.max_pages_per_rpc=4096 lctl set_param osc.*.checksums=0 lctl set_param osc.*.max_rpcs_in_flight=16 # OSSs: options lnet networks="o2ib(ib0)" options libcfs cpu_npartitions=1 options ko2iblnd peer_credits=32 peer_credits_hiw=16 credits=256 concurrent_sends=64 nscheds=8 options ost oss_num_threads=128 lctl set_param *.*.brw_size=16 lctl set_param osd-ldiskfs.*.writethrough_cache_enable=0 lctl set_param osd-ldiskfs.*.read_cache_enable=0 # MGS – MDSs options lnet networks="o2ib(ib0)" options libcfs cpu_npartitions=1 options ko2iblnd peer_credits=32 peer_credits_hiw=16 credits=256 concurrent_sends=64 nscheds=8 ``` Thank you for your helping.
_______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
