Re: [lustre-discuss] Lustre poor performance
for Qlogic the script works but then there is some other parameter to change in the peer credits value otherwise Lustre will complane and it would not work. At lest this is for my old Qlogic QDR cards. I do not know if this does apply for newer Qlogic too. I'll write a patch to the script that will work for mellanox cards (ConnectX-3 family). I Can't speak for ConnectX-4 because I have no experience on those right now. On 8/23/17 4:36 PM, Dilger, Andreas wrote: > On Aug 23, 2017, at 08:39, Mohr Jr, Richard Frank (Rick Mohr)> wrote: >> >>> On Aug 22, 2017, at 7:14 PM, Riccardo Veraldi >>> wrote: >>> >>> On 8/22/17 9:22 AM, Mannthey, Keith wrote: Younot expected. >>> yes they are automatically used on my Mellanox and the script >>> ko2iblnd-probe seems like not working properly. >> The ko2iblnd-probe script looks in /sys/class/infiniband for device names >> starting with “hfi” or “qib”. If it detects those, it decides that the >> “profile” it should use is “opa” so then it basically invokes the >> ko2iblnd-opa modprobe line. But the script has no logic to detect other >> types of card (i.e. - mellanox), so in those cases, no ko2iblnd options are >> used and you end up with the default module parameters being used. >> >> If you want to use the script, you will need to modify ko2iblnd-probe to add >> a new case for your brand of HCA and then add an appropriate >> ko2iblnd- line to ko2iblnd.conf. >> >> Or just do what I did and comment out all the lines in ko2iblnd.conf and add >> your own lines. > If there are significantly different options needed for newer Mellanox HCAs > (e.g. as between Qlogic/OPA and MLX) it would be great to get a patch to > ko2iblnd-probe and ko2iblnd.conf that adds those options as the default for > the new type of card, so that Lustre works better out of the box. That helps > transfer the experience of veteran IB users to users that may not have the > background to get the best LNet IB performance. > > Cheers, Andreas > -- > Andreas Dilger > Lustre Principal Architect > Intel Corporation > > > > > > > > ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] Lustre poor performance
On 8/23/17 7:39 AM, Mohr Jr, Richard Frank (Rick Mohr) wrote: >> On Aug 22, 2017, at 7:14 PM, Riccardo Veraldi >>wrote: >> >> On 8/22/17 9:22 AM, Mannthey, Keith wrote: >>> Younot expected. >>> >> yes they are automatically used on my Mellanox and the script ko2iblnd-probe >> seems like not working properly. > The ko2iblnd-probe script looks in /sys/class/infiniband for device names > starting with “hfi” or “qib”. If it detects those, it decides that the > “profile” it should use is “opa” so then it basically invokes the > ko2iblnd-opa modprobe line. But the script has no logic to detect other > types of card (i.e. - mellanox), so in those cases, no ko2iblnd options are > used and you end up with the default module parameters being used. > > If you want to use the script, you will need to modify ko2iblnd-probe to add > a new case for your brand of HCA and then add an appropriate > ko2iblnd- line to ko2iblnd.conf. > > Or just do what I did and comment out all the lines in ko2iblnd.conf and add > your own lines. yes what I did was to disable the module alias and just options ko2iblnd ... install ko2iblnd ... and it worked. I may modify the script as well as you mentioned. thank you. > > -- > Rick Mohr > Senior HPC System Administrator > National Institute for Computational Sciences > http://www.nics.tennessee.edu > > ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] Lustre poor performance
On Aug 23, 2017, at 08:39, Mohr Jr, Richard Frank (Rick Mohr)wrote: > > >> On Aug 22, 2017, at 7:14 PM, Riccardo Veraldi >> wrote: >> >> On 8/22/17 9:22 AM, Mannthey, Keith wrote: >>> Younot expected. >>> >> yes they are automatically used on my Mellanox and the script ko2iblnd-probe >> seems like not working properly. > > The ko2iblnd-probe script looks in /sys/class/infiniband for device names > starting with “hfi” or “qib”. If it detects those, it decides that the > “profile” it should use is “opa” so then it basically invokes the > ko2iblnd-opa modprobe line. But the script has no logic to detect other > types of card (i.e. - mellanox), so in those cases, no ko2iblnd options are > used and you end up with the default module parameters being used. > > If you want to use the script, you will need to modify ko2iblnd-probe to add > a new case for your brand of HCA and then add an appropriate > ko2iblnd- line to ko2iblnd.conf. > > Or just do what I did and comment out all the lines in ko2iblnd.conf and add > your own lines. If there are significantly different options needed for newer Mellanox HCAs (e.g. as between Qlogic/OPA and MLX) it would be great to get a patch to ko2iblnd-probe and ko2iblnd.conf that adds those options as the default for the new type of card, so that Lustre works better out of the box. That helps transfer the experience of veteran IB users to users that may not have the background to get the best LNet IB performance. Cheers, Andreas -- Andreas Dilger Lustre Principal Architect Intel Corporation ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] Lustre poor performance
> On Aug 22, 2017, at 7:14 PM, Riccardo Veraldi> wrote: > > On 8/22/17 9:22 AM, Mannthey, Keith wrote: >> Younot expected. >> > yes they are automatically used on my Mellanox and the script ko2iblnd-probe > seems like not working properly. The ko2iblnd-probe script looks in /sys/class/infiniband for device names starting with “hfi” or “qib”. If it detects those, it decides that the “profile” it should use is “opa” so then it basically invokes the ko2iblnd-opa modprobe line. But the script has no logic to detect other types of card (i.e. - mellanox), so in those cases, no ko2iblnd options are used and you end up with the default module parameters being used. If you want to use the script, you will need to modify ko2iblnd-probe to add a new case for your brand of HCA and then add an appropriate ko2iblnd- line to ko2iblnd.conf. Or just do what I did and comment out all the lines in ko2iblnd.conf and add your own lines. -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] Lustre poor performance
On 8/22/17 9:22 AM, Mannthey, Keith wrote: > > You may want to file a jira ticket if ko2iblnd-opa setting were being > automatically used on your Mellanox setup. That is not expected. > yes they are automatically used on my Mellanox and the script ko2iblnd-probe seems like not working properly. > > > > On another note: As you note you NVMe backend is much faster than QRD > link speed. You may want to look at using the new Multi-rall lnet > feature to boost network bandwidth. You can add a 2^nd QRD HCA/Port > and get more Lnet bandwith from your OSS server. It is a new feature > that is a bit of work to use but if you are chasing bandwith it might > be worth the effort. > I have a dual infiniband card so I was thinking to bond them to have more bandwidth. Is this that you mean when you are talking about the Muti-rail feature boost ? thanks Rick > > > Thanks, > > Keith > > > > *From:*lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org] > *On Behalf Of *Chris Horn > *Sent:* Monday, August 21, 2017 12:40 PM > *To:* Riccardo Veraldi <riccardo.vera...@cnaf.infn.it>; Arman > Khalatyan <arm2...@gmail.com> > *Cc:* lustre-discuss@lists.lustre.org > *Subject:* Re: [lustre-discuss] Lustre poor performance > > > > The ko2iblnd-opa settings are tuned specifically for Intel OmniPath. > Take a look at the /usr/sbin/ko2iblnd-probe script to see how OPA > hardware is detected and the “ko2iblnd-opa” settings get used. > > > > Chris Horn > > > > *From: *lustre-discuss <lustre-discuss-boun...@lists.lustre.org > <mailto:lustre-discuss-boun...@lists.lustre.org>> on behalf of > Riccardo Veraldi <riccardo.vera...@cnaf.infn.it > <mailto:riccardo.vera...@cnaf.infn.it>> > *Date: *Saturday, August 19, 2017 at 5:00 PM > *To: *Arman Khalatyan <arm2...@gmail.com <mailto:arm2...@gmail.com>> > *Cc: *"lustre-discuss@lists.lustre.org > <mailto:lustre-discuss@lists.lustre.org>" > <lustre-discuss@lists.lustre.org <mailto:lustre-discuss@lists.lustre.org>> > *Subject: *Re: [lustre-discuss] Lustre poor performance > > > > I ran again my Lnet self test and this time adding --concurrency=16 > I can use all of the IB bandwith (3.5GB/sec). > > the only thing I do not understand is why ko2iblnd.conf is not loaded > properly and I had to remove the alias in the config file to allow > the proper peer_credit settings to be loaded. > > thanks to everyone for helping > > Riccardo > > On 8/19/17 8:54 AM, Riccardo Veraldi wrote: > > > I found out that ko2iblnd is not getting settings from > /etc/modprobe/ko2iblnd.conf > alias ko2iblnd-opa ko2iblnd > options ko2iblnd-opa peer_credits=128 peer_credits_hiw=64 > credits=1024 concurrent_sends=256 ntx=2048 map_on_demand=32 > fmr_pool_size=2048 fmr_flush_trigger=512 fmr_cache=1 conns_per_peer=4 > > install ko2iblnd /usr/sbin/ko2iblnd-probe > > but if I modify ko2iblnd.conf like this, then settings are loaded: > > options ko2iblnd peer_credits=128 peer_credits_hiw=64 credits=1024 > concurrent_sends=256 ntx=2048 map_on_demand=32 fmr_pool_size=2048 > fmr_flush_trigger=512 fmr_cache=1 conns_per_peer=4 > > install ko2iblnd /usr/sbin/ko2iblnd-probe > > Lnet tests show better behaviour but still I Would expect more > than this. > Is it possible to tune parameters in /etc/modprobe/ko2iblnd.conf > so that Mellanox ConnectX-3 will work more efficiently ? > > [LNet Rates of servers] > [R] Avg: 2286 RPC/s Min: 0RPC/s Max: 4572 RPC/s > [W] Avg: 3322 RPC/s Min: 0RPC/s Max: 6643 RPC/s > [LNet Bandwidth of servers] > [R] Avg: 625.23 MiB/s Min: 0.00 MiB/s Max: 1250.46 MiB/s > [W] Avg: 1035.85 MiB/s Min: 0.00 MiB/s Max: 2071.69 MiB/s > [LNet Rates of servers] > [R] Avg: 2286 RPC/s Min: 1RPC/s Max: 4571 RPC/s > [W] Avg: 3321 RPC/s Min: 1RPC/s Max: 6641 RPC/s > [LNet Bandwidth of servers] > [R] Avg: 625.55 MiB/s Min: 0.00 MiB/s Max: 1251.11 MiB/s > [W] Avg: 1035.05 MiB/s Min: 0.00 MiB/s Max: 2070.11 MiB/s > [LNet Rates of servers] > [R] Avg: 2291 RPC/s Min: 0RPC/s Max: 4581 RPC/s > [W] Avg: 3329 RPC/s Min: 0RPC/s Max: 6657 RPC/s > [LNet Bandwidth of servers] > [R] Avg: 626.55 MiB/s Min: 0.00 MiB/s Max: 1253.11 MiB/s > [W] Avg: 1038.05 MiB/s Min: 0.00 MiB/s Max: 2076.11 MiB/s > session is ended > ./lnet_test.sh: line 17: 23394 Terminated lst stat > servers > > > > > On 8/19/17 4:
Re: [lustre-discuss] Lustre poor performance
You may want to file a jira ticket if ko2iblnd-opa setting were being automatically used on your Mellanox setup. That is not expected. On another note: As you note you NVMe backend is much faster than QRD link speed. You may want to look at using the new Multi-rall lnet feature to boost network bandwidth. You can add a 2nd QRD HCA/Port and get more Lnet bandwith from your OSS server. It is a new feature that is a bit of work to use but if you are chasing bandwith it might be worth the effort. Thanks, Keith From: lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org] On Behalf Of Chris Horn Sent: Monday, August 21, 2017 12:40 PM To: Riccardo Veraldi <riccardo.vera...@cnaf.infn.it>; Arman Khalatyan <arm2...@gmail.com> Cc: lustre-discuss@lists.lustre.org Subject: Re: [lustre-discuss] Lustre poor performance The ko2iblnd-opa settings are tuned specifically for Intel OmniPath. Take a look at the /usr/sbin/ko2iblnd-probe script to see how OPA hardware is detected and the “ko2iblnd-opa” settings get used. Chris Horn From: lustre-discuss <lustre-discuss-boun...@lists.lustre.org<mailto:lustre-discuss-boun...@lists.lustre.org>> on behalf of Riccardo Veraldi <riccardo.vera...@cnaf.infn.it<mailto:riccardo.vera...@cnaf.infn.it>> Date: Saturday, August 19, 2017 at 5:00 PM To: Arman Khalatyan <arm2...@gmail.com<mailto:arm2...@gmail.com>> Cc: "lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org>" <lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org>> Subject: Re: [lustre-discuss] Lustre poor performance I ran again my Lnet self test and this time adding --concurrency=16 I can use all of the IB bandwith (3.5GB/sec). the only thing I do not understand is why ko2iblnd.conf is not loaded properly and I had to remove the alias in the config file to allow the proper peer_credit settings to be loaded. thanks to everyone for helping Riccardo On 8/19/17 8:54 AM, Riccardo Veraldi wrote: I found out that ko2iblnd is not getting settings from /etc/modprobe/ko2iblnd.conf alias ko2iblnd-opa ko2iblnd options ko2iblnd-opa peer_credits=128 peer_credits_hiw=64 credits=1024 concurrent_sends=256 ntx=2048 map_on_demand=32 fmr_pool_size=2048 fmr_flush_trigger=512 fmr_cache=1 conns_per_peer=4 install ko2iblnd /usr/sbin/ko2iblnd-probe but if I modify ko2iblnd.conf like this, then settings are loaded: options ko2iblnd peer_credits=128 peer_credits_hiw=64 credits=1024 concurrent_sends=256 ntx=2048 map_on_demand=32 fmr_pool_size=2048 fmr_flush_trigger=512 fmr_cache=1 conns_per_peer=4 install ko2iblnd /usr/sbin/ko2iblnd-probe Lnet tests show better behaviour but still I Would expect more than this. Is it possible to tune parameters in /etc/modprobe/ko2iblnd.conf so that Mellanox ConnectX-3 will work more efficiently ? [LNet Rates of servers] [R] Avg: 2286 RPC/s Min: 0RPC/s Max: 4572 RPC/s [W] Avg: 3322 RPC/s Min: 0RPC/s Max: 6643 RPC/s [LNet Bandwidth of servers] [R] Avg: 625.23 MiB/s Min: 0.00 MiB/s Max: 1250.46 MiB/s [W] Avg: 1035.85 MiB/s Min: 0.00 MiB/s Max: 2071.69 MiB/s [LNet Rates of servers] [R] Avg: 2286 RPC/s Min: 1RPC/s Max: 4571 RPC/s [W] Avg: 3321 RPC/s Min: 1RPC/s Max: 6641 RPC/s [LNet Bandwidth of servers] [R] Avg: 625.55 MiB/s Min: 0.00 MiB/s Max: 1251.11 MiB/s [W] Avg: 1035.05 MiB/s Min: 0.00 MiB/s Max: 2070.11 MiB/s [LNet Rates of servers] [R] Avg: 2291 RPC/s Min: 0RPC/s Max: 4581 RPC/s [W] Avg: 3329 RPC/s Min: 0RPC/s Max: 6657 RPC/s [LNet Bandwidth of servers] [R] Avg: 626.55 MiB/s Min: 0.00 MiB/s Max: 1253.11 MiB/s [W] Avg: 1038.05 MiB/s Min: 0.00 MiB/s Max: 2076.11 MiB/s session is ended ./lnet_test.sh: line 17: 23394 Terminated lst stat servers On 8/19/17 4:20 AM, Arman Khalatyan wrote: just minor comment, you should push up performance of your nodes,they are not running in the max cpu frequencies.Al tests might be inconsistent. in order to get most of ib run following: tuned-adm profile latency-performance for more options use: tuned-adm list It will be interesting to see the difference. Am 19.08.2017 3:57 vorm. schrieb "Riccardo Veraldi" <riccardo.vera...@cnaf.infn.it<mailto:riccardo.vera...@cnaf.infn.it>>: Hello Keith and Dennis, these are the test I ran. * obdfilter-survey, shows that I Can saturate disk performance, the NVMe/ZFS backend is performing very well and it is faster then my Infiniband network pool alloc free read write read write - - - - - - drpffb-ost01 3.31T 3.19T 3 35.7K 16.0K 7.03G raidz1 3.31T 3.19T 3 35.7K 16.0K 7.03G nvme0n1 - - 1 5.95K 7.99K 1.17G nvme1n1 - - 0 6.01K 0 1.18G nvme2n1 - - 0 5.93K
Re: [lustre-discuss] Lustre poor performance
The ko2iblnd-opa settings are tuned specifically for Intel OmniPath. Take a look at the /usr/sbin/ko2iblnd-probe script to see how OPA hardware is detected and the “ko2iblnd-opa” settings get used. Chris Horn From: lustre-discuss <lustre-discuss-boun...@lists.lustre.org> on behalf of Riccardo Veraldi <riccardo.vera...@cnaf.infn.it> Date: Saturday, August 19, 2017 at 5:00 PM To: Arman Khalatyan <arm2...@gmail.com> Cc: "lustre-discuss@lists.lustre.org" <lustre-discuss@lists.lustre.org> Subject: Re: [lustre-discuss] Lustre poor performance I ran again my Lnet self test and this time adding --concurrency=16 I can use all of the IB bandwith (3.5GB/sec). the only thing I do not understand is why ko2iblnd.conf is not loaded properly and I had to remove the alias in the config file to allow the proper peer_credit settings to be loaded. thanks to everyone for helping Riccardo On 8/19/17 8:54 AM, Riccardo Veraldi wrote: I found out that ko2iblnd is not getting settings from /etc/modprobe/ko2iblnd.conf alias ko2iblnd-opa ko2iblnd options ko2iblnd-opa peer_credits=128 peer_credits_hiw=64 credits=1024 concurrent_sends=256 ntx=2048 map_on_demand=32 fmr_pool_size=2048 fmr_flush_trigger=512 fmr_cache=1 conns_per_peer=4 install ko2iblnd /usr/sbin/ko2iblnd-probe but if I modify ko2iblnd.conf like this, then settings are loaded: options ko2iblnd peer_credits=128 peer_credits_hiw=64 credits=1024 concurrent_sends=256 ntx=2048 map_on_demand=32 fmr_pool_size=2048 fmr_flush_trigger=512 fmr_cache=1 conns_per_peer=4 install ko2iblnd /usr/sbin/ko2iblnd-probe Lnet tests show better behaviour but still I Would expect more than this. Is it possible to tune parameters in /etc/modprobe/ko2iblnd.conf so that Mellanox ConnectX-3 will work more efficiently ? [LNet Rates of servers] [R] Avg: 2286 RPC/s Min: 0RPC/s Max: 4572 RPC/s [W] Avg: 3322 RPC/s Min: 0RPC/s Max: 6643 RPC/s [LNet Bandwidth of servers] [R] Avg: 625.23 MiB/s Min: 0.00 MiB/s Max: 1250.46 MiB/s [W] Avg: 1035.85 MiB/s Min: 0.00 MiB/s Max: 2071.69 MiB/s [LNet Rates of servers] [R] Avg: 2286 RPC/s Min: 1RPC/s Max: 4571 RPC/s [W] Avg: 3321 RPC/s Min: 1RPC/s Max: 6641 RPC/s [LNet Bandwidth of servers] [R] Avg: 625.55 MiB/s Min: 0.00 MiB/s Max: 1251.11 MiB/s [W] Avg: 1035.05 MiB/s Min: 0.00 MiB/s Max: 2070.11 MiB/s [LNet Rates of servers] [R] Avg: 2291 RPC/s Min: 0RPC/s Max: 4581 RPC/s [W] Avg: 3329 RPC/s Min: 0RPC/s Max: 6657 RPC/s [LNet Bandwidth of servers] [R] Avg: 626.55 MiB/s Min: 0.00 MiB/s Max: 1253.11 MiB/s [W] Avg: 1038.05 MiB/s Min: 0.00 MiB/s Max: 2076.11 MiB/s session is ended ./lnet_test.sh: line 17: 23394 Terminated lst stat servers On 8/19/17 4:20 AM, Arman Khalatyan wrote: just minor comment, you should push up performance of your nodes,they are not running in the max cpu frequencies.Al tests might be inconsistent. in order to get most of ib run following: tuned-adm profile latency-performance for more options use: tuned-adm list It will be interesting to see the difference. Am 19.08.2017 3:57 vorm. schrieb "Riccardo Veraldi" <riccardo.vera...@cnaf.infn.it<mailto:riccardo.vera...@cnaf.infn.it>>: Hello Keith and Dennis, these are the test I ran. * obdfilter-survey, shows that I Can saturate disk performance, the NVMe/ZFS backend is performing very well and it is faster then my Infiniband network pool alloc free read write read write - - - - - - drpffb-ost01 3.31T 3.19T 3 35.7K 16.0K 7.03G raidz1 3.31T 3.19T 3 35.7K 16.0K 7.03G nvme0n1 - - 1 5.95K 7.99K 1.17G nvme1n1 - - 0 6.01K 0 1.18G nvme2n1 - - 0 5.93K 0 1.17G nvme3n1 - - 0 5.88K 0 1.16G nvme4n1 - - 1 5.95K 7.99K 1.17G nvme5n1 - - 0 5.96K 0 1.17G - - - - - - this are the tests results Fri Aug 18 16:54:48 PDT 2017 Obdfilter-survey for case=disk from drp-tst-ffb01 ost 1 sz 10485760K rsz 1024K obj1 thr1 write 7633.08 SHORT rewrite 7558.78 SHORT read 3205.24 [3213.70, 3226.78] ost 1 sz 10485760K rsz 1024K obj1 thr2 write 7996.89 SHORT rewrite 7903.42 SHORT read 5264.70 SHORT ost 1 sz 10485760K rsz 1024K obj2 thr2 write 7718.94 SHORT rewrite 7977.84 SHORT read 5802.17 SHORT * Lnet self test, and here I see the problems. For reference 172.21.52.[83,84] are the two OSSes 172.21.52.86 is the reader/writer. Here is the script that I ran #!/bin/bash export LST_SESSION=$$ lst new_session read_write lst add_group servers 172.21.52.[83,84]@o2ib5 lst add_group readers 172
Re: [lustre-discuss] Lustre poor performance
>> >> --- >> >> RDMA modules are loaded >> >> rpcrdma90366 0 >> rdma_ucm 26837 0 >> ib_uverbs 51854 2 ib_ucm,rdma_ucm >> rdma_cm53755 5 >> rpcrdma,ko2iblnd,ib_iser,rdma_ucm,ib_isert >> ib_cm 47149 5 >> rdma_cm,ib_srp,ib_ucm,ib_srpt,ib_ipoib >> iw_cm 46022 1 rdma_cm >> ib_core 210381 15 >> >> rdma_cm,ib_cm,iw_cm,rpcrdma,ko2iblnd,mlx4_ib,ib_srp,ib_ucm,ib_iser,ib_srpt,ib_umad,ib_uverbs,rdma_ucm,ib_ipoib,ib_isert >> sunrpc334343 17 >> nfs,nfsd,rpcsec_gss_krb5,auth_rpcgss,lockd,nfsv4,rpcrdma,nfs_acl >> >> I do not know where to look to have Lnet performing faster. I am >> running my ib0 interface in connected mode with 65520 MTU size. >> >> Any hint will be much appreciated >> >> thank you >> >> Rick >> >> >> >> >> On 8/18/17 9:05 AM, Mannthey, Keith wrote: >>> I would suggest you a few other tests to help isolate where the issue >>> might be. >>> >>> 1. What is the single thread "DD" write speed? >>> >>> 2. Lnet_selfttest: Please see " Chapter 28. Testing Lustre Network >>> Performance (LNet Self-Test)" in the Lustre manual if this is a new test >>> for you. >>> This will help show how much Lnet bandwith you have from your single >>> client. There are tunable in the lnet later that can affect things. Which >>> QRD HCA are you using? >>> >>> 3. OBDFilter_survey : Please see " 29.3. Testing OST Performance >>> (obdfilter-survey)" in the Lustre manual. This test will help demonstrate >>> what the backed NVMe/ZFS setup can do at the OBD layer in Lustre. >>> >>> Thanks, >>> Keith >>> -Original Message- >>> From: lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org >>> <mailto:lustre-discuss-boun...@lists.lustre.org>] On Behalf Of Riccardo >>> Veraldi >>> Sent: Thursday, August 17, 2017 10:48 PM >>> To: Dennis Nelson <dnel...@ddn.com> <mailto:dnel...@ddn.com>; >>> lustre-discuss@lists.lustre.org >>> <mailto:lustre-discuss@lists.lustre.org> >>> Subject: Re: [lustre-discuss] Lustre poor performance >>> >>> this is my lustre.conf >>> >>> [drp-tst-ffb01:~]$ cat /etc/modprobe.d/lustre.conf options lnet >>> networks=o2ib5(ib0),tcp5(enp1s0f0) >>> >>> data transfer is over infiniband >>> >>> ib0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 65520 >>> inet 172.21.52.83 netmask 255.255.252.0 broadcast >>> 172.21.55.255 >>> >>> >>> On 8/17/17 10:45 PM, Riccardo Veraldi wrote: >>>> On 8/17/17 9:22 PM, Dennis Nelson wrote: >>>>> It appears that you are running iozone on a single client? What kind >>>>> of network is tcp5? Have you looked at the network to make sure it is >>>>> not the bottleneck? >>>>> >>>> yes the data transfer is on ib0 interface and I did a memory to memory >>>> test through InfiniBand QDR resulting in 3.7GB/sec. >>>> tcp is used to connect to the MDS. It is tcp5 to differentiate it from >>>> my other many Lustre clusters. I could have called it tcp but it does >>>> not make any difference performance wise. >>>> I ran the test from one single node yes, I ran the same test also >>>> locally on a zpool identical to the one on the Lustre OSS. >>>> Ihave 4 identical servers each of them with the aame nvme disks: >>>> >>>> server1: OSS - OST1 Lustre/ZFS raidz1 >>>> >>>> server2: OSS - OST2 Lustre/ZFS raidz1 >>>> >>>> server3: local ZFS raidz1 >>>> >>>> server4: Lustre client >>>> >>>> >>>> >>>> ___ >>>> lustre-discuss mailing list >>>> lustre-discuss@lists.lustre.org >>>> <mailto:lustre-discuss@lists.lustre.org> >>>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org >>>> <http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org> >>> ___ >>> lustre-discuss mailing list >>> lustre-discuss@lists.lustre.org >>> <mailto:lustre-discuss@lists.lustre.org> >>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org >>> <http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org> >>> >> ___ lustre-discuss >> mailing list lustre-discuss@lists.lustre.org >> <mailto:lustre-discuss@lists.lustre.org> >> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org >> <http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org> >> ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] Lustre poor performance
Frequency is not max. > 1281000 0.00 845.45 > 6.925918 > Conflicting CPU frequency values detected: 1469.703000 != > 1362.257000. CPU Frequency is not max. > 2561000 0.00 1746.93 > 7.155406 > Conflicting CPU frequency values detected: 1469.703000 != > 1362.257000. CPU Frequency is not max. > 5121000 0.00 2766.93 > 5.82 > Conflicting CPU frequency values detected: 1296.714000 != > 1204.675000. CPU Frequency is not max. > 1024 1000 0.00 3516.26 > 3.600646 > Conflicting CPU frequency values detected: 1296.714000 != > 1325.535000. CPU Frequency is not max. > 2048 1000 0.00 3630.93 > 1.859035 > Conflicting CPU frequency values detected: 1296.714000 != > 1331.312000. CPU Frequency is not max. > 4096 1000 0.00 3702.39 > 0.947813 > Conflicting CPU frequency values detected: 1296.714000 != > 1200.027000. CPU Frequency is not max. > 8192 1000 0.00 3724.82 > 0.476777 > Conflicting CPU frequency values detected: 1384.902000 != > 1314.113000. CPU Frequency is not max. > 16384 1000 0.00 3731.21 > 0.238798 > Conflicting CPU frequency values detected: 1578.078000 != > 1200.027000. CPU Frequency is not max. > 32768 1000 0.00 3735.32 > 0.119530 > Conflicting CPU frequency values detected: 1578.078000 != > 1200.027000. CPU Frequency is not max. > 65536 1000 0.00 3736.98 > 0.059792 > Conflicting CPU frequency values detected: 1578.078000 != > 1200.027000. CPU Frequency is not max. > 131072 1000 0.00 3737.80 > 0.029902 > Conflicting CPU frequency values detected: 1578.078000 != > 1200.027000. CPU Frequency is not max. > 262144 1000 0.00 3738.43 > 0.014954 > Conflicting CPU frequency values detected: 1570.507000 != > 1200.027000. CPU Frequency is not max. > 524288 1000 0.00 3738.50 > 0.007477 > Conflicting CPU frequency values detected: 1457.019000 != > 1236.152000. CPU Frequency is not max. > 10485761000 0.00 3738.65 > 0.003739 > Conflicting CPU frequency values detected: 1411.597000 != > 1234.957000. CPU Frequency is not max. > 20971521000 0.00 3738.65 > 0.001869 > Conflicting CPU frequency values detected: 1369.828000 != > 1516.851000. CPU Frequency is not max. > 41943041000 0.00 3738.80 > 0.000935 > Conflicting CPU frequency values detected: 1564.664000 != > 1247.574000. CPU Frequency is not max. > 83886081000 0.00 3738.76 > 0.000467 > > --- > > RDMA modules are loaded > > rpcrdma90366 0 > rdma_ucm 26837 0 > ib_uverbs 51854 2 ib_ucm,rdma_ucm > rdma_cm53755 5 > rpcrdma,ko2iblnd,ib_iser,rdma_ucm,ib_isert > ib_cm 47149 5 rdma_cm,ib_srp,ib_ucm,ib_srpt,ib_ipoib > iw_cm 46022 1 rdma_cm > ib_core 210381 15 > > rdma_cm,ib_cm,iw_cm,rpcrdma,ko2iblnd,mlx4_ib,ib_srp,ib_ucm,ib_iser,ib_srpt,ib_umad,ib_uverbs,rdma_ucm,ib_ipoib,ib_isert > sunrpc334343 17 > nfs,nfsd,rpcsec_gss_krb5,auth_rpcgss,lockd,nfsv4,rpcrdma,nfs_acl > > I do not know where to look to have Lnet performing faster. I am > running my ib0 interface in connected mode with 65520 MTU size. > > Any hint will be much appreciated > > thank you > > Rick > > > > > On 8/18/17 9:05 AM, Mannthey, Keith wrote: >> I would suggest you a few other tests to help isolate where the issue >> might be. >> >> 1. What is the single thread "DD" write speed? >> >> 2. Lnet_selfttest: Please see " Chapter 28. Testing Lustre Network >> Performance (LNet Self-Test)" in the Lustre manual if this is a new test for >> you. >> This will help show how much Lnet
Re: [lustre-discuss] Lustre poor performance
, ib_uverbs,rdma_ucm,ib_ipoib,ib_isert sunrpc334343 17 nfs,nfsd,rpcsec_gss_krb5,auth_ rpcgss,lockd,nfsv4,rpcrdma,nfs_acl I do not know where to look to have Lnet performing faster. I am running my ib0 interface in connected mode with 65520 MTU size. Any hint will be much appreciated thank you Rick On 8/18/17 9:05 AM, Mannthey, Keith wrote: I would suggest you a few other tests to help isolate where the issue might be. 1. What is the single thread "DD" write speed? 2. Lnet_selfttest: Please see " Chapter 28. Testing Lustre Network Performance (LNet Self-Test)" in the Lustre manual if this is a new test for you. This will help show how much Lnet bandwith you have from your single client. There are tunable in the lnet later that can affect things. Which QRD HCA are you using? 3. OBDFilter_survey : Please see " 29.3. Testing OST Performance (obdfilter-survey)" in the Lustre manual. This test will help demonstrate what the backed NVMe/ZFS setup can do at the OBD layer in Lustre. Thanks, Keith -Original Message- From: lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org <lustre-discuss-boun...@lists.lustre.org>] On Behalf Of Riccardo Veraldi Sent: Thursday, August 17, 2017 10:48 PM To: Dennis Nelson <dnel...@ddn.com> <dnel...@ddn.com>; lustre-discuss@lists.lustre.org Subject: Re: [lustre-discuss] Lustre poor performance this is my lustre.conf [drp-tst-ffb01:~]$ cat /etc/modprobe.d/lustre.conf options lnet networks=o2ib5(ib0),tcp5(enp1s0f0) data transfer is over infiniband ib0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 65520 inet 172.21.52.83 netmask 255.255.252.0 broadcast 172.21.55.255 On 8/17/17 10:45 PM, Riccardo Veraldi wrote: On 8/17/17 9:22 PM, Dennis Nelson wrote: It appears that you are running iozone on a single client? What kind of network is tcp5? Have you looked at the network to make sure it is not the bottleneck? yes the data transfer is on ib0 interface and I did a memory to memory test through InfiniBand QDR resulting in 3.7GB/sec. tcp is used to connect to the MDS. It is tcp5 to differentiate it from my other many Lustre clusters. I could have called it tcp but it does not make any difference performance wise. I ran the test from one single node yes, I ran the same test also locally on a zpool identical to the one on the Lustre OSS. Ihave 4 identical servers each of them with the aame nvme disks: server1: OSS - OST1 Lustre/ZFS raidz1 server2: OSS - OST2 Lustre/ZFS raidz1 server3: local ZFS raidz1 server4: Lustre client ___ lustre-discuss mailing listlustre-discuss@lists.lustre.orghttp://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org ___ lustre-discuss mailing listlustre-discuss@lists.lustre.orghttp://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] Lustre poor performance
.21 >> 0.238798 >> Conflicting CPU frequency values detected: 1578.078000 != >> 1200.027000. CPU Frequency is not max. >> 32768 1000 0.00 3735.32 >> 0.119530 >> Conflicting CPU frequency values detected: 1578.078000 != >> 1200.027000. CPU Frequency is not max. >> 65536 1000 0.00 3736.98 >> 0.059792 >> Conflicting CPU frequency values detected: 1578.078000 != >> 1200.027000. CPU Frequency is not max. >> 131072 1000 0.00 3737.80 >> 0.029902 >> Conflicting CPU frequency values detected: 1578.078000 != >> 1200.027000. CPU Frequency is not max. >> 262144 1000 0.00 3738.43 >> 0.014954 >> Conflicting CPU frequency values detected: 1570.507000 != >> 1200.027000. CPU Frequency is not max. >> 524288 1000 0.00 3738.50 >> 0.007477 >> Conflicting CPU frequency values detected: 1457.019000 != >> 1236.152000. CPU Frequency is not max. >> 10485761000 0.00 3738.65 >> 0.003739 >> Conflicting CPU frequency values detected: 1411.597000 != >> 1234.957000. CPU Frequency is not max. >> 20971521000 0.00 3738.65 >> 0.001869 >> Conflicting CPU frequency values detected: 1369.828000 != >> 1516.851000. CPU Frequency is not max. >> 41943041000 0.00 3738.80 >> 0.000935 >> Conflicting CPU frequency values detected: 1564.664000 != >> 1247.574000. CPU Frequency is not max. >> 83886081000 0.00 3738.76 >> 0.000467 >> --- >> >> RDMA modules are loaded >> >> rpcrdma90366 0 >> rdma_ucm 26837 0 >> ib_uverbs 51854 2 ib_ucm,rdma_ucm >> rdma_cm53755 5 >> rpcrdma,ko2iblnd,ib_iser,rdma_ucm,ib_isert >> ib_cm 47149 5 rdma_cm,ib_srp,ib_ucm,ib_srpt,ib_ipoib >> iw_cm 46022 1 rdma_cm >> ib_core 210381 15 >> rdma_cm,ib_cm,iw_cm,rpcrdma,ko2iblnd,mlx4_ib,ib_srp,ib_ucm,ib_iser,ib_srpt,ib_umad,ib_uverbs,rdma_ucm,ib_ipoib,ib_isert >> sunrpc334343 17 >> nfs,nfsd,rpcsec_gss_krb5,auth_rpcgss,lockd,nfsv4,rpcrdma,nfs_acl >> >> I do not know where to look to have Lnet performing faster. I am >> running my ib0 interface in connected mode with 65520 MTU size. >> >> Any hint will be much appreciated >> >> thank you >> >> Rick >> >> >> >> >> On 8/18/17 9:05 AM, Mannthey, Keith wrote: >>> I would suggest you a few other tests to help isolate where the issue might >>> be. >>> >>> 1. What is the single thread "DD" write speed? >>> >>> 2. Lnet_selfttest: Please see " Chapter 28. Testing Lustre Network >>> Performance (LNet Self-Test)" in the Lustre manual if this is a new test >>> for you. >>> This will help show how much Lnet bandwith you have from your single >>> client. There are tunable in the lnet later that can affect things. Which >>> QRD HCA are you using? >>> >>> 3. OBDFilter_survey : Please see " 29.3. Testing OST Performance >>> (obdfilter-survey)" in the Lustre manual. This test will help demonstrate >>> what the backed NVMe/ZFS setup can do at the OBD layer in Lustre. >>> >>> Thanks, >>> Keith >>> -Original Message- >>> From: lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org] On >>> Behalf Of Riccardo Veraldi >>> Sent: Thursday, August 17, 2017 10:48 PM >>> To: Dennis Nelson <dnel...@ddn.com>; lustre-discuss@lists.lustre.org >>> Subject: Re: [lustre-discuss] Lustre poor performance >>> >>> this is my lustre.conf >>> >>> [drp-tst-ffb01:~]$ cat /etc/modprobe.d/lustre.conf options lnet >>> networks=o2ib5(ib0),tcp5(enp1s0f0) >>> >>> data transfer is over infiniband >>> >>> ib0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 65520 >>> inet 172.21.52.83 netmask 255.255.252.0 broadcast 172.21.55.255 >>> >>> >>> On 8/17/17 10:45 PM, Riccardo Veraldi wrote: >>>> On 8/17/17 9:22 PM, Dennis Nelson wrote: >>>>> It appears that you are running iozone on a single client? What kind of >>>>> network is tcp5? Have you looked at the network to make sure it is not >>>>> the bottleneck? >>>>> >>>> yes the data transfer is on ib0 interface and I did a memory to memory >>>> test through InfiniBand QDR resulting in 3.7GB/sec. >>>> tcp is used to connect to the MDS. It is tcp5 to differentiate it from >>>> my other many Lustre clusters. I could have called it tcp but it does >>>> not make any difference performance wise. >>>> I ran the test from one single node yes, I ran the same test also >>>> locally on a zpool identical to the one on the Lustre OSS. >>>> Ihave 4 identical servers each of them with the aame nvme disks: >>>> >>>> server1: OSS - OST1 Lustre/ZFS raidz1 >>>> >>>> server2: OSS - OST2 Lustre/ZFS raidz1 >>>> >>>> server3: local ZFS raidz1 >>>> >>>> server4: Lustre client >>>> >>>> >>>> >>>> ___ >>>> lustre-discuss mailing list >>>> lustre-discuss@lists.lustre.org >>>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org >>> ___ >>> lustre-discuss mailing list >>> lustre-discuss@lists.lustre.org >>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org >>> >> ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] Lustre poor performance
crdma,nfs_acl I do not know where to look to have Lnet performing faster. I am running my ib0 interface in connected mode with 65520 MTU size. Any hint will be much appreciated thank you Rick On 8/18/17 9:05 AM, Mannthey, Keith wrote: I would suggest you a few other tests to help isolate where the issue might be. 1. What is the single thread "DD" write speed? 2. Lnet_selfttest: Please see " Chapter 28. Testing Lustre Network Performance (LNet Self-Test)" in the Lustre manual if this is a new test for you. This will help show how much Lnet bandwith you have from your single client. There are tunable in the lnet later that can affect things. Which QRD HCA are you using? 3. OBDFilter_survey : Please see " 29.3. Testing OST Performance (obdfilter-survey)" in the Lustre manual. This test will help demonstrate what the backed NVMe/ZFS setup can do at the OBD layer in Lustre. Thanks, Keith -Original Message- From: lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org] On Behalf Of Riccardo Veraldi Sent: Thursday, August 17, 2017 10:48 PM To: Dennis Nelson <dnel...@ddn.com><mailto:dnel...@ddn.com>; lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org> Subject: Re: [lustre-discuss] Lustre poor performance this is my lustre.conf [drp-tst-ffb01:~]$ cat /etc/modprobe.d/lustre.conf options lnet networks=o2ib5(ib0),tcp5(enp1s0f0) data transfer is over infiniband ib0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 65520 inet 172.21.52.83 netmask 255.255.252.0 broadcast 172.21.55.255 On 8/17/17 10:45 PM, Riccardo Veraldi wrote: On 8/17/17 9:22 PM, Dennis Nelson wrote: It appears that you are running iozone on a single client? What kind of network is tcp5? Have you looked at the network to make sure it is not the bottleneck? yes the data transfer is on ib0 interface and I did a memory to memory test through InfiniBand QDR resulting in 3.7GB/sec. tcp is used to connect to the MDS. It is tcp5 to differentiate it from my other many Lustre clusters. I could have called it tcp but it does not make any difference performance wise. I ran the test from one single node yes, I ran the same test also locally on a zpool identical to the one on the Lustre OSS. Ihave 4 identical servers each of them with the aame nvme disks: server1: OSS - OST1 Lustre/ZFS raidz1 server2: OSS - OST2 Lustre/ZFS raidz1 server3: local ZFS raidz1 server4: Lustre client ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] Lustre poor performance
> I would suggest you a few other tests to help isolate where the issue might > be. > > 1. What is the single thread "DD" write speed? > > 2. Lnet_selfttest: Please see " Chapter 28. Testing Lustre Network > Performance (LNet Self-Test)" in the Lustre manual if this is a new test for > you. > This will help show how much Lnet bandwith you have from your single client. > There are tunable in the lnet later that can affect things. Which QRD HCA > are you using? > > 3. OBDFilter_survey : Please see " 29.3. Testing OST Performance > (obdfilter-survey)" in the Lustre manual. This test will help demonstrate > what the backed NVMe/ZFS setup can do at the OBD layer in Lustre. > > Thanks, > Keith > -Original Message- > From: lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org] On > Behalf Of Riccardo Veraldi > Sent: Thursday, August 17, 2017 10:48 PM > To: Dennis Nelson <dnel...@ddn.com>; lustre-discuss@lists.lustre.org > Subject: Re: [lustre-discuss] Lustre poor performance > > this is my lustre.conf > > [drp-tst-ffb01:~]$ cat /etc/modprobe.d/lustre.conf options lnet > networks=o2ib5(ib0),tcp5(enp1s0f0) > > data transfer is over infiniband > > ib0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 65520 > inet 172.21.52.83 netmask 255.255.252.0 broadcast 172.21.55.255 > > > On 8/17/17 10:45 PM, Riccardo Veraldi wrote: >> On 8/17/17 9:22 PM, Dennis Nelson wrote: >>> It appears that you are running iozone on a single client? What kind of >>> network is tcp5? Have you looked at the network to make sure it is not the >>> bottleneck? >>> >> yes the data transfer is on ib0 interface and I did a memory to memory >> test through InfiniBand QDR resulting in 3.7GB/sec. >> tcp is used to connect to the MDS. It is tcp5 to differentiate it from >> my other many Lustre clusters. I could have called it tcp but it does >> not make any difference performance wise. >> I ran the test from one single node yes, I ran the same test also >> locally on a zpool identical to the one on the Lustre OSS. >> Ihave 4 identical servers each of them with the aame nvme disks: >> >> server1: OSS - OST1 Lustre/ZFS raidz1 >> >> server2: OSS - OST2 Lustre/ZFS raidz1 >> >> server3: local ZFS raidz1 >> >> server4: Lustre client >> >> >> >> ___ >> lustre-discuss mailing list >> lustre-discuss@lists.lustre.org >> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org > > ___ > lustre-discuss mailing list > lustre-discuss@lists.lustre.org > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org > ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] Lustre poor performance
On 8/18/17 1:13 PM, Mannthey, Keith wrote: > Is Selinux enabled on the client or server? the first thing I always to is to disable SElinux. it's not running. > > Thanks, > Keith > -Original Message- > From: Riccardo Veraldi [mailto:riccardo.vera...@cnaf.infn.it] > Sent: Friday, August 18, 2017 11:31 AM > To: Mannthey, Keith <keith.mannt...@intel.com>; Dennis Nelson > <dnel...@ddn.com>; lustre-discuss@lists.lustre.org > Subject: Re: [lustre-discuss] Lustre poor performance > > > thank you Keith, > I will do all this. the single thread dd tests shows 1GB/sec. I will do the > other tests > > > On 8/18/17 9:05 AM, Mannthey, Keith wrote: >> I would suggest you a few other tests to help isolate where the issue might >> be. >> >> 1. What is the single thread "DD" write speed? >> >> 2. Lnet_selfttest: Please see " Chapter 28. Testing Lustre Network >> Performance (LNet Self-Test)" in the Lustre manual if this is a new test for >> you. >> This will help show how much Lnet bandwith you have from your single client. >> There are tunable in the lnet later that can affect things. Which QRD HCA >> are you using? >> >> 3. OBDFilter_survey : Please see " 29.3. Testing OST Performance >> (obdfilter-survey)" in the Lustre manual. This test will help demonstrate >> what the backed NVMe/ZFS setup can do at the OBD layer in Lustre. >> >> Thanks, >> Keith >> -Original Message- >> From: lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org] >> On Behalf Of Riccardo Veraldi >> Sent: Thursday, August 17, 2017 10:48 PM >> To: Dennis Nelson <dnel...@ddn.com>; lustre-discuss@lists.lustre.org >> Subject: Re: [lustre-discuss] Lustre poor performance >> >> this is my lustre.conf >> >> [drp-tst-ffb01:~]$ cat /etc/modprobe.d/lustre.conf options lnet >> networks=o2ib5(ib0),tcp5(enp1s0f0) >> >> data transfer is over infiniband >> >> ib0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 65520 >> inet 172.21.52.83 netmask 255.255.252.0 broadcast >> 172.21.55.255 >> >> >> On 8/17/17 10:45 PM, Riccardo Veraldi wrote: >>> On 8/17/17 9:22 PM, Dennis Nelson wrote: >>>> It appears that you are running iozone on a single client? What kind of >>>> network is tcp5? Have you looked at the network to make sure it is not >>>> the bottleneck? >>>> >>> yes the data transfer is on ib0 interface and I did a memory to >>> memory test through InfiniBand QDR resulting in 3.7GB/sec. >>> tcp is used to connect to the MDS. It is tcp5 to differentiate it >>> from my other many Lustre clusters. I could have called it tcp but it >>> does not make any difference performance wise. >>> I ran the test from one single node yes, I ran the same test also >>> locally on a zpool identical to the one on the Lustre OSS. >>> Ihave 4 identical servers each of them with the aame nvme disks: >>> >>> server1: OSS - OST1 Lustre/ZFS raidz1 >>> >>> server2: OSS - OST2 Lustre/ZFS raidz1 >>> >>> server3: local ZFS raidz1 >>> >>> server4: Lustre client >>> >>> >>> >>> ___ >>> lustre-discuss mailing list >>> lustre-discuss@lists.lustre.org >>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org >> ___ >> lustre-discuss mailing list >> lustre-discuss@lists.lustre.org >> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org >> > ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] Lustre poor performance
Is Selinux enabled on the client or server? Thanks, Keith -Original Message- From: Riccardo Veraldi [mailto:riccardo.vera...@cnaf.infn.it] Sent: Friday, August 18, 2017 11:31 AM To: Mannthey, Keith <keith.mannt...@intel.com>; Dennis Nelson <dnel...@ddn.com>; lustre-discuss@lists.lustre.org Subject: Re: [lustre-discuss] Lustre poor performance thank you Keith, I will do all this. the single thread dd tests shows 1GB/sec. I will do the other tests On 8/18/17 9:05 AM, Mannthey, Keith wrote: > I would suggest you a few other tests to help isolate where the issue might > be. > > 1. What is the single thread "DD" write speed? > > 2. Lnet_selfttest: Please see " Chapter 28. Testing Lustre Network > Performance (LNet Self-Test)" in the Lustre manual if this is a new test for > you. > This will help show how much Lnet bandwith you have from your single client. > There are tunable in the lnet later that can affect things. Which QRD HCA > are you using? > > 3. OBDFilter_survey : Please see " 29.3. Testing OST Performance > (obdfilter-survey)" in the Lustre manual. This test will help demonstrate > what the backed NVMe/ZFS setup can do at the OBD layer in Lustre. > > Thanks, > Keith > -Original Message- > From: lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org] > On Behalf Of Riccardo Veraldi > Sent: Thursday, August 17, 2017 10:48 PM > To: Dennis Nelson <dnel...@ddn.com>; lustre-discuss@lists.lustre.org > Subject: Re: [lustre-discuss] Lustre poor performance > > this is my lustre.conf > > [drp-tst-ffb01:~]$ cat /etc/modprobe.d/lustre.conf options lnet > networks=o2ib5(ib0),tcp5(enp1s0f0) > > data transfer is over infiniband > > ib0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 65520 > inet 172.21.52.83 netmask 255.255.252.0 broadcast > 172.21.55.255 > > > On 8/17/17 10:45 PM, Riccardo Veraldi wrote: >> On 8/17/17 9:22 PM, Dennis Nelson wrote: >>> It appears that you are running iozone on a single client? What kind of >>> network is tcp5? Have you looked at the network to make sure it is not the >>> bottleneck? >>> >> yes the data transfer is on ib0 interface and I did a memory to >> memory test through InfiniBand QDR resulting in 3.7GB/sec. >> tcp is used to connect to the MDS. It is tcp5 to differentiate it >> from my other many Lustre clusters. I could have called it tcp but it >> does not make any difference performance wise. >> I ran the test from one single node yes, I ran the same test also >> locally on a zpool identical to the one on the Lustre OSS. >> Ihave 4 identical servers each of them with the aame nvme disks: >> >> server1: OSS - OST1 Lustre/ZFS raidz1 >> >> server2: OSS - OST2 Lustre/ZFS raidz1 >> >> server3: local ZFS raidz1 >> >> server4: Lustre client >> >> >> >> ___ >> lustre-discuss mailing list >> lustre-discuss@lists.lustre.org >> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org > > ___ > lustre-discuss mailing list > lustre-discuss@lists.lustre.org > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org > ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] Lustre poor performance
thank you Keith, I will do all this. the single thread dd tests shows 1GB/sec. I will do the other tests On 8/18/17 9:05 AM, Mannthey, Keith wrote: > I would suggest you a few other tests to help isolate where the issue might > be. > > 1. What is the single thread "DD" write speed? > > 2. Lnet_selfttest: Please see " Chapter 28. Testing Lustre Network > Performance (LNet Self-Test)" in the Lustre manual if this is a new test for > you. > This will help show how much Lnet bandwith you have from your single client. > There are tunable in the lnet later that can affect things. Which QRD HCA > are you using? > > 3. OBDFilter_survey : Please see " 29.3. Testing OST Performance > (obdfilter-survey)" in the Lustre manual. This test will help demonstrate > what the backed NVMe/ZFS setup can do at the OBD layer in Lustre. > > Thanks, > Keith > -Original Message- > From: lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org] On > Behalf Of Riccardo Veraldi > Sent: Thursday, August 17, 2017 10:48 PM > To: Dennis Nelson <dnel...@ddn.com>; lustre-discuss@lists.lustre.org > Subject: Re: [lustre-discuss] Lustre poor performance > > this is my lustre.conf > > [drp-tst-ffb01:~]$ cat /etc/modprobe.d/lustre.conf options lnet > networks=o2ib5(ib0),tcp5(enp1s0f0) > > data transfer is over infiniband > > ib0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 65520 > inet 172.21.52.83 netmask 255.255.252.0 broadcast 172.21.55.255 > > > On 8/17/17 10:45 PM, Riccardo Veraldi wrote: >> On 8/17/17 9:22 PM, Dennis Nelson wrote: >>> It appears that you are running iozone on a single client? What kind of >>> network is tcp5? Have you looked at the network to make sure it is not the >>> bottleneck? >>> >> yes the data transfer is on ib0 interface and I did a memory to memory >> test through InfiniBand QDR resulting in 3.7GB/sec. >> tcp is used to connect to the MDS. It is tcp5 to differentiate it from >> my other many Lustre clusters. I could have called it tcp but it does >> not make any difference performance wise. >> I ran the test from one single node yes, I ran the same test also >> locally on a zpool identical to the one on the Lustre OSS. >> Ihave 4 identical servers each of them with the aame nvme disks: >> >> server1: OSS - OST1 Lustre/ZFS raidz1 >> >> server2: OSS - OST2 Lustre/ZFS raidz1 >> >> server3: local ZFS raidz1 >> >> server4: Lustre client >> >> >> >> ___ >> lustre-discuss mailing list >> lustre-discuss@lists.lustre.org >> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org > > ___ > lustre-discuss mailing list > lustre-discuss@lists.lustre.org > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org > ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] Lustre poor performance
I would suggest you a few other tests to help isolate where the issue might be. 1. What is the single thread "DD" write speed? 2. Lnet_selfttest: Please see " Chapter 28. Testing Lustre Network Performance (LNet Self-Test)" in the Lustre manual if this is a new test for you. This will help show how much Lnet bandwith you have from your single client. There are tunable in the lnet later that can affect things. Which QRD HCA are you using? 3. OBDFilter_survey : Please see " 29.3. Testing OST Performance (obdfilter-survey)" in the Lustre manual. This test will help demonstrate what the backed NVMe/ZFS setup can do at the OBD layer in Lustre. Thanks, Keith -Original Message- From: lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org] On Behalf Of Riccardo Veraldi Sent: Thursday, August 17, 2017 10:48 PM To: Dennis Nelson <dnel...@ddn.com>; lustre-discuss@lists.lustre.org Subject: Re: [lustre-discuss] Lustre poor performance this is my lustre.conf [drp-tst-ffb01:~]$ cat /etc/modprobe.d/lustre.conf options lnet networks=o2ib5(ib0),tcp5(enp1s0f0) data transfer is over infiniband ib0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 65520 inet 172.21.52.83 netmask 255.255.252.0 broadcast 172.21.55.255 On 8/17/17 10:45 PM, Riccardo Veraldi wrote: > On 8/17/17 9:22 PM, Dennis Nelson wrote: >> It appears that you are running iozone on a single client? What kind of >> network is tcp5? Have you looked at the network to make sure it is not the >> bottleneck? >> > yes the data transfer is on ib0 interface and I did a memory to memory > test through InfiniBand QDR resulting in 3.7GB/sec. > tcp is used to connect to the MDS. It is tcp5 to differentiate it from > my other many Lustre clusters. I could have called it tcp but it does > not make any difference performance wise. > I ran the test from one single node yes, I ran the same test also > locally on a zpool identical to the one on the Lustre OSS. > Ihave 4 identical servers each of them with the aame nvme disks: > > server1: OSS - OST1 Lustre/ZFS raidz1 > > server2: OSS - OST2 Lustre/ZFS raidz1 > > server3: local ZFS raidz1 > > server4: Lustre client > > > > ___ > lustre-discuss mailing list > lustre-discuss@lists.lustre.org > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] Lustre poor performance
On 8/17/17 8:56 PM, Jones, Peter A wrote: > Riccardo > > I expect that it will be useful to know which version of ZFS you are using apologies for not telling this I Am running 0.7.1 > > Peter > > > > > On 8/17/17, 8:21 PM, "lustre-discuss on behalf of Riccardo Veraldi" >riccardo.vera...@cnaf.infn.it> wrote: > >> Hello, >> >> I am running Lustre 2.10.0 on Centos 7.3 >> I have one MDS and two OSSes, each with one OST >> each OST is a ZFS raidz1 with 6 nvme disks each. >> The configuration of ZFS is done in a way to allow maximum write >> performances: >> >> zfs set sync=disabled drpffb-ost02 >> zfs set atime=off drpffb-ost02 >> zfs set redundant_metadata=most drpffb-ost02 >> zfs set xattr=sa drpffb-ost02 >> zfs set recordsize=1M drpffb-ost02 >> >> every NVMe disk has 4K byte sector, zfs -o ashift=12 >> >> In a LOCAL raidz1 configuration I get 3.6GB/sec writings and 5GB/sec >> readings. >> >> The same configuration thru Lustre has very poor performances, 1.3GB/sec >> writes and 2GB/sec reads >> >> There must be something else to look for having better performances but >> a local ZFS raidz1 is working pretty good. >> >> this is the Lustre partition client side: >> >> 172.21.42.159@tcp5:/drpffb 10T 279G 9.8T 3% /drpffb >> >> UUID bytesUsed Available Use% Mounted on >> drpffb-MDT_UUID19.1G2.1M 19.1G 0% /drpffb[MDT:0] >> drpffb-OST0001_UUID 5.0T 142.2G4.9T 3% /drpffb[OST:1] >> drpffb-OST0002_UUID 5.0T 136.4G4.9T 3% /drpffb[OST:2] >> >> filesystem_summary:10.0T 278.6G9.7T 3% /drpffb >> >> Tests both on Lustre/ZFS and local ZFS are based on 50 threads writing >> 4GB of data each and 50 threads reading using iozone: >> >> iozone -i 0 -t 50 -i 1 -t 50 -s4g >> >> I do not know what else I can do to improve performances >> >> here some details on the OSSes >> >> OSS01: >> >> NAME USED AVAIL REFER MOUNTPOINT >> drpffb-ost0139.4G 4.99T 153K none >> drpffb-ost01/ost01 39.4G 4.99T 39.4G none >> >> pool: drpffb-ost01 >> state: ONLINE >> scan: none requested >> config: >> >>NAME STATE READ WRITE CKSUM >>drpffb-ost01 ONLINE 0 0 0 >> raidz1-0 ONLINE 0 0 0 >>nvme0n1 ONLINE 0 0 0 >>nvme1n1 ONLINE 0 0 0 >>nvme2n1 ONLINE 0 0 0 >>nvme3n1 ONLINE 0 0 0 >>nvme4n1 ONLINE 0 0 0 >>nvme5n1 ONLINE 0 0 0 >> >> OSS02: >> >> NAME USED AVAIL REFER MOUNTPOINT >> drpffb-ost0262.2G 4.97T 153K none >> drpffb-ost02/ost02 62.2G 4.97T 62.2G none >> >> pool: drpffb-ost02 >> state: ONLINE >> scan: none requested >> config: >> >>NAME STATE READ WRITE CKSUM >>drpffb-ost02 ONLINE 0 0 0 >> raidz1-0 ONLINE 0 0 0 >>nvme0n1 ONLINE 0 0 0 >>nvme1n1 ONLINE 0 0 0 >>nvme2n1 ONLINE 0 0 0 >>nvme3n1 ONLINE 0 0 0 >>nvme4n1 ONLINE 0 0 0 >>nvme5n1 ONLINE 0 0 0 >> >> thanks to anyone who may help giving hints. >> >> Rick >> >> >> >> ___ >> lustre-discuss mailing list >> lustre-discuss@lists.lustre.org >> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] Lustre poor performance
It appears that you are running iozone on a single client? What kind of network is tcp5? Have you looked at the network to make sure it is not the bottleneck? -- Dennis Nelson Mobile: 817-233-6116 Applications Support Engineer DataDirect Networks, Inc. dnel...@ddn.com On 8/17/17, 10:22 PM, "lustre-discuss on behalf of Riccardo Veraldi"wrote: Hello, I am running Lustre 2.10.0 on Centos 7.3 I have one MDS and two OSSes, each with one OST each OST is a ZFS raidz1 with 6 nvme disks each. The configuration of ZFS is done in a way to allow maximum write performances: zfs set sync=disabled drpffb-ost02 zfs set atime=off drpffb-ost02 zfs set redundant_metadata=most drpffb-ost02 zfs set xattr=sa drpffb-ost02 zfs set recordsize=1M drpffb-ost02 every NVMe disk has 4K byte sector, zfs -o ashift=12 In a LOCAL raidz1 configuration I get 3.6GB/sec writings and 5GB/sec readings. The same configuration thru Lustre has very poor performances, 1.3GB/sec writes and 2GB/sec reads There must be something else to look for having better performances but a local ZFS raidz1 is working pretty good. this is the Lustre partition client side: 172.21.42.159@tcp5:/drpffb 10T 279G 9.8T 3% /drpffb UUID bytesUsed Available Use% Mounted on drpffb-MDT_UUID19.1G2.1M 19.1G 0% /drpffb[MDT:0] drpffb-OST0001_UUID 5.0T 142.2G4.9T 3% /drpffb[OST:1] drpffb-OST0002_UUID 5.0T 136.4G4.9T 3% /drpffb[OST:2] filesystem_summary:10.0T 278.6G9.7T 3% /drpffb Tests both on Lustre/ZFS and local ZFS are based on 50 threads writing 4GB of data each and 50 threads reading using iozone: iozone -i 0 -t 50 -i 1 -t 50 -s4g I do not know what else I can do to improve performances here some details on the OSSes OSS01: NAME USED AVAIL REFER MOUNTPOINT drpffb-ost0139.4G 4.99T 153K none drpffb-ost01/ost01 39.4G 4.99T 39.4G none pool: drpffb-ost01 state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM drpffb-ost01 ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 nvme0n1 ONLINE 0 0 0 nvme1n1 ONLINE 0 0 0 nvme2n1 ONLINE 0 0 0 nvme3n1 ONLINE 0 0 0 nvme4n1 ONLINE 0 0 0 nvme5n1 ONLINE 0 0 0 OSS02: NAME USED AVAIL REFER MOUNTPOINT drpffb-ost0262.2G 4.97T 153K none drpffb-ost02/ost02 62.2G 4.97T 62.2G none pool: drpffb-ost02 state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM drpffb-ost02 ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 nvme0n1 ONLINE 0 0 0 nvme1n1 ONLINE 0 0 0 nvme2n1 ONLINE 0 0 0 nvme3n1 ONLINE 0 0 0 nvme4n1 ONLINE 0 0 0 nvme5n1 ONLINE 0 0 0 thanks to anyone who may help giving hints. Rick ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] Lustre poor performance
Riccardo I expect that it will be useful to know which version of ZFS you are using Peter On 8/17/17, 8:21 PM, "lustre-discuss on behalf of Riccardo Veraldi"wrote: >Hello, > >I am running Lustre 2.10.0 on Centos 7.3 >I have one MDS and two OSSes, each with one OST >each OST is a ZFS raidz1 with 6 nvme disks each. >The configuration of ZFS is done in a way to allow maximum write >performances: > >zfs set sync=disabled drpffb-ost02 >zfs set atime=off drpffb-ost02 >zfs set redundant_metadata=most drpffb-ost02 >zfs set xattr=sa drpffb-ost02 >zfs set recordsize=1M drpffb-ost02 > >every NVMe disk has 4K byte sector, zfs -o ashift=12 > >In a LOCAL raidz1 configuration I get 3.6GB/sec writings and 5GB/sec >readings. > >The same configuration thru Lustre has very poor performances, 1.3GB/sec >writes and 2GB/sec reads > >There must be something else to look for having better performances but >a local ZFS raidz1 is working pretty good. > >this is the Lustre partition client side: > >172.21.42.159@tcp5:/drpffb 10T 279G 9.8T 3% /drpffb > >UUID bytesUsed Available Use% Mounted on >drpffb-MDT_UUID19.1G2.1M 19.1G 0% /drpffb[MDT:0] >drpffb-OST0001_UUID 5.0T 142.2G4.9T 3% /drpffb[OST:1] >drpffb-OST0002_UUID 5.0T 136.4G4.9T 3% /drpffb[OST:2] > >filesystem_summary:10.0T 278.6G9.7T 3% /drpffb > >Tests both on Lustre/ZFS and local ZFS are based on 50 threads writing >4GB of data each and 50 threads reading using iozone: > >iozone -i 0 -t 50 -i 1 -t 50 -s4g > >I do not know what else I can do to improve performances > >here some details on the OSSes > >OSS01: > >NAME USED AVAIL REFER MOUNTPOINT >drpffb-ost0139.4G 4.99T 153K none >drpffb-ost01/ost01 39.4G 4.99T 39.4G none > > pool: drpffb-ost01 > state: ONLINE > scan: none requested >config: > >NAME STATE READ WRITE CKSUM >drpffb-ost01 ONLINE 0 0 0 > raidz1-0 ONLINE 0 0 0 >nvme0n1 ONLINE 0 0 0 >nvme1n1 ONLINE 0 0 0 >nvme2n1 ONLINE 0 0 0 >nvme3n1 ONLINE 0 0 0 >nvme4n1 ONLINE 0 0 0 >nvme5n1 ONLINE 0 0 0 > >OSS02: > >NAME USED AVAIL REFER MOUNTPOINT >drpffb-ost0262.2G 4.97T 153K none >drpffb-ost02/ost02 62.2G 4.97T 62.2G none > > pool: drpffb-ost02 > state: ONLINE > scan: none requested >config: > >NAME STATE READ WRITE CKSUM >drpffb-ost02 ONLINE 0 0 0 > raidz1-0 ONLINE 0 0 0 >nvme0n1 ONLINE 0 0 0 >nvme1n1 ONLINE 0 0 0 >nvme2n1 ONLINE 0 0 0 >nvme3n1 ONLINE 0 0 0 >nvme4n1 ONLINE 0 0 0 >nvme5n1 ONLINE 0 0 0 > >thanks to anyone who may help giving hints. > >Rick > > > >___ >lustre-discuss mailing list >lustre-discuss@lists.lustre.org >http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
[lustre-discuss] Lustre poor performance
Hello, I am running Lustre 2.10.0 on Centos 7.3 I have one MDS and two OSSes, each with one OST each OST is a ZFS raidz1 with 6 nvme disks each. The configuration of ZFS is done in a way to allow maximum write performances: zfs set sync=disabled drpffb-ost02 zfs set atime=off drpffb-ost02 zfs set redundant_metadata=most drpffb-ost02 zfs set xattr=sa drpffb-ost02 zfs set recordsize=1M drpffb-ost02 every NVMe disk has 4K byte sector, zfs -o ashift=12 In a LOCAL raidz1 configuration I get 3.6GB/sec writings and 5GB/sec readings. The same configuration thru Lustre has very poor performances, 1.3GB/sec writes and 2GB/sec reads There must be something else to look for having better performances but a local ZFS raidz1 is working pretty good. this is the Lustre partition client side: 172.21.42.159@tcp5:/drpffb 10T 279G 9.8T 3% /drpffb UUID bytesUsed Available Use% Mounted on drpffb-MDT_UUID19.1G2.1M 19.1G 0% /drpffb[MDT:0] drpffb-OST0001_UUID 5.0T 142.2G4.9T 3% /drpffb[OST:1] drpffb-OST0002_UUID 5.0T 136.4G4.9T 3% /drpffb[OST:2] filesystem_summary:10.0T 278.6G9.7T 3% /drpffb Tests both on Lustre/ZFS and local ZFS are based on 50 threads writing 4GB of data each and 50 threads reading using iozone: iozone -i 0 -t 50 -i 1 -t 50 -s4g I do not know what else I can do to improve performances here some details on the OSSes OSS01: NAME USED AVAIL REFER MOUNTPOINT drpffb-ost0139.4G 4.99T 153K none drpffb-ost01/ost01 39.4G 4.99T 39.4G none pool: drpffb-ost01 state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM drpffb-ost01 ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 nvme0n1 ONLINE 0 0 0 nvme1n1 ONLINE 0 0 0 nvme2n1 ONLINE 0 0 0 nvme3n1 ONLINE 0 0 0 nvme4n1 ONLINE 0 0 0 nvme5n1 ONLINE 0 0 0 OSS02: NAME USED AVAIL REFER MOUNTPOINT drpffb-ost0262.2G 4.97T 153K none drpffb-ost02/ost02 62.2G 4.97T 62.2G none pool: drpffb-ost02 state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM drpffb-ost02 ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 nvme0n1 ONLINE 0 0 0 nvme1n1 ONLINE 0 0 0 nvme2n1 ONLINE 0 0 0 nvme3n1 ONLINE 0 0 0 nvme4n1 ONLINE 0 0 0 nvme5n1 ONLINE 0 0 0 thanks to anyone who may help giving hints. Rick ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org