Hi Michal, How are you ?
Can you also tell: 1. How much luns are allocate for the GPFS File-System from Storage (minimum is 16) ? 2. What is the Block-size defining in the GPFS FS ? 3. How much pools you have in the FS ? 4. Does all tests are run on single server ? Regards Yaron Daniel 94 Em Ha'Moshavot Rd [cid:[email protected]] Storage and Cloud Consultant Petach Tiqva, 49527 Technology Services IBM Technology Lifecycle Service Israel Phone: +972-3-916-5672 Fax: +972-3-916-5672 Mobile: +972-52-8395593 e-mail: [email protected]<mailto:[email protected]> Webex: https://ibm.webex.com/meet/yard<webex:%20%20%20%20%20%20%20%20%20%20%20%20https://ibm.webex.com/meet/yard> IBM Israel<webex:%20%20%20%20%20%20%20%20%20%20%20%20%20https://ibm.webex.com/meet/yard%0dIBM%20Israel> From: gpfsug-discuss <[email protected]> On Behalf Of Michal Hruška Sent: Wednesday, 14 February 2024 20:29 To: [email protected] Subject: [EXTERNAL] Re: [gpfsug-discuss] sequential I/O write - performance Dear friends, Thank you all for your time and thoughts/ideas! The main goal for sharing our test results comparing XFS and GPFS was to show, that the storage subsystem is able to do better if the I/O is provided in different way. We were not ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. Report Suspicious <https://us-phishalarm-ewt.proofpoint.com/EWT/v1/PjiDSg!2e-rYp51ahlw0GUbEivFBBHwoZM0xRLp7zx9uoYh3vwIvX_PsvdQxWgLetD5R1ZaeiVSdoVnUTPHHILYFoplm1IzLjvECA$> ZjQcmQRYFpfptBannerEnd Dear friends, Thank you all for your time and thoughts/ideas! The main goal for sharing our test results comparing XFS and GPFS was to show, that the storage subsystem is able to do better if the I/O is provided in different way. We were not trying to compare XFS and GPFS directly, we understand that there will be some performance drop using GPFS (compared to “raw” performance) but we are just surprised by the ~20-25% performance drop. We tried to change multiple suggested parameters but we got no performance gain. As there was no change we tried to do more troubleshooting using different configurations. To better understand what we tried I have to describe our environment a bit more: Our underlying storage system is IBM FS7300 (each controller has 384 GB cache). There are 8 DRAIDs (8+2+1). Each DRAID has its own pool and each pool has one Volume (LUN). Every FE server (we have 3 of them) is connected directly to this storage using two 32 GFC connections. 3 client servers and FE servers are connected to LAN switch using 100GbE connection. Testing results (metadata are located on NVMe SSD DRAID): 1. We used second - identical storage to test the performance but we are getting almost the same results compared to first storage. In iohist we can see that one LUN (dm-device) is probably overloaded as IO time is high – from 300 to 500 ms. 2. Using both storage systems together in one big FS (GPFS): always is only one dm-device slow (according to iohist output) but the “problematic” dm-device changes in time. 3. During out tests we also tried synchronous fio test but we observed significant performance drop. 4. We tried to compare single LUN performance GPFS against XFS: GPFS 435MB/s compared to XFS 485MB/s. From single server. The drop is not so significant but when we added more LUNs to the comparison the performance drop was more painful. For this testing “session” we were able to gather data by Storage Insights to check storage performance: 1. There is no problematic HDD – the worst latency seen is 42ms from all 176 drives in two storage systems. Average latency is 15ms. 2. CPU usage was at 25% max. 3. “Problematic” DRAID latency – average is 16ms the worst is 430ms. I can not tell if there was the same peak in latency during XFS tests but I think that no (or not so bad) – as the XFS is able to perform better than GPFS. 4. During our tests the write cache for all pools was fully allocated. Both for XFS and GPFS tests. Which is expected state as the cache is much faster than HDDs and it should help organize writes before they are forwarded to RAID groups. Do you see some other possible problems we missed? I do not want to leave it behind “unfinished” but I am out of ideas. 😊 Best, Michal From: Michal Hruška Sent: Thursday, February 8, 2024 3:59 PM To: '[email protected]' <[email protected]<mailto:[email protected]>> Subject: Re: [gpfsug-discuss] sequential I/O write - performance @Aaron Yes, I can confirm that 2MB blocks are transfered over. @ Jan-Frode We tried to change multiple parameters, but if you know the best combination for sequential IO, please let me know. #mmlsconfig autoload no dmapiFileHandleSize 32 minReleaseLevel 5.1.9.0 tscCmdAllowRemoteConnections no ccrEnabled yes cipherList AUTHONLY sdrNotifyAuthEnabled yes pagepool 64G maxblocksize 16384K maxMBpS 40000 maxReceiverThreads 32 nsdMaxWorkerThreads 512 nsdMinWorkerThreads 8 nsdMultiQueue 256 nsdSmallThreadRatio 0 nsdThreadsPerQueue 3 prefetchAggressiveness 2 adminMode central /dev/fs0 @Uwe Using iohist we found out that gpfs is overloading one dm-device (it took about 500ms to finish IOs). We replaced the „problematic“ dm-device (as we have enough drives to play with) for new one but the overloading issue just jumped to another dm-device. We believe that this behaviour is caused by the gpfs but we are unable to locate the root cause of it. Best, Michal
_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
