Hi Uwe

Thanks.

Worth noting is that we have win10 ltsc, win 2019 and have had a single CPU win 
11 22h2 (as a test) clients that all perform as expected. Those machines is 
older though and connected with 10-40GbE, those client max out their NIC in 
read and write.

Let me know if i missed something important here. Thanks again.

Setups is

Client:

  *   Supermicro Workstation
  *   Intel(R) Xeon(R) Gold 6418H   2.10 GHz  (2 processors)
  *   Mellanox ConnectX-6 Dx connected with 100GbE over dedicated vlan via 
mellanox sn2100.
  *   Windows 11 Pro for Workstations, 22H2


Storage setup

  *
3 x 84 bay seagate chassis with spinning disks.
  *
Storage connected with redundant 12Gb SAS to 2 x Storage node servers
  *
2 x mellanox sn2100
  *
The 2 storage node servers are for this vlan connected with 100GbE to each 
switch, so in total 4 x 100GbE.  And the switches are connected with 2 * 100GbE


I tested the commands you suggested. They are both new to me so not sure what 
the output is supposed to be, looks like -iohistory isn't available in windows. 
I ran --waiters a few times, as seen below. Not sure what the expected output 
is from that.


mmdiag --waiters

=== mmdiag: waiters ===
Waiting 0.0000 sec since 2024-09-04_10:05:14, monitored, thread 18616 
MsgHandler@getData: for In function sendMessage
Waiting 0.0000 sec since 2024-09-04_10:05:14, monitored, thread 25084 
WritebehindWorkerThread: on ThCond 0x31A7C360 (MsgRecordCondvar), reason 'RPC 
wait' for NSD I/O completion on node 192.168.45.213 <c0n0>

C:\Users\m5-tkd01>mmdiag --waiters

=== mmdiag: waiters ===
Waiting 0.0009 sec since 2024-09-04_10:05:17, monitored, thread 16780 
FsyncHandlerThread: on ThCond 0x37FFDAB0 (MsgRecordCondvar), reason 'RPC wait' 
for NSD I/O completion on node 192.168.45.214 <c0n1>
Waiting 0.0009 sec since 2024-09-04_10:05:17, monitored, thread 30308 
MsgHandler@getData: for In function sendMessage

C:\Users\m5-tkd01>mmdiag --waiters

=== mmdiag: waiters ===
Waiting 0.0055 sec since 2024-09-04_10:05:21, monitored, thread 16780 
FileBlockReadFetchHandlerThread: on ThCond 0x37A25FF0 (MsgRecordCondvar), 
reason 'RPC wait' for NSD I/O completion on node 192.168.45.213 <c0n0>

C:\Users\m5-tkd01>mmdiag --waiters

=== mmdiag: waiters ===
Waiting 0.0029 sec since 2024-09-04_10:05:23, monitored, thread 16780 
FileBlockReadFetchHandlerThread: on ThCond 0x38281DE0 (MsgRecordCondvar), 
reason 'RPC wait' for NSD I/O completion on node 192.168.45.213 <c0n0>

C:\Users\m5-tkd01>mmdiag --waiters

=== mmdiag: waiters ===
Waiting 0.0019 sec since 2024-09-04_10:05:25, monitored, thread 11832 
PrefetchWorkerThread: on ThCond 0x38278D20 (MsgRecordCondvar), reason 'RPC 
wait' for NSD I/O completion on node 192.168.45.214 <c0n1>
Waiting 0.0009 sec since 2024-09-04_10:05:25, monitored, thread 16780 
AcquireBRTHandlerThread: on ThCond 0x37A324E0 (MsgRecordCondvar), reason 'RPC 
wait' for tmMsgBRRevoke on node 192.168.45.161 <c0n11>
Waiting 0.0009 sec since 2024-09-04_10:05:25, monitored, thread 2576 
RangeRevokeWorkerThread: on ThCond 0x5419DAA0 (BrlObjCondvar), reason 'waiting 
because of local byte range lock conflict'

C:\Users\m5-tkd01>






C:\Users\m5-tkd01>mmdiag --iohistory
Unrecognized option: --iohistory.
Run mmdiag --help for the option list





--

Henrik Cednert  /  + 46 704 71 89 54  /  CTO  /  OnePost (formerly Filmlance 
Post)

☝️ OnePost, formerly Filmlance's post-production, is now an independent part of 
the Banijay Group.
New name, same team – business as usual at OnePost.



________________________________
From: gpfsug-discuss <[email protected]> on behalf of Uwe Falke 
<[email protected]>
Sent: Tuesday, 3 September 2024 17:35
To: [email protected] <[email protected]>
Subject: Re: [gpfsug-discuss] GPFS 5.1.9.4 on Windows 11 Pro. Performance 
issues, write.


Hi, Henrik,


while I am not using Windows I'd start investigating the usual things (see 
below).


But first you should describe your set-up better.

Where are the NSDs : locally attached to the Windows box? In some NSD servers?

If the latter -- what is the link to the NSD servers? via your GbE link? FC? 
IB? separate Ethernet?

What type of storage? Spinning Disks? Flash?


How long are your I/Os waiting on the client (compare that to the waiting times 
on the NSD server if applicable)?

not sure whether that is available on Windows, but

mmdiag --waiters

mmdiag --iohistory

might be of use.


Somewhere in the chain from your application to the storage backend there is a 
delay and you should first find out where that occurs I think.


Bye

Uwe



On 03.09.24 14:10, Henrik Cednert wrote:
Still no solution here regarding this.

Have tested other cables.
Have tested to change tcp window size, no change
Played with numa in the bios, no change
Played with hyperthreading in bios, no change


Have anyone managed to get some speed out of windows 11 and gpfs?



--

Henrik Cednert  /  + 46 704 71 89 54  /  CTO  /  OnePost (formerly Filmlance 
Post)

☝️ OnePost, formerly Filmlance's post-production, is now an independent part of 
the Banijay Group.
New name, same team – business as usual at OnePost.



________________________________
From: gpfsug-discuss 
<[email protected]><mailto:[email protected]> 
on behalf of Henrik Cednert 
<[email protected]><mailto:[email protected]>
Sent: Friday, 9 August 2024 17:25
To: [email protected]<mailto:[email protected]> 
<[email protected]><mailto:[email protected]>
Subject: [gpfsug-discuss] GPFS 5.1.9.4 on Windows 11 Pro. Performance issues, 
write.


VARNING: DETTA ÄR ETT EXTERNT MAIL. Klicka inte på några länkar oavsett hur 
legitima de verkar utan att verifiera.

Hello

I have some issues with write performance on a windows 11 pro system and I'm 
out of ideas here. Hopefully someone here have some bright ideas and/or 
experience of GPFS on Windows 11?

The system is a:

Windows 11 Pro 22H2
2 x Intel(R) Xeon(R) Gold 6418H   2.10 GHz
512 GB RAM
GPFS 5.1.9.4
Mellanox ConnectX 6 Dx
100GbE connected to Mellanox Switch with 5m Mellanox DAC.

Before deploying this workstation we had a single socket system as a test bench 
where we got 60 GbE in both directons with iPerf and around 6GB/sec write and 
3GB/sec read from the system over GPFS (fio tests, same tests as furhter down 
here).

With that system I had loads of issues before getting to that point though. MS 
Defender had to be forcefully disabled via regedit some other tweaks. All those 
tweaks have been performed in this new system as well, but I can't get the 
proper speed out of it.


On this new system and with iPerf to the storage servers I get around 50-60GbE 
in both directions and send and receive.

If I mount the storage over SMB and 100GbE via the storage gateway servers I 
get around 3GB/sec read and write with Blackmagics Disk speed test. I have not 
tweaked the system for samba performande, just a test to see what it would give 
and part of the troubleshooting.

If I run Blackmagics diskspeed test to the GPFS mount I instead get around 
700MB/sec write and 400MB/sec read.

Starting to think that the Blackmagic test might not run properly on this 
machine with these CPUs though. Or it's related to the mmfsd process maybe, how 
that threads or not threads...?

But if we instead look at fio. I have a bat script that loops through a bunch 
of FIO-tests. A test that I have been using over the years so that we easily 
can benchmark all deployed systems with the exakt same tests. The tests are 
named like:

seqrw-<filesize>gb-<blocksize>mb-t<threads>

The result when I run this is like the below list. Number in parenthesis is the 
by fio reported latency.

Job: seqrw-40gb-1mb-t1
      •     Write: 162 MB/s (6 ms)
      •     Read: 1940 MB/s (1 ms)

Job: seqrw-20gb-1mb-t2
      •     Write: 286 MB/s (7 ms)
      •     Read: 3952 MB/s (1 ms)

Job: seqrw-10gb-1mb-t4
      •     Write: 549 MB/s (7 ms)
      •     Read: 6987 MB/s (1 ms)

Job: seqrw-05gb-1mb-t8
      •     Write: 989 MB/s (8 ms)
      •     Read: 7721 MB/s (1 ms)

Job: seqrw-40gb-2mb-t1
      •     Write: 161 MB/s (12 ms)
      •     Read: 2261 MB/s (0 ms)

Job: seqrw-20gb-2mb-t2
      •     Write: 348 MB/s (11 ms)
      •     Read: 4266 MB/s (1 ms)

Job: seqrw-10gb-2mb-t4
      •     Write: 626 MB/s (13 ms)
      •     Read: 4949 MB/s (1 ms)

Job: seqrw-05gb-2mb-t8
      •     Write: 1154 MB/s (14 ms)
      •     Read: 7007 MB/s (2 ms)

Job: seqrw-40gb-4mb-t1
      •     Write: 161 MB/s (25 ms)
      •     Read: 2083 MB/s (1 ms)

Job: seqrw-20gb-4mb-t2
      •     Write: 352 MB/s (23 ms)
      •     Read: 4317 MB/s (2 ms)

Job: seqrw-10gb-4mb-t4
      •     Write: 696 MB/s (23 ms)
      •     Read: 7358 MB/s (2 ms)

Job: seqrw-05gb-4mb-t8
      •     Write: 1251 MB/s (25 ms)
      •     Read: 6707 MB/s (5 ms)


So with fio I get a very nice read speed, but the write is horrendous and I 
cannot find what causes it. I have looked at affinity settings for the mmfsd 
process but not sure I fully understand it. But no matter what I set it to, I 
see no difference.

I have "played" with the bios and tried with/without hyperthreading, numa and 
so on. And nothing affects atleast the blackmagic disk speed test.

the current settings for this host is like below. I write "current" because I 
have tested a few different settings here but nothing affects the write speed. 
maxTcpConnsPerNodeConn for sure bumped the read speed though.

nsdMaxWorkerThreads 16
prefetchPct 60
maxTcpConnsPerNodeConn 8
maxMBpS 14000


Does anyone have any suggestions or ideas on how to troubleshoot this?

Thanks




--

Henrik Cednert  /  + 46 704 71 89 54  /  CTO  /  OnePost (formerly Filmlance 
Post)

☝️ OnePost, formerly Filmlance's post-production, is now an independent part of 
the Banijay Group.
New name, same team – business as usual at OnePost.





_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org


--
Karlsruhe Institute of Technology (KIT)
Scientific Computing Centre (SCC)
Scientific Data Management (SDM)

Uwe Falke

Hermann-von-Helmholtz-Platz 1, Building 442, Room 187
D-76344 Eggenstein-Leopoldshafen

Tel: +49 721 608 28024
Email: [email protected]<mailto:[email protected]>
www.scc.kit.edu<http://www.scc.kit.edu>

Registered office:
Kaiserstraße 12, 76131 Karlsruhe, Germany

KIT – The Research University in the Helmholtz Association

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org

Reply via email to