Re: [gpfsug-discuss] Poor client performance with high cpu usage of mmfsd process

2020-11-20 Thread Czauz, Kamil
 what lookups you see (check in the trcrpt file, could be like 
gpfs_i_lookup or gpfs_v_lookup)?
Lookups are metadata ops, do you have a separate pool for your metadata?
How is that pool set up (doen to the physical block devices)?
Your trcsum down revealed 36 lookups, each one on avg taking >30ms. That is a 
lot (albeit the respective waiters won't show up at first glance as suspicious 
...).
So, which waiters did you see  (hope you saved them, if not, do it next time).

What are the node you see this on and the whole cluster used for? What is the 
MaxFilesToCache setting (for that node and for others)? what HW is that, how 
big are your nodes (memory,CPU)?
To check the unreasonably short trace capture time: how large are the trcrpt 
files you obtain?


Mit freundlichen Grüßen / Kind regards

Dr. Uwe Falke
IT Specialist
Hybrid Cloud Infrastructure / Technology Consulting & Implementation Services
+49 175 575 2877 Mobile
Rathausstr. 7, 09111 Chemnitz, Germany
uwefa...@de.ibm.com

IBM Services

IBM Data Privacy Statement

IBM Deutschland Business & Technology Services GmbH
Geschäftsführung: Sven Schooss, Stefan Hierl Sitz der Gesellschaft: Ehningen
Registergericht: Amtsgericht Stuttgart, HRB 17122



From:   "Czauz, Kamil" 
To: gpfsug main discussion list 
Date:   13/11/2020 14:33
Subject:[EXTERNAL] Re: [gpfsug-discuss] Poor client performance
with high cpu   usage   of  mmfsd process
Sent by:gpfsug-discuss-boun...@spectrumscale.org



Hi Uwe -

Regarding your previous message - waiters were coming / going with just
1-2 waiters when I ran the mmdiag command, with very low wait times (<0.01s).

We are running version 4.2.3

I did another capture today while the client is functioning normally and this 
was the header result:

Overwrite trace parameters:
buffer size: 134217728
64 kernel trace streams, indices 0-63 (selected by low bits of processor
ID)
128 daemon trace streams, indices 64-191 (selected by low bits of thread
ID)
Interval for calibrating clock rate was 25.996957 seconds and 67592121252 
cycles Measured cycle count update rate to be 261271 per second < using 
this value OS reported cycle count update rate as 259000 per second Trace 
milestones:
kernel trace enabled Fri Nov 13 08:20:01.800558000 2020 (TOD 1605273601.800558, 
cycles 20807897445779444) daemon trace enabled Fri Nov 13 08:20:01.910017000 
2020 (TOD 1605273601.910017, cycles 20807897730372442) all streams included Fri 
Nov 13 08:20:26.423085049 2020 (TOD 1605273626.423085, cycles 
20807961464381068) < useful part of trace extends from here trace quiesced 
Fri Nov 13 08:20:27.797515000 2020 (TOD 1605273627.000797, cycles 
20807965037900696) < to here Approximate number of times the trace buffer 
was filled: 14.631

Still a very small capture (1.3s), but the trsum.awk output was not filled with 
lookup commands / large lookup times. Can you help debug what those long lookup 
operations mean?

Unfinished operations:

27967 * pagein ** 1.362382116
27967 * readpage ** 1.362381516
139130 1.362448448 * Unfinished IO: buffer/disk 3002F67
20:107498951168^\archive_data_16
104686 1.362022068 * Unfinished IO: buffer/disk 50011878000
1:47169618944^\archive_data_1
0 0.0 * Unfinished IO: buffer/disk 5003CEB8000 
4:23073390592^\FFFE
0 0.0 * Unfinished IO: buffer/disk 5003CEB8000 
4:23073390592^\
0 0.0 * Unfinished IO: buffer/disk 2000EE78000 
5:47631127040^\FFFE
341710 1.362423815 * Unfinished IO: buffer/disk 20022218000
19:107498951680^\archive_data_15
0 0.0 * Unfinished IO: buffer/disk 2000EE78000 
5:47631127040^\
0 0.0 * Unfinished IO: buffer/disk 206B000 
18:3452986648^\FFFE
0 0.0 * Unfinished IO: buffer/disk 206B000 
18:3452986648^\
0 0.0 * Unfinished IO: buffer/disk 206B000 
18:3452986648^\
139150 1.361122006 * Unfinished IO: buffer/disk 50012018000
2:47169622016^\archive_data_2
0 0.0 * Unfinished IO: buffer/disk 5003CEB8000 
4:23073390592^\
95782 1.361112791 * Unfinished IO: buffer/disk 4001630
20:107498950656^\archive_data_16
0 0.0 * Unfinished IO: buffer/disk 2000EE78000 
5:47631127040^\
271076 1.361579585 * Unfinished IO: buffer/disk 20023DB8000
4:47169606656^\archive_data_4
341676 1.362018599 * Unfinished IO: buffer/disk 4003814
5:47169614336^\archive_data_5
139150 1.361131599 MSG FSnd: nsdMsgReadExt msg_id 2930654492 Sduration
13292.382 + us
341676 1.362027104 MSG FSnd: nsdMsgReadExt msg_id 2930654495 Sduration
12396.877 + us
95782 1.361124739 MSG FSnd: nsdMsgReadExt msg_id 2930654491 Sduration
13299.242 + us
271076 1.361587653 MSG FSnd: nsdMsgReadExt msg_id 2930654493 Sduration
12836.328 + us
92182 0.0

Re: [gpfsug-discuss] Poor client performance with high cpu usage of mmfsd process

2020-11-13 Thread Czauz, Kamil
sellschaft: Ehningen
Registergericht: Amtsgericht Stuttgart, HRB 17122



From:   Uwe Falke/Germany/IBM
To: gpfsug main discussion list 
Date:   13/11/2020 10:21
Subject:Re: [EXTERNAL] Re: [gpfsug-discuss] Poor client
performance with high cpu usage of  mmfsd process


Hi, Kamil,
looks your tracefile setting has been too low:
all streams included Thu Nov 12 20:58:19.950515266 2020 (TOD 1605232699.950515, 
cycles 20701552715873212) < useful part of trace extends from here trace 
quiesced Thu Nov 12 20:58:20.133134000 2020 (TOD 1605232700.000133, cycles 
20701553190681534) < to here means you effectively captured a period of 
about 5ms only ... you can't see much from that.

I'd assumed the default trace file size would be sufficient here but it doesn't 
seem to.
try running with something like
 mmtracectl --start --trace-file-size=512M --trace=io 
--tracedev-write-mode=overwrite -N .

However, if you say "no major waiter" - how many waiters did you see at any 
time? what kind of waiters were the oldest, how long they'd waited?
it could indeed well be that some job is just creating a killer workload.

The very short cyle time of the trace points, OTOH, to high activity, OTOH the 
trace file setting appears quite low (trace=io doesnt' collect many trace 
infos, just basic IO stuff).
If I might ask: what version of GPFS are you running?

Mit freundlichen Grüßen / Kind regards

Dr. Uwe Falke
IT Specialist
Hybrid Cloud Infrastructure / Technology Consulting & Implementation Services
+49 175 575 2877 Mobile
Rathausstr. 7, 09111 Chemnitz, Germany
uwefa...@de.ibm.com

IBM Services

IBM Data Privacy Statement

IBM Deutschland Business & Technology Services GmbH
Geschäftsführung: Sven Schooss, Stefan Hierl Sitz der Gesellschaft: Ehningen
Registergericht: Amtsgericht Stuttgart, HRB 17122




From:   "Czauz, Kamil" 
To: gpfsug main discussion list 
Date:   13/11/2020 03:33
Subject:[EXTERNAL] Re: [gpfsug-discuss] Poor client performance
with high cpu usage of  mmfsd process
Sent by:gpfsug-discuss-boun...@spectrumscale.org



Hi Uwe -

I hit the issue again today, no major waiters, and nothing useful from the 
iohist report. Nothing interesting in the logs either.

I was able to get a trace today while the issue was happening. I took 2 traces 
a few min apart.

The beginning of the traces look something like this:
Overwrite trace parameters:
buffer size: 134217728
64 kernel trace streams, indices 0-63 (selected by low bits of processor
ID)
128 daemon trace streams, indices 64-191 (selected by low bits of thread
ID)
Interval for calibrating clock rate was 100.019054 seconds and
260049296314 cycles
Measured cycle count update rate to be 257559 per second < using this 
value OS reported cycle count update rate as 259000 per second Trace 
milestones:
kernel trace enabled Thu Nov 12 20:56:40.11408 2020 (TOD 1605232600.114080, 
cycles 20701293141385220) daemon trace enabled Thu Nov 12 20:56:40.24743 
2020 (TOD 1605232600.247430, cycles 20701293488095152) all streams included Thu 
Nov 12 20:58:19.950515266 2020 (TOD 1605232699.950515, cycles 
20701552715873212) < useful part of trace extends from here trace quiesced 
Thu Nov 12 20:58:20.133134000 2020 (TOD 1605232700.000133, cycles 
20701553190681534) < to here Approximate number of times the trace buffer 
was filled: 553.529


Here is the output of
trsum.awk details=0
I'm not quite sure what to make of it, can you help me decipher it? The 
'lookup' operations are taking a hell of a long time, what does that mean?

Capture 1

Unfinished operations:

21234 * lookup ** 0.165851604
290020 * lookup ** 0.151032241
302757 * lookup ** 0.168723402
301677 * lookup ** 0.070016530
230983 * lookup ** 0.127699082
21233 * lookup ** 0.060357257
309046 * lookup ** 0.157124551
301643 * lookup ** 0.165543982
304042 * lookup ** 0.172513838
167794 * lookup ** 0.056056815
189680 * lookup ** 0.062022237
362216 * lookup ** 0.072063619
406314 * lookup ** 0.114121838
167776 * lookup ** 0.114899642
303016 * lookup ** 0.144491120
290021 * lookup ** 0.142311603
167762 * lookup ** 0.144240366
248530 * lookup ** 0.168728131
0 0.0 * Unfinished IO: buffer/disk 30018014000 
14:48493092752^\
0 0.0 * Unfinished IO: buffer/disk 30018014000 
14:48493092752^\FFFE
0 0.0 * Unfinished IO: buffer/disk 206B000 
2:6058336^\
0 0

[gpfsug-discuss] Poor client performance with high cpu usage of mmfsd process

2020-11-11 Thread Czauz, Kamil
We regularly run into performance issues on our clients where the client seems 
to hang when accessing any gpfs mount, even something simple like a ls could 
take a few minutes to complete.   This affects every gpfs mount on the client, 
but other clients are working just fine. Also the mmfsd process at this point 
is spinning at something like 300-500% cpu.

The only way I have found to solve this is by killing processes that may be 
doing heavy i/o to the gpfs mounts - but this is more of an art than a science. 
 I often end up killing many processes before finding the offending one.

My question is really about finding the offending process easier.  Is there 
something similar to iotop or a trace that I can enable that can tell me what 
files/processes and being heavily used by the mmfsd process on the client?

-Kamil
Confidentiality Note: This e-mail and any attachments are confidential and may 
be protected by legal privilege. If you are not the intended recipient, be 
aware that any disclosure, copying, distribution or use of this e-mail or any 
attachment is prohibited. If you have received this e-mail in error, please 
notify us immediately by returning it to the sender and delete this copy from 
your system. We will use any personal information you give to us in accordance 
with our Privacy Policy which can be found in the Data Protection section on 
our corporate website www.squarepoint-capital.com. Please note that e-mails may 
be monitored for regulatory and compliance purposes. Thank you for your 
cooperation.
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss