from your file system configuration .. mmfs <dev> -L you'll find the size of the LOG
since release 4.x ..you can change it, but you need to re-mount the FS on every client , to make the change effective ...

when a clients initiate writes/changes to GPFS  it needs to update its changes to the log -  if this narrows a certain filling degree, GPFS triggers so called logWrapThreads to write content to disk and so free space

with your given numbers ... double digit [ms] waiter times .. you fs get's probably slowed down.. and there's something suspect with the storage, because LOG-IOs are rather small and should not take that long

to give you an example from a healthy environment... the IO times are so small, that you usually don't see waiters for this..

I/O start time RW    Buf type disk:sectorNum     nSec  time ms      tag1      tag2           Disk UID typ      NSD node context   thread
--------------- -- ----------- ----------------- -----  ------- --------- --------- ------------------ --- --------------- --------- ----------

06:23:32.358851  W     logData    2:524306424        8    0.439         0         0  C0A70D08:57CF40D1 cli   192.167.20.17 LogData   SGExceptionLogBufferFullThread
06:23:33.576367  W     logData    1:524257280        8    0.646         0         0  C0A70D08:57CF40D0 cli   192.167.20.16 LogData   SGExceptionLogBufferFullThread

06:23:32.358851  W     logData    2:524306424        8    0.439         0         0  C0A70D08:57CF40D1 cli   192.167.20.17 LogData   SGExceptionLogBufferFullThread
06:23:33.576367  W     logData    1:524257280        8    0.646         0         0  C0A70D08:57CF40D0 cli   192.167.20.16 LogData   SGExceptionLogBufferFullThread
06:23:32.212426  W   iallocSeg    1:524490048       64    0.733         2       245  C0A70D08:57CF40D0 cli   192.167.20.16 Logwrap   LogWrapHelperThread
06:23:32.212412  W     logWrap    2:524552192        8    0.755         0    179200  C0A70D08:57CF40D1 cli   192.167.20.17 Logwrap   LogWrapHelperThread
06:23:32.212432  W     logWrap    2:525162760        8    0.737         0    125473  C0A70D08:57CF40D1 cli   192.167.20.17 Logwrap   LogWrapHelperThread
06:23:32.212416  W   iallocSeg    2:524488384       64    0.763         2       347  C0A70D08:57CF40D1 cli   192.167.20.17 Logwrap   LogWrapHelperThread
06:23:32.212414  W     logWrap    2:525266944        8    2.160         0    177664  C0A70D08:57CF40D1 cli   192.167.20.17 Logwrap   LogWrapHelperThread



hope this helps ..


Mit freundlichen Grüßen / Kind regards

 
Olaf Weiser

EMEA Storage Competence Center Mainz, German / IBM Systems, Storage Platform,
-------------------------------------------------------------------------------------------------------------------------------------------
IBM Deutschland
IBM Allee 1
71139 Ehningen
Phone: +49-170-579-44-66
E-Mail: olaf.wei...@de.ibm.com
-------------------------------------------------------------------------------------------------------------------------------------------
IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin Jetter
Geschäftsführung: Martina Koederitz (Vorsitzende), Susanne Peter, Norbert Janzen, Dr. Christian Keller, Ivo Koerner, Markus Koerner
Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 14562 / WEEE-Reg.-Nr. DE 99369940




From:        Aaron Knister <aaron.s.knis...@nasa.gov>
To:        gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>
Date:        10/15/2016 07:23 AM
Subject:        [gpfsug-discuss] SGExceptionLogBufferFullThread waiter
Sent by:        gpfsug-discuss-boun...@spectrumscale.org




I've got a node that's got some curious waiters on it (see below). Could
someone explain what the "SGExceptionLogBufferFullThread" waiter means?

Thanks!

-Aaron

=== mmdiag: waiters ===
0x7FFFF040D600 waiting 0.038822715 seconds,
SGExceptionLogBufferFullThread: on ThCond 0x7FFFDBB07628
(0x7FFFDBB07628) (parallelWaitCond), reason 'wait for parallel write'
for NSD I/O completion on node 10.1.53.5 <c0n20>
0x7FFFE83F3D60 waiting 0.039629116 seconds, CleanBufferThread: on ThCond
0x17B1488 (0x17B1488) (MsgRecordCondvar), reason 'RPC wait' for NSD I/O
completion on node 10.1.53.7 <c0n22>
0x7FFFE8373A90 waiting 0.038921480 seconds, CleanBufferThread: on ThCond
0x7FFFCD2B4E30 (0x7FFFCD2B4E30) (LogFileBufferDescriptorCondvar), reason
'force wait on force active buffer write'
0x42CD9B0 waiting 0.028227004 seconds, CleanBufferThread: on ThCond
0x7FFFCD2B4E30 (0x7FFFCD2B4E30) (LogFileBufferDescriptorCondvar), reason
'force wait for buffer write to complete'
0x7FFFE0F0EAD0 waiting 0.027864343 seconds, CleanBufferThread: on ThCond
0x7FFFDC0EEA88 (0x7FFFDC0EEA88) (MsgRecordCondvar), reason 'RPC wait'
for NSD I/O completion on node 10.1.53.7 <c0n22>
0x1575560 waiting 0.028045975 seconds, RemoveHandlerThread: on ThCond
0x18020CE4E08 (0xFFFFC90020CE4E08) (LkObjCondvar), reason 'waiting for
LX lock'
0x1570560 waiting 0.038724949 seconds, CreateHandlerThread: on ThCond
0x18020CE50A0 (0xFFFFC90020CE50A0) (LkObjCondvar), reason 'waiting for
LX lock'
0x1563D60 waiting 0.073919918 seconds, RemoveHandlerThread: on ThCond
0x180235F6440 (0xFFFFC900235F6440) (LkObjCondvar), reason 'waiting for
LX lock'
0x1561560 waiting 0.054854513 seconds, RemoveHandlerThread: on ThCond
0x1802292D200 (0xFFFFC9002292D200) (LkObjCondvar), reason 'waiting for
LX lock'
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss




_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to