Re: [gpfsug-discuss] Hanging file-systems

2018-11-27 Thread Simon Thompson
It was an NSD server … we’d already shutdown all the clients in the remote 
clusters!

And Tomer has already agreed to do a talk on memory  (but I’m still looking 
for a user talk if anyone is interested!)

Simon

From:  on behalf of 
"oeh...@gmail.com" 
Reply-To: "gpfsug-discuss@spectrumscale.org" 
Date: Tuesday, 27 November 2018 at 20:44
To: "gpfsug-discuss@spectrumscale.org" 
Subject: Re: [gpfsug-discuss] Hanging file-systems

and i already talk about NUMA stuff at the CIUK usergroup meeting, i won't 
volunteer for a 2nd advanced topic  :-D


On Tue, Nov 27, 2018 at 12:43 PM Sven Oehme 
mailto:oeh...@gmail.com>> wrote:
was the node you rebooted a client or a server that was running kswapd at 100% ?

sven


On Tue, Nov 27, 2018 at 12:09 PM Simon Thompson 
mailto:s.j.thomp...@bham.ac.uk>> wrote:
The nsd nodes were running 5.0.1-2 (though we just now rolling to 5.0.2-1 I 
think).

So is this memory pressure on the NSD nodes then? I thought it was documented 
somewhere that GFPS won’t use more than 50% of the host memory.

And actually if you look at the values for maxStatCache and maxFilesToCache, 
the memory footprint is quite small.

Sure on these NSD servers we had a pretty big pagepool (which we’ve dropped by 
some), but there still should have been quite a lot of memory space on the 
nodes …

If only someone as going to do a talk in December at the CIUK SSUG on memory 
usage …

Simon

From: 
mailto:gpfsug-discuss-boun...@spectrumscale.org>>
 on behalf of "oeh...@gmail.com" 
mailto:oeh...@gmail.com>>
Reply-To: 
"gpfsug-discuss@spectrumscale.org" 
mailto:gpfsug-discuss@spectrumscale.org>>
Date: Tuesday, 27 November 2018 at 18:19

To: "gpfsug-discuss@spectrumscale.org" 
mailto:gpfsug-discuss@spectrumscale.org>>
Subject: Re: [gpfsug-discuss] Hanging file-systems

Hi,

now i need to swap back in a lot of information about GPFS i tried to swap out 
:-)

i bet kswapd is not doing anything you think the name suggest here, which is 
handling swap space.  i claim the kswapd thread is trying to throw dentries out 
of the cache and what it tries to actually get rid of are entries of 
directories very high up in the tree which GPFS still has a refcount on so it 
can't free it. when it does this there is a single thread (unfortunate was 
never implemented with multiple threads) walking down the tree to find some 
entries to steal, it it can't find any it goes to the next , next , etc and on 
a bus system it can take forever to free anything up. there have been multiple 
fixes in this area in 5.0.1.x and 5.0.2 which i pushed for the weeks before i 
left IBM. you never see this in a trace with default traces which is why nobody 
would have ever suspected this, you need to set special trace levels to even 
see this.
i don't know the exact version the changes went into, but somewhere in the 
5.0.1.X timeframe. the change was separating the cache list to prefer stealing 
files before directories, also keep a minimum percentages of directories in the 
cache (10 % by default) before it would ever try to get rid of a directory. it 
also tries to keep a list of free entries all the time (means pro active 
cleaning them) and also allows to go over the hard limit compared to just block 
as in previous versions. so i assume you run a version prior to 5.0.1.x and 
what you see is kspwapd desperately get rid of entries, but can't find one its 
already at the limit so it blocks and doesn't allow a new entry to be created 
or promoted from the statcache .

again all this is without source code access and speculation on my part based 
on experience :-)

what version are you running and also share mmdiag --stats of that node

sven






On Tue, Nov 27, 2018 at 9:54 AM Simon Thompson 
mailto:s.j.thomp...@bham.ac.uk>> wrote:
Thanks Sven …

We found a node with kswapd running 100% (and swap was off)…

Killing that node made access to the FS spring into life.

Simon

From: 
mailto:gpfsug-discuss-boun...@spectrumscale.org>>
 on behalf of "oeh...@gmail.com" 
mailto:oeh...@gmail.com>>
Reply-To: 
"gpfsug-discuss@spectrumscale.org" 
mailto:gpfsug-discuss@spectrumscale.org>>
Date: Tuesday, 27 November 2018 at 16:14
To: "gpfsug-discuss@spectrumscale.org" 
mailto:gpfsug-discuss@spectrumscale.org>>
Subject: Re: [gpfsug-discuss] Hanging file-systems

1. are you under memory pressure or even worse started swapping .
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
___

[gpfsug-discuss] Introduction

2018-11-27 Thread Constance M Rice
Hello,

I am a new member here. I work for IBM in the Washington System Center 
supporting Spectrum Scale and ESS across North America. I live in 
Leesburg, Virginia, USA northwest of Washington, DC.

  Connie Rice
  Storage Specialist
  Washington Systems Center
  Mobile: 202-821-6747
  E-mail: constance.r...@us.ibm.com




___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Hanging file-systems

2018-11-27 Thread Sven Oehme
and i already talk about NUMA stuff at the CIUK usergroup meeting, i won't
volunteer for a 2nd advanced topic  :-D


On Tue, Nov 27, 2018 at 12:43 PM Sven Oehme  wrote:

> was the node you rebooted a client or a server that was running kswapd at
> 100% ?
>
> sven
>
>
> On Tue, Nov 27, 2018 at 12:09 PM Simon Thompson 
> wrote:
>
>> The nsd nodes were running 5.0.1-2 (though we just now rolling to 5.0.2-1
>> I think).
>>
>>
>>
>> So is this memory pressure on the NSD nodes then? I thought it was
>> documented somewhere that GFPS won’t use more than 50% of the host memory.
>>
>>
>>
>> And actually if you look at the values for maxStatCache and
>> maxFilesToCache, the memory footprint is quite small.
>>
>>
>>
>> Sure on these NSD servers we had a pretty big pagepool (which we’ve
>> dropped by some), but there still should have been quite a lot of memory
>> space on the nodes …
>>
>>
>>
>> If only someone as going to do a talk in December at the CIUK SSUG on
>> memory usage …
>>
>>
>>
>> Simon
>>
>>
>>
>> *From: * on behalf of "
>> oeh...@gmail.com" 
>> *Reply-To: *"gpfsug-discuss@spectrumscale.org" <
>> gpfsug-discuss@spectrumscale.org>
>> *Date: *Tuesday, 27 November 2018 at 18:19
>>
>>
>> *To: *"gpfsug-discuss@spectrumscale.org" <
>> gpfsug-discuss@spectrumscale.org>
>> *Subject: *Re: [gpfsug-discuss] Hanging file-systems
>>
>>
>>
>> Hi,
>>
>>
>>
>> now i need to swap back in a lot of information about GPFS i tried to
>> swap out :-)
>>
>>
>>
>> i bet kswapd is not doing anything you think the name suggest here, which
>> is handling swap space.  i claim the kswapd thread is trying to throw
>> dentries out of the cache and what it tries to actually get rid of are
>> entries of directories very high up in the tree which GPFS still has a
>> refcount on so it can't free it. when it does this there is a single thread
>> (unfortunate was never implemented with multiple threads) walking down the
>> tree to find some entries to steal, it it can't find any it goes to the
>> next , next , etc and on a bus system it can take forever to free anything
>> up. there have been multiple fixes in this area in 5.0.1.x and 5.0.2 which
>> i pushed for the weeks before i left IBM. you never see this in a trace
>> with default traces which is why nobody would have ever suspected this, you
>> need to set special trace levels to even see this.
>>
>> i don't know the exact version the changes went into, but somewhere in
>> the 5.0.1.X timeframe. the change was separating the cache list to prefer
>> stealing files before directories, also keep a minimum percentages of
>> directories in the cache (10 % by default) before it would ever try to get
>> rid of a directory. it also tries to keep a list of free entries all the
>> time (means pro active cleaning them) and also allows to go over the hard
>> limit compared to just block as in previous versions. so i assume you run a
>> version prior to 5.0.1.x and what you see is kspwapd desperately get rid of
>> entries, but can't find one its already at the limit so it blocks and
>> doesn't allow a new entry to be created or promoted from the statcache .
>>
>>
>>
>> again all this is without source code access and speculation on my part
>> based on experience :-)
>>
>>
>>
>> what version are you running and also share mmdiag --stats of that node
>>
>>
>>
>> sven
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> On Tue, Nov 27, 2018 at 9:54 AM Simon Thompson 
>> wrote:
>>
>> Thanks Sven …
>>
>>
>>
>> We found a node with kswapd running 100% (and swap was off)…
>>
>>
>>
>> Killing that node made access to the FS spring into life.
>>
>>
>>
>> Simon
>>
>>
>>
>> *From: * on behalf of "
>> oeh...@gmail.com" 
>> *Reply-To: *"gpfsug-discuss@spectrumscale.org" <
>> gpfsug-discuss@spectrumscale.org>
>> *Date: *Tuesday, 27 November 2018 at 16:14
>> *To: *"gpfsug-discuss@spectrumscale.org" <
>> gpfsug-discuss@spectrumscale.org>
>> *Subject: *Re: [gpfsug-discuss] Hanging file-systems
>>
>>
>>
>> 1. are you under memory pressure or even worse started swapping .
>>
>> ___
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>
>> ___
>> gpfsug-discuss mailing list
>> gpfsug-discuss at spectrumscale.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>
>
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Hanging file-systems

2018-11-27 Thread Sven Oehme
was the node you rebooted a client or a server that was running kswapd at
100% ?

sven


On Tue, Nov 27, 2018 at 12:09 PM Simon Thompson 
wrote:

> The nsd nodes were running 5.0.1-2 (though we just now rolling to 5.0.2-1
> I think).
>
>
>
> So is this memory pressure on the NSD nodes then? I thought it was
> documented somewhere that GFPS won’t use more than 50% of the host memory.
>
>
>
> And actually if you look at the values for maxStatCache and
> maxFilesToCache, the memory footprint is quite small.
>
>
>
> Sure on these NSD servers we had a pretty big pagepool (which we’ve
> dropped by some), but there still should have been quite a lot of memory
> space on the nodes …
>
>
>
> If only someone as going to do a talk in December at the CIUK SSUG on
> memory usage …
>
>
>
> Simon
>
>
>
> *From: * on behalf of "
> oeh...@gmail.com" 
> *Reply-To: *"gpfsug-discuss@spectrumscale.org" <
> gpfsug-discuss@spectrumscale.org>
> *Date: *Tuesday, 27 November 2018 at 18:19
>
>
> *To: *"gpfsug-discuss@spectrumscale.org"  >
> *Subject: *Re: [gpfsug-discuss] Hanging file-systems
>
>
>
> Hi,
>
>
>
> now i need to swap back in a lot of information about GPFS i tried to swap
> out :-)
>
>
>
> i bet kswapd is not doing anything you think the name suggest here, which
> is handling swap space.  i claim the kswapd thread is trying to throw
> dentries out of the cache and what it tries to actually get rid of are
> entries of directories very high up in the tree which GPFS still has a
> refcount on so it can't free it. when it does this there is a single thread
> (unfortunate was never implemented with multiple threads) walking down the
> tree to find some entries to steal, it it can't find any it goes to the
> next , next , etc and on a bus system it can take forever to free anything
> up. there have been multiple fixes in this area in 5.0.1.x and 5.0.2 which
> i pushed for the weeks before i left IBM. you never see this in a trace
> with default traces which is why nobody would have ever suspected this, you
> need to set special trace levels to even see this.
>
> i don't know the exact version the changes went into, but somewhere in the
> 5.0.1.X timeframe. the change was separating the cache list to prefer
> stealing files before directories, also keep a minimum percentages of
> directories in the cache (10 % by default) before it would ever try to get
> rid of a directory. it also tries to keep a list of free entries all the
> time (means pro active cleaning them) and also allows to go over the hard
> limit compared to just block as in previous versions. so i assume you run a
> version prior to 5.0.1.x and what you see is kspwapd desperately get rid of
> entries, but can't find one its already at the limit so it blocks and
> doesn't allow a new entry to be created or promoted from the statcache .
>
>
>
> again all this is without source code access and speculation on my part
> based on experience :-)
>
>
>
> what version are you running and also share mmdiag --stats of that node
>
>
>
> sven
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Tue, Nov 27, 2018 at 9:54 AM Simon Thompson 
> wrote:
>
> Thanks Sven …
>
>
>
> We found a node with kswapd running 100% (and swap was off)…
>
>
>
> Killing that node made access to the FS spring into life.
>
>
>
> Simon
>
>
>
> *From: * on behalf of "
> oeh...@gmail.com" 
> *Reply-To: *"gpfsug-discuss@spectrumscale.org" <
> gpfsug-discuss@spectrumscale.org>
> *Date: *Tuesday, 27 November 2018 at 16:14
> *To: *"gpfsug-discuss@spectrumscale.org"  >
> *Subject: *Re: [gpfsug-discuss] Hanging file-systems
>
>
>
> 1. are you under memory pressure or even worse started swapping .
>
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Hanging file-systems

2018-11-27 Thread Simon Thompson
The nsd nodes were running 5.0.1-2 (though we just now rolling to 5.0.2-1 I 
think).

So is this memory pressure on the NSD nodes then? I thought it was documented 
somewhere that GFPS won’t use more than 50% of the host memory.

And actually if you look at the values for maxStatCache and maxFilesToCache, 
the memory footprint is quite small.

Sure on these NSD servers we had a pretty big pagepool (which we’ve dropped by 
some), but there still should have been quite a lot of memory space on the 
nodes …

If only someone as going to do a talk in December at the CIUK SSUG on memory 
usage …

Simon

From:  on behalf of 
"oeh...@gmail.com" 
Reply-To: "gpfsug-discuss@spectrumscale.org" 
Date: Tuesday, 27 November 2018 at 18:19
To: "gpfsug-discuss@spectrumscale.org" 
Subject: Re: [gpfsug-discuss] Hanging file-systems

Hi,

now i need to swap back in a lot of information about GPFS i tried to swap out 
:-)

i bet kswapd is not doing anything you think the name suggest here, which is 
handling swap space.  i claim the kswapd thread is trying to throw dentries out 
of the cache and what it tries to actually get rid of are entries of 
directories very high up in the tree which GPFS still has a refcount on so it 
can't free it. when it does this there is a single thread (unfortunate was 
never implemented with multiple threads) walking down the tree to find some 
entries to steal, it it can't find any it goes to the next , next , etc and on 
a bus system it can take forever to free anything up. there have been multiple 
fixes in this area in 5.0.1.x and 5.0.2 which i pushed for the weeks before i 
left IBM. you never see this in a trace with default traces which is why nobody 
would have ever suspected this, you need to set special trace levels to even 
see this.
i don't know the exact version the changes went into, but somewhere in the 
5.0.1.X timeframe. the change was separating the cache list to prefer stealing 
files before directories, also keep a minimum percentages of directories in the 
cache (10 % by default) before it would ever try to get rid of a directory. it 
also tries to keep a list of free entries all the time (means pro active 
cleaning them) and also allows to go over the hard limit compared to just block 
as in previous versions. so i assume you run a version prior to 5.0.1.x and 
what you see is kspwapd desperately get rid of entries, but can't find one its 
already at the limit so it blocks and doesn't allow a new entry to be created 
or promoted from the statcache .

again all this is without source code access and speculation on my part based 
on experience :-)

what version are you running and also share mmdiag --stats of that node

sven






On Tue, Nov 27, 2018 at 9:54 AM Simon Thompson 
mailto:s.j.thomp...@bham.ac.uk>> wrote:
Thanks Sven …

We found a node with kswapd running 100% (and swap was off)…

Killing that node made access to the FS spring into life.

Simon

From: 
mailto:gpfsug-discuss-boun...@spectrumscale.org>>
 on behalf of "oeh...@gmail.com" 
mailto:oeh...@gmail.com>>
Reply-To: 
"gpfsug-discuss@spectrumscale.org" 
mailto:gpfsug-discuss@spectrumscale.org>>
Date: Tuesday, 27 November 2018 at 16:14
To: "gpfsug-discuss@spectrumscale.org" 
mailto:gpfsug-discuss@spectrumscale.org>>
Subject: Re: [gpfsug-discuss] Hanging file-systems

1. are you under memory pressure or even worse started swapping .
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Hanging file-systems

2018-11-27 Thread Simon Thompson
Yes, but we’d upgraded all out HPC client nodes to 5.0.2-1 last week as well 
when this first happened …

Unless it’s necessary to upgrade the NSD servers as well for this?

Simon

From:  on behalf of "t...@il.ibm.com" 

Reply-To: "gpfsug-discuss@spectrumscale.org" 
Date: Tuesday, 27 November 2018 at 19:48
To: "gpfsug-discuss@spectrumscale.org" 
Subject: Re: [gpfsug-discuss] Hanging file-systems

"paging to disk" sometimes means mmap as well - there were several issues 
around that recently as well.


Regards,

Tomer Perry
Scalable I/O Development (Spectrum Scale)
email: t...@il.ibm.com
1 Azrieli Center, Tel Aviv 67021, Israel
Global Tel:+1 720 3422758
Israel Tel:  +972 3 9188625
Mobile: +972 52 2554625




From:Skylar Thompson 
To:gpfsug-discuss@spectrumscale.org
Date:27/11/2018 20:28
Subject:Re: [gpfsug-discuss] Hanging file-systems
Sent by:gpfsug-discuss-boun...@spectrumscale.org




Despite its name, kswapd isn't directly involved in paging to disk; it's
the kernel process that's involved in finding committed memory that can be
reclaimed for use (either immediately, or possibly by flushing dirty pages
to disk). If kswapd is using a lot of CPU, it's a sign that the kernel is
spending a lot of time to find free pages to allocate to processes.

On Tue, Nov 27, 2018 at 05:53:58PM +, Simon Thompson wrote:
> Thanks Sven ???
>
> We found a node with kswapd running 100% (and swap was off)???
>
> Killing that node made access to the FS spring into life.
>
> Simon
>
> From:  on behalf of 
> "oeh...@gmail.com" 
> Reply-To: "gpfsug-discuss@spectrumscale.org" 
> 
> Date: Tuesday, 27 November 2018 at 16:14
> To: "gpfsug-discuss@spectrumscale.org" 
> Subject: Re: [gpfsug-discuss] Hanging file-systems
>
> 1. are you under memory pressure or even worse started swapping .

> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss=DwICAg=jf_iaSHvJObTbx-siA1ZOg=mLPyKeOa1gNDrORvEXBgMw=sgaWNOJnHka2HBtMNNXBur4p2KbQ8q786tWza40tcLQ=CWkCUHu4-uwZQ6r1x_VFAGqQ5FFSBGXMSVa5t2pk424=


--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss=DwICAg=jf_iaSHvJObTbx-siA1ZOg=mLPyKeOa1gNDrORvEXBgMw=sgaWNOJnHka2HBtMNNXBur4p2KbQ8q786tWza40tcLQ=CWkCUHu4-uwZQ6r1x_VFAGqQ5FFSBGXMSVa5t2pk424=




___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Hanging file-systems

2018-11-27 Thread Tomer Perry
"paging to disk" sometimes means mmap as well - there were several issues 
around that recently as well.


Regards,

Tomer Perry
Scalable I/O Development (Spectrum Scale)
email: t...@il.ibm.com
1 Azrieli Center, Tel Aviv 67021, Israel
Global Tel:+1 720 3422758
Israel Tel:  +972 3 9188625
Mobile: +972 52 2554625




From:   Skylar Thompson 
To: gpfsug-discuss@spectrumscale.org
Date:   27/11/2018 20:28
Subject:Re: [gpfsug-discuss] Hanging file-systems
Sent by:gpfsug-discuss-boun...@spectrumscale.org



Despite its name, kswapd isn't directly involved in paging to disk; it's
the kernel process that's involved in finding committed memory that can be
reclaimed for use (either immediately, or possibly by flushing dirty pages
to disk). If kswapd is using a lot of CPU, it's a sign that the kernel is
spending a lot of time to find free pages to allocate to processes.

On Tue, Nov 27, 2018 at 05:53:58PM +, Simon Thompson wrote:
> Thanks Sven ???
> 
> We found a node with kswapd running 100% (and swap was off)???
> 
> Killing that node made access to the FS spring into life.
> 
> Simon
> 
> From:  on behalf of 
"oeh...@gmail.com" 
> Reply-To: "gpfsug-discuss@spectrumscale.org" 

> Date: Tuesday, 27 November 2018 at 16:14
> To: "gpfsug-discuss@spectrumscale.org" 

> Subject: Re: [gpfsug-discuss] Hanging file-systems
> 
> 1. are you under memory pressure or even worse started swapping .

> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> 
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss=DwICAg=jf_iaSHvJObTbx-siA1ZOg=mLPyKeOa1gNDrORvEXBgMw=sgaWNOJnHka2HBtMNNXBur4p2KbQ8q786tWza40tcLQ=CWkCUHu4-uwZQ6r1x_VFAGqQ5FFSBGXMSVa5t2pk424=



-- 
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss=DwICAg=jf_iaSHvJObTbx-siA1ZOg=mLPyKeOa1gNDrORvEXBgMw=sgaWNOJnHka2HBtMNNXBur4p2KbQ8q786tWza40tcLQ=CWkCUHu4-uwZQ6r1x_VFAGqQ5FFSBGXMSVa5t2pk424=





___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Hanging file-systems

2018-11-27 Thread Dwayne.Hart
Hi Simon,

Was there a reason behind swap being disabled?

Best,
Dwayne
—
Dwayne Hart | Systems Administrator IV

CHIA, Faculty of Medicine
Memorial University of Newfoundland
300 Prince Philip Drive
St. John’s, Newfoundland | A1B 3V6
Craig L Dobbin Building | 4M409
T 709 864 6631

On Nov 27, 2018, at 2:24 PM, Simon Thompson 
mailto:s.j.thomp...@bham.ac.uk>> wrote:

Thanks Sven …

We found a node with kswapd running 100% (and swap was off)…

Killing that node made access to the FS spring into life.

Simon

From: 
mailto:gpfsug-discuss-boun...@spectrumscale.org>>
 on behalf of "oeh...@gmail.com" 
mailto:oeh...@gmail.com>>
Reply-To: 
"gpfsug-discuss@spectrumscale.org" 
mailto:gpfsug-discuss@spectrumscale.org>>
Date: Tuesday, 27 November 2018 at 16:14
To: "gpfsug-discuss@spectrumscale.org" 
mailto:gpfsug-discuss@spectrumscale.org>>
Subject: Re: [gpfsug-discuss] Hanging file-systems

1. are you under memory pressure or even worse started swapping .
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Hanging file-systems

2018-11-27 Thread Skylar Thompson
Despite its name, kswapd isn't directly involved in paging to disk; it's
the kernel process that's involved in finding committed memory that can be
reclaimed for use (either immediately, or possibly by flushing dirty pages
to disk). If kswapd is using a lot of CPU, it's a sign that the kernel is
spending a lot of time to find free pages to allocate to processes.

On Tue, Nov 27, 2018 at 05:53:58PM +, Simon Thompson wrote:
> Thanks Sven ???
> 
> We found a node with kswapd running 100% (and swap was off)???
> 
> Killing that node made access to the FS spring into life.
> 
> Simon
> 
> From:  on behalf of 
> "oeh...@gmail.com" 
> Reply-To: "gpfsug-discuss@spectrumscale.org" 
> 
> Date: Tuesday, 27 November 2018 at 16:14
> To: "gpfsug-discuss@spectrumscale.org" 
> Subject: Re: [gpfsug-discuss] Hanging file-systems
> 
> 1. are you under memory pressure or even worse started swapping .

> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss


-- 
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Hanging file-systems

2018-11-27 Thread Sven Oehme
Hi,

now i need to swap back in a lot of information about GPFS i tried to swap
out :-)

i bet kswapd is not doing anything you think the name suggest here, which
is handling swap space.  i claim the kswapd thread is trying to throw
dentries out of the cache and what it tries to actually get rid of are
entries of directories very high up in the tree which GPFS still has a
refcount on so it can't free it. when it does this there is a single thread
(unfortunate was never implemented with multiple threads) walking down the
tree to find some entries to steal, it it can't find any it goes to the
next , next , etc and on a bus system it can take forever to free anything
up. there have been multiple fixes in this area in 5.0.1.x and 5.0.2 which
i pushed for the weeks before i left IBM. you never see this in a trace
with default traces which is why nobody would have ever suspected this, you
need to set special trace levels to even see this.
i don't know the exact version the changes went into, but somewhere in the
5.0.1.X timeframe. the change was separating the cache list to prefer
stealing files before directories, also keep a minimum percentages of
directories in the cache (10 % by default) before it would ever try to get
rid of a directory. it also tries to keep a list of free entries all the
time (means pro active cleaning them) and also allows to go over the hard
limit compared to just block as in previous versions. so i assume you run a
version prior to 5.0.1.x and what you see is kspwapd desperately get rid of
entries, but can't find one its already at the limit so it blocks and
doesn't allow a new entry to be created or promoted from the statcache .

again all this is without source code access and speculation on my part
based on experience :-)

what version are you running and also share mmdiag --stats of that node

sven






On Tue, Nov 27, 2018 at 9:54 AM Simon Thompson 
wrote:

> Thanks Sven …
>
>
>
> We found a node with kswapd running 100% (and swap was off)…
>
>
>
> Killing that node made access to the FS spring into life.
>
>
>
> Simon
>
>
>
> *From: * on behalf of "
> oeh...@gmail.com" 
> *Reply-To: *"gpfsug-discuss@spectrumscale.org" <
> gpfsug-discuss@spectrumscale.org>
> *Date: *Tuesday, 27 November 2018 at 16:14
> *To: *"gpfsug-discuss@spectrumscale.org"  >
> *Subject: *Re: [gpfsug-discuss] Hanging file-systems
>
>
>
> 1. are you under memory pressure or even worse started swapping .
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] mmfsck output

2018-11-27 Thread IBM Spectrum Scale
This means that the files having the below inode numbers 38422 and 281057 
are orphan files (i.e. files not referenced by any directory/folder) and 
they will be moved to the lost+found folder of the fileset owning these 
files by mmfsck repair.

Regards, The Spectrum Scale (GPFS) team

--
If you feel that your question can benefit other users of  Spectrum Scale 
(GPFS), then please post it to the public IBM developerWroks Forum at 
https://www.ibm.com/developerworks/community/forums/html/forum?id=----0479
. 

If your query concerns a potential software error in Spectrum Scale (GPFS) 
and you have an IBM software maintenance contract please contact 
1-800-237-5511 in the United States or your local IBM Service Center in 
other countries. 

The forum is informally monitored as time permits and should not be used 
for priority messages to the Spectrum Scale (GPFS) team.



From:   Kenneth Waegeman 
To: gpfsug main discussion list 
Date:   11/26/2018 10:10 PM
Subject:[gpfsug-discuss] mmfsck output
Sent by:gpfsug-discuss-boun...@spectrumscale.org



Hi all,

We had some leftover files with IO errors on a GPFS FS, so we ran a 
mmfsck.

Does someone know what these mmfsck errors mean:

Error in inode 38422 snap 0: has nlink field as 1

Error in inode 281057 snap 0: is unreferenced
  Attach inode to lost+found of fileset root filesetId 0? no



Thanks!

Kenneth

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss=DwIGaQ=jf_iaSHvJObTbx-siA1ZOg=IbxtjdkPAM2Sbon4Lbbi4w=-J2C2ZYYUsp42fIyYHg3aYSR8wC5SKNhl6ZztfRJMvI=4OPQpDp8v56fvska0-O-pskIfONFMnZFydDo0T6KwJM=






___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Hanging file-systems

2018-11-27 Thread Simon Thompson
Thanks Sven …

We found a node with kswapd running 100% (and swap was off)…

Killing that node made access to the FS spring into life.

Simon

From:  on behalf of 
"oeh...@gmail.com" 
Reply-To: "gpfsug-discuss@spectrumscale.org" 
Date: Tuesday, 27 November 2018 at 16:14
To: "gpfsug-discuss@spectrumscale.org" 
Subject: Re: [gpfsug-discuss] Hanging file-systems

1. are you under memory pressure or even worse started swapping .
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Hanging file-systems

2018-11-27 Thread Sven Oehme
if this happens you should check a couple of things :

1. are you under memory pressure or even worse started swapping .
2. is there any core running at ~ 0% idle - run top , press 1 and check the
idle column.
3. is there any single thread running at ~100%  - run top , press shift - h
and check what the CPU % shows for the top 5 processes.

if you want to go the extra mile, you could run perf top -p $PID_OF_MMFSD
and check what the top cpu consumers are.
confirming and providing data to any of the above being true could be the
missing piece why nobody was able to find it, as this is stuff unfortunate
nobody ever looks at. even a trace won't help if any of the above is true
as all you see is that the system behaves correct according to the trace,
its doesn't appear busy,

Sven





On Tue, Nov 27, 2018 at 8:03 AM Oesterlin, Robert <
robert.oester...@nuance.com> wrote:

> I have seen something like this in the past, and I have resorted to a
> cluster restart as well.  :-( IBM and I could never really track it down,
> because I could not get a dump at the time of occurrence. However, you
> might take a look at your NSD servers, one at a time. As I recall, we
> thought it was a stuck thread on one of the NSD servers, and when we
> restarted the “right” one it cleared the block.
>
>
>
> The other thing I’ve done in the past to isolate problems like this (since
> this is related to tokens) is to look at the “token revokes” on each node,
> looking for ones that are sticking around for a long time. I tossed
> together a quick script and ran it via mmdsh on all the node. Not pretty,
> but it got the job done. Run this a few times, see if any of the revokes
> are sticking around for a long time
>
>
>
> #!/bin/sh
>
> rm -f /tmp/revokelist
>
> /usr/lpp/mmfs/bin/mmfsadm dump tokenmgr | grep -A 2 'revokeReq list' >
> /tmp/revokelist 2> /dev/null
>
> if [ $? -eq 0 ]; then
>
>   /usr/lpp/mmfs/bin/mmfsadm dump tscomm > /tmp/tscomm.out
>
>   for n in `cat /tmp/revokelist  | grep msgHdr | awk '{print $5}'`; do
>
>grep $n /tmp/tscomm.out | tail -1
>
>   done
>
>   rm -f /tmp/tscomm.out
>
> fi
>
>
>
>
>
> Bob Oesterlin
>
> Sr Principal Storage Engineer, Nuance
>
>
>
>
>
>
>
> *From: * on behalf of Simon
> Thompson 
> *Reply-To: *gpfsug main discussion list 
> *Date: *Tuesday, November 27, 2018 at 9:27 AM
> *To: *"gpfsug-discuss@spectrumscale.org"  >
> *Subject: *[EXTERNAL] [gpfsug-discuss] Hanging file-systems
>
>
>
> I have a file-system which keeps hanging over the past few weeks. Right
> now, its offline and taken a bunch of services out with it.
>
>
>
> (I have a ticket with IBM open about this as well)
>
>
>
> We see for example:
>
> Waiting 305.0391 sec since 15:17:02, monitored, thread 24885
> SharedHashTabFetchHandlerThread: on ThCond 0x7FE3B408
> (MsgRecordCondvar), re
>
> ason 'RPC wait' for tmMsgTellAcquire1 on node 10.10.12.42 
>
>
>
> and on that node:
>
> Waiting 292.4581 sec since 15:17:22, monitored, thread 20368
> SharedHashTabFetchHandlerThread: on ThCond 0x7F3C2929719
>
> 8 (TokenCondvar), reason 'wait for SubToken to become stable'
>
>
>
> On this node, if you dump tscomm, you see entries like:
>
> Pending messages:
>
>   msg_id 376617, service 13.1, msg_type 20 'tmMsgTellAcquire1', n_dest 1,
> n_pending 1
>
>   this 0x7F3CD800B930, n_xhold 1, cl 0, cbFn 0x0, age 303 sec
>
> sent by 'SharedHashTabFetchHandlerThread' (0x7F3DD800A6C0)
>
> dest   status pending   , err 0, reply len 0 by TCP
> connection
>
>
>
> c0n9 is itself.
>
>
>
> This morning when this happened, the only way to get the FS back online
> was to shutdown the entire cluster.
>
>
>
> Any pointers for next place to look/how to fix?
>
>
>
> Simon
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Hanging file-systems

2018-11-27 Thread Oesterlin, Robert
I have seen something like this in the past, and I have resorted to a cluster 
restart as well.  :-( IBM and I could never really track it down, because I 
could not get a dump at the time of occurrence. However, you might take a look 
at your NSD servers, one at a time. As I recall, we thought it was a stuck 
thread on one of the NSD servers, and when we restarted the “right” one it 
cleared the block.

The other thing I’ve done in the past to isolate problems like this (since this 
is related to tokens) is to look at the “token revokes” on each node, looking 
for ones that are sticking around for a long time. I tossed together a quick 
script and ran it via mmdsh on all the node. Not pretty, but it got the job 
done. Run this a few times, see if any of the revokes are sticking around for a 
long time

#!/bin/sh
rm -f /tmp/revokelist
/usr/lpp/mmfs/bin/mmfsadm dump tokenmgr | grep -A 2 'revokeReq list' > 
/tmp/revokelist 2> /dev/null
if [ $? -eq 0 ]; then
  /usr/lpp/mmfs/bin/mmfsadm dump tscomm > /tmp/tscomm.out
  for n in `cat /tmp/revokelist  | grep msgHdr | awk '{print $5}'`; do
   grep $n /tmp/tscomm.out | tail -1
  done
  rm -f /tmp/tscomm.out
fi


Bob Oesterlin
Sr Principal Storage Engineer, Nuance



From:  on behalf of Simon Thompson 

Reply-To: gpfsug main discussion list 
Date: Tuesday, November 27, 2018 at 9:27 AM
To: "gpfsug-discuss@spectrumscale.org" 
Subject: [EXTERNAL] [gpfsug-discuss] Hanging file-systems

I have a file-system which keeps hanging over the past few weeks. Right now, 
its offline and taken a bunch of services out with it.

(I have a ticket with IBM open about this as well)

We see for example:
Waiting 305.0391 sec since 15:17:02, monitored, thread 24885 
SharedHashTabFetchHandlerThread: on ThCond 0x7FE3B408 (MsgRecordCondvar), re
ason 'RPC wait' for tmMsgTellAcquire1 on node 10.10.12.42 

and on that node:
Waiting 292.4581 sec since 15:17:22, monitored, thread 20368 
SharedHashTabFetchHandlerThread: on ThCond 0x7F3C2929719
8 (TokenCondvar), reason 'wait for SubToken to become stable'

On this node, if you dump tscomm, you see entries like:
Pending messages:
  msg_id 376617, service 13.1, msg_type 20 'tmMsgTellAcquire1', n_dest 1, 
n_pending 1
  this 0x7F3CD800B930, n_xhold 1, cl 0, cbFn 0x0, age 303 sec
sent by 'SharedHashTabFetchHandlerThread' (0x7F3DD800A6C0)
dest   status pending   , err 0, reply len 0 by TCP connection

c0n9 is itself.

This morning when this happened, the only way to get the FS back online was to 
shutdown the entire cluster.

Any pointers for next place to look/how to fix?

Simon
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


[gpfsug-discuss] Hanging file-systems

2018-11-27 Thread Simon Thompson
I have a file-system which keeps hanging over the past few weeks. Right now, 
its offline and taken a bunch of services out with it.

(I have a ticket with IBM open about this as well)

We see for example:
Waiting 305.0391 sec since 15:17:02, monitored, thread 24885 
SharedHashTabFetchHandlerThread: on ThCond 0x7FE3B408 (MsgRecordCondvar), re
ason 'RPC wait' for tmMsgTellAcquire1 on node 10.10.12.42 

and on that node:
Waiting 292.4581 sec since 15:17:22, monitored, thread 20368 
SharedHashTabFetchHandlerThread: on ThCond 0x7F3C2929719
8 (TokenCondvar), reason 'wait for SubToken to become stable'

On this node, if you dump tscomm, you see entries like:
Pending messages:
  msg_id 376617, service 13.1, msg_type 20 'tmMsgTellAcquire1', n_dest 1, 
n_pending 1
  this 0x7F3CD800B930, n_xhold 1, cl 0, cbFn 0x0, age 303 sec
sent by 'SharedHashTabFetchHandlerThread' (0x7F3DD800A6C0)
dest   status pending   , err 0, reply len 0 by TCP connection

c0n9 is itself.

This morning when this happened, the only way to get the FS back online was to 
shutdown the entire cluster.

Any pointers for next place to look/how to fix?

Simon
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss