and i already talk about NUMA stuff at the CIUK usergroup meeting, i won't volunteer for a 2nd advanced topic :-D
On Tue, Nov 27, 2018 at 12:43 PM Sven Oehme <[email protected]> wrote: > was the node you rebooted a client or a server that was running kswapd at > 100% ? > > sven > > > On Tue, Nov 27, 2018 at 12:09 PM Simon Thompson <[email protected]> > wrote: > >> The nsd nodes were running 5.0.1-2 (though we just now rolling to 5.0.2-1 >> I think). >> >> >> >> So is this memory pressure on the NSD nodes then? I thought it was >> documented somewhere that GFPS won’t use more than 50% of the host memory. >> >> >> >> And actually if you look at the values for maxStatCache and >> maxFilesToCache, the memory footprint is quite small. >> >> >> >> Sure on these NSD servers we had a pretty big pagepool (which we’ve >> dropped by some), but there still should have been quite a lot of memory >> space on the nodes … >> >> >> >> If only someone as going to do a talk in December at the CIUK SSUG on >> memory usage … >> >> >> >> Simon >> >> >> >> *From: *<[email protected]> on behalf of " >> [email protected]" <[email protected]> >> *Reply-To: *"[email protected]" < >> [email protected]> >> *Date: *Tuesday, 27 November 2018 at 18:19 >> >> >> *To: *"[email protected]" < >> [email protected]> >> *Subject: *Re: [gpfsug-discuss] Hanging file-systems >> >> >> >> Hi, >> >> >> >> now i need to swap back in a lot of information about GPFS i tried to >> swap out :-) >> >> >> >> i bet kswapd is not doing anything you think the name suggest here, which >> is handling swap space. i claim the kswapd thread is trying to throw >> dentries out of the cache and what it tries to actually get rid of are >> entries of directories very high up in the tree which GPFS still has a >> refcount on so it can't free it. when it does this there is a single thread >> (unfortunate was never implemented with multiple threads) walking down the >> tree to find some entries to steal, it it can't find any it goes to the >> next , next , etc and on a bus system it can take forever to free anything >> up. there have been multiple fixes in this area in 5.0.1.x and 5.0.2 which >> i pushed for the weeks before i left IBM. you never see this in a trace >> with default traces which is why nobody would have ever suspected this, you >> need to set special trace levels to even see this. >> >> i don't know the exact version the changes went into, but somewhere in >> the 5.0.1.X timeframe. the change was separating the cache list to prefer >> stealing files before directories, also keep a minimum percentages of >> directories in the cache (10 % by default) before it would ever try to get >> rid of a directory. it also tries to keep a list of free entries all the >> time (means pro active cleaning them) and also allows to go over the hard >> limit compared to just block as in previous versions. so i assume you run a >> version prior to 5.0.1.x and what you see is kspwapd desperately get rid of >> entries, but can't find one its already at the limit so it blocks and >> doesn't allow a new entry to be created or promoted from the statcache . >> >> >> >> again all this is without source code access and speculation on my part >> based on experience :-) >> >> >> >> what version are you running and also share mmdiag --stats of that node >> >> >> >> sven >> >> >> >> >> >> >> >> >> >> >> >> >> >> On Tue, Nov 27, 2018 at 9:54 AM Simon Thompson <[email protected]> >> wrote: >> >> Thanks Sven … >> >> >> >> We found a node with kswapd running 100% (and swap was off)… >> >> >> >> Killing that node made access to the FS spring into life. >> >> >> >> Simon >> >> >> >> *From: *<[email protected]> on behalf of " >> [email protected]" <[email protected]> >> *Reply-To: *"[email protected]" < >> [email protected]> >> *Date: *Tuesday, 27 November 2018 at 16:14 >> *To: *"[email protected]" < >> [email protected]> >> *Subject: *Re: [gpfsug-discuss] Hanging file-systems >> >> >> >> 1. are you under memory pressure or even worse started swapping . >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >
_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
