Aaron, hold a bit with the upgrade , i just got word that while 4.2.1+ most likely addresses the issues i mentioned, there was a defect in the initial release of the parallel log recovery code. i will get the exact minimum version you need to deploy and send another update to this thread.
sven On Mon, Jan 23, 2017 at 5:03 AM Sven Oehme <[email protected]> wrote: > Then i would suggest to move up to at least 4.2.1.LATEST , there is a high > chance your problem might already be fixed. > > i see 2 potential area that got significant improvements , Token Manager > recovery and Log Recovery, both are in latest 4.2.1 code enabled : > > 2 significant improvements on Token Recovery in 4.2.1 : > > 1. Extendible hashing for token hash table. This speeds up token lookup > and thereby reduce tcMutex hold times for configurations with a large ratio > of clients to token servers. > 2. Cleaning up tokens held by failed nodes was making multiple passes > over the whole token table, one for each failed node. The loops are now > inverted, so it makes a single pass over the able, and for each token > found, does cleanup for all failed nodes. > > there are multiple smaller enhancements beyond 4.2.1 but thats the minimum > level you want to be. i have seen token recovery of 10's of minutes similar > to what you described going down to a minute with this change. > > on Log Recovery - in case of an unclean unmount/shutdown of a node prior > 4.2.1 the Filesystem manager would only recover one Log file at a time, > using a single thread, with 4.2.1 this is now done with multiple threads > and multiple log files in parallel . > > Sven > > On Mon, Jan 23, 2017 at 4:22 AM Aaron Knister <[email protected]> > wrote: > > It's at 4.1.1.10. > > On 1/22/17 11:12 PM, Sven Oehme wrote: > > What version of Scale/ GPFS code is this cluster on ? > > > > ------------------------------------------ > > Sven Oehme > > Scalable Storage Research > > email: [email protected] > > Phone: +1 (408) 824-8904 <(408)%20824-8904> > > IBM Almaden Research Lab > > ------------------------------------------ > > > > Inactive hide details for Aaron Knister ---01/23/2017 01:31:29 AM---I > > was afraid someone would ask :) One possible use would beAaron Knister > > ---01/23/2017 01:31:29 AM---I was afraid someone would ask :) One > > possible use would be testing how monitoring reacts to and/or > > > > From: Aaron Knister <[email protected]> > > To: <[email protected]> > > Date: 01/23/2017 01:31 AM > > Subject: Re: [gpfsug-discuss] forcibly panic stripegroup everywhere? > > Sent by: [email protected] > > > > ------------------------------------------------------------------------ > > > > > > > > I was afraid someone would ask :) > > > > One possible use would be testing how monitoring reacts to and/or > > corrects stale filesystems. > > > > The use in my case is there's an issue we see quite often where a > > filesystem won't unmount when trying to shut down gpfs. Linux insists > > its still busy despite every process being killed on the node just about > > except init. It's a real pain because it complicates maintenance, > > requiring a reboot of some nodes prior to patching for example. > > > > I dug into it and it appears as though when this happens the > > filesystem's mnt_count is ridiculously high (300,000+ in one case). I'm > > trying to debug it further but I need to actually be able to make the > > condition happen a few more times to debug it. A stripegroup panic isn't > > a surefire way but it's the only way I've found so far to trigger this > > behavior somewhat on demand. > > > > One way I've found to trigger a mass stripegroup panic is to induce what > > I call a "301 error": > > > > loremds07: Sun Jan 22 00:30:03.367 2017: [X] File System ttest unmounted > > by the system with return code 301 reason code 0 > > loremds07: Sun Jan 22 00:30:03.368 2017: Invalid argument > > > > and tickle a known race condition between nodes being expelled from the > > cluster and a manager node joining the cluster. When this happens it > > seems to cause a mass stripe group panic that's over in a few minutes. > > The trick there is that it doesn't happen every time I go through the > > exercise and when it does there's no guarantee the filesystem that > > panics is the one in use. If it's not an fs in use then it doesn't help > > me reproduce the error condition. I was trying to use the "mmfsadm test > > panic" command to try a more direct approach. > > > > Hope that helps shed some light. > > > > -Aaron > > > > On 1/22/17 8:16 PM, Andrew Beattie wrote: > >> Out of curiosity -- why would you want to? > >> Andrew Beattie > >> Software Defined Storage - IT Specialist > >> Phone: 614-2133-7927 > >> E-mail: [email protected] <mailto:[email protected]> > >> > >> > >> > >> ----- Original message ----- > >> From: Aaron Knister <[email protected]> > >> Sent by: [email protected] > >> To: gpfsug main discussion list <[email protected]> > >> Cc: > >> Subject: [gpfsug-discuss] forcibly panic stripegroup everywhere? > >> Date: Mon, Jan 23, 2017 11:11 AM > >> > >> This is going to sound like a ridiculous request, but, is there a > way to > >> cause a filesystem to panic everywhere in one "swell foop"? I'm > assuming > >> the answer will come with an appropriate disclaimer of "don't ever > do > >> this, we don't support it, it might eat your data, summon cthulu, > etc.". > >> I swear I've seen the fs manager initiate this type of operation > before. > >> > >> I can seem to do it on a per-node basis with "mmfsadm test panic > <fs> > >> <error code>" but if I do that over all 1k nodes in my test cluster > at > >> once it results in about 45 minutes of almost total deadlock while > each > >> panic is processed by the fs manager. > >> > >> -Aaron > >> > >> -- > >> Aaron Knister > >> NASA Center for Climate Simulation (Code 606.2) > >> Goddard Space Flight Center > >> (301) 286-2776 > >> _______________________________________________ > >> gpfsug-discuss mailing list > >> gpfsug-discuss at spectrumscale.org > >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > >> > >> > >> > >> > >> > >> > >> _______________________________________________ > >> gpfsug-discuss mailing list > >> gpfsug-discuss at spectrumscale.org > >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > >> > > > > -- > > Aaron Knister > > NASA Center for Climate Simulation (Code 606.2) > > Goddard Space Flight Center > > (301) 286-2776 > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > > > > > > > > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > -- > Aaron Knister > NASA Center for Climate Simulation (Code 606.2) > Goddard Space Flight Center > (301) 286-2776 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > >
_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
