We tend to get the maintenance company to down-grade the firmware to match what 
we have for our aging hardware, before sending it to us.
I assume this isn’t an option?

Paul Ward
Technical Solutions Infrastructure Architect
Natural History Museum
T: 02079426450
E: p.w...@nhm.ac.uk

From: gpfsug-discuss-boun...@spectrumscale.org 
[mailto:gpfsug-discuss-boun...@spectrumscale.org] On Behalf Of Buterbaugh, 
Kevin L
Sent: 08 February 2018 16:00
To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>
Subject: [gpfsug-discuss] mmchdisk suspend / stop

Hi All,

We are in a bit of a difficult situation right now with one of our non-IBM 
hardware vendors (I know, I know, I KNOW - buy IBM hardware! <grin>) and are 
looking for some advice on how to deal with this unfortunate situation.

We have a non-IBM FC storage array with dual-“redundant” controllers.  One of 
those controllers is dead and the vendor is sending us a replacement.  However, 
the replacement controller will have mis-matched firmware with the surviving 
controller and - long story short - the vendor says there is no way to resolve 
that without taking the storage array down for firmware upgrades.  Needless to 
say there’s more to that story than what I’ve included here, but I won’t bore 
everyone with unnecessary details.

The storage array has 5 NSDs on it, but fortunately enough they are part of our 
“capacity” pool … i.e. the only way a file lands here is if an mmapplypolicy 
scan moved it there because the *access* time is greater than 90 days.  
Filesystem data replication is set to one.

So … what I was wondering if I could do is to use mmchdisk to either suspend or 
(preferably) stop those NSDs, do the firmware upgrade, and resume the NSDs?  
The problem I see is that suspend doesn’t stop I/O, it only prevents the 
allocation of new blocks … so, in theory, if a user suddenly decided to start 
using a file they hadn’t needed for 3 months then I’ve got a problem.  Stopping 
all I/O to the disks is what I really want to do.  However, according to the 
mmchdisk man page stop cannot be used on a filesystem with replication set to 
one.

There’s over 250 TB of data on those 5 NSDs, so restriping off of them or 
setting replication to two are not options.

It is very unlikely that anyone would try to access a file on those NSDs during 
the hour or so I’d need to do the firmware upgrades, but how would GPFS itself 
react to those (suspended) disks going away for a while?  I’m thinking I could 
be OK if there was just a way to actually stop them rather than suspend them.  
Any undocumented options to mmchdisk that I’m not aware of???

Are there other options - besides buying IBM hardware - that I am overlooking?  
Thanks...
—
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
kevin.buterba...@vanderbilt.edu<mailto:kevin.buterba...@vanderbilt.edu> - 
(615)875-9633



_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to