Re: [gpfsug-discuss] mounts taking longer in 4.2 vs 4.1?

2018-02-08 Thread Bryan Banister
It may be related to this issue of using root squashed file system option, here are some edited comments from my colleague who stumbled upon this while chatting with a friend at a CUG: " Something I learned last week: apparently the libmount code from util-linux (used by /bin/mount) will

[gpfsug-discuss] Call for presentations - US Spring 2018 Meeting - Boston, May 16-17th

2018-02-08 Thread Oesterlin, Robert
We’re finalizing the details for the Spring 2018 User Group meeting, and we need your help! I’ve you’re interested in presenting at this meeting (it will be a full 2 days), then contact me and let me know what’s you’d like to talk about. We’re always looking for presentations on how you are

Re: [gpfsug-discuss] mounts taking longer in 4.2 vs 4.1?

2018-02-08 Thread Aaron Knister
Hi Loic, Thank you for that information! I have two follow up questions-- 1. Are you using ccr? 2. Do you happen to have mmsdrserv disabled in your environment? (e.g. what's the output of "mmlsconfig mmsdrservPort" on your cluster?). -Aaron On Thu, 8 Feb 2018, Loic Tortay wrote: On

Re: [gpfsug-discuss] mmchdisk suspend / stop

2018-02-08 Thread Edward Wahl
I'm with Richard on this one. Sounds dubious to me. Even older style stuff could start a new controller in a 'failed' or 'service' state and push firmware back in the 20th century... ;) Ed On Thu, 8 Feb 2018 16:23:33 + "Sobey, Richard A" wrote: > Sorry I

Re: [gpfsug-discuss] hdisk suspend / stop (Buterbaugh, Kevin L)

2018-02-08 Thread Buterbaugh, Kevin L
Hi again all, It sounds like doing the “mmchconfig unmountOnDiskFail=meta -i” suggested by Steve and Bob followed by using mmchdisk to stop the disks temporarily is the way we need to go. We will, as an aside, also run a mmapplypolicy first to pull any files users have started accessing again

Re: [gpfsug-discuss] hdisk suspend / stop (Buterbaugh, Kevin L)

2018-02-08 Thread Bryan Banister
I don't know or care who the hardware vendor is, but they can DEFINITELY ship you a controller with the right firmware! Just demand it, which is what I do and they have basically always complied with the request. There is the risk associated with running even longer with a single point of

Re: [gpfsug-discuss] hdisk suspend / stop (Buterbaugh, Kevin L)

2018-02-08 Thread Steve Xiao
You can change the cluster configuration to online unmount the file system when there is error accessing metadata. This can be done run the following command: mmchconfig unmountOnDiskFail=meta -i After this configuration change, you should be able to stop all 5 NSDs with mmchdisk stop

Re: [gpfsug-discuss] Inode scan optimization - (tomasz.wol...@ts.fujitsu.com )

2018-02-08 Thread Marc A Kaplan
Let's give Fujitsu an opportunity to answer with some facts and re-pose their questions. When I first read the complaint, I kinda assumed they were using mmbackup and TSM -- but then I noticed words about some gpfs_XXX apis So it looks like this Fujitsu fellow is "rolling his own"... NOT

Re: [gpfsug-discuss] mmchdisk suspend / stop

2018-02-08 Thread valdis . kletnieks
On Thu, 08 Feb 2018 16:25:33 +, "Oesterlin, Robert" said: > unmountOnDiskFail > The unmountOnDiskFail specifies how the GPFS daemon responds when a disk > failure is detected. The valid values of this parameter are yes, no, and meta. > The default value is no. I suspect that the only

Re: [gpfsug-discuss] mmchdisk suspend / stop

2018-02-08 Thread Oesterlin, Robert
Check out “unmountOnDiskFail” config parameter perhaps? https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/bl1adm_tuningguide.htm unmountOnDiskFail The unmountOnDiskFail specifies how the GPFS daemon responds when a disk failure is detected. The valid

Re: [gpfsug-discuss] mmchdisk suspend / stop

2018-02-08 Thread Sobey, Richard A
Sorry I can’t help… the only thing going round and round my head right now is why on earth the existing controller cannot push the required firmware to the new one when it comes online. Never heard of anything else! Feel free to name and shame so I can avoid  Richard From:

[gpfsug-discuss] mmchdisk suspend / stop

2018-02-08 Thread Buterbaugh, Kevin L
Hi All, We are in a bit of a difficult situation right now with one of our non-IBM hardware vendors (I know, I know, I KNOW - buy IBM hardware! ) and are looking for some advice on how to deal with this unfortunate situation. We have a non-IBM FC storage array with dual-“redundant”

Re: [gpfsug-discuss] Inode scan optimization - (tomasz.wol...@ts.fujitsu.com )

2018-02-08 Thread valdis . kletnieks
On Thu, 08 Feb 2018 10:33:13 -0500, "Marc A Kaplan" said: > Please clarify and elaborate When you write "a full backup ... takes > 60 days" - that seems very poor indeed. > BUT you haven't stated how much data is being copied to what kind of > backup media nor how much equipment or what

Re: [gpfsug-discuss] Inode scan optimization - (tomasz.wol...@ts.fujitsu.com )

2018-02-08 Thread Marc A Kaplan
Please clarify and elaborate When you write "a full backup ... takes 60 days" - that seems very poor indeed. BUT you haven't stated how much data is being copied to what kind of backup media nor how much equipment or what types you are using... Nor which backup software... We have

Re: [gpfsug-discuss] Inode scan optimization

2018-02-08 Thread Marc A Kaplan
Recall that many years ago we demonstrated a Billion files scanned with mmapplypolicy in under 20 minutes... And that was on ordinary at the time, spinning disks (not SSD!)... Granted we packed about 1000 files per directory and made some other choices that might not be typical usage OTOH

Re: [gpfsug-discuss] Inode scan optimization

2018-02-08 Thread Frederick Stock
You mention that all the NSDs are metadata and data but you do not say how many NSDs are defined or the type of storage used, that is are these on SAS or NL-SAS storage? I'm assuming they are not on SSDs/flash storage. Have you considered moving the metadata to separate NSDs, preferably

[gpfsug-discuss] Inode scan optimization

2018-02-08 Thread tomasz.wol...@ts.fujitsu.com
Hello All, A full backup of an 2 billion inodes spectrum scale file system on V4.1.1.16 takes 60 days. We try to optimize and using inode scans seems to improve, even when we are using a directory scan and the inode scan just for having a better performance concerning stat (using