Re: [gpfsug-discuss] immutable folder

2022-02-23 Thread Frederick Stock
Paul, what version of Spectrum Scale are you using?
Fred___Fred Stock | Spectrum Scale Development Advocacy | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: "Paul Ward" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: "gpfsug main discussion list" Cc:Subject: [EXTERNAL] Re: [gpfsug-discuss] immutable folderDate: Wed, Feb 23, 2022 7:17 AM  
Thanks, I couldn’t recreate that test:
 
# mkdir "it/stu'pid name"
mkdir: cannot create directory ‘it/stu'pid name’: No such file or directory
[Removing the / ]
 
# mkdir "itstu'pid name"
 
# mmchattr -i yes itstu\'pid\ name/
itstu'pid name/: Change immutable flag failed: Invalid argument.
Can not set directory to be immutable or appendOnly under current fileset mode!
 
Which begs the question, how do I have an immutable folder!
 
Kindest regards,
Paul
 
Paul Ward
TS Infrastructure Architect
Natural History Museum
T: 02079426450
E: p.w...@nhm.ac.uk

 
From: gpfsug-discuss-boun...@spectrumscale.org  On Behalf Of Hannappel, JuergenSent: 23 February 2022 11:49To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] immutable folder
 
While the apostrophe is evil it's not the problem:
 
root@it-gti-02 test1]# mkdir "it/stu'pid name"[root@it-gti-02 test1]# mmchattr -i yes it/stu\'pid\ name[root@it-gti-02 test1]# mmchattr -i no it/stu\'pid\ name

From: "Paul Ward" To: "gpfsug main discussion list" Sent: Wednesday, 23 February, 2022 12:03:37Subject: Re: [gpfsug-discuss] immutable folder
Its not a fileset, its just a folder, well a subfolder…
 
[filesystem/[fileset]/share/data/iac/[user] 2004-2014/Laboratory Impact experiments/LGG shots/Kent LGG/Kent aerogel LGG shots/Lizardite in aerogel/Nick Foster's sample
 
It’s the “Nick Foster's sample” folder I want to delete, but it says it is immutable and I can’t disable that.
 
I suspect it’s the apostrophe confusing things.
 
 
 
 
Kindest regards,
Paul
 
Paul Ward
TS Infrastructure Architect
Natural History Museum
T: 02079426450
E: p.w...@nhm.ac.uk

 
From: gpfsug-discuss-boun...@spectrumscale.org  On Behalf Of IBM Spectrum ScaleSent: 22 February 2022 14:17To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] immutable folder
 
Scale disallows deleting fileset junction using rmdir, so I suggested mmunlinkfileset.Regards, The Spectrum Scale (GPFS) team--If you feel that your question can benefit other users of  Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=----0479.If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact  1-800-237-5511 in the United States or your local IBM Service Center in other countries.The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team.From:        "Paul Ward" To:        "gpfsug main discussion list" Date:        02/22/2022 05:31 AMSubject:        [EXTERNAL] Re: [gpfsug-discuss] immutable folderSent by:        gpfsug-discuss-boun...@spectrumscale.org
___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss
___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss 
 


___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] R: Question on changing mode on many files

2021-12-07 Thread Frederick Stock
Yes
Fred___Fred Stock | Spectrum Scale Development Advocacy | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: "Dorigo Alvise (PSI)" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: "gpfsug main discussion list" Cc:Subject: [EXTERNAL] [gpfsug-discuss] R: Question on changing mode on many filesDate: Tue, Dec 7, 2021 9:10 AM  
I have 5.0.4 for the moment (planned to be updated next year) and what I see is:
 
[root@sf-dss-1 tmp]# locate mmfind
/usr/lpp/mmfs/samples/ilm/mmfind
/usr/lpp/mmfs/samples/ilm/mmfind.README
/usr/lpp/mmfs/samples/ilm/mmfindUtil_processOutputFile.c
/usr/lpp/mmfs/samples/ilm/mmfindUtil_processOutputFile.sampleMakefile
 
Is that what you are talking about ?
 
Thanks,
 
   Alvise
 
Da: gpfsug-discuss-boun...@spectrumscale.org  Per conto di Frederick StockInviato: martedì 7 dicembre 2021 15:02A: gpfsug-discuss@spectrumscale.orgCc: gpfsug-discuss@spectrumscale.orgOggetto: Re: [gpfsug-discuss] Question on changing mode on many files
 
If you are running on a more recent version of Scale you might want to look at the mmfind command.  It provides a find-like wrapper around the execution of policy rules.
Fred___Fred Stock | Spectrum Scale Development Advocacy | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: "Dorigo Alvise (PSI)" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: "'gpfsug main discussion list'" Cc:Subject: [EXTERNAL] [gpfsug-discuss] Question on changing mode on many filesDate: Tue, Dec 7, 2021 8:53 AM 
Dear users/developers/support,
I’d like to ask if there is a fast way to manipulate the permission mask of many files (millions).
 
I tried on 900k files and a recursive chmod (chmod 0### -R path) takes about 1000s, with about 50% usage of mmfsd daemon.
I tried with the perl’s internal function chmod that can operate on an array of files, and it takes about 1/3 of the previous method. Which is already a good result.
 
I’ve seen the possibility to run a policy to execute commands, but I would avoid to execute external commands through mmxargs, 1M of times; would you ?
 
Does anybody have any suggestion to do this operation with minimum disruption on the system ?
 
Thank you,
 
    Alvise
___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss 
 
 
___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss 
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Question on changing mode on many files

2021-12-07 Thread Frederick Stock
If you are running on a more recent version of Scale you might want to look at the mmfind command.  It provides a find-like wrapper around the execution of policy rules.
Fred___Fred Stock | Spectrum Scale Development Advocacy | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: "Dorigo Alvise (PSI)" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: "'gpfsug main discussion list'" Cc:Subject: [EXTERNAL] [gpfsug-discuss] Question on changing mode on many filesDate: Tue, Dec 7, 2021 8:53 AM  
Dear users/developers/support,
I’d like to ask if there is a fast way to manipulate the permission mask of many files (millions).
 
I tried on 900k files and a recursive chmod (chmod 0### -R path) takes about 1000s, with about 50% usage of mmfsd daemon.
I tried with the perl’s internal function chmod that can operate on an array of files, and it takes about 1/3 of the previous method. Which is already a good result.
 
I’ve seen the possibility to run a policy to execute commands, but I would avoid to execute external commands through mmxargs, 1M of times; would you ?
 
Does anybody have any suggestion to do this operation with minimum disruption on the system ?
 
Thank you,
 
    Alvise
___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss 
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] mmsysmon exception with pmcollector socket being absent

2021-11-10 Thread Frederick Stock
I am curious to know if you upgraded by manually applying rpms or if you used the Spectrum Scale deployment tool (spectrumscale command) to apply the upgrade?
Fred___Fred Stock | Spectrum Scale Development Advocacy | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: "Ragho Mahalingam" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: "gpfsug main discussion list" Cc:Subject: [EXTERNAL] Re: [gpfsug-discuss] mmsysmon exception with pmcollector socket being absentDate: Wed, Nov 10, 2021 9:00 AM 
Hi Frederick,In our case the issue started appearing after upgrading from 5.0.4 to 5.1.1.  If you've recently upgraded, then the following may be useful.Turns out that mmsysmon (gpfs-base package) requires the new gpfs.gss.pmcollector (from zimon packages) to function correctly (the AF_INET -> AF_UNIX switch seems to have happened between 5.0 and 5.1).  In our case, we'd upgraded all the mandatory packages but had not upgraded the optional ones; the mmsysmonc python libs appears to be updated by the pmcollector package from my study.
 
If you're running >5.1, I'd suggest checking the versions of gpfs.gss.* packages installed.  If gpfs.gss.pmcollector isn't installed, you'd definitely need that to make this runaway logging stop.
 
Hope that helps!
 
Ragu 

On Wed, Nov 10, 2021 at 5:40 AM Frederik Ferner  wrote:
Hi Ragu,have you ever received any reply to this or managed to solve it? We areseeing exactly the same error and it's filling up our logs. It seems allthe monitoring data is still extracted, so I'm not sure when itstarted so not sure if this is related to any upgrade on our side, butit may have been going on for a while. We only noticed because the logfile now is filling up the local log partition.Kind regards,FrederikOn 26/08/2021 11:49, Ragho Mahalingam wrote:> We've been working on setting up mmperfmon; after creating a new> configuration with the new collector on the same manager node, mmsysmon> keeps throwing exceptions.>>   File "/usr/lpp/mmfs/lib/mmsysmon/container/PerfmonController.py", line> 123, in _getDataFromZimonSocket>     sock.connect(SOCKET_PATH)> FileNotFoundError: [Errno 2] No such file or directory>> Tracing this a bit, it appears that SOCKET_PATH is>  /var/run/perfmon/pmcollector.socket and this unix domain socket is absent,> even though pmcollector has started and is running successfully.>> Under what scenarios is pmcollector supposed to create this socket?  I> don't see any configuration for this in /opt/IBM/zimon/ZIMonCollector.cfg,> so I'm assuming the socket is automatically created when pmcollector starts.>> Any thoughts on how to debug and resolve this?>> Thanks, Ragu--Frederik Ferner (he/him)Senior Computer Systems Administrator (storage) phone: +44 1235 77 8624Diamond Light Source Ltd.                       mob:   +44 7917 08 5110SciComp Help Desk can be reached on x8596(Apologies in advance for the lines below. Some bits are a legalrequirement and I have no control over them.)--This e-mail and any attachments may contain confidential, copyright and or privileged material, and are for the use of the intended addressee only. If you are not the intended addressee or an authorised recipient of the addressee please notify us of receipt by returning the e-mail and do not use, copy, retain, distribute or disclose the information in or attached to the e-mail.Any opinions expressed within this e-mail are those of the individual and not necessarily of Diamond Light Source Ltd.Diamond Light Source Ltd. cannot guarantee that this e-mail or any attachments are free from viruses and we cannot accept liability for any damage which you may sustain as a result of software viruses which may be transmitted in or with the message.Diamond Light Source Limited (company no. 4375679). Registered in England and Wales with its registered office at Diamond House, Harwell Science and Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discussDisclaimer: This email and any corresponding attachments may contain confidential information. If you're not the intended recipient, any copying, distribution, disclosure, or use of any information contained in the email or its attachments is strictly prohibited. If you believe to have received this email in error, please email secur...@pathai.com immediately, then destroy the email and any attachments without reading or saving.
___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss 
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] alphafold and mmap performance

2021-10-19 Thread Frederick Stock
FYI Scale 4.2.3 went out of support in September 2020.  DDN may still support it but there are no updates/enhancements being made to that code stream.  It is quite old.  Scale 5.0.x goes end of support at the end of April 2022.  Scale 5.1.2 was just released and if possible I suggest you upgrade your entire cluster to the Scale 5.1.x release stream.
 
The change to the prefetchAggressivenessRead variable should help, but performance may still not be as you require.
Fred___Fred Stock | Spectrum Scale Development Advocacy | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: "Stuart Barkley" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug-discuss@spectrumscale.orgCc:Subject: [EXTERNAL] [gpfsug-discuss] alphafold and mmap performanceDate: Tue, Oct 19, 2021 1:25 PM 
Over the years there have been several discussions about performanceproblems with mmap() on GPFS/Spectrum Scale.We are currently having problems with mmap() performance on oursystems with new alphafold protein folding software.  Things look similar to previous times wehave had mmap() problems.The software component "hhblits" appears to mmap a large file withgenomic data and then does random reads throughout the file.  GPFSappears to be doing 4K reads for each block limiting the performance.The first run takes 20+ hours to run.  Subsequent identical runscomplete in just 1-2 hours.  After clearing the linux system cache(echo 3 > /proc/sys/vm/drop_caches) the slow performance returns forthe next run.GPFS Server is 4.2.3-5 running on DDN hardware.  CentOS 7.3Default GPFS Client is 4.2.3-22. CentOS 7.9We have tried a number of things including Spectrum Scale clientversion 5.0.5-9 which should have Sven's recent mmap performanceimprovements. Are the recent mmap performance improvements in theclient code or the server code?Only now do I notice a suggestion:    mmchconfig prefetchAggressivenessRead=0 -iI did not use this.  Would a performance change be expected?Would the pagepool size be involved in this?Stuart Barkley--I've never been lost; I was once bewildered for three days, but never lost!                                        --  Daniel Boone___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss 
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] mmbackup with own policy

2021-06-23 Thread Frederick Stock
The only requirement for your own backup policy is that it finds the files you want to back up and skips those that you do not want to back up.  It is no different than any policy that you would use with the GPFS policy engine.
Fred___Fred Stock | Spectrum Scale Development Advocacy | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: "T.A. Yeep" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug main discussion list Cc:Subject: [EXTERNAL] Re: [gpfsug-discuss] mmbackup with own policyDate: Wed, Jun 23, 2021 7:08 AM 
Hi Dr. Martin,
 
You can refer to the Administrator Guide > Chapter 30 > Policies for automating file management, or access via the link below. If you downloaded a PDF, it starts with page 487.
https://www.ibm.com/docs/en/spectrum-scale/5.1.0?topic=management-policy-rules
 
There a quite a number of examples in that chapter too which can help you establish a good understanding of how to write one yourself. 

On Wed, Jun 23, 2021 at 6:10 PM Ulrich Sibiller  wrote:
Hallo,mmbackup offers -P to specify an own policy. Unfortunately I cannot seem to find documentation howthat policy has to look like.I mean, if I grab the policy generated automatically by mmbackup it looks like this:-/* Auto-generated GPFS policy rules file  * Generated on Sat May 29 15:10:46 2021  *//*   Server rules for backup server 1  ***   back5_2   ***  */RULE EXTERNAL LIST 'mmbackup.1.back5_2' EXEC '/net/gpfso/fs1/.mmbackupCfg/BAexecScript.gpfsofs1'OPTS '"/net/gpfso/fs1/.mmbackupShadow.1.back5_2.filesys.update" "-servername=back5_2""-auditlogname=/net/gpfso/fs1/mmbackup.audit.gpfsofs1.back5_2" "NONE"'RULE 'BackupRule' LIST 'mmbackup.1.back5_2' DIRECTORIES_PLUS      SHOW(VARCHAR(MODIFICATION_TIME) || ' ' || VARCHAR(CHANGE_TIME) || ' ' ||           VARCHAR(FILE_SIZE)         || ' ' || VARCHAR(FILESET_NAME) ||           ' ' || 'resdnt' )      WHERE       (         NOT         ( (PATH_NAME LIKE '/%/.mmbackup%') OR           (PATH_NAME LIKE '/%/.mmLockDir' AND MODE LIKE 'd%') OR           (PATH_NAME LIKE '/%/.mmLockDir/%') OR           (MODE LIKE 's%')         )       )       AND         (MISC_ATTRIBUTES LIKE '%u%')       AND...-If I want use an own policy what of all that is required for mmbackup to find the information it needs?Uli--Science + Computing AGVorstandsvorsitzender/Chairman of the board of management:Dr. Martin MatzkeVorstand/Board of Management:Matthias Schempp, Sabine HohensteinVorsitzender des Aufsichtsrats/Chairman of the Supervisory Board:Philippe MiltinAufsichtsrat/Supervisory Board:Martin Wibbe, Ursula MorgensternSitz/Registered Office: TuebingenRegistergericht/Registration Court: StuttgartRegisternummer/Commercial Register No.: HRB 382196___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss 

 --

Best regards 
T.A. Yeep
 
 
___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss 
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Quick delete of huge tree

2021-04-20 Thread Frederick Stock
Assuming your metadata storage is not already at its limit of throughput you may get improved performance by temporarily increasing the value of maxBackgroundDeletionThreads.  You can check its current value with "mmlsconfig maxBackgroundDeletionThreads" and change it with the command "mmchconfig maxBackgroundDeletionThreads=N -I".  Once the removal process is complete you can reset the value back to its previous value.
Fred___Fred Stock | Spectrum Scale Development Advocacy | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: Ulrich Sibiller Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug-discuss@spectrumscale.orgCc:Subject: [EXTERNAL] Re: [gpfsug-discuss] Quick delete of huge treeDate: Tue, Apr 20, 2021 8:10 AM 
On 4/20/21 1:52 PM, Jan-Frode Myklebust wrote:> A couple of ideas.>> The KC recommends adding WEIGHT(DIRECTORY_HASH) to group deletions within a directory. Then maybe > also do it as a 2-step process, in the same policy run. Where you delete all non-directories first,> and then deletes the directories in a depth-first order using WEIGTH(Length(PATH_NAME)):>>> RULE 'delnondir' DELETE> WEIGHT(DIRECTORY_HASH)>       DIRECTORIES_PLUS>       WHERE PATH_NAME LIKE '/mypath/%' AND NOT MISC_ATTRIBUTES LIKE '%D%'>> RULE 'deldir' DELETE>       DIRECTORIES_PLUS>      WEIGHT(Length(PATH_NAME))>       WHERE PATH_NAME LIKE '/mypath/%'  AND MISC_ATTRIBUTES LIKE '%D%'Thanks, I am aware of that but it will not really help with my speed concerns.Uli--Dipl.-Inf. Ulrich Sibiller           science + computing agSystem Administration                    Hagellocher Weg 73Hotline +49 7071 9457 681          72070 Tuebingen, Germany                           https://atos.net/de/deutschland/sc --Science + Computing AGVorstandsvorsitzender/Chairman of the board of management:Dr. Martin MatzkeVorstand/Board of Management:Matthias Schempp, Sabine HohensteinVorsitzender des Aufsichtsrats/Chairman of the Supervisory Board:Philippe MiltinAufsichtsrat/Supervisory Board:Martin Wibbe, Ursula MorgensternSitz/Registered Office: TuebingenRegistergericht/Registration Court: StuttgartRegisternummer/Commercial Register No.: HRB 382196___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss  
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Policy scan of symbolic links with contents?

2021-03-08 Thread Frederick Stock
Presumably the only feature that would help here is if policy could determine that the end location pointed to by a symbolic link is within the current file system.  I am not aware of any such feature or attribute which policy could check so I think all you can do is run policy to find the symbolic links and then check each link to see if it points into the same file system.  You might find the mmfind command useful for this purpose.  I expect it would eliminate the need to create a policy to find the symbolic links.
Fred___Fred Stock | Spectrum Scale Development Advocacy | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: "Oesterlin, Robert" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug main discussion list Cc:Subject: [EXTERNAL] Re: [gpfsug-discuss] Policy scan of symbolic links with contents?Date: Mon, Mar 8, 2021 10:35 AM  
Well - the case here is that the file system has, let’s say, 100M files. Some percentage of these are sym-links to a location that’s not in this file system. I want a report of all these off file system links. However, not all of the sym-links off file system are of interest, just some of them.
 
I can’t say for sure where in the file system they are (and I don’t care).
 
 
Bob Oesterlin
Sr Principal Storage Engineer, Nuance
 
 
From:  on behalf of Frederick Stock Reply-To: gpfsug main discussion list Date: Monday, March 8, 2021 at 9:29 AMTo: "gpfsug-discuss@spectrumscale.org" Cc: "gpfsug-discuss@spectrumscale.org" Subject: [EXTERNAL] Re: [gpfsug-discuss] Policy scan of symbolic links with contents?
 
CAUTION: This Email is from an EXTERNAL source. Ensure you trust this sender before clicking on any links or attachments.

Could you use the PATHNAME LIKE statement to limit the location to the files of interest?
Fred___Fred Stock | Spectrum Scale Development Advocacy | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: "Oesterlin, Robert" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug main discussion list Cc:Subject: [EXTERNAL] [gpfsug-discuss] Policy scan of symbolic links with contents?Date: Mon, Mar 8, 2021 10:12 AM 
Looking to craft a policy scan that pulls out symbolic links to a particular destination. For instance:
 
file1.py -> /fs1/patha/pathb/file1.py (I want to include these)
file2.py -> /fs2/patha/pathb/file2.py (exclude these)
 
The easy way would be to pull out all sym-links and just grep for the ones I want but was hoping for a more elegant solution…
 
 
Bob Oesterlin
Sr Principal Storage Engineer, Nuance
 
___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttps://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss=DwICAg=jf_iaSHvJObTbx-siA1ZOg=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw=i6m1zVXf4peZo0yo02IiRaQ_pUX95MN3wU53M0NiWcI=z-ibh2kAPHbehAsrGavNIg2AJdXmHkpUwy5YhZfUbpc= 
 

___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttps://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss=DwICAg=jf_iaSHvJObTbx-siA1ZOg=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw=oRbtYApZcD6DP-VZN9gW7dTGSZU6I4r9b_Q-nb9Xc7k=SvdmxtC5aLPE7HL4nkPNlImv4pzfOCwcTkVBQsrAo-Q= 
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Policy scan of symbolic links with contents?

2021-03-08 Thread Frederick Stock
Could you use the PATHNAME LIKE statement to limit the location to the files of interest?
Fred___Fred Stock | Spectrum Scale Development Advocacy | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: "Oesterlin, Robert" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug main discussion list Cc:Subject: [EXTERNAL] [gpfsug-discuss] Policy scan of symbolic links with contents?Date: Mon, Mar 8, 2021 10:12 AM 
Looking to craft a policy scan that pulls out symbolic links to a particular destination. For instance:
 
file1.py -> /fs1/patha/pathb/file1.py (I want to include these)
file2.py -> /fs2/patha/pathb/file2.py (exclude these)
 
The easy way would be to pull out all sym-links and just grep for the ones I want but was hoping for a more elegant solution…
 
 
Bob Oesterlin
Sr Principal Storage Engineer, Nuance
 
___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttps://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss=DwICAg=jf_iaSHvJObTbx-siA1ZOg=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw=i6m1zVXf4peZo0yo02IiRaQ_pUX95MN3wU53M0NiWcI=z-ibh2kAPHbehAsrGavNIg2AJdXmHkpUwy5YhZfUbpc= 
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] TSM errors restoring files with ACL's

2021-03-05 Thread Frederick Stock
I was referring to this flash, https://www.ibm.com/support/pages/node/6381354?myns=swgtiv=OCSSEQVQ=E_sp=swgtiv-_-OCSSEQVQ-_-E 
 
Spectrum Protect 8.1.11 client has the fix so this should not be an issue for Jonathan.  Probably best to open a help case against Spectrum Protect and begin the investigation there.
Fred___Fred Stock | Spectrum Scale Development Advocacy | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: "Grunenberg, Renar" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: "gpfsug-discuss@spectrumscale.org" Cc:Subject: [EXTERNAL] Re: [gpfsug-discuss] TSM errors restoring files with ACL'sDate: Fri, Mar 5, 2021 1:14 PM 
Hallo All,thge mentioned problem with protect was this:https://www.ibm.com/support/pages/node/6415985?myns=s033=OCSTXKQY=E_sp=s033-_-OCSTXKQY-_-ERegards RenarRenar GrunenbergAbteilung Informatik - BetriebHUK-COBURGBahnhofsplatz96444 CoburgTelefon:  09561 96-44110Telefax:  09561 96-44104E-Mail:   renar.grunenb...@huk-coburg.deInternet: www.huk.de===HUK-COBURG Haftpflicht-Unterstützungs-Kasse kraftfahrender Beamter Deutschlands a. G. in CoburgReg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021Sitz der Gesellschaft: Bahnhofsplatz, 96444 CoburgVorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin.Vorstand: Klaus-Jürgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Herøy, Dr. Jörg Rheinländer, Sarah Rössler, Thomas Sehn, Daniel Thomas.===Diese Nachricht enthält vertrauliche und/oder rechtlich geschützte Informationen.Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrtümlich erhalten haben,informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht.Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet.This information may contain confidential and/or privileged information.If you are not the intended recipient (or have received this information in error) please notify thesender immediately and destroy this information.Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden.===-Ursprüngliche Nachricht-Von: gpfsug-discuss-boun...@spectrumscale.org  Im Auftrag von Jonathan BuzzardGesendet: Freitag, 5. März 2021 14:08An: gpfsug-discuss@spectrumscale.orgBetreff: Re: [gpfsug-discuss] TSM errors restoring files with ACL'sOn 05/03/2021 12:15, Frederick Stock wrote:> CAUTION: This email originated outside the University. Check before> clicking links or attachments.> Have you checked to see if Spectrum Protect (TSM) has addressed this> problem.  There recently was an issue with Protect and how it used the> GPFS API for ACLs.  If I recall Protect was not properly handling a> return code.  I do not know if it is relevant to your problem but  it> seemed worth mentioning.As far as I am aware 8.1.11.0 is the most recent version of the SpectrumProtect/TSM client. There is nothing newer showing on the IBM FTP siteftp://ftp.software.ibm.com/storage/tivoli-storage-management/maintenance/client/v8r1/Linux/LinuxX86/BA/Checking on fix central also seems to show that 8.1.11.0 is the latestversion, and the only fix over 8.1.10.0 is a security update to do withthe client web user interface.JAB.--Jonathan A. Buzzard                         Tel: +44141-5483420HPC System Administrator, ARCHIE-WeSt.University of Strathclyde, John Anderson Building, Glasgow. G4 0NG___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttps://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss=DwIGaQ=jf_iaSHvJObTbx-siA1ZOg=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw=1mL9NKvW4jc9m4y7DfoqRmd_gaZ1etahRgWaOcS3vHA=HTaTnW1O1GM3hXsYlRLrzwmU4Ucagi0izQwm8q1O9E8= ___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttps://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss=DwIGaQ=jf_iaSHvJObTbx-siA1ZOg=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw=1mL9NKvW4jc9m4y7DfoqRmd_gaZ1etahRgWaOcS3vHA=HTaTnW1O1GM3hXsYlRLrzwmU4Ucagi0izQwm8q1O9E8=  
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] TSM errors restoring files with ACL's

2021-03-05 Thread Frederick Stock
Have you checked to see if Spectrum Protect (TSM) has addressed this problem.  There recently was an issue with Protect and how it used the GPFS API for ACLs.  If I recall Protect was not properly handling a return code.  I do not know if it is relevant to your problem but  it seemed worth mentioning.
Fred___Fred Stock | Spectrum Scale Development Advocacy | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: Jonathan Buzzard Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug main discussion list Cc:Subject: [EXTERNAL] [gpfsug-discuss] TSM errors restoring files with ACL'sDate: Fri, Mar 5, 2021 5:04 AM 
I am seeing that whenever I try and restore a file with an ACL I get thea ANS1589W error in /var/log/dsmerror.logANS1589W Unable to write extended attributes for ** due to errno:13, reason: Permission deniedBut bizarrely the ACL is actually restored. At least as far as I cantell. This is the 8.1.11-0 TSM client with GPFS version 5.0.5-1 againsta 8.1.10-0 TSM server. Running on RHEL 7.7 to match the DSS-G 2.7binstall. The backup node makes the third quorum node for the clusterbeing as that it runs genuine RHEL (unlike all the compute nodes whichare running CentOS).Googling I can't find any references to this being fixed in a laterversion of the GPFS software, though being on RHEL7 and it's derivativesI am stuck on 5.0.5Surely root has permissions to write the extended attributes for anyone?It would seem perverse if you have to be the owner of a file to restorethe ACL's.JAB.--Jonathan A. Buzzard                         Tel: +44141-5483420HPC System Administrator, ARCHIE-WeSt.University of Strathclyde, John Anderson Building, Glasgow. G4 0NG___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttps://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss=DwICAg=jf_iaSHvJObTbx-siA1ZOg=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw=vOIlg_AmnF0QpZgBNH7ZxyFuYN7z1waVH0PcA0OHemA=gezWXi-NfdoarS5isN1eaM-YWhi3gUOjoNAi7CffDD0=  
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Using setfacl vs. mmputacl

2021-03-01 Thread Frederick Stock
To add to Olaf's response, Scale 4.2 is now out of support, as of October 1, 2020.  I do not know if this behavior would change with a more recent release of Scale but it is worth giving that a try if you can.  The most current release of Scale is 5.1.0.2.
Fred___Fred Stock | Spectrum Scale Development Advocacy | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: "Olaf Weiser" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug-discuss@spectrumscale.orgCc: gpfsug-discuss@spectrumscale.orgSubject: [EXTERNAL] Re: [gpfsug-discuss] Using setfacl vs. mmputaclDate: Mon, Mar 1, 2021 7:46 AM 
Hallo Stephen,
 
behavior ... or better to say ... predicted behavior for chmod and ACLs .. is not an easy thing or only  , if  you stay in either POSIX world or NFSv4 world
 
to be POSIX compliant, a chmod overwrites ACLs
 
GPFS was enhanced to ignore overwrites to ACLs on chmod by a parameter.. and I can't remember exactly when, but in your very old version (blink blink, please update) it shoud be already there...  ... .. Then(later) it was even more enhanced to better mitigate between the world's ...
 
You need to have in mind... in case you use kernelNFS ... the linux kernel NFS required a so called lossy mapping ... because at the time of writing the kernel-NFS, there was no linux file system available,  supporting native NFSv4 ACLs... so there was no other way .. than "lossy" map NFSv4 ACLs into POSIX ACLs (long time ago) ...
 
but as always.. everything in IT business has some history.. ;-)
 
later in GPFS, we introduced, that you can fine grained allow behavior of chmod , NFSv4 ACLs, POSIX ACls or both  -and - do that per file set level
 
--allow-permission-change PermissionChangeMode Specifies the new permission change mode. This modecontrols how chmod and ACL operations are handled on objects in the fileset. Valid modes are as follows:chmodOnly  Specifies that only the UNIX change mode operation (chmod) is allowed to change access permissions (ACL commands and API will not be  accepted).setAclOnly  Specifies that permissions can be changed using ACL commands and API only (chmod will not be  accepted).chmodAndSetAcl  Specifies that chmod and ACL operations are  permitted. If the chmod command (or setattr file operation) is issued, the result depends on the type of ACL that was previously  controlling access to the object: *  If the object had a Posix ACL, it will be modified accordingly. *  If the object had an NFSv4 ACL, it will be replaced by the given UNIX mode bits. Note:  This is the default setting when a  fileset is created.chmodAndUpdateAcl  Specifies that chmod and ACL operations are  permitted. If chmod is issued, the ACL will be  updated by privileges derived from UNIX mode bits. 
hope this helps ..
 
 
- Original message -From: "Losen, Stephen C (scl)" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug main discussion list Cc:Subject: [EXTERNAL] [gpfsug-discuss] Using setfacl vs. mmputaclDate: Mon, Mar 1, 2021 1:31 PM 
Hi folks,Experimenting with POSIX ACLs on GPFS 4.2 and noticed that the Linux command setfacl clears "c" permissions that were set with mmputacl. So if I have this:...group:group1:rwxcmask::rwxc...and I modify a different entry with:setfacl -m group:group2:r-x dirnamethen the "c" permissions above get cleared and I end up with...group:group1:rwx-mask::rwx-...I discovered that chmod does not clear the "c" mode. Is there any filesystem option to change this behavior to leave "c" modes in place?Steve LosenResearch ComputingUniversity of Virginias...@virginia.edu   434-924-0640___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttps://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss=DwICAg=jf_iaSHvJObTbx-siA1ZOg=QInBVUG2345zpTXGAPvczeXAfnCgUNXuJEI_-wZlDDs=Cb4nCNXx2mpY3MW5kuFoMZe8SsgXt-2_m6k7OMk50v8=FxuSoG7O3C3D-I-NblJA4tsPcsXlkF0JGTSormvlYiE=  
  

___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttps://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss=DwICAg=jf_iaSHvJObTbx-siA1ZOg=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw=LhD8G4vGI6iP_cAv61i-8S7oCxTqceBjD2tIjt3Mo4I=9zZCf_3LLk7t3bZPMMnQRs6ZAn954cw6MyS1ik-ag5s= 
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] policy ilm features?

2021-02-02 Thread Frederick Stock
Hello Ed.  Jordan contacted me about the question you are posing so I am responding to your message.  Could you please provide clarification as to why the existence of the unbalanced flag is of a concern, or why you would want to know all the files that have this flag set?  The flag would be cleared once the file was rebalanced either through normal access or through the execution of the mmrestripefs/mmrestripefile commands.
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: "Wahl, Edward" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug main discussion list Cc:Subject: [EXTERNAL] Re: [gpfsug-discuss] policy ilm features?Date: Tue, Feb 2, 2021 11:52 AM 
Replying to a 3 year old message I sent, hoping that in the last couple of years that Scale has added some ILM extensions into the policy engine that I have missed, or somehow didn't notice?
Just ran into a file with an 'unbalanced' flag and I REALLY don't want to have to mmlsattr everything. AGAIN. /facepalm  
 
IBM?  Bueller? Bueller?  
 
When everyone answers: "No", I'm guessing this needs to be a request for improvement/enhancement?  
 
Ed Wahl
Ohio Supercomputer Center
 
 
From: gpfsug-discuss-boun...@spectrumscale.org  on behalf of Edward Wahl Sent: Friday, February 2, 2018 3:23 PMTo: John Hearns Cc: gpfsug main discussion list Subject: Re: [gpfsug-discuss] policy ilm features?
 
Thanks John, this was the path I was HOPING to go down as I do similar thingsalready, but there appears to be no extended attribute in ILM for what I want.Data block replication flag exists in the ILM, but not MetaData, or balance.Yet these states ARE reported by mmlsattr, so there must be a flag somewhere.bad MD replication & balance example: mmlsattr -L /fs/scratch/sysp/ed/180days.polfile name:    /fs/scratch/sysp/ed/180days.polmetadata replication: 1 max 2data replication: 1 max 2flags:    illreplicated,unbalancedEncrypted:    yesFile next to it for comparison. note proper MD replication and balance. mmlsattr -L  /fs/scratch/sysp/ed/120days.polfile name:    /fs/scratch/sysp/ed/120days.polmetadata replication: 2 max 2data replication: 1 max 2flags:   Encrypted:    yesmisc_attributes flags from a policy run showing no difference in status:FJAEu -- /fs/scratch/sysp/ed/180days.polFJAEu -- /fs/scratch/sysp/ed/120days.polFile system has MD replication enabled, but not Data, so ALL files show "J" ilmflagmmlsfs scratch -mflag    value    description---  --- -m 2    Default number of metadata replicasmmlsfs scratch -rflag    value    description---  --- -r 1    Default number of data replicasI poked around a little trying to find out if perhaps using GetXattr wouldwork and show me what I wanted, it does not. All I sem to be able to get is theFile Encryption Key.I was hoping perhaps someone had found a cheaper way for this to work ratherthan hundreds of millions of 'mmlsattr' execs.  :-(On the plus side, I've only run across a few of these and all appear to befrom before we did the MD replication and re-striping.  On the minus, I have NOidea where they are, and they appears to be on both of our filesystems.  Soseveral hundred million files to check.EdOn Mon, 22 Jan 2018 08:29:42 +John Hearns  wrote:> Ed,> This is not a perfect answer. You need to look at policies for this. I have> been doing something similar recently.>> Something like:>> RULE 'list_file' EXTERNAL LIST 'all-files' EXEC> '/var/mmfs/etc/mmpolicyExec-list' RULE 'listall' list 'all-files'> SHOW( varchar(kb_allocated) || '  ' || varchar(file_size) || ' ' ||> varchar(misc_attributes) || ' ' || name || ' ' || fileset_name  ) WHERE> REGEX(misc_attributes,'[J]')>>> So this policy shows the kbytes allocates, file size, the miscellaneous> attributes, name and fileset name For all files with  miscellaneous> attributes of 'J'   which means 'Some data blocks might be ill replicated'> -Original Message-> From: gpfsug-discuss-boun...@spectrumscale.org> [mailto:gpfsug-discuss-boun...@spectrumscale.org] On Behalf Of Edward Wahl> Sent: Friday, January 19, 2018 10:38 PM To: gpfsug-discuss@spectrumscale.org> Subject: [gpfsug-discuss] policy ilm features?>>> This one has been on my list a long time so I figured I'd ask here first> before I open an apar or request an enhancement (most likely).>>  Is there a way using the policy engine to determine the following?>> -metadata replication total/current> -unbalanced file>> Looking to catch things like this that stand out on my filesystem without> having to run several hundred million 'mmlsattr's.>> metadata replication: 1 max 2> 

Re: [gpfsug-discuss] Migrate/syncronize data from Isilon to Scale over NFS?

2020-11-16 Thread Frederick Stock
Have you considered using the AFM feature of Spectrum Scale?  I doubt it will provide any speed improvement but it would allow for data to be accessed as it was being migrated.
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: Andi Christiansen Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: "gpfsug-discuss@spectrumscale.org" Cc:Subject: [EXTERNAL] [gpfsug-discuss] Migrate/syncronize data from Isilon to Scale over NFS?Date: Mon, Nov 16, 2020 2:44 PM 
Hi all,
 
i have got a case where a customer wants 700TB migrated from isilon to Scale and the only way for him is exporting the same directory on NFS from two different nodes...
 
as of now we are using multiple rsync processes on different parts of folders within the main directory. this is really slow and will take forever.. right now 14 rsync processes spread across 3 nodes fetching from 2.. 
 
does anyone know of a way to speed it up? right now we see from 1Gbit to 3Gbit if we are lucky(total bandwidth) and there is a total of 30Gbit from scale nodes and 20Gbits from isilon so we should be able to reach just under 20Gbit...
 
 
if anyone have any ideas they are welcome! Thanks in advance 
Andi Christiansen
___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss 
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Poor client performance with high cpu usage of mmfsd process

2020-11-13 Thread Frederick Stock
Scale 4.2.3 was end of service as of September 30, 2020.  As for waiters the mmdiag --waiters command only shows waiters on the node upon which the command is executed.  You should use the command, mmlsnode -N waiters -L, to see all the waiters in the cluster, which may be more revealing as to the source of your problem.
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: "Czauz, Kamil" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug main discussion list Cc:Subject: [EXTERNAL] Re: [gpfsug-discuss] Poor client performance with high cpu usage of mmfsd processDate: Fri, Nov 13, 2020 8:32 AM Hi Uwe -Regarding your previous message - waiters were coming / going with just 1-2 waiters when I ran the mmdiag command, with very low wait times (<0.01s).We are running version 4.2.3I did another capture today while the client is functioning normally and this was the header result:Overwrite trace parameters:buffer size: 13421772864 kernel trace streams, indices 0-63 (selected by low bits of processor ID)128 daemon trace streams, indices 64-191 (selected by low bits of thread ID)Interval for calibrating clock rate was 25.996957 seconds and 67592121252 cyclesMeasured cycle count update rate to be 261271 per second < using this valueOS reported cycle count update rate as 259000 per secondTrace milestones:kernel trace enabled Fri Nov 13 08:20:01.800558000 2020 (TOD 1605273601.800558, cycles 20807897445779444)daemon trace enabled Fri Nov 13 08:20:01.910017000 2020 (TOD 1605273601.910017, cycles 20807897730372442)all streams included Fri Nov 13 08:20:26.423085049 2020 (TOD 1605273626.423085, cycles 20807961464381068) < useful part of trace extends from heretrace quiesced Fri Nov 13 08:20:27.797515000 2020 (TOD 1605273627.000797, cycles 20807965037900696) < to hereApproximate number of times the trace buffer was filled: 14.631Still a very small capture (1.3s), but the trsum.awk output was not filled with lookup commands / large lookup times. Can you help debug what those long lookup operations mean?Unfinished operations:27967 * pagein ** 1.36238211627967 * readpage ** 1.362381516139130 1.362448448 * Unfinished IO: buffer/disk 3002F67 20:107498951168^\archive_data_16104686 1.362022068 * Unfinished IO: buffer/disk 50011878000 1:47169618944^\archive_data_10 0.0 * Unfinished IO: buffer/disk 5003CEB8000 4:23073390592^\FFFE0 0.0 * Unfinished IO: buffer/disk 5003CEB8000 4:23073390592^\0 0.0 * Unfinished IO: buffer/disk 2000EE78000 5:47631127040^\FFFE341710 1.362423815 * Unfinished IO: buffer/disk 20022218000 19:107498951680^\archive_data_150 0.0 * Unfinished IO: buffer/disk 2000EE78000 5:47631127040^\0 0.0 * Unfinished IO: buffer/disk 206B000 18:3452986648^\FFFE0 0.0 * Unfinished IO: buffer/disk 206B000 18:3452986648^\0 0.0 * Unfinished IO: buffer/disk 206B000 18:3452986648^\139150 1.361122006 * Unfinished IO: buffer/disk 50012018000 2:47169622016^\archive_data_20 0.0 * Unfinished IO: buffer/disk 5003CEB8000 4:23073390592^\95782 1.361112791 * Unfinished IO: buffer/disk 4001630 20:107498950656^\archive_data_160 0.0 * Unfinished IO: buffer/disk 2000EE78000 5:47631127040^\271076 1.361579585 * Unfinished IO: buffer/disk 20023DB8000 4:47169606656^\archive_data_4341676 1.362018599 * Unfinished IO: buffer/disk 4003814 5:47169614336^\archive_data_5139150 1.361131599 MSG FSnd: nsdMsgReadExt msg_id 2930654492 Sduration 13292.382 + us341676 1.362027104 MSG FSnd: nsdMsgReadExt msg_id 2930654495 Sduration 12396.877 + us95782 1.361124739 MSG FSnd: nsdMsgReadExt msg_id 2930654491 Sduration 13299.242 + us271076 1.361587653 MSG FSnd: nsdMsgReadExt msg_id 2930654493 Sduration 12836.328 + us92182 0.0 MSG FSnd: msg_id 0 Sduration 0.000 + us341710 1.362429643 MSG FSnd: nsdMsgReadExt msg_id 2930654497 Sduration 11994.338 + us341662 0.0 MSG FSnd: msg_id 0 Sduration 0.000 + us139130 1.362458376 MSG FSnd: nsdMsgReadExt msg_id 2930654498 Sduration 11965.605 + us104686 1.362028772 MSG FSnd: nsdMsgReadExt msg_id 2930654496 Sduration 12395.209 + us412373 0.775676657 MSG FRep: nsdMsgReadExt msg_id 304915249 Rduration 598747.324 us Rlen 262144 Hduration 598752.112 + us341770 0.589739579 MSG FRep: nsdMsgReadExt msg_id 338079050 Rduration 784684.402 us Rlen 4 Hduration 784692.651 + us143315 0.536252844 MSG FRep: nsdMsgReadExt msg_id 631945522 Rduration 838171.137 us Rlen 233472 Hduration 838174.299 + us341878 0.134331812 MSG FRep: nsdMsgReadExt msg_id 338079023 Rduration 1240092.169 us Rlen 262144 Hduration 1240094.403 + us175478 0.587353287 MSG FRep: 

Re: [gpfsug-discuss] tsgskkm stuck

2020-08-28 Thread Frederick Stock
Not sure that Spectrum Scale has stated it supports the AMD epyc (Rome?) processors.  You may want to open a help case to determine the cause of this problem.
 
Note that Spectrum Scale 4.2.x goes out of service on September 30, 2020 so you may want to consider upgrading your cluster.  And should Scale officially support the AMD epyc processor it would not be on Scale 4.2.x.
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: Philipp Helo Rehs Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug main discussion list Cc:Subject: [EXTERNAL] [gpfsug-discuss] tsgskkm stuckDate: Fri, Aug 28, 2020 5:52 AM 
Hello,we have a gpfs v4 cluster running with 4 nsds and i am trying to addsome clients:mmaddnode -N hpc-storage-1-ib:client:hpc-storage-1this commands hangs and do not finishWhen i look into the server, i can see the following processes whichnever finish:root 38138  0.0  0.0 123048 10376 ?    Ss   11:32   0:00/usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmremote checkNewClusterNode3lc/setupClient%%%%:00_VERSION_LINE::1709:3:1::lc:gpfs3.hilbert.hpc.uni-duesseldorf.de::0:/bin/ssh:/bin/scp:5362040003754711198:lc2:1597757602::HPCStorage.hilbert.hpc.uni-duesseldorf.de:2:1:1:2:A:::central:0.0:%%home%%:20_MEMBER_NODE::5:20:hpc-storage-1root 38169  0.0  0.0 123564 10892 ?    S    11:32   0:00/usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmremote ccrctl setupClient 2214791=gpfs3-ib.hilbert.hpc.uni-duesseldorf.de:1191,2=gpfs4-ib.hilbert.hpc.uni-duesseldorf.de:1191,4=gpfs6-ib.hilbert.hpc.uni-duesseldorf.de:1191,3=gpfs5-ib.hilbert.hpc.uni-duesseldorf.de:11910 1191root 38212  100  0.0  35544  5752 ?    R    11:32   9:40/usr/lpp/mmfs/bin/tsgskkm store --cert/var/mmfs/ssl/stage/tmpKeyData.mmremote.38169.cert --priv/var/mmfs/ssl/stage/tmpKeyData.mmremote.38169.priv --out/var/mmfs/ssl/stage/tmpKeyData.mmremote.38169.keystore --fips offThe node is an AMD epyc.Any idea what could cause the issue?ssh is possible in both directions and firewall is disabled.Kind regards Philipp Rehs 
 
___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Spectrum Scale pagepool size with RDMA

2020-07-23 Thread Frederick Stock
And what version of ESS/Scale are you running on your systems (mmdiag --version)?
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: "Yaron Daniel" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug main discussion list Cc:Subject: [EXTERNAL] Re: [gpfsug-discuss] Spectrum Scale pagepool size with RDMADate: Thu, Jul 23, 2020 3:09 AM HiWhat is the output for:#mmlsconfig |grep -i verbs #ibstat Regards 
  Yaron Daniel 94 Em Ha'Moshavot RdStorage Architect – IL Lab Services (Storage) Petach Tiqva, 49527IBM Global Markets, Systems HW Sales Israel   Phone:+972-3-916-5672  Fax:+972-3-916-5672   Mobile:+972-52-8395593   e-mail:y...@il.ibm.com   Webex:    https://ibm.webex.com/meet/yardIBM Israel    
    From:        Prasad Surampudi To:        "gpfsug-discuss@spectrumscale.org" Date:        07/23/2020 03:34 AMSubject:        [EXTERNAL] [gpfsug-discuss] Spectrum Scale pagepool size with RDMASent by:        gpfsug-discuss-boun...@spectrumscale.org
Hi,We have an ESS clusters with two CES nodes. The pagepool is set to 128 GB ( Real Memory is 256 GB ) on both ESS NSD servers and CES nodes as well. Occasionally we see the mmfsd process memory usage reaches 90% on NSD servers and CES nodes and stays there until GPFS is recycled. I have couple of questions in this scenario:
 What are the general recommendations of pagepool size for nodes with RDMA enabled? On, IBM knowledge center for RDMA tuning says "If the GPFS pagepool is set to 32 GB, then the mapping of the RDMA for this pagepool must be at least 64 GB."  So, does this mean that the pagepool can't be more than half of real memory with RDMA enabled? Also, Is this the reason why mmfsd memory usage exceeds pagepool size and spikes to almost 90%?If we dont want to see high mmfsd process memory usage on NSD/CES nodes, should we decrease the pagepool size?Can we tune  log_num_mtt parameter to limit the memory usage? Currently its set to 0 for both NSD (ppc64_le) and CES (x86_64).We also see messages like "Verbs RDMA disabled for xx.xx.xx.xx due to no matching port found" . Any idea what this message indicate? I dont see any Verbs RDMA enabled message after these warning messages. Does it get enabled automatically? 
Prasad Surampudi|Sr. Systems Engineer|ATS Group, LLC
 ___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss 
___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss 
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] dependent versus independent filesets

2020-07-07 Thread Frederick Stock
One comment about inode preallocation.  There was a time when inode creation was performance challenged but in my opinion that is no longer the case, unless you have need for file creates to complete at extreme speed.  In my experience it is the rare customer that requires extremely fast file create times so pre-allocation is not truly necessary.  As was noted once an inode is allocated it cannot be deallocated.  The more important item is the maximum inodes defined for a fileset or file system.  Yes, those do need to be monitored so they can be increased if necessary to avoid out of space errors.
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: "Wahl, Edward" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug main discussion list Cc:Subject: [EXTERNAL] Re: [gpfsug-discuss] dependent versus independent filesetsDate: Tue, Jul 7, 2020 11:59 AM 
We also went with independent filesets for both backup (and quota) reasons for several years now, and have stuck with this across to 5.x.  However we still maintain a minor number of dependent filesets for administrative use.     Being able to mmbackup on many filesets at once can increase your parallelization _quite_ nicely!  We create and delete the individual snaps before and after each backup, as you may expect.  Just be aware that if you do massive numbers of fast snapshot deletes and creates you WILL reach a point where you will run into issues due to quiescing compute clients, and that certain types of workloads have issues with snapshotting in general.You have to more closely watch what you pre-allocate, and what you have left in the common metadata/inode pool.  Once allocated, even if not being used, you cannot reduce the inode allocation without removing the fileset and re-creating.  (say a fileset user had 5 million inodes and now only needs 500,000)  Growth can also be an issue if you do NOT fully pre-allocate each space.  This can be scary if you are not used to over-subscription in general.  But I imagine that most sites have some decent % of oversubscription if they use filesets and quotas.EdOSC-Original Message-From: gpfsug-discuss-boun...@spectrumscale.org  On Behalf Of Skylar ThompsonSent: Tuesday, July 7, 2020 10:00 AMTo: gpfsug-discuss@spectrumscale.orgSubject: Re: [gpfsug-discuss] dependent versus independent filesetsWe wanted to be able to snapshot and backup filesets separately with mmbackup, so went with independent filesets.On Tue, Jul 07, 2020 at 08:37:46AM -0500, Damir Krstic wrote:> We are deploying our new ESS and are considering moving to independent> filesets. The snapshot per fileset feature appeals to us.>> Has anyone considered independent vs. dependent filesets and what was> your reasoning to go with one as opposed to the other? Or perhaps you> opted to have both on your filesystem, and if, what was the reasoning for it?>> Thank you.> Damir> ___> gpfsug-discuss mailing list> gpfsug-discuss at spectrumscale.org> https://urldefense.com/v3/__http://gpfsug.org/mailman/listinfo/gpfsug-> discuss__;!!KGKeukY!j-c9kslUrEaNslhTbLLfaY8TES7Xf4eUCxysOaXwroHhTMwiVY> vcGNh4M_no$ Skylar Thompson (skyl...@u.washington.edu)-- Genome Sciences Department (UW Medicine), System Administrator-- Foege Building S046, (206)-685-7354-- Pronouns: He/Him/His___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttps://urldefense.com/v3/__http://gpfsug.org/mailman/listinfo/gpfsug-discuss__;!!KGKeukY!j-c9kslUrEaNslhTbLLfaY8TES7Xf4eUCxysOaXwroHhTMwiVYvcGNh4M_no$ ___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss  
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Dedicated filesystem for cesSharedRoot

2020-06-25 Thread Frederick Stock
Generally these file systems are configured with a block size of 256KB.  As for inodes I would not pre-allocate any and set the initial maximum size to value such as 5000 since it can be increased if necessary.
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: "Caubet Serrabou Marc (PSI)" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug main discussion list Cc:Subject: [EXTERNAL] [gpfsug-discuss] Dedicated filesystem for cesSharedRootDate: Thu, Jun 25, 2020 6:47 AM 
Hi all,
 
I would like to use CES for exporting Samba and NFS. However, when reading the documentation (https://www.ibm.com/support/knowledgecenter/STXKQY_5.0.4/com.ibm.spectrum.scale.v5r04.doc/bl1adm_setcessharedroot.htm), is recommended (but not enforced) the use of a dedicated filesystem (of at least 4GB).
 
Is there any best practice or recommendation for configuring this filesystem? This is: inode / block sizes, number of expected files in the filesystem, ideal size for this filesystem (from my understanding, 4GB should be enough, but I am not sure if there are conditions that would require a bigger one).
 
Thanks a lot and best regards,
Marc  
_Paul Scherrer InstitutHigh Performance Computing & Emerging TechnologiesMarc Caubet SerrabouBuilding/Room: OHSA/014
Forschungsstrasse, 111
5232 Villigen PSISwitzerlandTelephone: +41 56 310 46 67E-Mail: marc.cau...@psi.ch
___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss 
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Client Latency and High NSD Server Load Average

2020-06-04 Thread Frederick Stock
From the waiters you provided I would guess there is something amiss with some of your storage systems.  Since those waiters are on NSD servers they are waiting for IO requests to the kernel to complete.  Generally IOs are expected to complete in milliseconds, not seconds.  You could look at the output of "mmfsadm dump nsd" to see how the GPFS IO queues are working but that would be secondary to checking your storage systems.
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: "Saula, Oluwasijibomi" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: "gpfsug-discuss@spectrumscale.org" Cc:Subject: [EXTERNAL] Re: [gpfsug-discuss] Client Latency and High NSD Server Load AverageDate: Wed, Jun 3, 2020 6:24 PM 
Frederick,
 
Yes on both counts! -  mmdf is showing pretty uniform (ie 5 NSDs out of 30 report 65% free; All others are uniform at 58% free)...
 
NSD servers per disks are called in round-robin fashion as well, for example:
 
 gpfs1         tier2_001    nsd02-ib,nsd03-ib,nsd04-ib,tsm01-ib,nsd01-ib 
 gpfs1         tier2_002    nsd03-ib,nsd04-ib,tsm01-ib,nsd01-ib,nsd02-ib 
 gpfs1         tier2_003    nsd04-ib,tsm01-ib,nsd01-ib,nsd02-ib,nsd03-ib 
 gpfs1         tier2_004    tsm01-ib,nsd01-ib,nsd02-ib,nsd03-ib,nsd04-ib 
Any other potential culprits to investigate?
 
I do notice nsd03/nsd04 have long waiters, but nsd01 doesn't (nsd02-ib is offline for now): 
[nsd03-ib ~]# mmdiag --waiters 
=== mmdiag: waiters === 
Waiting 6.5113 sec since 17:17:33, monitored, thread 4175 NSDThread: for I/O completion 
Waiting 6.3810 sec since 17:17:33, monitored, thread 4127 NSDThread: for I/O completion 
Waiting 6.1959 sec since 17:17:34, monitored, thread 4144 NSDThread: for I/O completion 
  
nsd04-ib: 
  
Waiting 13.1386 sec since 17:19:09, monitored, thread 9971 NSDThread: for I/O completion 
Waiting 10.3562 sec since 17:19:12, monitored, thread 9958 NSDThread: for I/O completion 
Waiting 10.0338 sec since 17:19:12, monitored, thread 9951 NSDThread: for I/O completion 
  
tsm01-ib: 
  
Waiting 8.1211 sec since 17:20:24, monitored, thread 3644 NSDThread: for I/O completion 
Waiting 7.6690 sec since 17:20:24, monitored, thread 3641 NSDThread: for I/O completion 
Waiting 7.4969 sec since 17:20:24, monitored, thread 3658 NSDThread: for I/O completion 
Waiting 7.3573 sec since 17:20:24, monitored, thread 3642 NSDThread: for I/O completion 
  
nsd01-ib: 
  
Waiting 0.2548 sec since 17:21:47, monitored, thread 30513 NSDThread: for I/O completion 
Waiting 0.1502 sec since 17:21:47, monitored, thread 30529 NSDThread: for I/O completion 
  
 
 
 
Thanks,
 
Oluwasijibomi (Siji) Saula
HPC Systems Administrator  /  Information Technology
 
Research 2 Building 220B / Fargo ND 58108-6050
p: 701.231.7749 / www.ndsu.edu
 

  

 
 
 
From: gpfsug-discuss-boun...@spectrumscale.org  on behalf of gpfsug-discuss-requ...@spectrumscale.org Sent: Wednesday, June 3, 2020 4:56 PMTo: gpfsug-discuss@spectrumscale.org Subject: gpfsug-discuss Digest, Vol 101, Issue 6
 
Send gpfsug-discuss mailing list submissions to    gpfsug-discuss@spectrumscale.orgTo subscribe or unsubscribe via the World Wide Web, visit    http://gpfsug.org/mailman/listinfo/gpfsug-discussor, via email, send a message with subject or body 'help' to    gpfsug-discuss-requ...@spectrumscale.orgYou can reach the person managing the list at    gpfsug-discuss-ow...@spectrumscale.orgWhen replying, please edit your Subject line so it is more specificthan "Re: Contents of gpfsug-discuss digest..."Today's Topics:   1. Introducing SSUG::Digital  (Simon Thompson (Spectrum Scale User Group Chair))   2. Client Latency and High NSD Server Load Average  (Saula, Oluwasijibomi)   3. Re: Client Latency and High NSD Server Load Average  (Frederick Stock)--Message: 1Date: Wed, 03 Jun 2020 20:11:17 +0100From: "Simon Thompson (Spectrum Scale User Group Chair)"    To: "gpfsug-discuss@spectrumscale.org"    Subject: [gpfsug-discuss] Introducing SSUG::DigitalMessage-ID: Content-Type: text/plain; charset="utf-8"Hi All., I happy that we can finally announce SSUG:Digital, which will be a series of online session based on the types of topic we present at our in-person events. I know it?s taken use a while to get this up and running, but we?ve been working on trying to get the format right. So save the date for the first SSUG:Digital event which will take place on Thursday 18th June 2020 at 4pm BST. That?s:San Francisco, USA at 08:00 PDTNew York, USA at 11:00 EDTLondon, United Kingdom at 16:00 BSTFrankfurt, Germany at 17:00 CESTPune, India at 20:30 ISTWe estimate about 90 minutes for the first session, and please forgive any teething troubles as we get this going! (I know the times don?t work for everyone in the global community!) Ea

Re: [gpfsug-discuss] Client Latency and High NSD Server Load Average

2020-06-03 Thread Frederick Stock
Does the output of mmdf show that data is evenly distributed across your NSDs?  If not that could be contributing to your problem.  Also, are your NSDs evenly distributed across your NSD servers, and the NSD configured so the first NSD server for each is not the same one?
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: "Saula, Oluwasijibomi" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: "gpfsug-discuss@spectrumscale.org" Cc:Subject: [EXTERNAL] [gpfsug-discuss] Client Latency and High NSD Server Load AverageDate: Wed, Jun 3, 2020 5:45 PM 
 
Hello,
 
Anyone faced a situation where a majority of NSDs have a high load average and a minority don't?
 
Also, is 10x NSD server latency for write operations than for read operations expected in any circumstance? 
 
We are seeing client latency between 6 and 9 seconds and are wondering if some GPFS configuration or NSD server condition may be triggering this poor performance.
 
 
 
 
Thanks,
 
Oluwasijibomi (Siji) Saula
HPC Systems Administrator  /  Information Technology
 
Research 2 Building 220B / Fargo ND 58108-6050
p: 701.231.7749 / www.ndsu.edu
 

  

 
___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss 
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Immutible attribute

2020-06-03 Thread Frederick Stock
Could you please provide the exact Scale version, or was it really 4.2.3.0?
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: Jonathan Buzzard Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug-discuss@spectrumscale.orgCc:Subject: [EXTERNAL] [gpfsug-discuss] Immutible attributeDate: Wed, Jun 3, 2020 11:16 AM 
Hum, on a "normal" Linux file system only the root user can change theimmutible attribute on a file.Running on 4.2.3 I have just removed the immutible attribute as anordinary user if I am the owner of the file.I would suggest that this is a bug as the manual page for mmchattr doesnot mention this.JAB.--Jonathan A. Buzzard                         Tel: +44141-5483420HPC System Administrator, ARCHIE-WeSt.University of Strathclyde, John Anderson Building, Glasgow. G4 0NG___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss  
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Importing a Spectrum Scale a filesystem from 4.2.3 cluster to 5.0.4.3 cluster

2020-06-01 Thread Frederick Stock
Chris, it was not clear to me if the file system you imported had files migrated to Spectrum Protect, that is stub files in GPFS.  If the file system does contain files migrated to Spectrum Protect with just a stub file in the file system, have you tried to recall any of them to see if that still works?
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: Chris Scott Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug main discussion list Cc:Subject: [EXTERNAL] Re: [gpfsug-discuss] Importing a Spectrum Scale a filesystem from 4.2.3 cluster to 5.0.4.3 clusterDate: Mon, Jun 1, 2020 9:14 AM 
Sounds like it would work fine.
 
I recently exported a 3.5 version filesystem from a GPFS 3.5 cluster to a 'Scale cluster at 5.0.2.3 software and 5.0.2.0 cluster version. I concurrently mapped the NSDs to new NSD servers in the 'Scale cluster, mmexported the filesystem and changed the NSD servers configuration of the NSDs using the mmimportfs ChangeSpecFile. The original (creation) filesystem version of this filesystem is 3.2.1.5.
 
To my pleasant surprise the filesystem mounted and worked fine while still at 3.5 filesystem version. Plan B would have been to "mmchfs  -V full" and then mmmount, but I was able to update the filesystem to 5.0.2.0 version while already mounted.
 
This was further pleasantly successful as the filesystem in question is DMAPI-enabled, with the majority of the data on tape using Spectrum Protect for Space Management than the volume resident/pre-migrated on disk.
 
The complexity is further compounded by this filesystem being associated to a different Spectrum Protect server than an existing DMAPI-enabled filesystem in the 'Scale cluster. Preparation of configs and subsequent commands to enable and use Spectrum Protect for Space Management multiserver for migration and backup all worked smoothly as per the docs.
 
I was thus able to get rid of the GPFS 3.5 cluster on legacy hardware, OS, GPFS and homebrew CTDB SMB and NFS and retain the filesystem with its majority of tape-stored data on current hardware, OS and 'Scale/'Protect with CES SMB and NFS.
 
The future objective remains to move all the data from this historical filesystem to a newer one to get the benefits of larger block and inode sizes, etc, although since the data is mostly dormant and kept for compliance/best-practice purposes, the main goal will be to head off original file system version 3.2 era going end of support.
 
Cheers
Chris 

On Thu, 28 May 2020 at 23:31, Prasad Surampudi  wrote:
We have two scale clusters, cluster-A running version Scale 4.2.3 and RHEL6/7 and Cluster-B running Spectrum Scale  5.0.4 and RHEL 8.1. All the nodes in both Cluster-A and Cluster-B are direct attached and no NSD servers. We have our current filesystem gpfs_4 in Cluster-A  and new filesystem gpfs_5 in Cluster-B. We want to copy all our data from gpfs_4 filesystem into gpfs_5 which has variable block size.  So, can we map NSDs of gpfs_4 to Cluster-B nodes and do a mmexportfs of gpfs_4 from Cluster-A and mmimportfs into Cluster-B so that we have both filesystems available on same node in Cluster-B for copying data across fiber channel? If mmexportfs/mmimportfs works, can we delete nodes from Cluster-A and add them to Cluster-B without upgrading RHEL or GPFS versions for now and  plan upgrading them at a later time?___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss
___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss 
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Scale 4.2.3.22 with support for RHEL 7.8 is now on Fix Central

2020-05-14 Thread Frederick Stock
Regarding RH 7.8 support in Scale 5.0.x, we expect it will be supported in the 5.0.5 release, due out very soon, but it may slip to the first PTF on 5.0.5.
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: Jonathan Buzzard Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug-discuss@spectrumscale.orgCc:Subject: [EXTERNAL] Re: [gpfsug-discuss] Scale 4.2.3.22 with support for RHEL 7.8 is now on Fix CentralDate: Thu, May 14, 2020 10:24 AM 
On 14/05/2020 13:31, Flanders, Dean wrote:> Hello, It is great, that RHEL 7.8 is supported on SS 4.2.3.22, when will> RHEL 8.x be supported on GPFS SS 4.2.3.X?? Thanks, DeanThat would be never, 4.2.3 goes out of support in September.Is 5.x supported in 7.8 yet? Some urgent upgrading of systems beinglooked into right now and I would prefer to jump to 5.x than 4.2.3.22and have to upgrade again. However needs must.Oh and if you haven't yet I would recommend any HPC site revoking allyour users SSH keys immediately. To many users have been creating SSHkeys without passphrases and jumping from system to system. Severalsites in the UK are currently down and I understand it has effectedGermany too :-(There are zero guesses to the origin of the hacks.JAB.--Jonathan A. Buzzard                         Tel: +44141-5483420HPC System Administrator, ARCHIE-WeSt.University of Strathclyde, John Anderson Building, Glasgow. G4 0NG___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss  
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] wait for mount during gpfs startup

2020-04-28 Thread Frederick Stock
Have you looked a the mmaddcallback command and specifically the file system mount callbacks?
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: Ulrich Sibiller Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug-disc...@gpfsug.orgCc:Subject: [EXTERNAL] [gpfsug-discuss] wait for mount during gpfs startupDate: Tue, Apr 28, 2020 7:05 AM 
Hi,when the gpfs systemd service returns from startup the filesystems are usually not mounted. Sohaving another service depending on gpfs is not feasible if you require the filesystem(s).Therefore we have added a script to the systemd gpfs service that waits for all local gpfsfilesystems being mounted. We have added that script via ExecStartPost:# cat /etc/systemd/system/gpfs.service.d/waitmount.conf[Service]ExecStartPost=/usr/local/sc-gpfs/sbin/wait-for-all_local-mounts.shTimeoutStartSec=200-The script itself is not doing much:-#!/bin/bash## wait until all _local_ gpfs filesystems are mounted. It ignored# filesystems where mmlsfs -A does not report "yes".## returns 0 if all fs are mounted (or none are found in gpfs configuration)# returns non-0 otherwise# wait for max. TIMEOUT secondsTIMEOUT=180# leading space is required!FS=" $(/usr/lpp/mmfs/bin/mmlsfs all_local -Y 2>/dev/null | grep :automaticMountOption:yes: | cut -d:-f7 | xargs; exit ${PIPESTATUS[0]})"# RC=1 and no output means there are no such filesystems configured in GPFS[ $? -eq 1 ] && [ "$FS" = " " ] && exit 0# uncomment this line for testing#FS="$FS gpfsdummy"while [ $TIMEOUT -gt 0 ]; do     for fs in ${FS}; do         if findmnt $fs -n &>/dev/null; then             FS=${FS/ $fs/}             continue 2;         fi     done     [ -z "${FS// /}" ] && break     (( TIMEOUT -= 5 ))     sleep 5doneif [ -z "${FS// /}" ]; then     exit 0else     echo >&2 "ERROR: filesystem(s) not found in time:${FS}"     exit 2fi--This works without problems on _most_ of our clusters. However, not on all. Some of them show what Ibelieve is a race condition and fail to startup after a reboot:--# journalctl -u gpfs-- Logs begin at Fri 2020-04-24 17:11:26 CEST, end at Tue 2020-04-28 12:47:34 CEST. --Apr 24 17:12:13 myhost systemd[1]: Starting General Parallel File System...Apr 24 17:12:17 myhost mmfs[5720]: [X] Cannot open configuration file /var/mmfs/gen/mmfs.cfg.Apr 24 17:13:44 myhost systemd[1]: gpfs.service start-post operation timed out. Stopping.Apr 24 17:13:44 myhost mmremote[8966]: Shutting down!Apr 24 17:13:48 myhost mmremote[8966]: Unloading modules from/lib/modules/3.10.0-1062.18.1.el7.x86_64/extraApr 24 17:13:48 myhost mmremote[8966]: Unloading module mmfs26Apr 24 17:13:48 myhost mmremote[8966]: Unloading module mmfslinuxApr 24 17:13:48 myhost systemd[1]: Failed to start General Parallel File System.Apr 24 17:13:48 myhost systemd[1]: Unit gpfs.service entered failed state.Apr 24 17:13:48 myhost systemd[1]: gpfs.service failed.--The mmfs.log shows a bit more:--# less /var/adm/ras/mmfs.log.previous2020-04-24_17:12:14.609+0200: runmmfs starting (4254)2020-04-24_17:12:14.622+0200: [I] Removing old /var/adm/ras/mmfs.log.* files:2020-04-24_17:12:14.658+0200: runmmfs: [I] Unloading modules from/lib/modules/3.10.0-1062.18.1.el7.x86_64/extra2020-04-24_17:12:14.692+0200: runmmfs: [I] Unloading module mmfs262020-04-24_17:12:14.901+0200: runmmfs: [I] Unloading module mmfslinux2020-04-24_17:12:15.018+0200: runmmfs: [I] Unloading module tracedev2020-04-24_17:12:15.057+0200: runmmfs: [I] Loading modules from/lib/modules/3.10.0-1062.18.1.el7.x86_64/extraModule                  Size  Used bymmfs26               2657452  0mmfslinux             809734  1 mmfs26tracedev               48618  2 mmfs26,mmfslinux2020-04-24_17:12:16.720+0200: Node rebooted.  Starting mmautoload...2020-04-24_17:12:17.011+0200: [I] This node has a valid standard license2020-04-24_17:12:17.011+0200: [I] Initializing the fast condition variables at 0x5561DFC365C0 ...2020-04-24_17:12:17.011+0200: [I] mmfsd initializing. {Version: 5.0.4.2   Built: Jan 27 202012:13:06} ...2020-04-24_17:12:17.011+0200: [I] Cleaning old shared memory ...2020-04-24_17:12:17.012+0200: [I] First pass parsing mmfs.cfg ...2020-04-24_17:12:17.013+0200: [X] Cannot open configuration file /var/mmfs/gen/mmfs.cfg.2020-04-24_17:12:20.667+0200: mmautoload: Starting GPFS ...2020-04-24_17:13:44.846+0200: mmremote: Initiating GPFS shutdown ...2020-04-24_17:13:47.861+0200: mmremote: Starting the mmsdrserv daemon ...2020-04-24_17:13:47.955+0200: mmremote: Unloading GPFS 

Re: [gpfsug-discuss] gpfs filesets question

2020-04-16 Thread Frederick Stock
The information you provided is useful but we would need more information to understand what is happening.  Specifically the mapping of filesets to GPFS storage pools, and the source and destination of the data that is to be moved.  If the data to be moved is in storage pool A and is being moved to storage pool B then there is copying that must be done, and that would explain the additional IO.  You can determine the storage pool of a file by using the mmlsattr command.
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: "J. Eric Wonderley" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug main discussion list Cc:Subject: [EXTERNAL] Re: [gpfsug-discuss] gpfs filesets questionDate: Thu, Apr 16, 2020 1:37 PM 
Hi Fred:
 
I do.  I have 3 pools.  system, ssd data pool(fc_ssd400G) and a spinning disk pool(fc_8T).
 
I want to think the ssd_data_pool is empty at the moment and the system pool is ssd and only contains metadata.
[root@cl005 ~]# mmdf home -P fc_ssd400Gdisk                disk size  failure holds    holds              free KB             free KBname                    in KB    group metadata data        in full blocks        in fragments--- -   -  ---Disks in storage pool: fc_ssd400G (Maximum disk size allowed is 97 TB)r10f1e8            1924720640     1001 No       Yes      1924644864 (100%)          9728 ( 0%)r10f1e7            1924720640     1001 No       Yes      1924636672 (100%)         17408 ( 0%)r10f1e6            1924720640     1001 No       Yes      1924636672 (100%)         17664 ( 0%)r10f1e5            1924720640     1001 No       Yes      1924644864 (100%)          9728 ( 0%)r10f6e8            1924720640     1001 No       Yes      1924644864 (100%)          9728 ( 0%)r10f1e9            1924720640     1001 No       Yes      1924644864 (100%)          9728 ( 0%)r10f6e9            1924720640     1001 No       Yes      1924644864 (100%)          9728 ( 0%)                -                          ---(pool total)      13473044480                           13472497664 (100%)         83712 ( 0%)
 
More or less empty.
 
Interesting...
  

On Thu, Apr 16, 2020 at 1:11 PM Frederick Stock <sto...@us.ibm.com> wrote:
Do you have more than one GPFS storage pool in the system?  If you do and they align with the filesets then that might explain why moving data from one fileset to another is causing increased IO operations.
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: "J. Eric Wonderley" <eric.wonder...@vt.edu>Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>Cc:Subject: [EXTERNAL] [gpfsug-discuss] gpfs filesets questionDate: Thu, Apr 16, 2020 12:32 PM 
I have filesets setup in a filesystem...looks like:
[root@cl005 ~]# mmlsfileset home -LFilesets in file system 'home':Name                            Id      RootInode  ParentId Created                      InodeSpace      MaxInodes    AllocInodes Commentroot                             0              3        -- Tue Jun 30 07:54:09 2015        0            402653184      320946176 root filesethess                             1      543733376         0 Tue Jun 13 14:56:13 2017        0                    0              0predictHPC                       2        1171116         0 Thu Jan  5 15:16:56 2017        0                    0              0HYCCSIM                          3      544258049         0 Wed Jun 14 10:00:41 2017        0                    0              0socialdet                        4      544258050         0 Wed Jun 14 10:01:02 2017        0                    0              0arc                              5        1171073         0 Thu Jan  5 15:07:09 2017        0                    0              0arcadm                           6        1171074         0 Thu Jan  5 15:07:10 2017        0                    0              0 
 
I beleive these are dependent filesets.  Dependent on the root fileset.   Anyhow a user wants to move a large amount of data from one fileset to another.   Would this be a metadata only operation?  He has attempted to small amount of data and has noticed some thrasing.
___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss 
 ___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss
___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfs

Re: [gpfsug-discuss] gpfs filesets question

2020-04-16 Thread Frederick Stock
Do you have more than one GPFS storage pool in the system?  If you do and they align with the filesets then that might explain why moving data from one fileset to another is causing increased IO operations.
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: "J. Eric Wonderley" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug main discussion list Cc:Subject: [EXTERNAL] [gpfsug-discuss] gpfs filesets questionDate: Thu, Apr 16, 2020 12:32 PM 
I have filesets setup in a filesystem...looks like:
[root@cl005 ~]# mmlsfileset home -LFilesets in file system 'home':Name                            Id      RootInode  ParentId Created                      InodeSpace      MaxInodes    AllocInodes Commentroot                             0              3        -- Tue Jun 30 07:54:09 2015        0            402653184      320946176 root filesethess                             1      543733376         0 Tue Jun 13 14:56:13 2017        0                    0              0predictHPC                       2        1171116         0 Thu Jan  5 15:16:56 2017        0                    0              0HYCCSIM                          3      544258049         0 Wed Jun 14 10:00:41 2017        0                    0              0socialdet                        4      544258050         0 Wed Jun 14 10:01:02 2017        0                    0              0arc                              5        1171073         0 Thu Jan  5 15:07:09 2017        0                    0              0arcadm                           6        1171074         0 Thu Jan  5 15:07:10 2017        0                    0              0 
 
I beleive these are dependent filesets.  Dependent on the root fileset.   Anyhow a user wants to move a large amount of data from one fileset to another.   Would this be a metadata only operation?  He has attempted to small amount of data and has noticed some thrasing.
___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss 
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] maxStatCache and maxFilesToCache: Tip"gpfs_maxstatcache_low".

2020-03-13 Thread Frederick Stock
As you have learned there is no simple formula for setting the maxStatToCache, or for that matter the maxFilesToCache, configuration values.  Memory is certainly one consideration but another is directory listing operations.  The information kept in the stat cache is sufficient for fulfilling directory listings.  If your users are doing directory listings regularly then a larger stat cache could be helpful. 
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: Philipp Grau Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug main discussion list Cc:Subject: [EXTERNAL] [gpfsug-discuss] maxStatCache and maxFilesToCache: Tip "gpfs_maxstatcache_low".Date: Fri, Mar 13, 2020 8:49 AM 
Hello,we have a two node NSD cluster based on a DDN system.  Currently werun Spectrum Scale 5.0.4.1 in an HPC environment.Mmhealth shows a tip stating "gpfs_maxstatcache_low". Our current settings are:# mmdiag --config | grep -i cache ! maxFilesToCache 300    maxStatCache 1maxFilesToCache was tuned during installion and maxStatCache is theaccording default value.After discussing this issue on the german spectumscale meeting, Iunderstand that it is difficult to give a formula on howto calulatethis values.But I learnt that a FilesToCache entry costs about 10 kbytes of memoryand a StatCache entry about 500 bytes. And typically maxStatCacheshould (obviously) be greater than maxFilesToCache. There is a average100 GB memory usage on our systems (with a total of 265 GB RAM).So setting maxStatCache to at least 300 should be no problem. Butis that correct or to high/low?Has anyone some hints or thoughts on this topic? Help is welcome.Regards,Philipp-- Philipp Grau               | Freie Universitaet Berlin   phg...@zedat.fu-berlin.de  | Zentraleinrichtung fuer Datenverarbeitung Tel: +49 (30) 838 56583    | Fabeckstr. 32   Fax: +49 (30) 838 56721    | 14195 Berlin  ___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss  
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] AFM Alternative?

2020-02-26 Thread Frederick Stock
What sources are you using to help you with configuring AFM?
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: Andi Christiansen To: Frederick Stock , gpfsug-discuss@spectrumscale.orgCc:Subject: [EXTERNAL] RE: [gpfsug-discuss] AFM Alternative?Date: Wed, Feb 26, 2020 8:39 AM 
5.0.4-2.1 (home and cache)
On February 26, 2020 2:33 PM Frederick Stock  wrote:
 
 
Andi, what version of Spectrum Scale do you have installed?
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: "Olaf Weiser" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: a...@christiansen.xxx, gpfsug-discuss@spectrumscale.orgCc: gpfsug-discuss@spectrumscale.orgSubject: [EXTERNAL] Re: [gpfsug-discuss] AFM Alternative?Date: Wed, Feb 26, 2020 8:27 AM 
you may consider WatchFolder  ... (cluster wider inotify --> kafka) .. and then you go from there
 
 
- Original message -From: Andi Christiansen Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: "gpfsug-discuss@spectrumscale.org" Cc:Subject: [EXTERNAL] [gpfsug-discuss] AFM Alternative?Date: Wed, Feb 26, 2020 1:59 PM 
Hi all,
 
Does anyone know of an alternative to AFM ?
 
We have been working on tuning AFM for a few weeks now and see little to no improvement.. And now we are searching for an alternative.. So if anyone knows of a product that can implement with Spectrum Scale i am open to any suggestions :)
 
We have a good mix of files but primarily billions of very small files which AFM does not handle well on long distances.
 
 
Best Regards
A. Christiansen
___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss 
  

___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss 
 
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] AFM Alternative?

2020-02-26 Thread Frederick Stock
Andi, what version of Spectrum Scale do you have installed?
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: "Olaf Weiser" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: a...@christiansen.xxx, gpfsug-discuss@spectrumscale.orgCc: gpfsug-discuss@spectrumscale.orgSubject: [EXTERNAL] Re: [gpfsug-discuss] AFM Alternative?Date: Wed, Feb 26, 2020 8:27 AM 
you may consider WatchFolder  ... (cluster wider inotify --> kafka) .. and then you go from there
 
 
- Original message -From: Andi Christiansen Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: "gpfsug-discuss@spectrumscale.org" Cc:Subject: [EXTERNAL] [gpfsug-discuss] AFM Alternative?Date: Wed, Feb 26, 2020 1:59 PM 
Hi all,
 
Does anyone know of an alternative to AFM ?
 
We have been working on tuning AFM for a few weeks now and see little to no improvement.. And now we are searching for an alternative.. So if anyone knows of a product that can implement with Spectrum Scale i am open to any suggestions :)
 
We have a good mix of files but primarily billions of very small files which AFM does not handle well on long distances.
 
 
Best Regards
A. Christiansen
___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss 
  

___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss 
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Thousands of CLOSE_WAIT IPV6 connections on CES

2020-02-25 Thread Frederick Stock
Colleagues of mine have communicated that this has been seen in the past due to interaction between the Spectrum Scale performance monitor (zimon) and Grafana.  Are you using Grafana?  Normally zimon is configured to use local port 9094 so if that is the port which the CLOSE_WAIT is attached then it would seem to be an instance of this problem.  You can use the following to check for this condition.
 
netstat -ntp | grep "\:9094 .*CLOSE_WAIT" | wc -l
 
 
Fred
__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: Leonardo Sala Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: "gpfsug-discuss@spectrumscale.org" Cc:Subject: [EXTERNAL] [gpfsug-discuss] Thousands of CLOSE_WAIT IPV6 connections on CESDate: Fri, Feb 21, 2020 9:30 AM 
Dear all,
I was wondering if anybody recently encountered a similar issue (I found a related thread from 2018, but it was inconclusive). I just found that one of our production CES nodes have 28k CLOSE_WAIT tcp6 connections, I do not understand why... the second node in the same cluster does not have this issue. Both are:
- GPFS 5.0.4.2
- RHEL 7.4
has anybody else encountered anything similar? In the last few days it seems it happened once on one node, and twice on the other, but never on both... 
Thanks for any feedback!
cheers
leo
--Paul Scherrer InstitutDr. Leonardo SalaGroup Leader High Performance ComputingDeputy Section Head Science ITScience ITWHGA/036Forschungstrasse 1115232 Villigen PSISwitzerlandPhone: +41 56 310 3369leonardo.s...@psi.chwww.psi.ch
___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss 
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Upgrade GPFS 3.5 to Spectrum Scale 5.0.3

2020-02-21 Thread Frederick Stock
Assuming you want to maintain your cluster and the file systems you have created you would need to upgrade to Spectrum Scale 4.2.3.x (4.1.x is no longer supported).  I think an upgrade from 3.5 to 4.2 is supported.  Once you have upgraded to 4.2.3.x (the latest PTF is 20, 4.2.3.20) you can then upgrade to Scale 5.0.x.  You show Scale 5.0.3 but 5.0.4 is available and its latest version is 5.0.4.2.  As you do the upgrades you should also execute the commands, "mmchconfig release=LATEST" and "mmchfs  -v full", where  is a file system name.  You should only execute those commands once all nodes in the cluster are upgraded.  If you have clusters remotely mounting file systems from this cluster then those clusters need to be upgraded before you execute the mmchfs command.
 
Hopefully you are aware that GPFS 3.5 went out of service in April 2018.
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: George Terry Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug-discuss@spectrumscale.orgCc:Subject: [EXTERNAL] [gpfsug-discuss] Upgrade GPFS 3.5 to Spectrum Scale 5.0.3Date: Fri, Feb 21, 2020 12:18 PM 
Hello,I've a question about upgrade of GPFS 3.5. We have an infrastructure with GSPF 3.5.0.33 and we need upgrade to Spectrum Scale 5.0.3.Can we upgrade from 3.5 to 4.1, 4.2 and 5.0.3 or can we do something additional like unistall GPFS 3.5 and install Spectrum Scale 5.0.3?Thank you
 
George
___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss 
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] GPFS 5 and supported rhel OS

2020-02-20 Thread Frederick Stock
This is a bit off the point of this discussion but it seemed like an appropriate context for me to post this question.  IMHO the state of software is such that it is expected to change rather frequently, for example the OS on your laptop/tablet/smartphone and your web browser.  It is correct to say those devices are not running an HPC or enterprise environment but I mention them because I expect none of us would think of running those devices on software that is a version far from the latest available.  With that as background I am curious to understand why folks would continue to run systems on software like RHEL 6.x which is now two major releases(and many years) behind the current version of that product?  Is it simply the effort required to upgrade 100s/1000s of nodes and the disruption that causes, or are there other factors that make keeping current with OS releases problematic?  I do understand it is not just a matter of upgrading the OS but all the software, like Spectrum Scale, that runs atop that OS in your environment.  While they all do not remain in lock step I would  think that in some window of time, say 12-18 months after an OS release, all software in your environment would support a new/recent OS release that would technically permit the system to be upgraded.
 
I should add that I think you want to be on or near the latest release of any software with the presumption that newer versions should be an improvement over older versions, albeit with the usual caveats of new defects.
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: Jonathan Buzzard Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: "gpfsug-discuss@spectrumscale.org" Cc:Subject: [EXTERNAL] Re: [gpfsug-discuss] GPFS 5 and supported rhel OSDate: Thu, Feb 20, 2020 6:24 AM 
On 20/02/2020 10:41, Simon Thompson wrote:> Well, if you were buying some form of extended Life Support for> Scale, then you might also be expecting to buy extended life for> RedHat. RHEL6 has extended life support until June 2024. Sure its an> add on subscription cost, but some people might be prepared to do> that over OS upgrades.I would recommend anyone going down that to route to take a *very* closelook at what you get for the extended support. Not all of the OS issupported, with large chunks being moved to unsupported even if you payfor the extended support.Consequently extended support is not suitable for HPC usage in my view,so start planning the upgrade now. It's not like you haven't had 10years notice.If your GPFS is just a storage thing serving out on protocol nodes,upgrade one node at a time to RHEL7 and then repeat upgrading to GPFS 5.It's a relatively easy invisible to the users upgrade.JAB.--Jonathan A. Buzzard                         Tel: +44141-5483420HPC System Administrator, ARCHIE-WeSt.University of Strathclyde, John Anderson Building, Glasgow. G4 0NG___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss  
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] How to upgrade GPFS 4.2.3.2 version?

2020-01-27 Thread Frederick Stock
Note that Spectrum Scale 4.2.x will be end of service in September 2020.  I strongly suggest you consider upgrading to Spectrum Scale 5.0.3 or later.
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: Simon Thompson Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug main discussion list Cc:Subject: [EXTERNAL] Re: [gpfsug-discuss] How to upgrade GPFS 4.2.3.2 version?Date: Mon, Jan 27, 2020 5:30 AM 
4.2.3 on Fix Central is called IBM Spectrum Scale, not GPFS.Try:https://www-945.ibm.com/support/fixcentral/swg/selectFixes?parent=Software%20defined%20storage=ibm/StorageSoftware/IBM+Spectrum+Scale=4.2.3=Linux+64-bit,x86_64=allSimonOn 27/01/2020, 10:27, "gpfsug-discuss-boun...@spectrumscale.org on behalf of agostino.fu...@enea.it"  wrote:    Hi,        I was trying to upgrade our IBM Spectrum Scale (General Parallel File    System Standard Edition) version "4.2.3.2 " for Linux_x86 systems        but from the Passport Advantage download site the only available    versions are 5.*        Moreover, from the Fix Central repository the only available patches are    for the 4.1.0 version.        How should I do?        Thank you in advance.        Best regards,        Agostino Funel        --    Agostino Funel    DTE-ICT-HPC    ENEA    P.le E. Fermi 1    80055 Portici (Napoli) Italy    Phone: (+39) 081-7723575    Fax: (+39) 081-7723344    E-mail: agostino.fu...@enea.it    WWW: http://www.afs.enea.it/funel         ___    gpfsug-discuss mailing list    gpfsug-discuss at spectrumscale.org    http://gpfsug.org/mailman/listinfo/gpfsug-discuss     ___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss  
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] mmapplypolicy - listing policy gets occasionally get stuck.

2020-01-14 Thread Frederick Stock
When this occurs have you run the command, mmlsnode -N waiters -L, to see the list of waiters in the cluster?  That may provide clues as to why the policy seems stuck.
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: "Wilson, Neil" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug main discussion list Cc:Subject: [EXTERNAL] [gpfsug-discuss] mmapplypolicy - listing policy gets occasionally get stuck.Date: Tue, Jan 14, 2020 10:37 AM 
Hi All,We are occasionally seeing an issue where an mmapplypolicy list job gets stuck , all its doing is generating a listing from a fileset.The problem occurs intermittently and doesn't seem to show any particular pattern ( i.e. not always on the same fileset)The policy job shows the usual output but then outputs the following until the process is killed.[I] 2020-01-08@03:05:30.471 Directory entries scanned: 0.          [I] 2020-01-08@03:05:45.471 Directory entries scanned: 0.          [I] 2020-01-08@03:06:00.472 Directory entries scanned: 0.          [I] 2020-01-08@03:06:15.472 Directory entries scanned: 0.          [I] 2020-01-08@03:06:30.473 Directory entries scanned: 0.          [I] 2020-01-08@03:06:45.473 Directory entries scanned: 0.          [I] 2020-01-08@03:07:00.473 Directory entries scanned: 0.          [I] 2020-01-08@03:07:15.473 Directory entries scanned: 0.          [I] 2020-01-08@03:07:30.475 Directory entries scanned: 0.          [I] 2020-01-08@03:07:45.475 Directory entries scanned: 0.          [I] 2020-01-08@03:08:00.475 Directory entries scanned: 0.          [I] 2020-01-08@03:08:15.475 Directory entries scanned: 0.          [I] 2020-01-08@03:08:30.476 Directory entries scanned: 0.          [I] 2020-01-08@03:08:45.476 Directory entries scanned: 0.          [I] 2020-01-08@03:09:00.477 Directory entries scanned: 0.          [I] 2020-01-08@03:09:15.477 Directory entries scanned: 0.          [I] 2020-01-08@03:09:30.478 Directory entries scanned: 0.          [I] 2020-01-08@03:09:45.478 Directory entries scanned: 0.          [I] 2020-01-08@03:10:00.478 Directory entries scanned: 0.          [I] 2020-01-08@03:10:15.478 Directory entries scanned: 0.          [I] 2020-01-08@03:10:30.479 Directory entries scanned: 0.          [I] 2020-01-08@03:10:45.480 Directory entries scanned: 0.          [I] 2020-01-08@03:11:00.481 Directory entries scanned: 0.          Have any of you come across an issue like this before?Kind regardsNeilNeil Wilson  Senior IT PractitionerStorage, Virtualisation and Mainframe Team   IT ServicesMet Office FitzRoy Road Exeter Devon EX1 3PB United Kingdom___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss  
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] How to join GNR nodes to a non-GNR cluster

2019-12-05 Thread Frederick Stock
If you plan to replace all the storage then why did you choose to integrate a ESS GL2 rather than use another storage option?  Perhaps you had already purchased the ESS system?
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: "Dorigo Alvise (PSI)" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug main discussion list Cc:Subject: [EXTERNAL] Re: [gpfsug-discuss] How to join GNR nodes to a non-GNR clusterDate: Thu, Dec 5, 2019 2:57 PM 
This is a quite critical storage for data taking. It is not easy to update to GPFS5 because in that facility we have very short shutdown period. Thank you for pointing out that 4.2.3. But the entire storage will be replaced in the future; at the moment we just need to expand it to survive for a while.
 
This merge seems quite tricky to implement and I haven't seen consistent opinions among the people that kindly answered. According to Jan Frode, Kaplan and T. Perry it should be possible, in principle, to do the merge... Other people suggest a remote mount, which is not a solution for my use case. Other suggest not to do that...
 
   A
 
 
From: gpfsug-discuss-boun...@spectrumscale.org  on behalf of Daniel Kidger Sent: Thursday, December 5, 2019 11:24:08 AMTo: gpfsug main discussion listSubject: Re: [gpfsug-discuss] How to join GNR nodes to a non-GNR cluster
 
One additional question to ask is : what are your long term plans for the 4.2.3 Spectrum Scake cluster?  Do you expect to upgrade it to version 5.x (hopefully before 4.2.3 goes out of support)?
 
Also I assume your Netapp hardware is the standard Netapp block storage, perhaps based on their standard 4U60 shelves daisy-chained together? 
Daniel
 
_Daniel Kidger
IBM Technical Sales Specialist
Spectrum Scale, Spectrum Discover and IBM Cloud Object Store+44-(0)7818 522 266 daniel.kid...@uk.ibm.com

 
 
 
On 5 Dec 2019, at 09:29, Dorigo Alvise (PSI)  wrote: 

Thank Anderson for the material. In principle our idea was to scratch the filesystem in the GL2, put its NSD on a dedicated pool and then merge it into the Filesystem which would remain on V4. I do not want to create a FS in the GL2 but use its space to expand the space of the other cluster.
 
   A
From: gpfsug-discuss-boun...@spectrumscale.org  on behalf of Anderson Ferreira Nobre Sent: Wednesday, December 4, 2019 3:07:18 PMTo: gpfsug-discuss@spectrumscale.orgSubject: Re: [gpfsug-discuss] How to join GNR nodes to a non-GNR cluster
 
Hi Dorigo,
 
From point of view of cluster administration I don't think it's a good idea to have hererogeneous cluster. There are too many diferences between V4 and V5. And much probably many of enhancements of V5 you won't take advantage. One example is the new filesystem layout in V5. And at this moment the way to migrate the filesystem is create a new filesystem in V5 with the new layout and migrate the data. That is inevitable. I have seen clients saying that they don't need all that enhancements, but the true is when you face performance issue that is only addressable with the new features someone will raise the question why we didn't consider that in the beginning.
 
Use this time to review if it would be better to change the block size of your fileystem. There's a script called filehist in /usr/lpp/mmfs/samples/debugtools to create a histogram of files in your current filesystem. Here's the link with additional information:
https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General%20Parallel%20File%20System%20(GPFS)/page/Data%20and%20Metadata
 
Different RAID configurations also brings unexpected performance behaviors. Unless you are planning create different pools and use ILM to manage the files in different pools.
 
One last thing, it's a good idea to follow the recommended levels for Spectrum Scale:
https://www.ibm.com/support/pages/ibm-spectrum-scale-software-version-recommendation-preventive-service-planning
 
Anyway, you are the system administrator, you know better than anyone how complex is to manage this cluster.
 
Abraços / Regards / Saludos,
 
AndersonNobrePower and Storage ConsultantIBM Systems Hardware Client Technical Team – IBM Systems Lab Services 
Phone:55-19-2132-4317E-mail:  ano...@br.ibm.com
 
 
- Original message -From: 

Re: [gpfsug-discuss] GPFS on RHEL 8.1

2019-11-11 Thread Frederick Stock
I was reminded by a colleague that RHEL 8.1 support is expected in the first quarter of 2020.
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: "Frederick Stock" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug-discuss@spectrumscale.orgCc: gpfsug-discuss@spectrumscale.orgSubject: [EXTERNAL] Re: [gpfsug-discuss] GPFS on RHEL 8.1Date: Mon, Nov 11, 2019 7:47 AM 
RHEL 8.1 is not yet supported so the mmbuildgpl error is not unexpected.  I do not recall when RHEL 8.1 will be supported.
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: "Jakobs, Julian" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: "gpfsug-discuss@spectrumscale.org" Cc:Subject: [EXTERNAL] [gpfsug-discuss] GPFS on RHEL 8.1Date: Mon, Nov 11, 2019 6:56 AM 
Hello,
 
has anyone already tried Spectrum Scale on RHEL 8.1? I can see from the GPFS FAQ that “RHEL 8” (no minor version indicated) is supported as of the 5.0.4 release. However latest kernel level fully tested is 4.18.0-80, indicating that only RHEL 8.0 was tested. I tested an installation with 8.1 (kernel version 4.18.0-147) and mmbuildgpl failed due to errors while compiling the gpl (incompatible pointer type).
Is this expected behaviour or is there maybe something else wrong with the installation?
 
If this needs a new GPFS release, is there an estimated time? I would much prefer to install it with RHEL 8.1 due to 8.0 not being a EUS release.
 
Best,
Julian Jakobs
 
___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss
  

___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss 
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] GPFS on RHEL 8.1

2019-11-11 Thread Frederick Stock
RHEL 8.1 is not yet supported so the mmbuildgpl error is not unexpected.  I do not recall when RHEL 8.1 will be supported.
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: "Jakobs, Julian" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: "gpfsug-discuss@spectrumscale.org" Cc:Subject: [EXTERNAL] [gpfsug-discuss] GPFS on RHEL 8.1Date: Mon, Nov 11, 2019 6:56 AM 
Hello,
 
has anyone already tried Spectrum Scale on RHEL 8.1? I can see from the GPFS FAQ that “RHEL 8” (no minor version indicated) is supported as of the 5.0.4 release. However latest kernel level fully tested is 4.18.0-80, indicating that only RHEL 8.0 was tested. I tested an installation with 8.1 (kernel version 4.18.0-147) and mmbuildgpl failed due to errors while compiling the gpl (incompatible pointer type).
Is this expected behaviour or is there maybe something else wrong with the installation?
 
If this needs a new GPFS release, is there an estimated time? I would much prefer to install it with RHEL 8.1 due to 8.0 not being a EUS release.
 
Best,
Julian Jakobs
 
___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] ESS - Considerations when adding NSD space?

2019-10-24 Thread Frederick Stock
Bob as I understand having different size NSDs is still not considered ideal even for ESS.  I had another customer recently add storage to an ESS system and they were advised to first check the size of their current vdisks and size the new vdisks to be the same. 
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: "Oesterlin, Robert" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug main discussion list Cc:Subject: [EXTERNAL] [gpfsug-discuss] ESS - Considerations when adding NSD space?Date: Thu, Oct 24, 2019 11:34 AM 
We recently upgraded our GL4 to a GL6 (trouble free process for those considering FYI). I now have 615T free (raw) in each of my recovery groups.  I’d like to increase the size of one of the file systems (currently at 660T, I’d like to add 100T).
 
My first thought was going to be:
 
mmvdisk vdiskset define --vdisk-set fsdata1 --recovery-group rg_gssio1-hs,rg_gssio2-hs --set-size 50T --code 8+2p --block-size 4m --nsd-usage dataOnly --storage-pool data
mmvdisk vdiskset create --vdisk-set fs1data1 
mmvdisk filesystem add --filesystem fs1 --vdisk-set fs1data1 
 
I know in the past use of mixed size NSDs was frowned upon, not sure on the ESS. 
 
The other approach would be add two larger NSDs (current ones are 330T) of 380T, migrate the data to the new ones using mmrestripe, then delete the old ones. The other benefit of this process would be to have the file system data better balanced across all the storage enclosures.
 
Any considerations before I do this?  Thoughts?
 
Bob Oesterlin
Sr Principal Storage Engineer, Nuance
 
___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss 
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] mmbackup questions

2019-10-17 Thread Frederick Stock
Jonathan the "objects inspected" refers to the number of file system objects that matched the policy rules used for the backup.  These rules are influenced by TSM server and client settings, e.g. the dsm.sys file.  So not all objects in the file system are actually inspected.
 
As for tuning I think the mmbackup man page is the place to start, and I think it is thorough in its description of the tuning options.  You may also want to look at the mmapplypolicy man page since mmbackup invokes it to scan the file system for files that need to be backed up.
 
To my knowledge there are no options to place the shadow database file in another location than the GPFS file system.  If the file system has fast storage I see no reason why you could not use a placement policy rule to place the shadow database on that fast storage.  However, I think using more than one node for your backups, and adjusting the various threads used by mmbackup will provide you with sufficient performance improvements.
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: Jonathan Buzzard Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: "gpfsug-discuss@spectrumscale.org" Cc:Subject: [EXTERNAL] [gpfsug-discuss] mmbackup questionsDate: Thu, Oct 17, 2019 8:00 AM 
I have been looking to give mmbackup another go (a very long historywith it being a pile of steaming dinosaur droppings last time I tried,but that was seven years ago).Anyway having done a backup last night I am curious about somethingthat does not appear to be explained in the documentation.Basically the output has a line like the following        Total number of objects inspected:      474630What is this number? Is it the number of files that have changed sincethe last backup or something else as it is not the number of files onthe file system by any stretch of the imagination. One would hope thatit inspected everything on the file system...Also it appears that the shadow database is held on the GPFS filesystem that is being backed up. Is there any way to change the locationof that? I am only using one node for backup (because I am cheap anddon't like paying for more PVU's than I need to) and would like to holdit on the node doing the backup where I can put it on SSD. Which doesto things firstly hopefully goes a lot faster, and secondly reduces theimpact on the file system of the backup.Anyway a significant speed up (assuming it worked) was achieved but Inote even the ancient Xeon E3113 (dual core 3GHz) was never taxed (loadaverage never went above one) and we didn't touch the swap despite onlyhave 24GB of RAM. Though the 10GbE networking did get busy during thetransfer of data to the TSM server bit of the backup but during the"assembly stage" it was all a bit quiet, and the DSS-G server nodeswhere not busy either. What options are there for tuning things becauseI feel it should be able to go a lot faster.JAB.--Jonathan A. Buzzard                         Tel: +44141-5483420HPC System Administrator, ARCHIE-WeSt.University of Strathclyde, John Anderson Building, Glasgow. G4 0NG___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss  
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] default owner and group for POSIX ACLs

2019-10-16 Thread Frederick Stock
Paul in regards to your question I would think you want to use NFSv4 ACLs and set the chmodAndUpdateAcl option on the fileset (see mmcrfileset/mmchfileset).
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: Paul Ward Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug main discussion list Cc:Subject: [EXTERNAL] Re: [gpfsug-discuss] default owner and group for POSIX ACLsDate: Wed, Oct 16, 2019 7:00 AM 
We are running GPFS 4.2.3 with Arcpix build 3.5.10 or 3.5.12.We don't have Ganesha in the build. I'm not sure about the NFS service.Thanks for the responses, its interesting how the discussion has branched into Ganesha and what ACL changes are picked up by Spectrum Protect and mmbackup (my next major change).Any more responses on what is the best practice for the default POSIX owner and group of files and folders, when NFSv4 ACLs are used for SMB shares?Kindest regards,PaulPaul WardTS Infrastructure ArchitectNatural History MuseumT: 02079426450E: p.w...@nhm.ac.uk-Original Message-From: gpfsug-discuss-boun...@spectrumscale.org  On Behalf Of Jonathan BuzzardSent: 16 October 2019 10:36To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] default owner and group for POSIX ACLsOn Wed, 2019-10-16 at 08:21 +, Malahal R Naineni wrote: >> Ganesha shows functions for converting between GPFS ACL's and the ACL format as used by Ganesha.   Ganesha only supports NFSv4 ACLs, so the conversion is a quick one. kernel NFS server converts NFSv4 ACLs to POSIX ACLs (the mapping isn't perfect) as many of the Linux file systems only support POSIX ACLs (at least this was the behavior).  Yes but the point is you don't need POSIX ACL's on your file system if you are doing NFS exports if you use Ganesha as your NFS server and only do NFSv4 exports. It is then down to the client to deal with the ACL's which the Linux client does. In fact it has for as long as I can remember. There are even tools to manipulate the NFSv4 ACL's (see nfs4- acl-tools on RHEL and derivatives).What's missing is "rich ACL" support in the Linux kernel.https://l.antigena.com/l/wElAOKB71BMteh5p3MJsrMJ1piEPqSzVv7jGE7WAADAaMiBDMV~~SJdC~qYZEePn7-JksRn9_H6cg21GWyrYE77TnWcAWsMEnF3Nwuug0tRR7ud7GDl9vPM3iafYImA3LyGuQInuXsXilJ6R9e2qmotMPRr~Lsq9CHJ2fsu1dBR1EL622lakpWuKLhjucFNsxUODYLWWFMzVbWj_AigKVAIMEX8Xqs0hGKXpOmjJOTejZDjM8bOCA1-jl06wU3DoT-ad3latFOtGR-oTHHwhAmu792L7Grmas12aetAuhTHnCQ6BBtRLGR_-iVJFYKfdyJNMVsDeKcBEBKKFSZdF~7ozqBouoIAZPE6cOA8KQIeh6mt1~_n which seems to be down at the moment. Though there has been activity on the user space utilities.https://eur03.safelinks.protection.outlook.com/?url=""> Is it possible to get IBM to devote some resources to moving this along. It would make using GPFS on Linux with ACL's a more pleasant experience.JAB.--Jonathan A. Buzzard                         Tel: +44141-5483420HPC System Administrator, ARCHIE-WeSt.University of Strathclyde, John Anderson Building, Glasgow. G4 0NG___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttps://eur03.safelinks.protection.outlook.com/?url=""> ___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss  
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] default owner and group for POSIX ACLs

2019-10-15 Thread Frederick Stock
Thanks Paul.  Could you please clarify which ACL you changed, the GPFS NFSv4 ACL or the POSIX ACL?
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: Paul Ward Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug main discussion list Cc:Subject: [EXTERNAL] Re: [gpfsug-discuss] default owner and group for POSIX ACLsDate: Tue, Oct 15, 2019 12:18 PM  
Hi Fred,
 
From the tests I have done changing the ACL results in just an ‘update’ to when using Spectrum Protect, even on migrated files.
 
Kindest regards,
Paul
 
Paul Ward
TS Infrastructure Architect
Natural History Museum
T: 02079426450
E: p.w...@nhm.ac.uk
 
From: gpfsug-discuss-boun...@spectrumscale.org  On Behalf Of Frederick StockSent: 15 October 2019 17:09To: gpfsug-discuss@spectrumscale.orgCc: gpfsug-discuss@spectrumscale.orgSubject: Re: [gpfsug-discuss] default owner and group for POSIX ACLs
 
As I understand if you change only the POSIX attributes on a file then you are correct that TSM will only backup the file metadata, actually just the POSIX relevant metadata.  However, if you change ACLs or other GPFS specific metadata then TSM will backup the entire file, TSM does not keep all file metadata separate from the actual file data.
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: Simon Thompson Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug main discussion list Cc:Subject: [EXTERNAL] Re: [gpfsug-discuss] default owner and group for POSIX ACLsDate: Tue, Oct 15, 2019 11:41 AM  
I thought Spectrum Protect didn't actually backup again on a file owner change. Sure mmbackup considers it, but I think Protect just updates the metadata. There are also some other options for dsmc that can stop other similar issues if you change ctime maybe.(Other backup tools are available)SimonOn 15/10/2019, 15:31, "gpfsug-discuss-boun...@spectrumscale.org on behalf of Jonathan Buzzard"  wrote:    On Tue, 2019-10-15 at 12:34 +, Paul Ward wrote:    > We are in the process of changing the way GPFS assigns UID/GIDs from    > internal tdb to using AD RIDs with an offset that matches our linux    > systems. We, therefore, need to change the ACLs for all the files in    > GPFS (up to 80 million).        You do realize that will mean backing everything up again        > We are running in mixed ACL mode, with some POSIX and some NFSv4 ACLs    > being applied. (This system was set up 14 years ago and has changed    > roles over time) We are running on linux, so need to have POSIX    > permissions enabled.        We run on Linux and only have NFSv4 ACL's applied. I am not sure why    you need POSIX ACL's if you are running Linux. Very very few    applications will actually check ACL's or even for that matter    permissions. They just do an fopen call or similar and the OS either    goes yeah or neah, and the app needs to do something in the case of    neah.        >      > What I want to know for those in a similar environment, what do you    > have as the POSIX owner and group, when NFSv4 ACLs are in use?    > root:root    >      > or do you have all files owned by a filesystem administrator account    > and group:    > :    >      > on our samba shares we have :    > admin users = @                      > So don’t actually need the group defined in POSIX.    >        Samba works much better with NFSv4 ACL's.        JAB.        --    Jonathan A. Buzzard                         Tel: +44141-5483420    HPC System Administrator, ARCHIE-WeSt.    University of Strathclyde, John Anderson Building, Glasgow. G4 0NG                ___    gpfsug-discuss mailing list    gpfsug-discuss at spectrumscale.org    gpfsug.org     ___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orggpfsug.org  
 
 
___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss 
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] default owner and group for POSIX ACLs

2019-10-15 Thread Frederick Stock
As I understand if you change only the POSIX attributes on a file then you are correct that TSM will only backup the file metadata, actually just the POSIX relevant metadata.  However, if you change ACLs or other GPFS specific metadata then TSM will backup the entire file, TSM does not keep all file metadata separate from the actual file data.
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: Simon Thompson Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug main discussion list Cc:Subject: [EXTERNAL] Re: [gpfsug-discuss] default owner and group for POSIX ACLsDate: Tue, Oct 15, 2019 11:41 AM 
I thought Spectrum Protect didn't actually backup again on a file owner change. Sure mmbackup considers it, but I think Protect just updates the metadata. There are also some other options for dsmc that can stop other similar issues if you change ctime maybe.(Other backup tools are available)SimonOn 15/10/2019, 15:31, "gpfsug-discuss-boun...@spectrumscale.org on behalf of Jonathan Buzzard"  wrote:    On Tue, 2019-10-15 at 12:34 +, Paul Ward wrote:    > We are in the process of changing the way GPFS assigns UID/GIDs from    > internal tdb to using AD RIDs with an offset that matches our linux    > systems. We, therefore, need to change the ACLs for all the files in    > GPFS (up to 80 million).        You do realize that will mean backing everything up again...        > We are running in mixed ACL mode, with some POSIX and some NFSv4 ACLs    > being applied. (This system was set up 14 years ago and has changed    > roles over time) We are running on linux, so need to have POSIX    > permissions enabled.        We run on Linux and only have NFSv4 ACL's applied. I am not sure why    you need POSIX ACL's if you are running Linux. Very very few    applications will actually check ACL's or even for that matter    permissions. They just do an fopen call or similar and the OS either    goes yeah or neah, and the app needs to do something in the case of    neah.        >      > What I want to know for those in a similar environment, what do you    > have as the POSIX owner and group, when NFSv4 ACLs are in use?    > root:root    >      > or do you have all files owned by a filesystem administrator account    > and group:    > :    >      > on our samba shares we have :    > admin users = @                      > So don’t actually need the group defined in POSIX.    >        Samba works much better with NFSv4 ACL's.        JAB.        --    Jonathan A. Buzzard                         Tel: +44141-5483420    HPC System Administrator, ARCHIE-WeSt.    University of Strathclyde, John Anderson Building, Glasgow. G4 0NG                ___    gpfsug-discuss mailing list    gpfsug-discuss at spectrumscale.org    http://gpfsug.org/mailman/listinfo/gpfsug-discuss     ___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss  
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Backup question

2019-08-29 Thread Frederick Stock
The only integrated solution, that is using mmbackup, is with Spectrum Protect.  However, you can use other products to backup GPFS file systems assuming they use POSIX commands/interfaces to do the backup.  I think CommVault has a backup client that makes use of the GPFS policy engine so it runs similar to how mmbackup works for Spectrum Protect.  If you do research third party backup options you should note if they make use of the GPFS policy engine.  You can do backups without it but using the policy engine does improve performance.
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: Cc:Subject: [EXTERNAL] [gpfsug-discuss] Backup questionDate: Thu, Aug 29, 2019 10:08 AM  
 
 
Are there any other options to backup up GPFS other that Spectrum Protect ? 
  

Notice to all users The information contained in this email, including any attachment(s) is confidential and intended solely for the addressee and may contain privileged, confidential or restricted information. If you are not the intended recipient or responsible to deliver to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you received this message in error please notify the originator and then delete. Neither, the sender or GMF's network will be liable for direct, indirect or consequential infection by viruses associated with this email. 
___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss 
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Checking for Stale File Handles

2019-08-09 Thread Frederick Stock
Are you able to explain why you want to check for stale file handles?  Are you attempting to detect failures of some sort, and why do the existing mechanisms in GPFS not provide the functionality you require?
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: Alexander John Mamach Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: "gpfsug-discuss@spectrumscale.org" Cc:Subject: [EXTERNAL] [gpfsug-discuss] Checking for Stale File HandlesDate: Fri, Aug 9, 2019 1:46 PM 
Hi folks,
 
We’re currently investigating a way to check for stale file handles on the nodes across our cluster in a way that minimizes impact to the filesystem and performance.
 
Has anyone found a direct way of doing so? We considered a few methods, including simply attempting to ls a GPFS filesystem from each node, but that might have false positives, (detecting slowdowns as stale file handles), and could negatively impact performance with hundreds of nodes doing this simultaneously.
 
Thanks,
 
Alex
 
Senior Systems AdministratorResearch Computing InfrastructureNorthwestern University Information Technology (NUIT)2020 Ridge AveEvanston, IL 60208-4311O: (847) 491-2219M: (312) 887-1881www.it.northwestern.edu
 
___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss 
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Intro, and Spectrum Archive self-service recall interface question

2019-05-20 Thread Frederick Stock
Todd I am not aware of any tool that provides the out of band recall that you propose, though it would be quite useful.  However, I wanted to note that as I understand the reason the the Mac client initiates the file recalls is because the Mac SMB client ignores the archive bit, indicating a file does not reside in online storage, in the SMB protocol.  To date efforts to have Apple change their SMB client to respect the archive bit have not been successful but if you feel so inclined we would be grateful if you would submit a request to Apple for them to change their SMB client to honor the archive bit and thus avoid file recalls.
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: Todd Ruston Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug-discuss@spectrumscale.orgCc:Subject: [EXTERNAL] [gpfsug-discuss] Intro, and Spectrum Archive self-service recall interface questionDate: Mon, May 20, 2019 4:12 PM 
Greetings all,First post here, so by way of introduction we are a fairly new Spectrum Scale and Archive customer (installed last year and live in production Q1 this year). We have a four node (plus EMS) ESS system with ~520TB of mixed spinning disk and SSD. Client access to the system is via CES (NFS and SMB, running on two protocol nodes), integrated with Active Directory, for a mixed population of Windows, Mac, and Linux clients. A separate pair of nodes run Spectrum Archive, with a TS4500 LTO-8 library behind them.We use the system for general institute data, with the largest data types being HD video, multibeam sonar, and hydrophone data. Video is the currently active data type in production; we will be migrating the rest over time. So far things are running pretty well.Our archive approach is to premigrate data, particularly the large, unchanging data like the above mentioned data types, almost immediately upon landing in the system. Then we migrate those that have not been accessed in a period of time (or manually if space demands require it). We do wish to allow users to recall archived data on demand as needed.Because we have a large contingent of Mac clients (accessing the system via SMB), one issue we want to get ahead of is inadvertent recalls triggered by Mac preview generation, Quick Look, Cover Flow/Gallery view, and the like. Going in we knew this was going to be something we'd need to address, and we anticipated being able to configure Finder to disable preview generation and train users to avoid Quick Look unless they intended to trigger a recall. In our testing however, even with those features disabled/avoided, we have seen Mac clients trigger inadvertent recalls just from CLI 'ls -lshrt' interactions with the system.While brainstorming ways to prevent these inadvertent recalls while still allowing users to initiate recalls on their own when needed, one thought that came to us is we might be able to turn off recalls via SMB (setgpfs:recalls = no via mmsmb), and create a simple self-service web portal that would allow users to browse the Scale file system with a web browser, select files for recall, and initiate the recall from there. The web interface could run on one of the Archive nodes, and the back end of it would simply send a list of selected file paths to ltfsee recall.Before possibly reinventing the wheel, I thought I'd check to see if something like this may already exist, either from IBM, the Scale user community, or a third-party/open source tool that could be leveraged for the purpose. I searched the list archive and didn't find anything, but please let me know if I missed something. And please let me know if you know of something that would fit this need, or other ideas as well.Cheers,--Todd E. RustonInformation Systems ManagerMonterey Bay Aquarium Research Institute (MBARI)7700 Sandholdt Road, Moss Landing, CA, 95039Phone 831-775-1997      Fax 831-775-1652      http://www.mbari.org___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss 
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] GPFS v5: Blocksizes and subblocks

2019-03-27 Thread Frederick Stock
Kevin you are correct, it is one "system" storage pool per file system not cluster.
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: "Buterbaugh, Kevin L" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug main discussion list Cc:Subject: Re: [gpfsug-discuss] GPFS v5: Blocksizes and subblocksDate: Wed, Mar 27, 2019 10:33 AM  Hi All,
 
So I was looking at the presentation referenced below and it states - on multiple slides - that there is one system storage pool per cluster.  Really?  Shouldn’t that be one system storage pool per filesystem?!?  If not, please explain how in my GPFS cluster with two (local) filesystems I see two different system pools with two different sets of NSDs, two different capacities, and two different percentages full???
 
Thanks…
 
Kevin
 
—
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
kevin.buterba...@vanderbilt.edu - (615)875-9633
 
On Mar 26, 2019, at 11:27 AM, Dorigo Alvise (PSI)  wrote: 

Hi Marc,
"Indirect block size" is well explained in this presentation: 
 
http://files.gpfsug.org/presentations/2016/south-bank/D2_P2_A_spectrum_scale_metadata_dark_V2a.pdf
 
pages 37-41
 
Cheers,
 
   Alvise
 
From: gpfsug-discuss-boun...@spectrumscale.org [gpfsug-discuss-boun...@spectrumscale.org] on behalf of Caubet Serrabou Marc (PSI) [marc.cau...@psi.ch]Sent: Tuesday, March 26, 2019 4:39 PMTo: gpfsug main discussion listSubject: [gpfsug-discuss] GPFS v5: Blocksizes and subblocks 
 
Hi all,
 
according to several GPFS presentations as well as according to the man pages:
 
 Table 1. Block sizes and subblock sizes+---+---+| Block size    | Subblock size |+---+---+| 64 KiB    | 2 KiB |+---+---+| 128 KiB   | 4 KiB |+---+---+| 256 KiB, 512 KiB, 1 MiB, 2    | 8 KiB || MiB, 4 MiB    |   |+---+---+| 8 MiB, 16 MiB | 16 KiB    |+---+---+ 
A block size of 8MiB or 16MiB should contain subblocks of 16KiB.
 
However, when creating a new filesystem with 16MiB blocksize, looks like is using 128KiB subblocks:
 
[root@merlindssio01 ~]# mmlsfs merlinflag    value    description---  --- -f 8192 Minimum fragment (subblock) size in bytes (system pool)    131072   Minimum fragment (subblock) size in bytes (other pools) -i 4096 Inode size in bytes -I 32768    Indirect block size in bytes.
.
.
 -n 128  Estimated number of nodes that will mount file system -B 1048576  Block size (system pool)    16777216 Block size (other pools).
.
.
 
What am I missing? According to documentation, I expect this to be a fixed value, or it isn't at all?
 
On the other hand, I don't really understand the concept 'Indirect block size in bytes', can somebody clarify or provide some details about this setting?
 
Thanks a lot and best regards,
Marc   
_Paul Scherrer Institut High Performance ComputingMarc Caubet SerrabouBuilding/Room: WHGA/019A
Forschungsstrasse, 111
5232 Villigen PSISwitzerlandTelephone: +41 56 310 46 67E-Mail: marc.cau...@psi.ch___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttps://nam04.safelinks.protection.outlook.com/?url="">
___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] mmlsquota output

2019-03-25 Thread Frederick Stock
It seems like a defect so I suggest you submit a help case for it.  If you are parsing the output you should consider using the -Y option since that should simplify any parsing.  I do not know if the mmrepquota command would be helpful to you but it is worth taking a look.
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: Robert Horton Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: "gpfsug-discuss@spectrumscale.org" Cc:Subject: Re: [gpfsug-discuss] mmlsquota outputDate: Mon, Mar 25, 2019 6:06 AM 
I don't know the answer to your actual question, but have you thoughtabout using the REST-API rather than parsing the command outputs? I cansend over the Python stuff we're using if you mail me off list.RobOn Mon, 2019-03-25 at 09:38 +, Peter Childs wrote:> Can someone tell me I'm not reading this wrong.>> This is using Spectrum Scale 5.0.2-1>> It looks like the output from mmlsquota is not what it says>> In the man page it says,>> mmlsquota [-u User | -g Group] [-v | -q] [-e] [-C ClusterName]>           [-Y] [--block-size {BlockSize | auto}] [Device[:Fileset]> ...]>> however>> mmlsquota -u username fs:fileset>> Return the output for every fileset, not just the "fileset" I've> asked> for, this is same output as>> mmlsquota -u username fs>> Where I've not said the fileset.>> I can work around this, but I'm just checking this is not actually a> bug, that ought to be fixed.>> Long story is that I'm working on rewriting our quota report util> that> used be a long bash/awk script into a more easy to understand python> script, and I want to get the user quota info for just one fileset.>> Thanks in advance.>>--Robert Horton | Research Data Storage LeadThe Institute of Cancer Research | 237 Fulham Road | London | SW3 6JBT +44 (0)20 7153 5350 | E robert.hor...@icr.ac.uk | W www.icr.ac.uk |Twitter @ICR_LondonFacebook: www.facebook.com/theinstituteofcancerresearchThe Institute of Cancer Research: Royal Cancer Hospital, a charitable Company Limited by Guarantee, Registered in England under Company No. 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP.This e-mail message is confidential and for use by the addressee only.  If the message is received by anyone other than the addressee, please return the message to the sender by replying to it and then delete the message from your computer and network.___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss 
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Calculate evicted space with a policy

2019-03-19 Thread Frederick Stock
You can scan for files using the MISC_ATTRIBUTES and look for those that are not cached, that is without the 'u' setting, and track their file size.  I think that should work.
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: "Dorigo Alvise (PSI)" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: "gpfsug-discuss@spectrumscale.org" Cc:Subject: [gpfsug-discuss] Calculate evicted space with a policyDate: Tue, Mar 19, 2019 5:27 AM 
Dear users,
is there a way (through a policy) to list the files (and their size) that are actually completely evicted by AFM from the cache filesystem ?
 
I used a policy with the clause KB_ALLOCATED=0, but it is clearly not precise, because it also includes files that are not evicted, but are so small that they fit into their inodes (I'm assuming that GPFS inode structure has this feature similar to some regular filesystems, like ext4... otherwise I could not explain some non empty file with 0 allocated KB that have been fetched, i.e. non-evicted).
 
Many thanks,
 
   Alvise
___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Systemd configuration to wait for mount of SS filesystem

2019-03-14 Thread Frederick Stock
But if all you are waiting for is the mount to occur the invocation of the callback informs you the file system has been mounted.  You would be free to start a command in the background, with appropriate protection, and exit the callback script.  Also, making the callback script run asynchronous means GPFS will not wait for it to complete and that greatly mitigates any potential problems with GPFS commands, if you need to run them from the script.
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: "Stephen R Buchanan" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug-discuss@spectrumscale.orgCc:Subject: Re: [gpfsug-discuss] Systemd configuration to wait for mount of SS filesystemDate: Thu, Mar 14, 2019 4:52 PM 
The man page for mmaddcallback specifically cautions against running "commands that involve GPFS files" because it "may cause unexpected and undesired results, including loss of file system availability." While I can imagine some kind of Rube Goldberg-esque chain of commands that I could run locally that would trigger the GPFS-filesystem-based commands I really want, I don't think mmaddcallback is the droid I'm looking for.
 
Stephen R. Wall BuchananSr. IT SpecialistIBM Data & AI North America Government Expert Labs+1 (571) 299-4601stephen.bucha...@us.ibm.com
 
 
- Original message -From: "Frederick Stock" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug-discuss@spectrumscale.orgCc: gpfsug-discuss@spectrumscale.orgSubject: Re: [gpfsug-discuss] Systemd configuration to wait for mount of SS filesystemDate: Thu, Mar 14, 2019 4:17 PM 
It is not systemd based but you might want to look at the user callback feature in GPFS (mmaddcallback).  There is a file system mount callback you could register.
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: "Stephen R Buchanan" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug-discuss@spectrumscale.orgCc:Subject: [gpfsug-discuss] Systemd configuration to wait for mount of SS filesystemDate: Thu, Mar 14, 2019 3:58 PM 
I searched the list archives with no obvious results.
 
I have an application that runs completely from a Spectrum Scale filesystem that I would like to start automatically on boot, obviously after the SS filesystem mounts, on multiple nodes. There are groups of nodes for dev, test, and production, (separate clusters) and the target filesystems are different between them (and are named differently, so the paths are different), but all nodes have an identical soft link from root (/) that points to the environment-specific path. (see below for details)
 
My first effort before I did any research was to try to simply use a directive of After=gpfs.service which anyone who has tried it will know that the gpfs.service returns as "started" far in advance (and independently of) when filesystems are actually mounted.
 
What I want is to be able to deploy a systemd service-unit and path-unit pair of files (that are as close to identical as possible across the environments) that wait for /appbin/builds/ to be available (/[dev|tst|prd]01/ to be mounted) and then starts the application. The problem is that systemd.path units, specifically the 'PathExists=' directive, don't follow symbolic links, so I would need to customize the path unit file for each environment with the full (real) path. There are other differences between the environments that I believe I can handle by specifying an EnvironmentFile directive -- but that would come from the SS filesystem so as to be a single reference point, so it can't help with the path unit.
 
Any suggestions are welcome and appreciated.
 
dev:(path names have been slightly generalized, but the structure is identical)
SS filesystem: /dev01
full path: /dev01/app-bin/user-tree/builds/
soft link: /appbin/ -> /dev01/app-bin/user-tree/
 
test:
SS filesystem: /tst01
full path: /tst01/app-bin/user-tree/builds/
soft link: /appbin/ -> /tst01/app-bin/user-tree/
 
prod:
SS filesystem: /prd01
full path: /prd01/app-bin/user-tree/builds/
soft link: /appbin/ -> /prd01/app-bin/user-tree/
 
 
Stephen R. Wall BuchananSr. IT SpecialistIBM Data & AI North America Government Expert Labs+1 (571) 299-4601stephen.bucha...@us.ibm.com 

___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss
  

___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss
  

___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss
 

_

Re: [gpfsug-discuss] Systemd configuration to wait for mount of SS filesystem

2019-03-14 Thread Frederick Stock
It is not systemd based but you might want to look at the user callback feature in GPFS (mmaddcallback).  There is a file system mount callback you could register.
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: "Stephen R Buchanan" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug-discuss@spectrumscale.orgCc:Subject: [gpfsug-discuss] Systemd configuration to wait for mount of SS filesystemDate: Thu, Mar 14, 2019 3:58 PM 
I searched the list archives with no obvious results.
 
I have an application that runs completely from a Spectrum Scale filesystem that I would like to start automatically on boot, obviously after the SS filesystem mounts, on multiple nodes. There are groups of nodes for dev, test, and production, (separate clusters) and the target filesystems are different between them (and are named differently, so the paths are different), but all nodes have an identical soft link from root (/) that points to the environment-specific path. (see below for details)
 
My first effort before I did any research was to try to simply use a directive of After=gpfs.service which anyone who has tried it will know that the gpfs.service returns as "started" far in advance (and independently of) when filesystems are actually mounted.
 
What I want is to be able to deploy a systemd service-unit and path-unit pair of files (that are as close to identical as possible across the environments) that wait for /appbin/builds/ to be available (/[dev|tst|prd]01/ to be mounted) and then starts the application. The problem is that systemd.path units, specifically the 'PathExists=' directive, don't follow symbolic links, so I would need to customize the path unit file for each environment with the full (real) path. There are other differences between the environments that I believe I can handle by specifying an EnvironmentFile directive -- but that would come from the SS filesystem so as to be a single reference point, so it can't help with the path unit.
 
Any suggestions are welcome and appreciated.
 
dev:(path names have been slightly generalized, but the structure is identical)
SS filesystem: /dev01
full path: /dev01/app-bin/user-tree/builds/
soft link: /appbin/ -> /dev01/app-bin/user-tree/
 
test:
SS filesystem: /tst01
full path: /tst01/app-bin/user-tree/builds/
soft link: /appbin/ -> /tst01/app-bin/user-tree/
 
prod:
SS filesystem: /prd01
full path: /prd01/app-bin/user-tree/builds/
soft link: /appbin/ -> /prd01/app-bin/user-tree/
 
 
Stephen R. Wall BuchananSr. IT SpecialistIBM Data & AI North America Government Expert Labs+1 (571) 299-4601stephen.bucha...@us.ibm.com 

___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] mmbackup: how to keep list(expiredFiles, updatedFiles) files

2019-03-12 Thread Frederick Stock
In the mmbackup man page look at the settings for the DEBUGmmbackup variable.  There is a value that will keep the temporary files.
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: "Jaime Pinto" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: "gpfsug main discussion list" Cc:Subject: [gpfsug-discuss] mmbackup: how to keep list(expiredFiles, updatedFiles) filesDate: Tue, Mar 12, 2019 10:28 AM 
How can I instruct mmbackup to *NOT* delete the temporary directories  and files created inside the FILESET/.mmbackupCfg folder?I can see that during the process the folders expiredFiles &  updatedFiles are there, and contain the lists I'm interested in for  post-analysis.ThanksJaime---Jaime Pinto - Storage AnalystSciNet HPC Consortium - Compute/Calcul Canadawww.scinet.utoronto.ca - www.computecanada.caUniversity of Toronto661 University Ave. (MaRS), Suite 1140Toronto, ON, M5G1M1P: 416-978-2755C: 416-505-1477This message was sent using IMP at SciNet Consortium, University of Toronto.___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss 
 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Migrating billions of files?

2019-03-06 Thread Frederick Stock
Does Aspera require a license? 
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: "Yaron Daniel" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug main discussion list Cc:Subject: Re: [gpfsug-discuss] Migrating billions of files?Date: Wed, Mar 6, 2019 4:18 AM HiU can also use today Aspera - which will replicate gpfs extended attr.Integration of IBM Aspera Sync with IBM Spectrum Scale: Protecting and Sharing Files Globallyhttp://www.redbooks.ibm.com/redpieces/abstracts/redp5527.html?Open Regards 
  Yaron Daniel 94 Em Ha'Moshavot RdStorage Architect – IL Lab Services (Storage) Petach Tiqva, 49527IBM Global Markets, Systems HW Sales Israel   Phone:+972-3-916-5672  Fax:+972-3-916-5672   Mobile:+972-52-8395593   e-mail:y...@il.ibm.com   IBM Israel     
   From:        Simon Thompson To:        gpfsug main discussion list Date:        03/06/2019 11:08 AMSubject:        Re: [gpfsug-discuss] Migrating billions of files?Sent by:        gpfsug-discuss-boun...@spectrumscale.org
 
AFM doesn’t work well if you have dependent filesets though .. which we did for quota purposes.
 
Simon
 
From:  on behalf of "y...@il.ibm.com" Reply-To: "gpfsug-discuss@spectrumscale.org" Date: Wednesday, 6 March 2019 at 09:01To: "gpfsug-discuss@spectrumscale.org" Subject: Re: [gpfsug-discuss] Migrating billions of files?
 
HiWhat permissions you have ? Do u have only Posix , or also SMB attributes ?If only posix attributes you can do the following:- rsync (which will work on different filesets/directories in parallel.- AFM (but in case you need rollback - it will be problematic) Regards 


  Yaron Daniel 94 Em Ha'Moshavot RdStorage Architect – IL Lab Services (Storage) Petach Tiqva, 49527IBM Global Markets, Systems HW Sales Israel   Phone:+972-3-916-5672  Fax:+972-3-916-5672   Mobile:+972-52-8395593   e-mail:y...@il.ibm.com   IBM Israel     
 From:        "Oesterlin, Robert" To:        gpfsug main discussion list Date:        03/05/2019 11:57 PMSubject:        [gpfsug-discuss] Migrating billions of files?Sent by:        gpfsug-discuss-boun...@spectrumscale.org

 
I’m looking at migration 3-4 Billion files, maybe 3PB of data between GPFS 

Re: [gpfsug-discuss] Clarification of mmdiag --iohist output

2019-02-21 Thread Frederick Stock
Kevin I'm assuming you have seen the article on IBM developerWorks about the GPFS NSD queues.  It provides useful background for analyzing the dump nsd information.  Here I'll list some thoughts for items that you can investigate/consider.
 
If your NSD servers are doing both large (greater than 64K) and small (64K or less) IOs then you want to have the nsdSmallThreadRatio set to 1 as it seems you do for the NSD servers.  This provides an equal number of SMALL and LARGE NSD queues.  You can also increase the total number of queues (currently 256) but I cannot determine if that is necessary from the data you provided.  Only on rare occasions have I seen a need to increase the number of queues.
 
The fact that you have 71 highest pending on your LARGE queues and 73 highest pending on your SMALL queues would imply your IOs are queueing for a good while either waiting for resources in GPFS or waiting for IOs to complete.  Your maximum buffer size is 16M which is defined to be the largest IO that can be requested by GPFS.  This is the buffer size that GPFS will use for LARGE IOs.  You indicated you had sufficient memory on the NSD servers but what is the value for the pagepool on those servers, and what is the value of the nsdBufSpace parameter?   If the NSD server is just that then usually nsdBufSpace is set to 70.  The IO buffers used by the NSD server come from the pagepool so you need sufficient space there for the maximum number of LARGE IO buffers that would be used concurrently by GPFS or threads will need to wait for those buffers to become available.  Essentially you want to ensure you have sufficient memory for the maximum number of IOs all doing a large IO and that value being less than 70% of the pagepool size.
 
You could look at the settings for the FC cards to ensure they are configured to do the largest IOs possible.  I forget the actual values (have not done this for awhile) but there are settings for the adapters that control the maximum IO size that will be sent.  I think you want this to be as large as the adapter can handle to reduce the number of messages needed to complete the large IOs done by GPFS.
 
 
Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com
 
 
- Original message -From: "Buterbaugh, Kevin L" Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug main discussion list Cc:Subject: Re: [gpfsug-discuss] Clarification of mmdiag --iohist outputDate: Thu, Feb 21, 2019 6:39 AM  Hi All,
 
My thanks to Aaron, Sven, Steve, and whoever responded for the GPFS team.  You confirmed what I suspected … my example 10 second I/O was _from an NSD server_ … and since we’re in a 8 Gb FC SAN environment, it therefore means - correct me if I’m wrong about this someone - that I’ve got a problem somewhere in one (or more) of the following 3 components:
 
1) the NSD servers
2) the SAN fabric
3) the storage arrays
 
I’ve been looking at all of the above and none of them are showing any obvious problems.  I’ve actually got a techie from the storage array vendor stopping by on Thursday, so I’ll see if he can spot anything there.  Our FC switches are QLogic’s, so I’m kinda screwed there in terms of getting any help.  But I don’t see any errors in the switch logs and “show perf” on the switches is showing I/O rates of 50-100 MB/sec on the in use ports, so I don’t _think_ that’s the issue.
 
And this is the GPFS mailing list, after all … so let’s talk about the NSD servers.  Neither memory (64 GB) nor CPU (2 x quad-core Intel Xeon E5620’s) appear to be an issue.  But I have been looking at the output of “mmfsadm saferdump nsd” based on what Aaron and then Steve said.  Here’s some fairly typical output from one of the SMALL queues (I’ve checked several of my 8 NSD servers and they’re all showing similar output):
 
    Queue NSD type NsdQueueTraditional [244]: SMALL, threads started 12, active 3, highest 12, deferred 0, chgSize 0, draining 0, is_chg 0
     requests pending 0, highest pending 73, total processed 4859732
     mutex 0x7F3E449B8F10, reqCond 0x7F3E449B8F58, thCond 0x7F3E449B8F98, queue 0x7F3E449B8EF0, nFreeNsdRequests 29
 
And for a LARGE queue:
 
    Queue NSD type NsdQueueTraditional [8]: LARGE, threads started 12, active 1, highest 12, deferred 0, chgSize 0, draining 0, is_chg 0
     requests pending 0, highest pending 71, total processed 2332966
     mutex 0x7F3E441F3890, reqCond 0x7F3E441F38D8, thCond 0x7F3E441F3918, queue 0x7F3E441F3870, nFreeNsdRequests 31
 
So my large queues seem to be slightly less utilized than my small queues overall … i.e. I see more inactive large queues and they generally have a smaller “highest pending” value.
 
Question:  are those non-zero “highest pending” values something to be concerned about?
 
I have the following thread-related parameters set:
 
[common]
maxReceiverThreads 12
nsdMaxWorkerThreads 640
nsdThreadsPerQueue 4
nsdSmallThreadRatio 3
workerThreads 128
 
[serverLicense]

Re: [gpfsug-discuss] Filesystem automount issues

2019-01-16 Thread Frederick Stock
What does the output of "mmlsmount all -L" show?

Fred
__
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
sto...@us.ibm.com



From:   KG 
To: gpfsug main discussion list 
Date:   01/16/2019 11:19 AM
Subject:[gpfsug-discuss] Filesystem automount issues
Sent by:gpfsug-discuss-boun...@spectrumscale.org



Hi 

IHAC running Scale 5.x on RHEL 7.5

One out of two filesystems (/home) does not get mounted automatically at 
boot. (/home is scale filesystem)

The scale log does mention that the filesystem is mounted but mount output 
says otherwise.

There are no entries for /home in fstab since we let scale mount it. 
Automount on scale and filesystem both have been set to yes.

Any pointers to troubleshoot would be appreciated.

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss





___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] GPFS nodes crashing during policy scan

2018-12-06 Thread Frederick Stock
Hopefully you are aware that GPFS 3.5 has been out of service since April 
2017 unless you are on extended service.  Might be a good time to consider 
upgrading.

Fred
__
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
sto...@us.ibm.com



From:   "Ratliff, John" 
To: "gpfsug-discuss@spectrumscale.org" 

Date:   12/06/2018 11:53 AM
Subject:[gpfsug-discuss] GPFS nodes crashing during policy scan
Sent by:gpfsug-discuss-boun...@spectrumscale.org



We’re trying to run a policy scan to get a list of all the files in one of 
our filesets. There are approximately 600 million inodes in this space. 
We’re running GPFS 3.5. Every time we run the policy scan, the node that 
is running it ends up crashing. It makes it through a quarter of the 
inodes before crashing (i.e. kernel panic and system reboot). Nothing in 
the GPFS logs shows anything. It just notes that the node rebooted.
 
In the crash logs of all the systems we’ve tried this on, we see the same 
line.
 
<1>BUG: unable to handle kernel NULL pointer dereference at 
00d8
<1>IP: [] 
_ZN6Direct5dreadEP15KernelOperationRK7FileUIDxiiiPvPFiS5_PKcixyS5_EPx+0xf2/0x590
 
[mmfs26]
 
Our policy scan rule is pretty simple:
 
RULE 'list-homedirs'
LIST 'list-homedirs'
 
mmapplypolicy /gs/home -A 607 -g /gpfs/tmp -f /gpfs/policy/output -N 
gpfs1,gpfs2,gpfs3,gpfs4 -P /tmp/homedirs.policy -I defer -L 1
 
Has anyone experienced something like this or have any suggestions on what 
to do to avoid it?
 
Thanks.
 
John Ratliff | Pervasive Technology Institute | UITS | Research Storage – 
Indiana University | http://pti.iu.edu
 [attachment "smime.p7s" deleted by Frederick Stock/Pittsburgh/IBM] 
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss




___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Long I/O's on client but not on NSD server(s)

2018-10-04 Thread Frederick Stock
My first guess would be the network between the NSD client and NSD server. 
 netstat and ethtool may help to determine where the cause may lie, if it 
is on the NSD client.  Obviously a switch on the network could be another 
source of the problem.

Fred
__
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
sto...@us.ibm.com



From:   "Buterbaugh, Kevin L" 
To: gpfsug main discussion list 
Date:   10/04/2018 03:55 PM
Subject:[gpfsug-discuss] Long I/O's on client but not on NSD 
server(s)
Sent by:gpfsug-discuss-boun...@spectrumscale.org



Hi All, 

What does it mean if I have a few dozen very long I/O’s (50 - 75 seconds) 
on a gateway as reported by “mmdiag —iohist” and they all reference two of 
my eight NSD servers…

… but then I go to those 2 NSD servers and I don’t see any long I/O’s at 
all?

In other words, if the problem (this time) were the backend storage, I 
should see long I/O’s on the NSD servers, right?

I’m thinking this indicates that there is some sort of problem with either 
the client gateway itself or the network in between the gateway and the 
NSD server(s) … thoughts???

Thanks in advance…

—
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and 
Education
kevin.buterba...@vanderbilt.edu - (615)875-9633


___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss





___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Optimal range on inode count for a single folder

2018-09-11 Thread Frederick Stock
I am not sure I can provide you an optimal range but I can list some 
factors to consider.  In general the guideline is to keep directories to 
500K files or so.  Keeping your metadata on separate NSDs, and preferably 
fast NSDs, helps especially with directory listings.  And running the 
latest version of Scale also helps.

It is unclear to me why the number of files in a directory would impact 
remount unless these are exported directories and the remount is occurring 
on a user node that also attempts to scan through the directory.

Fred
__
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
sto...@us.ibm.com



From:   "Michael Dutchak" 
To: gpfsug-discuss@spectrumscale.org
Date:   09/11/2018 09:21 AM
Subject:[gpfsug-discuss] Optimal range on inode count for a single 
folder
Sent by:gpfsug-discuss-boun...@spectrumscale.org



I would like to find out what the limitation, or optimal range on inode 
count for a single folder is in GPFS.  We have several users that have 
caused issues with our current files system by adding up to a million 
small files (1 ~ 40k) to a single directory.  This causes issues during 
system remount where restarting the system can take excessive amounts of 
time.
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss





___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] RAID type for system pool

2018-09-10 Thread Frederick Stock
My guess is that the "metadata" IO is for either for directory data since 
directories are considered metadata, or fileset metadata.

Fred
__
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
sto...@us.ibm.com



From:   "Buterbaugh, Kevin L" 
To: gpfsug main discussion list 
Date:   09/10/2018 02:27 PM
Subject:[gpfsug-discuss] RAID type for system pool
Sent by:gpfsug-discuss-boun...@spectrumscale.org



Hi All, 

So while I’m waiting for the purchase of new hardware to go thru, I’m 
trying to gather more data about the current workload.  One of the things 
I’m trying to do is get a handle on the ratio of reads versus writes for 
my metadata.

I’m using “mmdiag —iohist” … in this case “dm-12” is one of my 
metadataOnly disks and I’m running this on the primary NSD server for that 
NSD.  I’m seeing output like:

11:22:13.931117  W   inode4:29984416310.448  srv dm-12 

11:22:13.932344  Rmetadata4:36659676 40.307  srv dm-12 

11:22:13.932005  W logData4:49676176 10.726  srv dm-12 
 

And I’m confused as to the difference between “inode” and “metadata” (I at 
least _think_ I understand “logData”)?!?  The man page for mmdiag doesn’t 
help and I’ve not found anything useful yet in my Googling.

This is on a filesystem that currently uses 512 byte inodes, if that 
matters.  Thanks…

Kevin

—
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and 
Education
kevin.buterba...@vanderbilt.edu - (615)875-9633


___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss





___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Problem with mmlscluster and callback scripts

2018-09-07 Thread Frederick Stock
Are you really running version 5.0.2?  If so then I presume you have a 
beta version since it has not yet been released.  For beta problems there 
is a specific feedback mechanism that should be used to report problems.

Fred
__
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
sto...@us.ibm.com



From:   Matthias Knigge 
To: "gpfsug-discuss@spectrumscale.org" 

Date:   09/07/2018 08:08 AM
Subject:[gpfsug-discuss] Problem with mmlscluster and callback 
scripts
Sent by:gpfsug-discuss-boun...@spectrumscale.org



Hello together,
 
I am using the version 5.0.2.0 of GPFS and have problems with the command 
mmlscluster and callback-scripts. It is a small cluster of two nodes only. 
If I shutdown one of the nodes sometimes mmlscluster reports the following 
output:
[root@gpfs-tier1 gpfs5.2]# mmgetstate
 
Node number  Node nameGPFS state
---
   1  gpfs-tier1   arbitrating
[root@gpfs-tier1 gpfs5.2]# mmlscluster
ssh: connect to host gpfs-tier2 port 22: No route to host
mmlscluster: Unable to retrieve GPFS cluster files from node gpfs-tier2
mmlscluster: Command failed. Examine previous error messages to determine 
cause.
 
Normally the output is like this:
 
[root@gpfs-tier1 gpfs5.2]# mmlscluster
 
GPFS cluster information

  GPFS cluster name: TIERCLUSTER.gpfs-tier1
  GPFS cluster id:   12458173498278694815
  GPFS UID domain:   TIERCLUSTER.gpfs-tier1
  Remote shell command:  /usr/bin/ssh
  Remote file copy command:  /usr/bin/scp
  Repository type:   server-based
 
GPFS cluster configuration servers:
---
  Primary server:gpfs-tier2
  Secondary server:  gpfs-tier1
 
Node  Daemon node name  IP address  Admin node name  Designation
--
   1   gpfs-tier1192.168.178.10  gpfs-tier1   quorum-manager
   2   gpfs-tier2192.168.178.11  gpfs-tier2   quorum-manager
 
[root@gpfs-tier1 gpfs5.2]# mmlscallback
NodeDownCallback
command   = /var/mmfs/rs/nodedown.ksh
priority  = 1
event = quorumNodeLeave
parms = %eventNode %quorumNodes
 
NodeUpCallback
command   = /var/mmfs/rs/nodeup.ksh
priority  = 1
event = quorumNodeJoin
parms = %eventNode %quorumNodes
 
If I shutdown the filesystem via mmshutdown the callback script works but 
if I shutdown the whole node the scripts does not run.
The latest log-entry in mmfs.log.latest shows only this information:
 
2018-09-07_13:12:36.724+0200: [I] Cluster Manager connection broke. 
Probing cluster TIERCLUSTER.gpfs-tier1
2018-09-07_13:12:37.226+0200: [E] Unable to contact enough other quorum 
nodes during cluster probe.
2018-09-07_13:12:37.226+0200: [E] Lost membership in cluster 
TIERCLUSTER.gpfs-tier1. Unmounting file systems.
2018-09-07_13:12:38.448+0200: [N] Connecting to 192.168.178.11 gpfs-tier2 

 
Could anybody help me in this case? I want to try to start a script if one 
node goes down or up to change the roles for starting the filesystem. The 
callback event NodeLeave and NodeJoin do not run too.
Any more information required? If yes, please let me know!
 
Many thanks in advance and a nice weekend!
Matthias
 
Best Regards

Matthias Knigge
R File Based Media Solutions

Rohde & Schwarz 
GmbH & Co. KG
Hanomaghof 1
30449 Hannover
Telefon +49 511 67 80 7 213
Fax +49 511 37 19 74
Internet: matthias.kni...@rohde-schwarz.com

Geschäftsführung / Executive Board: Christian Leicher (Vorsitzender / 
Chairman), Peter Riedel, Sitz der Gesellschaft / Company's Place of 
Business: München, Registereintrag / Commercial Register No.: HRA 16 270, 
Persönlich haftender Gesellschafter / Personally Liable Partner: RUSEG 
Verwaltungs-GmbH, Sitz der Gesellschaft / Company's Place of Business: 
München, Registereintrag / Commercial Register No.: HRB 7 534, 
Umsatzsteuer-Identifikationsnummer (USt-IdNr.) / VAT Identification No.: 
DE 130 256 683, Elektro-Altgeräte Register (EAR) / WEEE Register No.: DE 
240 437 86
 ___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss





___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] mmbackup failed

2018-09-05 Thread Frederick Stock
There are options in the mmbackup command to rebuild the shadowDB file 
from data kept in TSM.  Be aware that using this option will take time to 
rebuild the shadowDB file, i.e. it is not a fast procedure.

Fred
__
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
sto...@us.ibm.com



From:   "Rafael Cezario" 
To: gpfsug-discuss@spectrumscale.org
Date:   09/05/2018 04:04 PM
Subject:[gpfsug-discuss] mmbackup failed
Sent by:gpfsug-discuss-boun...@spectrumscale.org



Hi All,
 
I have a filesystem “/dados” with 900TB of data.
I have a backup routine with mmbackup and I receive several errors because 
incorrect values in the file .mmbackupShadow.1.
 
The problem was resolved after I removed the lines of the file. 
 
Anyone had any a tool or utility to help me check the file .mmbackupShadow 
looking incorrect rows? 
 
Rafael Cezario
IBM Power
rafael.ceza...@ibm.com
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss





___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] RAID type for system pool

2018-09-05 Thread Frederick Stock
Another option for saving space is to not keep 2 copies of the metadata 
within GPFS.  The SSDs are mirrored so you have two copies though very 
likely they share a possible single point of failure and that could be a 
deal breaker.  I have my doubts that RAID5 will perform well for the 
reasons Marc described but worth testing to see how it does perform.  If 
you do test I presume you would also run equivalent tests with a RAID1 
(mirrored) configuration.

Regarding your point about making multiple volumes that would become GPFS 
NSDs for metadata.  It has been my experience that for traditional RAID 
systems it is better to have many small metadata LUNs (more IO paths) then 
a few large metadata LUNs.  This becomes less of an issue with ESS, i.e. 
there you can have a few metadata NSDs yet still get very good 
performance.

Fred
__
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
sto...@us.ibm.com



From:   "Marc A Kaplan" 
To: gpfsug main discussion list 
Date:   09/05/2018 01:22 PM
Subject:Re: [gpfsug-discuss] RAID type for system pool
Sent by:gpfsug-discuss-boun...@spectrumscale.org



It's good to try to reason and think this out... But there's a good 
likelihood that we don't understand ALL the details, some of which may 
negatively impact performance - so no matter what scheme you come up with 
- test, test, and re-test before deploying and depending on it in 
production.

Having said that, I'm pretty sure that old "spinning" RAID 5 
implementations had horrible performance for GPFS metadata/system pool.
Why? Among other things, the large stripe size vs the almost random small 
writes directed to system pool.

That random-small-writes pattern won't change when we go to SSD RAID 5 - 
so you'd have to see if the SSD implementation is somehow smarter than an 
old fashioned RAID 5 implementation which I believe requires several 
physical reads and writes, for each "small" logical write.
(Top decent google result I found quickly 
http://rickardnobel.se/raid-5-write-penalty/But you will probably want to 
do more research!)

Consider GPFS small write performance for:  inode updates, log writes, 
small files (possibly in inode), directory updates, allocation map 
updates, index of indirect blocks.



From:"Buterbaugh, Kevin L" 
To:gpfsug main discussion list 
Date:09/05/2018 11:36 AM
Subject:[gpfsug-discuss] RAID type for system pool
Sent by:gpfsug-discuss-boun...@spectrumscale.org



Hi All, 

We are in the process of finalizing the purchase of some new storage 
arrays (so no sales people who might be monitoring this list need contact 
me) to life-cycle some older hardware.  One of the things we are 
considering is the purchase of some new SSD’s for our “/home” filesystem 
and I have a question or two related to that.

Currently, the existing home filesystem has it’s metadata on SSD’s … two 
RAID 1 mirrors and metadata replication set to two.  However, the 
filesystem itself is old enough that it uses 512 byte inodes.  We have 
analyzed our users files and know that if we create a new filesystem with 
4K inodes that a very significant portion of the files would now have 
their _data_ stored in the inode as well due to the files being 3.5K or 
smaller (currently all data is on spinning HD RAID 1 mirrors).

Of course, if we increase the size of the inodes by a factor of 8 then we 
also need 8 times as much space to store those inodes.  Given that 
Enterprise class SSDs are still very expensive and our budget is not 
unlimited, we’re trying to get the best bang for the buck.

We have always - even back in the day when our metadata was on spinning 
disk and not SSD - used RAID 1 mirrors and metadata replication of two. 
However, we are wondering if it might be possible to switch to RAID 5? 
Specifically, what we are considering doing is buying 8 new SSDs and 
creating two 3+1P RAID 5 LUNs (metadata replication would stay at two). 
That would give us 50% more usable space than if we configured those same 
8 drives as four RAID 1 mirrors.

Unfortunately, unless I’m misunderstanding something, mean that the RAID 
stripe size and the GPFS block size could not match.  Therefore, even 
though we don’t need the space, would we be much better off to buy 10 SSDs 
and create two 4+1P RAID 5 LUNs?

I’ve searched the mailing list archives and scanned the DeveloperWorks 
wiki and even glanced at the GPFS documentation and haven’t found anything 
that says “bad idea, Kevin”… ;-)

Expanding on this further … if we just present those two RAID 5 LUNs to 
GPFS as NSDs then we can only have two NSD servers as primary for them. So 
another thing we’re considering is to take those RAID 5 LUNs and further 
sub-divide them into a total of 8 logical volumes, each of which could be 
a GPFS NSD and therefore would allow us to have each of our 8 NSD servers 
be primary for one of them.  Even worse idea?!?  Good idea?

Anybody 

Re: [gpfsug-discuss] Rebalancing with mmrestripefs -P

2018-08-20 Thread Frederick Stock
That should do what you want.  Be aware that mmrestripefs generates 
significant IO load so you should either use the QoS feature to mitigate 
its impact or run the command when the system is not very busy.

Note you have two additional NSDs in the 33 failure group than you do in 
the 23 failure group.  You may want to change one of those NSDs in failure 
group 33 to be in failure group 23 so you have equal storage space in both 
failure groups.

Fred
__
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
sto...@us.ibm.com



From:   David Johnson 
To: gpfsug main discussion list 
Date:   08/20/2018 12:55 PM
Subject:[gpfsug-discuss] Rebalancing with mmrestripefs -P
Sent by:gpfsug-discuss-boun...@spectrumscale.org



I have one storage pool that was recently doubled, and another pool 
migrated there using mmapplypolicy.
The new half is only 50% full, and the old half is 94% full. 

Disks in storage pool: cit_10tb (Maximum disk size allowed is 516 TB)
d05_george_23  50.49T   23 No   Yes  25.91T ( 51%) 
   18.93G ( 0%) 
d04_george_23  50.49T   23 No   Yes  25.91T ( 51%) 
18.9G ( 0%) 
d03_george_23  50.49T   23 No   Yes   25.9T ( 51%) 
   19.12G ( 0%) 
d02_george_23  50.49T   23 No   Yes   25.9T ( 51%) 
   19.03G ( 0%) 
d01_george_23  50.49T   23 No   Yes   25.9T ( 51%) 
   18.92G ( 0%) 
d00_george_23  50.49T   23 No   Yes  25.91T ( 51%) 
   19.05G ( 0%) 
d06_cit_33 50.49T   33 No   Yes  3.084T (  6%) 
   70.35G ( 0%) 
d07_cit_33 50.49T   33 No   Yes  3.084T (  6%) 
70.2G ( 0%) 
d05_cit_33 50.49T   33 No   Yes  3.084T (  6%) 
   69.93G ( 0%) 
d04_cit_33 50.49T   33 No   Yes  3.085T (  6%) 
   70.11G ( 0%) 
d03_cit_33 50.49T   33 No   Yes  3.084T (  6%) 
   70.08G ( 0%) 
d02_cit_33 50.49T   33 No   Yes  3.083T (  6%) 
70.3G ( 0%) 
d01_cit_33 50.49T   33 No   Yes  3.085T (  6%) 
   70.25G ( 0%) 
d00_cit_33 50.49T   33 No   Yes  3.083T (  6%) 
   70.28G ( 0%) 
-  
---
(pool total)   706.9T180.1T ( 25%) 
   675.5G ( 0%)

 Will the command "mmrestripfs /gpfs -b -P cit_10tb”  move the data blocks 
from the _cit_ NSDs to the _george_ NSDs,
so that they end up all around 75% full?

Thanks,
 — ddj
Dave Johnson
Brown University CCV/CIS___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss





___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Same file opened by many nodes / processes

2018-07-23 Thread Frederick Stock
Have you considered keeping the 1G network for daemon traffic and moving 
the data traffic to another network?

Given the description of your configuration with only 2 manager nodes 
handling mmbackup and other tasks my guess is that is where the problem 
lies regarding performance when mmbackup is running with the many nodes 
accessing a single file.  You said the fs managers were on hardware, does 
that mean other nodes in this cluster are VMs of some kind?

You stated that your NSD servers were under powered.  Did you address that 
problem in any way, that is adding memory/CPUs, or did you just move other 
GPFS activity off of those nodes?

Fred
__
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
sto...@us.ibm.com



From:   Peter Childs 
To: "gpfsug-discuss@spectrumscale.org" 

Date:   07/23/2018 07:06 AM
Subject:Re: [gpfsug-discuss] Same file opened by many nodes / 
processes
Sent by:gpfsug-discuss-boun...@spectrumscale.org



On Mon, 2018-07-23 at 22:13 +1200, José Filipe Higino wrote:
I think the network problems need to be cleared first. Then I would 
investigate further. 

Buf if that is not a trivial path... 
Are you able to understand from the mmfslog what happens when the tipping 
point occurs?

mmfslog thats not a term I've come accross before, if you mean 
/var/adm/ras/mmfs.log.latest then I'm already there is not a lot there, In 
other words no expulsions or errors just a very slow filesystem, We've not 
seen any significantly long waiters either (mmdiag --waiters) so as far as 
I can see its just behaving like a very very busy filesystem.

We've already had IBM looking at the snaps due to the rather slow mmbackup 
process, all I've had back is to try increase -a ie the number of sort 
threads which has speed it up to a certain extent, But once again I think 
we're looking at the results of the issue not the cause.


In my view, when troubleshooting is not easy, the usual methods work/help 
to find the next step:
- Narrow the window of troubleshooting (by discarding "for now" events 
that did not happen within the same timeframe)
- Use "as precise" as possible, timebased events to read the reaction of 
the cluster (via log or others)  and make assumptions about other observed 
situations.
- If possible and when the problem is happening, run some traces, 
gpfs.snap and ask for support via PMR.

Also,

What is version of GPFS?

4.2.3-8 

How many quorum nodes?

4 Quorum nodes with tie breaker disks, however these are not the file 
system manager nodes as to fix a previous problem (with our nsd servers 
not being powerful enough) our fsmanager nodes are on hardware, We have 
two file system manager nodes (Which do token management, quota management 
etc) they also run the mmbackup.

How many filesystems?

1, although we do have a second that is accessed via multi-cluster from 
our older GPFS setup, (thats running 4.2.3-6 currently)

Is the management network the same as the daemon network?

Yes. the management network and the daemon network are the same network. 

Thanks in advance

Peter Childs



On Mon, 23 Jul 2018 at 20:37, Peter Childs  wrote:
On Mon, 2018-07-23 at 00:51 +1200, José Filipe Higino wrote:

Hi there, 

Have you been able to create a test case (replicate the problem)? Can you 
tell us a bit more about the setup?

Not really, It feels like a perfect storm, any one of the tasks running on 
its own would be fine, Its the shear load, our mmpmon data says the 
storage has been flat lining when it occurs.

Its a reasonably standard (small) HPC cluster, with a very mixed work 
load, hence while we can usually find "bad" jobs from the point of view of 
io on this occasion we can see a few large array jobs all accessing the 
same file, the cluster runs fine until we get to a certain point and one 
more will tip the balance. We've been attempting to limit the problem by 
adding limits to the number of jobs in an array that can run at once. But 
that feels like fire fighting. 


Are you using GPFS API over any administrative commands? Any problems with 
the network (being that Ethernet or IB)?

We're not as using the GPFS API, never got it working, which is a shame, 
I've never managed to figure out the setup, although it is on my to do 
list.

Network wise, We've just removed a great deal of noise from arp requests 
by increasing the arp cache size on the nodes. Its a mixed 1GBit/10GBit 
network currently, we're currently looking at removing all the 1GBit nodes 
within the next few months and adding some new faster kit. The Storage is 
attached at 40GBit but it does not look to want to run much above 5Gbit I 
suspect due to Ethernet back off due to the mixed speeds. 

While we do have some IB we don't currently run our storage over it.

Thanks in advance

Peter Childs





Sorry if I am un-announced here for the first time. But I would like to 
help if I can.

Jose Higino,
from NIWA
New Zealand

Cheers

On Sun, 22 Jul 

Re: [gpfsug-discuss] preventing HSM tape recall storms

2018-07-09 Thread Frederick Stock
Another option is to request Apple to support the OFFLINE flag in the SMB 
protocol.  The more Mac customers making such a request (I have asked 
others to do likewise) might convince Apple to add this checking to their 
SMB client.

Fred
__
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
sto...@us.ibm.com



From:   "Christof Schmitt" 
To: gpfsug-discuss@spectrumscale.org
Date:   07/09/2018 02:53 PM
Subject:Re: [gpfsug-discuss] preventing HSM tape recall storms
Sent by:gpfsug-discuss-boun...@spectrumscale.org



> we had left out "gpfs" from the
> vfs objects =
> line in smb.conf
>
> so setting
> vfs objects = gpfs (etc)
> gpfs:hsm = yes
> gpfs:recalls = yes  (not "no" as I had originally, and is implied by the 
manual)
 
Thank you for the update. gpfs:recalls=yes is the default, allowing
recalls of files. If you set that to 'no', Samba will deny access to
"OFFLINE" files in GPFS through SMB.
 
> and setting the offline flag on the file by migrating it, so that
> # mmlsattr -L  filename.jpg
> ...
> Misc attributes:  ARCHIVE OFFLINE
>
> now Explorer on Windows 7 and 10 do not recall the file while viewing 
the folder with "Large icons"
>
> and a standard icon with an X is displayed.
>
> But after the file is then opened and recalled, the icon displays the 
thumbnail image and the OFFLINE flag is lost.
 
Yes, that is working as intended. While the file is only in the
"external pool" (e.g. HSM tape), the OFFLINE flag is reported. Once
you read/write data, that triggers a recall to the disk pool and the
flag is cleared.
 
> Also as you observed, Finder on  MacOSX 10.13 ignores the file's offline 
flag,
>
> so we still risk a recall storm caused by them.
 
The question here would be how to handle the Mac clients. You could
configured two SMB shares on the same path: One with gpfs:recalls=yes
and tell the Windows users to access that share; the other one with
gpfs:recalls=no and tell the Mac users to use that share. That would
avoid the recall storms, but runs the risk of Mac users connecting to
the wrong share and avoiding this workaround...
 
Regards,

Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ
christof.schm...@us.ibm.com  ||  +1-520-799-2469(T/L: 321-2469)
 
 
- Original message -
From: Cameron Dunn 
Sent by: gpfsug-discuss-boun...@spectrumscale.org
To: "gpfsug-discuss@spectrumscale.org" 
Cc:
Subject: Re: [gpfsug-discuss] preventing HSM tape recall storms
Date: Sat, Jul 7, 2018 2:30 PM
 
Thanks Christof,
 
we had left out "gpfs" from the
vfs objects = 
line in smb.conf
 
so setting
vfs objects = gpfs (etc)
gpfs:hsm = yes
gpfs:recalls = yes  (not "no" as I had originally, and is implied by the 
manual)
 
and setting the offline flag on the file by migrating it, so that
# mmlsattr -L  filename.jpg
...
 
Misc attributes:  ARCHIVE OFFLINE
 
now Explorer on Windows 7 and 10 do not recall the file while viewing the 
folder with "Large icons"
and a standard icon with an X is displayed.
But after the file is then opened and recalled, the icon displays the 
thumbnail image and the OFFLINE flag is lost.
 
Also as you observed, Finder on  MacOSX 10.13 ignores the file's offline 
flag,
so we still risk a recall storm caused by them.
 
All the best,
Cameron
 

From: gpfsug-discuss-boun...@spectrumscale.org 
 on behalf of Christof Schmitt 

Sent: 03 July 2018 20:37:08
To: gpfsug-discuss@spectrumscale.org
Cc: gpfsug-discuss@spectrumscale.org
Subject: Re: [gpfsug-discuss] preventing HSM tape recall storms 
 
> HSM over LTFS-EE runs the risk of a recall storm if files which have 
been migrated to tape
> are then shared by Samba to Macs and PCs.
> MacOS Finder and Windows Explorer will want to display all the thumbnail 
images of a
> folder's contents, which will recall lots of files from tape.
 
SMB clients can query file information, including the OFFLINE
flag. With Spectrum Scale and the "gpfs" module loaded in Samba that
is mapped from the the OFFLINE flag that is visible in "mmlsattr
-L". In those systems, the SMB client can determine that a file is
offline.
 
In our experience this is handled correctly in Windows Explorer; when
an "offline" file is encountered, no preview is generated from the
file data. The Finder on Mac clients does not seem to honor the
OFFLINE flag, thus the main problems are typically recall storms
caused by Mac clients.
 
> According to the Samba documentation this is preventable by setting the 
following
> --
> https://www.samba.org/samba/docs/current/man-html/vfs_gpfs.8.html
>
> gpfs:recalls = [ yes | no ]
> When this option is set to no, an attempt to open an offline file
> will be rejected with access denied.
> This helps preventing recall storms triggered by careless applications 
like Finder and Explorer.
>
> yes(default) - Open files that are offline. This will recall the files 
from HSM.
> no - Reject access to offline files with 

Re: [gpfsug-discuss] High I/O wait times

2018-07-03 Thread Frederick Stock
How many NSDs are served by the NSD servers and what is your maximum file 
system block size?  Have you confirmed that you have sufficient NSD worker 
threads to handle the maximum number of IOs you are configured to have 
active?  That would be the number of NSDs served times 12 (you have 12 
threads per queue).

Fred
__
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
sto...@us.ibm.com



From:   "Buterbaugh, Kevin L" 
To: gpfsug main discussion list 
Date:   07/03/2018 05:41 PM
Subject:Re: [gpfsug-discuss] High I/O wait times
Sent by:gpfsug-discuss-boun...@spectrumscale.org



Hi Fred, 

Thanks for the response.  I have been looking at the “mmfsadm dump nsd” 
data from the two NSD servers that serve up the two NSDs that most 
commonly experience high wait times (although, again, this varies from 
time to time).  In addition, I have been reading:

https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General%20Parallel%20File%20System%20(GPFS)/page/NSD%20Server%20Design%20and%20Tuning

And:

https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General%20Parallel%20File%20System%20(GPFS)/page/NSD%20Server%20Tuning

Which seem to be the most relevant documents on the Wiki.

I would like to do a more detailed analysis of the “mmfsadm dump nsd” 
output, but my preliminary looks at it seems to indicate that I see I/O’s 
queueing in the 50 - 100 range for the small queues and the 60 - 200 range 
on the large queues.

In addition, I am regularly seeing all 12 threads on the LARGE queues 
active, while it is much more rare that I see all - or even close to all - 
the threads on the SMALL queues active.

As far as the parameters Scott and Yuri mention, on our cluster they are 
set thusly:

[common]
nsdMaxWorkerThreads 640
[]
nsdMaxWorkerThreads 1024
[common]
nsdThreadsPerQueue 4
[]
nsdThreadsPerQueue 12
[common]
nsdSmallThreadRatio 3
[]
nsdSmallThreadRatio 1

So to me it sounds like I need more resources on the LARGE queue side of 
things … i.e. it sure doesn’t sound like I want to change my small thread 
ratio.  If I increase the amount of threads it sounds like that might 
help, but that also takes more pagepool, and I’ve got limited RAM in these 
(old) NSD servers.  I do have nsdbufspace set to 70, but I’ve only got 
16-24 GB RAM each in these NSD servers.  And a while back I did try 
increase the page pool on them (very slightly) and ended up causing 
problems because then they ran out of physical RAM.

Thoughts?  Followup questions?  Thanks!

Kevin

On Jul 3, 2018, at 3:11 PM, Frederick Stock  wrote:

Are you seeing similar values for all the nodes or just some of them?  One 
possible issue is how the NSD queues are configured on the NSD servers. 
You can see this with the output of "mmfsadm dump nsd".  There are queues 
for LARGE IOs (greater than 64K) and queues for SMALL IOs (64K or less). 
Check the highest pending values to see if many IOs are queueing.  There 
are a couple of options to fix this but rather than explain them I suggest 
you look for information about NSD queueing on the developerWorks site. 
There has been information posted there that should prove helpful.

Fred
__
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
sto...@us.ibm.com



From:"Buterbaugh, Kevin L" 
To:gpfsug main discussion list 
Date:07/03/2018 03:49 PM
Subject:[gpfsug-discuss] High I/O wait times
Sent by:gpfsug-discuss-boun...@spectrumscale.org



Hi all, 

We are experiencing some high I/O wait times (5 - 20 seconds!) on some of 
our NSDs as reported by “mmdiag —iohist" and are struggling to understand 
why.  One of the confusing things is that, while certain NSDs tend to show 
the problem more than others, the problem is not consistent … i.e. the 
problem tends to move around from NSD to NSD (and storage array to storage 
array) whenever we check … which is sometimes just a few minutes apart.

In the past when I have seen “mmdiag —iohist” report high wait times like 
this it has *always* been hardware related.  In our environment, the most 
common cause has been a battery backup unit on a storage array controller 
going bad and the storage array switching to write straight to disk.  But 
that’s *not* happening this time. 

Is there anything within GPFS / outside of a hardware issue that I should 
be looking for??  Thanks!

—
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and 
Education
kevin.buterba...@vanderbilt.edu- (615)875-9633


___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss



___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://na01.safelinks.pro

Re: [gpfsug-discuss] High I/O wait times

2018-07-03 Thread Frederick Stock
Are you seeing similar values for all the nodes or just some of them?  One 
possible issue is how the NSD queues are configured on the NSD servers. 
You can see this with the output of "mmfsadm dump nsd".  There are queues 
for LARGE IOs (greater than 64K) and queues for SMALL IOs (64K or less). 
Check the highest pending values to see if many IOs are queueing.  There 
are a couple of options to fix this but rather than explain them I suggest 
you look for information about NSD queueing on the developerWorks site. 
There has been information posted there that should prove helpful.

Fred
__
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
sto...@us.ibm.com



From:   "Buterbaugh, Kevin L" 
To: gpfsug main discussion list 
Date:   07/03/2018 03:49 PM
Subject:[gpfsug-discuss] High I/O wait times
Sent by:gpfsug-discuss-boun...@spectrumscale.org



Hi all, 

We are experiencing some high I/O wait times (5 - 20 seconds!) on some of 
our NSDs as reported by “mmdiag —iohist" and are struggling to understand 
why.  One of the confusing things is that, while certain NSDs tend to show 
the problem more than others, the problem is not consistent … i.e. the 
problem tends to move around from NSD to NSD (and storage array to storage 
array) whenever we check … which is sometimes just a few minutes apart.

In the past when I have seen “mmdiag —iohist” report high wait times like 
this it has *always* been hardware related.  In our environment, the most 
common cause has been a battery backup unit on a storage array controller 
going bad and the storage array switching to write straight to disk.  But 
that’s *not* happening this time. 

Is there anything within GPFS / outside of a hardware issue that I should 
be looking for??  Thanks!

—
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and 
Education
kevin.buterba...@vanderbilt.edu - (615)875-9633


___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss





___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Thousands of CLOSE_WAIT connections

2018-06-15 Thread Frederick Stock
Assuming CentOS 7.5 parallels RHEL 7.5 then you would need Spectrum Scale 
4.2.3.9 because that is the release version (along with 5.0.1 PTF1) that 
supports RHEL 7.5.

Fred
__
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
sto...@us.ibm.com



From:   Iban Cabrillo 
To: gpfsug-discuss 
Date:   06/15/2018 11:16 AM
Subject:Re: [gpfsug-discuss] Thousands of CLOSE_WAIT connections
Sent by:gpfsug-discuss-boun...@spectrumscale.org



Hi Anderson,

Comments are  in line


From: "Anderson Ferreira Nobre" 
To: "gpfsug-discuss" 
Cc: "gpfsug-discuss" 
Sent: Friday, 15 June, 2018 16:49:14
Subject: Re: [gpfsug-discuss] Thousands of CLOSE_WAIT connections

Hi Iban,
 
I think it's necessary more information to be able to help you. Here they 
are:
- Redhat version: Which is 7.2, 7.3 or 7.4?
   CentOS Linux release 7.5.1804 (Core) 

- Redhat kernel version: In the FAQ of GPFS has the recommended kernel 
levels
- Platform: Is it x86_64?
  Yes it is
- Is there a reason for you stay in 4.2.3-6? Could you update to 4.2.3-9 
or 5.0.1?
   No, that  wasthe default version we get from our costumer we could 
upgrade to 4.2.3-9 with time...

- How is the name resolution? Can you do test ping from one node to 
another and it's reverse?

   yes resolution works fine in both directions (there is no firewall or 
icmp filter) using ethernet private network (not IB)

- TCP/IP tuning: What is the TCP/IP parameters you are using? I have used 
for 7.4 the following:
[root@ sysctl.d]# cat 99-ibmscale.conf
net.core.somaxconn = 1
net.core.netdev_max_backlog = 25
net.ipv4.ip_local_port_range = 2000 65535
net.ipv4.tcp_rfc1337 = 1
net.ipv4.tcp_max_tw_buckets = 144
net.ipv4.tcp_mtu_probing = 1
net.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_low_latency = 1
net.ipv4.tcp_max_syn_backlog = 4096
net.ipv4.tcp_fin_timeout = 10
net.core.rmem_default = 4194304
net.core.rmem_max = 4194304
net.core.wmem_default = 4194304
net.core.wmem_max = 4194304
net.core.optmem_max = 4194304
net.ipv4.tcp_rmem=4096 87380 16777216
net.ipv4.tcp_wmem=4096 65536 16777216
vm.min_free_kbytes = 512000
kernel.panic_on_oops = 0
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
vm.swappiness = 0
vm.dirty_ratio = 10
 

That is mine:
net.ipv4.conf.default.accept_source_route = 0
net.core.somaxconn = 8192
net.ipv4.tcp_fin_timeout = 30
kernel.sysrq = 1
kernel.core_uses_pid = 1
net.ipv4.tcp_syncookies = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.shmmax = 13491064832
kernel.shmall = 4294967296
net.ipv4.neigh.default.gc_stale_time = 120
net.ipv4.tcp_synack_retries = 10
net.ipv4.tcp_sack = 0
net.ipv4.icmp_echo_ignore_broadcasts = 1
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1
net.core.netdev_max_backlog = 25
net.core.rmem_default = 16777216
net.core.wmem_default = 16777216
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_mem = 16777216 16777216 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 87380 16777216
net.ipv4.tcp_adv_win_scale = 2
net.ipv4.tcp_low_latency = 1
net.ipv4.tcp_reordering = 3
net.ipv4.tcp_timestamps = 0
net.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_max_syn_backlog = 8192
net.ipv4.neigh.default.gc_thresh1 = 3
net.ipv4.neigh.default.gc_thresh2 = 32000
net.ipv4.neigh.default.gc_thresh3 = 32768
net.ipv4.conf.all.arp_filter = 1
net.ipv4.conf.all.arp_ignore = 1
net.ipv4.neigh.enp3s0.mcast_solicit = 9
net.ipv4.neigh.enp3s0.ucast_solicit = 9
net.ipv6.neigh.enp3s0.ucast_solicit = 9
net.ipv6.neigh.enp3s0.mcast_solicit = 9
net.ipv4.neigh.ib0.mcast_solicit = 18
vm.oom_dump_tasks = 1
vm.min_free_kbytes = 524288

Since we disabled ipv6, we had to rebuild the kernel image with the 
following command:
[root@ ~]# dracut -f -v
 
  I did that on Wns but no on GPFS servers...
- GPFS tuning parameters: Can you list them?
- Spectrum Scale status: Can you send the following outputs:
  mmgetstate -a -L
  mmlscluster

[root@gpfs01 ~]# mmlscluster 

GPFS cluster information

GPFS cluster name: gpfsgui.ifca.es
GPFS cluster id: 8574383285738337182
GPFS UID domain: gpfsgui.ifca.es
Remote shell command: /usr/bin/ssh
Remote file copy command: /usr/bin/scp
Repository type: CCR

Node Daemon node name IP address Admin node name Designation

1 gpfs01.ifca.es 10.10.0.111 gpfs01.ifca.es quorum-manager-perfmon
2 gpfs02.ifca.es 10.10.0.112 gpfs02.ifca.es quorum-manager-perfmon
3 gpfsgui.ifca.es 10.10.0.60 gpfsgui.ifca.es quorum-perfmon
9 cloudprv-02-9.ifca.es 10.10.140.26 cloudprv-02-9.ifca.es 
10 cloudprv-02-8.ifca.es 10.10.140.25 cloudprv-02-8.ifca.es 
13 node1.ifca.es 10.10.151.3 node3.ifca.es 
..
44 node24.ifca.es 10.10.151.24 node24.ifca.es 
.
  mmhealth cluster show (It was shoutdown by hand)

[root@gpfs01 ~]# mmhealth cluster show --verbose

Error: The 

Re: [gpfsug-discuss] RHEL updated to 7.5 instead of 7.4

2018-06-11 Thread Frederick Stock
Spectrum Scale 4.2.3.9 does support RHEL 7.5.

Fred
__
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
sto...@us.ibm.com



From:   "Sobey, Richard A" 
To: gpfsug main discussion list 
Date:   06/11/2018 06:59 AM
Subject:Re: [gpfsug-discuss] RHEL updated to 7.5 instead of 7.4
Sent by:gpfsug-discuss-boun...@spectrumscale.org



Thanks Simon. Do you mean you pinned the minor release to 7.X but yum 
upgraded you to 7.Y? This has just happened to me:
 
[root@ ~]# subscription-manager release
Release: 7.4
[root@ ~]# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 7.5 (Maipo)
 
Granted I didn’t issue a yum clean all after changing the release however 
I’ve never seen this happen before. 
 
Anyway, I need to either downgrade back to 7.4 or upgrade GPFS, whichever 
will be the best supported. I need to learn to pay attention to what 
kernel version I’m being updated to in future!
 
Cheers
Richard
 
From: gpfsug-discuss-boun...@spectrumscale.org 
 On Behalf Of Simon Thompson (IT 
Research Support)
Sent: 11 June 2018 11:50
To: gpfsug main discussion list 
Subject: Re: [gpfsug-discuss] RHEL updated to 7.5 instead of 7.4
 
We have on our DSS-G …
 
Have you looked at:
https://access.redhat.com/solutions/238533
 
?
 
Simon
 
From:  on behalf of "Sobey, 
Richard A" 
Reply-To: "gpfsug-discuss@spectrumscale.org" <
gpfsug-discuss@spectrumscale.org>
Date: Monday, 11 June 2018 at 11:46
To: "gpfsug-discuss@spectrumscale.org" 
Subject: [gpfsug-discuss] RHEL updated to 7.5 instead of 7.4
 
Has anyone ever used subscription-manager to set a release to 7.4 only for 
the system to upgrade to 7.5 anyway?
 
Also is 7.5 now supported with the 4.2.3.9 PTF or should I concentrate on 
downgrading back to 7.4?
 
Richard___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss





___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] pool-metadata_high_error

2018-05-14 Thread Frederick Stock
The difference in your inode information is presumably because the fileset 
you reference is an independent fileset and it has its own inode space 
distinct from the indoe space used for the "root" fileset (file system).

Fred
__
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
sto...@us.ibm.com



From:   "Markus Rohwedder" 
To: gpfsug main discussion list 
Date:   05/14/2018 07:19 AM
Subject:Re: [gpfsug-discuss] pool-metadata_high_error
Sent by:gpfsug-discuss-boun...@spectrumscale.org



Hello, 

the pool metadata high error reports issues with the free blocks in the 
metadataOnly and/or dataAndMetadata NSDs in the system pool.

mmlspool and subsequently the GPFSPool sensor is the source of the 
information that is used be the threshold that reports this error.

So please compare with 

mmlspool 
and 
mmperfmon query gpfs_pool_disksize, gpfs_pool_free_fullkb -b 86400 -n 1

Once inodes are allocated I am not aware of a method to de-allocate them. 
This is what the Knowledge Center says:

"Inodes are allocated when they are used. When a file is deleted, the 
inode is reused, but inodes are never deallocated. When setting the 
maximum number of inodes in a file system, there is the option to 
preallocate inodes. However, in most cases there is no need to preallocate 
inodes because, by default, inodes are allocated in sets as needed. If you 
do decide to preallocate inodes, be careful not to preallocate more inodes 
than will be used; otherwise, the allocated inodes will unnecessarily 
consume metadata space that cannot be reclaimed. "


Mit freundlichen Grüßen / Kind regards

Dr. Markus Rohwedder

Spectrum Scale GUI Development


Phone:
+49 7034 6430190
IBM Deutschland Research & Development

E-Mail:
rohwed...@de.ibm.com
Am Weiher 24


65451 Kelsterbach


Germany



KG ---14.05.2018 12:57:33---Hi Folks IHAC who is reporting 
pool-metadata_high_error on GUI.

From: KG 
To: gpfsug main discussion list 
Date: 14.05.2018 12:57
Subject: [gpfsug-discuss] pool-metadata_high_error
Sent by: gpfsug-discuss-boun...@spectrumscale.org



Hi Folks

IHAC who is reporting pool-metadata_high_error on GUI.

The inode utilisation on filesystem is as below
Used inodes - 92922895
free inodes - 1684812529
allocated - 135424
max inodes - 1911363520

the inode utilization on one fileset (it is only one being used) is below
Used inodes - 93252664
allocated - 1776624128
max inodes 1876624064

is this because the difference in allocated and max inodes is very less?

Customer tried reducing allocated inodes on fileset (between max and used 
inode) and GUI complains that it is out of range.
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss





___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] mmapplypolicy on nested filesets ...

2018-04-18 Thread Frederick Stock
Would the PATH_NAME LIKE option work?

Fred
__
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
sto...@us.ibm.com



From:   "Jaime Pinto" 
To: "gpfsug main discussion list" 
Date:   04/18/2018 12:55 PM
Subject:[gpfsug-discuss] mmapplypolicy on nested filesets ...
Sent by:gpfsug-discuss-boun...@spectrumscale.org



A few months ago I asked about limits and dynamics of traversing 
depended .vs independent filesets on this forum. I used the 
information provided to make decisions and setup our new DSS based 
gpfs storage system. Now I have a problem I couldn't' yet figure out 
how to make it work:

'project' and 'scratch' are top *independent* filesets of the same 
file system.

'proj1', 'proj2' are dependent filesets nested under 'project'
'scra1', 'scra2' are dependent filesets nested under 'scratch'

I would like to run a purging policy on all contents under 'scratch' 
(which includes 'scra1', 'scra2'), and TSM backup policies on all 
contents under 'project' (which includes 'proj1', 'proj2').

HOWEVER:
When I run the purging policy on the whole gpfs device (with both 
'project' and 'scratch' filesets)

* if I use FOR FILESET('scratch') on the list rules, the 'scra1' and 
'scra2' filesets under scratch are excluded (totally unexpected)

* if I use FOR FILESET('scra1') I get error that scra1 is dependent 
fileset (Ok, that is expected)

* if I use /*FOR FILESET('scratch')*/, all contents under 'project', 
'proj1', 'proj2' are traversed as well, and I don't want that (it 
takes too much time)

* if I use /*FOR FILESET('scratch')*/, and instead of the whole device 
I apply the policy to the /scratch mount point only, the policy still 
traverses all the content of 'project', 'proj1', 'proj2', which I 
don't want. (again, totally unexpected)

QUESTION:

How can I craft the syntax of the mmapplypolicy in combination with 
the RULE filters, so that I can traverse all the contents under the 
'scratch' independent fileset, including the nested dependent filesets 
'scra1','scra2', and NOT traverse the other independent filesets at 
all (since this takes too much time)?

Thanks
Jaime


PS: FOR FILESET('scra*') does not work.




  
   TELL US ABOUT YOUR SUCCESS STORIES
  
https://urldefense.proofpoint.com/v2/url?u=http-3A__www.scinethpc.ca_testimonials=DwICAg=jf_iaSHvJObTbx-siA1ZOg=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw=csxqKhhBsww-1H4lJlra9UtcoY0yG6PcOeV5jYf5pYo=tM9JZXsRNu6EEhoFlUuWvTLwMsqbDjfDj3NDZ6elACA=

  
---
Jaime Pinto - Storage Analyst
SciNet HPC Consortium - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.ca
University of Toronto
661 University Ave. (MaRS), Suite 1140
Toronto, ON, M5G1M1
P: 416-978-2755
C: 416-505-1477


This message was sent using IMP at SciNet Consortium, University of 
Toronto.

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss=DwICAg=jf_iaSHvJObTbx-siA1ZOg=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw=csxqKhhBsww-1H4lJlra9UtcoY0yG6PcOeV5jYf5pYo=V6u0XsNxHj4Mp-mu7hCZKv1AD3_GYqU-4KZzvMSQ_MQ=






___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] GPFS autoload - wait for IB ports tobecomeactive

2018-03-15 Thread Frederick Stock
The callback is the only way I know to use the "--onerror shutdown" 
option.

Fred
__
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
sto...@us.ibm.com



From:   Jan-Frode Myklebust <janfr...@tanso.net>
To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>
Date:   03/15/2018 01:14 PM
Subject:Re: [gpfsug-discuss] GPFS autoload - wait for IB ports to 
becomeactive
Sent by:gpfsug-discuss-boun...@spectrumscale.org



I found some discussion on this at 
https://www.ibm.com/developerworks/community/forums/html/threadTopic?id=----14471957=25
 
and there it's claimed that none of the callback events are early enough 
to resolve this. That we need a pre-preStartup trigger. Any idea if this 
has changed -- or is the callback option then only to do a "--onerror 
shutdown" if it has failed to connect IB ?

On Thu, Mar 8, 2018 at 1:42 PM, Frederick Stock <sto...@us.ibm.com> wrote:
You could also use the GPFS prestartup callback (mmaddcallback) to execute 
a script synchronously that waits for the IB ports to become available 
before returning and allowing GPFS to continue.  Not systemd integrated 
but it should work.

Fred
__
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
sto...@us.ibm.com



From:david_john...@brown.edu
To:gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>
Date:03/08/2018 07:34 AM
Subject:Re: [gpfsug-discuss] GPFS autoload - wait for IB ports to 
becomeactive
Sent by:gpfsug-discuss-boun...@spectrumscale.org




Until IBM provides a solution, here is my workaround. Add it so it runs 
before the gpfs script, I call it from our custom xcat diskless boot 
scripts. Based on rhel7, not fully systemd integrated. YMMV!

Regards, 
 — ddj
——-
[ddj@storage041 ~]$ cat /etc/init.d/ibready 
#! /bin/bash
#
# chkconfig: 2345 06 94
# /etc/rc.d/init.d/ibready
# written in 2016 David D Johnson (ddj  brown.edu)
#
### BEGIN INIT INFO
# Provides: ibready
# Required-Start:
# Required-Stop:
# Default-Stop:
# Description: Block until infiniband is ready
# Short-Description: Block until infiniband is ready
### END INIT INFO

RETVAL=0
if [[ -d /sys/class/infiniband ]] 
then
IBDEVICE=$(dirname $(grep -il infiniband 
/sys/class/infiniband/*/ports/1/link* | head -n 1))
fi
# See how we were called.
case "$1" in
  start)
if [[ -n $IBDEVICE && -f $IBDEVICE/state ]]
then
echo -n "Polling for InfiniBand link up: "
for (( count = 60; count > 0; count-- ))
do
if grep -q ACTIVE $IBDEVICE/state
then
echo ACTIVE
break
fi
echo -n "."
sleep 5
done
if (( count <= 0 ))
then
echo DOWN - $0 timed out
fi
fi
;;
  stop|restart|reload|force-reload|condrestart|try-restart)
;;
  status)
if [[ -n $IBDEVICE && -f $IBDEVICE/state ]]
then
echo "$IBDEVICE is $(< $IBDEVICE/state) $(< 
$IBDEVICE/rate)"
else
echo "No IBDEVICE found"
fi
;;
  *)
echo "Usage: ibready 
{start|stop|status|restart|reload|force-reload|condrestart|try-restart}"
exit 2
esac
exit ${RETVAL}


  -- ddj
Dave Johnson

On Mar 8, 2018, at 6:10 AM, Caubet Serrabou Marc (PSI) <marc.cau...@psi.ch
> wrote:

Hi all,

with autoload = yes we do not ensure that GPFS will be started after the 
IB link becomes up. Is there a way to force GPFS waiting to start until IB 
ports are up? This can be probably done by adding something like 
After=network-online.target and Wants=network-online.target in the systemd 
file but I would like to know if this is natively possible from the GPFS 
configuration.

Thanks a lot,
Marc
_
Paul Scherrer Institut 
High Performance Computing
Marc Caubet Serrabou
WHGA/036
5232 Villigen PSI
Switzerland

Telephone: +41 56 310 46 67
E-Mail: marc.cau...@psi.ch
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss=DwICAg=jf_iaSHvJObTbx-siA1ZOg=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw=u-EMob09-dkE6jZbD3dTjBi3vWhmDXtxiOK3nqFyIgY=JCfJgq6pZnKUI6d-rIgJXVcdZh7vmA5ypB1_goP_FFA=





___

Re: [gpfsug-discuss] GPFS autoload - wait for IB ports to becomeactive

2018-03-08 Thread Frederick Stock
You could also use the GPFS prestartup callback (mmaddcallback) to execute 
a script synchronously that waits for the IB ports to become available 
before returning and allowing GPFS to continue.  Not systemd integrated 
but it should work.

Fred
__
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
sto...@us.ibm.com



From:   david_john...@brown.edu
To: gpfsug main discussion list 
Date:   03/08/2018 07:34 AM
Subject:Re: [gpfsug-discuss] GPFS autoload - wait for IB ports to 
become  active
Sent by:gpfsug-discuss-boun...@spectrumscale.org



Until IBM provides a solution, here is my workaround. Add it so it runs 
before the gpfs script, I call it from our custom xcat diskless boot 
scripts. Based on rhel7, not fully systemd integrated. YMMV!

Regards, 
 — ddj
——-
[ddj@storage041 ~]$ cat /etc/init.d/ibready 
#! /bin/bash
#
# chkconfig: 2345 06 94
# /etc/rc.d/init.d/ibready
# written in 2016 David D Johnson (ddj  brown.edu)
#
### BEGIN INIT INFO
# Provides: ibready
# Required-Start:
# Required-Stop:
# Default-Stop:
# Description: Block until infiniband is ready
# Short-Description: Block until infiniband is ready
### END INIT INFO

RETVAL=0
if [[ -d /sys/class/infiniband ]] 
then
IBDEVICE=$(dirname $(grep -il infiniband 
/sys/class/infiniband/*/ports/1/link* | head -n 1))
fi
# See how we were called.
case "$1" in
  start)
if [[ -n $IBDEVICE && -f $IBDEVICE/state ]]
then
echo -n "Polling for InfiniBand link up: "
for (( count = 60; count > 0; count-- ))
do
if grep -q ACTIVE $IBDEVICE/state
then
echo ACTIVE
break
fi
echo -n "."
sleep 5
done
if (( count <= 0 ))
then
echo DOWN - $0 timed out
fi
fi
;;
  stop|restart|reload|force-reload|condrestart|try-restart)
;;
  status)
if [[ -n $IBDEVICE && -f $IBDEVICE/state ]]
then
echo "$IBDEVICE is $(< $IBDEVICE/state) $(< 
$IBDEVICE/rate)"
else
echo "No IBDEVICE found"
fi
;;
  *)
echo "Usage: ibready 
{start|stop|status|restart|reload|force-reload|condrestart|try-restart}"
exit 2
esac
exit ${RETVAL}


  -- ddj
Dave Johnson

On Mar 8, 2018, at 6:10 AM, Caubet Serrabou Marc (PSI)  wrote:

Hi all,

with autoload = yes we do not ensure that GPFS will be started after the 
IB link becomes up. Is there a way to force GPFS waiting to start until IB 
ports are up? This can be probably done by adding something like 
After=network-online.target and Wants=network-online.target in the systemd 
file but I would like to know if this is natively possible from the GPFS 
configuration.

Thanks a lot,
Marc 
_
Paul Scherrer Institut 
High Performance Computing
Marc Caubet Serrabou
WHGA/036
5232 Villigen PSI
Switzerland

Telephone: +41 56 310 46 67
E-Mail: marc.cau...@psi.ch
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss=DwICAg=jf_iaSHvJObTbx-siA1ZOg=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw=u-EMob09-dkE6jZbD3dTjBi3vWhmDXtxiOK3nqFyIgY=JCfJgq6pZnKUI6d-rIgJXVcdZh7vmA5ypB1_goP_FFA=





___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Inode scan optimization

2018-02-08 Thread Frederick Stock
You mention that all the NSDs are metadata and data but you do not say how 
many NSDs are defined or the type of storage used, that is are these on 
SAS or NL-SAS storage?  I'm assuming they are not on SSDs/flash storage.

Have you considered moving the metadata to separate NSDs, preferably 
SSD/flash storage?  This is likely to give you a significant performance 
boost.

You state that using  the inode scan API you reduced the time to 40 days. 
Did you analyze your backup application to determine where the time was 
being spent for the backup?  If the inode scan is a small percentage of 
your backup time then optimizing it will not provide much benefit.

Fred
__
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
sto...@us.ibm.com



From:   "tomasz.wol...@ts.fujitsu.com" 
To: "gpfsug-discuss@spectrumscale.org" 

Date:   02/08/2018 05:50 AM
Subject:[gpfsug-discuss] Inode scan optimization
Sent by:gpfsug-discuss-boun...@spectrumscale.org



Hello All,
 
A full backup of an 2 billion inodes spectrum scale file system on 
V4.1.1.16 takes 60 days.
 
We try to optimize and using inode scans seems to improve, even when we 
are using a directory scan and the inode scan just for having a better 
performance concerning stat (using gpfs_stat_inode_with_xattrs64). With 20 
processes in parallel doing dir scans (+ inode scans for stat info) we 
have decreased the time to 40 days.
All NSDs are dataAndMetadata type.
 
I have the following questions:
· Is there a way to increase the inode scan cache (we may use 32 
GByte)? 
o   Can we us the “hidden” config parameters
§ iscanPrefetchAggressiveness 2
§ iscanPrefetchDepth 0
§ iscanPrefetchThreadsPerNode 0
· Is there a documentation concerning cache behavior?
o   if no, is the  inode scan cache process or node specific?
o   Is there a suggestion to optimize the termIno parameter in the 
gpfs_stat_inode_with_xattrs64() in such a use case?
 
Thanks! 
 
Best regards,
Tomasz Wolski___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss=DwICAg=jf_iaSHvJObTbx-siA1ZOg=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw=y2y22xZuqjpkKfO2WSdcJsBXMaM8hOedaB_AlgFlIb0=DL0ZnBuH9KpvKN6XQNvoYmvwfZDbbwMlM-4rCbsAgWo=





___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Metadata only system pool

2018-01-23 Thread Frederick Stock
You are correct about  mmchfs, you can increase the inode maximum but once 
an inode is allocated it cannot be de-allocated in the sense that the 
space can be recovered.  You can of course decreased the inode maximum to 
a value equal to the used and allocated inodes but that would not help you 
here.  Providing more metadata space via additional NSDs seems your  most 
expedient option to address the issue.

Fred
__
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
sto...@us.ibm.com



From:   "Buterbaugh, Kevin L" <kevin.buterba...@vanderbilt.edu>
To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>
Date:   01/23/2018 01:10 PM
Subject:Re: [gpfsug-discuss] Metadata only system pool
Sent by:gpfsug-discuss-boun...@spectrumscale.org



Hi All, 

I do have metadata replication set to two, so Alex, does that make more 
sense?

And I had forgotten about indirect blocks for large files, which actually 
makes sense with the user in question … my apologies for that … due to a 
very gravely ill pet and a recovering at home from pneumonia family member 
I’m way more sleep deprived right now than I’d like.  :-(

Fred - I think you’ve already answered this … but mmchfs can only create / 
allocate more inodes … it cannot be used to shrink the number of inodes? 
That would make sense, and if that’s the case then I can allocate more 
NSDs to the system pool.

Thanks…

Kevin

On Jan 23, 2018, at 11:27 AM, Alex Chekholko <a...@calicolabs.com> wrote:

2.8TB seems quite high for only 350M inodes.  Are you sure you only have 
metadata in there?

On Tue, Jan 23, 2018 at 9:25 AM, Frederick Stock <sto...@us.ibm.com> 
wrote:
One possibility is the creation/expansion of directories or allocation of 
indirect blocks for large files.

Not sure if this is the issue here but at one time inode allocation was 
considered slow and so folks may have pre-allocated inodes to avoid that 
overhead during file creation.  To my understanding inode creation time is 
not so slow that users need to pre-allocate inodes.  Yes, there are likely 
some applications where pre-allocating may be necessary but I expect they 
would be the exception.  I mention this because you have a lot of free 
inodes and of course once they are allocated they cannot be de-allocated. 

Fred
__
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
sto...@us.ibm.com



From:"Buterbaugh, Kevin L" <kevin.buterba...@vanderbilt.edu>
To:gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>
Date:01/23/2018 12:17 PM
Subject:[gpfsug-discuss] Metadata only system pool
Sent by:gpfsug-discuss-boun...@spectrumscale.org




Hi All, 

I was under the (possibly false) impression that if you have a filesystem 
where the system pool contains metadata only then the only thing that 
would cause the amount of free space in that pool to change is the 
creation of more inodes … is that correct?  In other words, given that I 
have a filesystem with 130 million free (but allocated) inodes:

Inode Information
-
Number of used inodes:   218635454
Number of free inodes:   131364674
Number of allocated inodes:  35128
Maximum number of inodes:35128

I would not expect that a user creating a few hundred or thousands of 
files could cause a “no space left on device” error (which I’ve got one 
user getting).  There’s plenty of free data space, BTW.

Now my system pool is almost “full”:

(pool total)   2.878T   34M (  0%) 
   140.9M ( 0%)

But again, what - outside of me creating more inodes - would cause that to 
change??

Thanks…

Kevin

—
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and 
Education
kevin.buterba...@vanderbilt.edu- (615)875-9633


___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss=DwICAg=jf_iaSHvJObTbx-siA1ZOg=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw=gou0xYZwz8M-5i8mT6Tthafi8JW2aMrzQGMK1hUEUls=jcHOB_vmJjE8PnrpfHqzMkm1nk6QWwkn2npTEP6kcKs=





___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C1607a3fe872e4241587b08d56286a746%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C1%7C636523252830007825=rIFx3lzbAIH5SZtFxJsVqWMMSo%2F0LssNc4K4tZH3uQc%3D=0
__

Re: [gpfsug-discuss] gpfs 4.2.3.5 and RHEL 7.4...

2017-12-18 Thread Frederick Stock
Yes the integrated protocols are the Samba and Ganesha that are bundled 
with Spectrum Scale.  These require the use of the CES component for 
monitoring the protocols.  If you do use them then you need to wait for a 
release of Spectrum Scale in which the integrated protocols are also 
supported on RHEL 7.4.

Fred
__
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
sto...@us.ibm.com



From:   valdis.kletni...@vt.edu
To: gpfsug-discuss@spectrumscale.org
Date:   12/18/2017 03:09 PM
Subject:[gpfsug-discuss] gpfs 4.2.3.5 and RHEL 7.4...
Sent by:gpfsug-discuss-boun...@spectrumscale.org



Currently, the IBM support matrix says:

https://www.ibm.com/support/knowledgecenter/STXKQY/gpfsclustersfaq.html#linux


that 4.2.3.5 is supported on RHEL 7.4, but with a footnote:

"AFM, Integrated Protocols, and Installation Toolkit are not supported on 
RHEL 7.4."

We don't use AFM or the install toolkit.  But we *do* make fairly heavy 
use
of mmces and nfs-ganesha - is that what they mean by "Integrated 
Protocols"?

(We're looking at doing upgrades next month while our HPC clusters are 
doing
their upgrades - and going to 7.4 would be nice.  If there's a mine field 
there, I need to
make sure we stay at 7.3 - plus applicable non-7.4 updates)
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss=DwICAg=jf_iaSHvJObTbx-siA1ZOg=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw=3Z9HrSAviMivcR98fNZ28F-RQq7ZPp-1UZtazzLnaUU=HlT2amKtCbngYmKNb3_I4NKvn8aFGXCqcJARCbu4AOE=






___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Specifying nodes in commands

2017-11-10 Thread Frederick Stock
How do you determine if mmapplypolicy is running on a node?  Normally 
mmapplypolicy as a process runs on a single node but its helper processes, 
policy-help or something similar, run on all the nodes which are 
referenced by the -N option.

Fred
__
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
sto...@us.ibm.com



From:   "Chase, Peter" 
To: "'gpfsug-discuss@spectrumscale.org'" 

Date:   11/10/2017 11:18 AM
Subject:[gpfsug-discuss] Specifying nodes in commands
Sent by:gpfsug-discuss-boun...@spectrumscale.org



Hello all,

I'm running a script triggered from an ILM external list rule.

The script has the following command in, and it isn't work as I'd expect:
/usr/lpp/mmfs/bin/mmapplypolicy /gpfs1/aws -N cloudNode -P 
/gpfs1/s3upload/policies/migration.policy --scope fileset

I'd expect the mmapplypolicy command to run the policy on all the nodes in 
the cloudNode class, but it doesn't, it runs on the node that triggered 
the script.

However, the following command does work as I'd expect:
/usr/lpp/mmfs/bin/mmdsh -N cloudNode /usr/lpp/mmfs/bin/mmapplypolicy 
/gpfs1/aws -P /gpfs1/s3upload/policies/migration.policy --scope fileset

Can any one shed any light on this? Have I just misconstrued how 
mmapplypolicy works?

Regards,

Peter Chase
GPCS Team
Met Office  FitzRoy Road  Exeter  Devon  EX1 3PB  United Kingdom
Tel: +44 (0)1392 886921
Email: peter.ch...@metoffice.gov.uk Website: www.metoffice.gov.uk 

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss=DwIFAw=jf_iaSHvJObTbx-siA1ZOg=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw=spXDnba2A_tVauiszV7sXhSkn6GeEljABN4lUEB4f8s=1Hd1SNkXtfLRcirmeRfg1JuAERuhbyiVqsLEdYlhFsM=






___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] mmrestripefs "No space left on device"

2017-11-02 Thread Frederick Stock
Did you run the tsfindinode command to see where that file is located? 
Also, what does the mmdf show for your other pools notably the sas0 
storage pool?

Fred
__
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
sto...@us.ibm.com



From:   John Hanks <griz...@gmail.com>
To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>
Date:   11/02/2017 01:17 PM
Subject:Re: [gpfsug-discuss] mmrestripefs "No space left on 
device"
Sent by:gpfsug-discuss-boun...@spectrumscale.org



We do have different amounts of space in the system pool which had the 
changes applied:

[root@scg4-hn01 ~]# mmdf gsfs0 -P system
diskdisk size  failure holdsholds  free 
KB free KB
namein KBgroup metadata datain full 
blocksin fragments
--- -   -  
---
Disks in storage pool: system (Maximum disk size allowed is 3.6 TB)
VD000   377487360  100 Yes  No143109120 ( 
38%)  35708688 ( 9%) 
DMD_NSD_804 377487360  100 Yes  No 79526144 ( 
21%)   2924584 ( 1%) 
VD002   377487360  100 Yes  No143067136 ( 
38%)  35713888 ( 9%) 
DMD_NSD_802 377487360  100 Yes  No 79570432 ( 
21%)   2926672 ( 1%) 
VD004   377487360  100 Yes  No143107584 ( 
38%)  35727776 ( 9%) 
DMD_NSD_805 377487360  200 Yes  No 7984 ( 
21%)   2940040 ( 1%) 
VD001   377487360  200 Yes  No142964992 ( 
38%)  35805384 ( 9%) 
DMD_NSD_803 377487360  200 Yes  No 79580160 ( 
21%)   2919560 ( 1%) 
VD003   377487360  200 Yes  No143132672 ( 
38%)  35764200 ( 9%) 
DMD_NSD_801 377487360  200 Yes  No 79550208 ( 
21%)   2915232 ( 1%) 
-  
---
(pool total)   37748736001113164032 ( 
29%) 193346024 ( 5%)


and mmldisk shows that there is a problem with replication:

...
Number of quorum disks: 5 
Read quorum value:  3
Write quorum value: 3
Attention: Due to an earlier configuration change the file system
is no longer properly replicated.


I thought a 'mmrestripe -r' would fix this, not that I have to fix it 
first before restriping?

jbh


On Thu, Nov 2, 2017 at 9:45 AM, Frederick Stock <sto...@us.ibm.com> wrote:
Assuming you are replicating data and metadata have you confirmed that all 
failure groups have the same free space?  That is could it be that one of 
your failure groups has less space than the others?  You can verify this 
with the output of mmdf and look at the NSD sizes and space available.

Fred
__
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
sto...@us.ibm.com



From:John Hanks <griz...@gmail.com>
To:gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>
Date:11/02/2017 12:20 PM
Subject:Re: [gpfsug-discuss] mmrestripefs "No space left on 
device"
Sent by:gpfsug-discuss-boun...@spectrumscale.org



Addendum to last message:

We haven't upgraded recently as far as I know (I just inherited this a 
couple of months ago.) but am planning an outage soon to upgrade from 
4.2.0-4 to 4.2.3-5. 

My growing collection of output files generally contain something like

This inode list was generated in the Parallel Inode Traverse on Thu Nov  2 
08:34:22 2017
INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID 
MEMO(INODE_FLAGS FILE_TYPE [ERROR])
 535060:00   1 0  
illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device

With that inode varying slightly.

jbh

On Thu, Nov 2, 2017 at 8:55 AM, Scott Fadden <sfad...@us.ibm.com> wrote:
Sorry just reread as I hit send and saw this was mmrestripe, in my case it 
was mmdeledisk.
 
Did you try running the command on just one pool. Or using -B instead?
 
What is the file it is complaining about in 
"/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711" ?
 
Looks like it could be related to the maxfeaturelevel of the cluster. Have 
you recently upgraded? Is everything up to the same level? 
 
Scott Fadden
Spectrum Scale - Technical Marketing
Phone: (503) 880-5833
sfad...@us.ibm.com
http://www.ibm.com/systems/storage/spectrum/scale
 
 
- Original message -
From: Scott Fadden/Portland/IBM
To: gpfsug-discuss@spectrumscale.org
Cc: gpfsug-discuss@spectrumscale.org
Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device"
Date: Thu, Nov 2, 2017 8:44 AM
  
I opened a defect on this the other day, in my case it was 

Re: [gpfsug-discuss] mmrestripefs "No space left on device"

2017-11-02 Thread Frederick Stock
Assuming you are replicating data and metadata have you confirmed that all 
failure groups have the same free space?  That is could it be that one of 
your failure groups has less space than the others?  You can verify this 
with the output of mmdf and look at the NSD sizes and space available.

Fred
__
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
sto...@us.ibm.com



From:   John Hanks 
To: gpfsug main discussion list 
Date:   11/02/2017 12:20 PM
Subject:Re: [gpfsug-discuss] mmrestripefs "No space left on 
device"
Sent by:gpfsug-discuss-boun...@spectrumscale.org



Addendum to last message:

We haven't upgraded recently as far as I know (I just inherited this a 
couple of months ago.) but am planning an outage soon to upgrade from 
4.2.0-4 to 4.2.3-5. 

My growing collection of output files generally contain something like

This inode list was generated in the Parallel Inode Traverse on Thu Nov  2 
08:34:22 2017
INODE_NUMBER DUMMY_INFO SNAPSHOT_ID ISGLOBAL_SNAPSHOT INDEPENDENT_FSETID 
MEMO(INODE_FLAGS FILE_TYPE [ERROR])
 535060:00   1 0  
illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device

With that inode varying slightly.

jbh

On Thu, Nov 2, 2017 at 8:55 AM, Scott Fadden  wrote:
Sorry just reread as I hit send and saw this was mmrestripe, in my case it 
was mmdeledisk.
 
Did you try running the command on just one pool. Or using -B instead?
 
What is the file it is complaining about in 
"/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711" ?
 
Looks like it could be related to the maxfeaturelevel of the cluster. Have 
you recently upgraded? Is everything up to the same level? 
 
Scott Fadden
Spectrum Scale - Technical Marketing
Phone: (503) 880-5833
sfad...@us.ibm.com
http://www.ibm.com/systems/storage/spectrum/scale
 
 
- Original message -
From: Scott Fadden/Portland/IBM
To: gpfsug-discuss@spectrumscale.org
Cc: gpfsug-discuss@spectrumscale.org
Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device"
Date: Thu, Nov 2, 2017 8:44 AM
  
I opened a defect on this the other day, in my case it was an incorrect 
error message. What it meant to say was,"The pool is not empty." Are you 
trying to remove the last disk in a pool? If so did you empty the pool 
with a MIGRATE policy first? 
 
 
Scott Fadden
Spectrum Scale - Technical Marketing
Phone: (503) 880-5833
sfad...@us.ibm.com
http://www.ibm.com/systems/storage/spectrum/scale
 
 
- Original message -
From: John Hanks 
Sent by: gpfsug-discuss-boun...@spectrumscale.org
To: gpfsug main discussion list 
Cc:
Subject: Re: [gpfsug-discuss] mmrestripefs "No space left on device"
Date: Thu, Nov 2, 2017 8:34 AM
  
We have no snapshots ( they were the first to go when we initially hit the 
full metadata NSDs).  
 
I've increased quotas so that no filesets have hit a space quota. 
 
Verified that there are no inode quotas anywhere.
 
mmdf shows the least amount of free space on any nsd to be 9% free.
 
Still getting this error:
 
[root@scg-gs0 ~]# mmrestripefs gsfs0 -r -N scg-gs0,scg-gs1,scg-gs2,scg-gs3
Scanning file system metadata, phase 1 ... 
Scan completed successfully.
Scanning file system metadata, phase 2 ... 
Scanning file system metadata for sas0 storage pool
Scanning file system metadata for sata0 storage pool
Scan completed successfully.
Scanning file system metadata, phase 3 ... 
Scan completed successfully.
Scanning file system metadata, phase 4 ... 
Scan completed successfully.
Scanning user file metadata ...
Error processing user file metadata.
No space left on device
Check file '/var/mmfs/tmp/gsfs0.pit.interestingInodes.12888779711' on 
scg-gs0 for inodes with broken disk addresses or failures.
mmrestripefs: Command failed. Examine previous error messages to determine 
cause.
 
I should note too that this fails almost immediately, far to quickly to 
fill up any location it could be trying to write to.
 
jbh
  
On Thu, Nov 2, 2017 at 7:57 AM, David Johnson  
wrote: 
One thing that may be relevant is if you have snapshots, depending on your 
release level, 
inodes in the snapshot may considered immutable, and will not be 
migrated.  Once the snapshots
have been deleted, the inodes are freed up and you won’t see the (somewhat 
misleading) message
about no space.
 
 — ddj
Dave Johnson
Brown University
  
On Nov 2, 2017, at 10:43 AM, John Hanks  wrote:
Thanks all for the suggestions.  
 
Having our metadata NSDs fill up was what prompted this exercise, but 
space was previously feed up on those by switching them from metadata+data 
to metadataOnly and using a policy to migrate files out of that pool. So 
these now have about 30% free space (more if you include fragmented 
space). The restripe attempt is just to make a 

Re: [gpfsug-discuss] Checking a file-system for errors

2017-10-11 Thread Frederick Stock
Generally you should not run mmfsck unless you see MMFS_FSSTRUCT errors in 
your system logs.  To my knowledge online mmfsck only checks for a subset 
of problems, notably lost blocks, but that situation does not indicate any 
problems with the file system.

Fred
__
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
sto...@us.ibm.com



From:   "Simon Thompson (IT Research Support)" 
To: gpfsug main discussion list 
Date:   10/11/2017 06:32 AM
Subject:Re: [gpfsug-discuss] Checking a file-system for errors
Sent by:gpfsug-discuss-boun...@spectrumscale.org



OK thanks,

So if I run mmfsck in online mode and it says:
"File system is clean.
Exit status 0:10:0."

Then I can assume there is no benefit to running in offline mode?

But it would also be prudent to run "mmrestripefs -c" to be sure my
filesystem is happy?

Thanks

Simon

On 11/10/2017, 11:19, "gpfsug-discuss-boun...@spectrumscale.org on behalf
of uwefa...@de.ibm.com"  wrote:

>Hm , mmfsck will  return not very reliable results in online mode,
>especially it will report many issues which are just due to the transient
>states in a files system in operation.
>It should however not find less issues than in off-line mode.
>
>mmrestripefs -c does not do any logical checks, it just checks for
>differences of multiple replicas of the same data/metadata.
>File system errors can be caused by such discrepancies (if an odd/corrupt
>replica is used by the GPFS), but can also be caused (probably more
>likely) by logical errors / bugs when metadata were modified in the file
>system. In those cases, all the replicas are identical nevertheless
>corrupt (cannot be found by mmrestripefs)
> 
>So, mmrestripefs -c is like scrubbing for silent data corruption (on its
>own, it cannot decide which is the correct replica!), while mmfsck checks
>the filesystem structure for logical consistency.
>If the contents of the replicas of a data block differ, mmfsck won't see
>any problem (as long as the fs metadata are consistent), but mmrestripefs
>-c will. 
>
> 
>Mit freundlichen Grüßen / Kind regards
>
> 
>Dr. Uwe Falke
> 
>IT Specialist
>High Performance Computing Services / Integrated Technology Services /
>Data Center Services
>--
>-
>IBM Deutschland
>Rathausstr. 7
>09111 Chemnitz
>Phone: +49 371 6978 2165
>Mobile: +49 175 575 2877
>E-Mail: uwefa...@de.ibm.com
>--
>-
>IBM Deutschland Business & Technology Services GmbH / Geschäftsführung:
>Thomas Wolter, Sven Schooß
>Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart,
>HRB 17122 
>
>
>
>
>From:   "Simon Thompson (IT Research Support)" 
>To: "gpfsug-discuss@spectrumscale.org"
>
>Date:   10/11/2017 10:47 AM
>Subject:[gpfsug-discuss] Checking a file-system for errors
>Sent by:gpfsug-discuss-boun...@spectrumscale.org
>
>
>
>I'm just wondering if anyone could share any views on checking a
>file-system for errors.
>
>For example, we could use mmfsck in online and offline mode. Does online
>mode detect errors (but not fix) things that would be found in offline
>mode?
>
>And then were does mmrestripefs -c fit into this?
>
>"-c
>  Scans the file system and compares replicas of
>  metadata and data for conflicts. When conflicts
>  are found, the -c option attempts to fix
>  the replicas.
>"
>
>Which sorta sounds like fix things in the file-system, so how does that
>intersect (if at all) with mmfsck?
>
>Thanks
>
>Simon
>
>___
>gpfsug-discuss mailing list
>gpfsug-discuss at spectrumscale.org
>
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss=DwIFAw=jf_iaSHvJObTbx-siA1ZOg=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw=V8K9eELGXftg3ELG2jV1OYptzOZ-j9OdBkpgvJXV_IM=MdJhKZ9vW4uhTesz1LqKiEWo6gZAEXjtgw0RXnlJSgY=

>
>
>
>
>
>___
>gpfsug-discuss mailing list
>gpfsug-discuss at spectrumscale.org
>
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss=DwIFAw=jf_iaSHvJObTbx-siA1ZOg=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw=V8K9eELGXftg3ELG2jV1OYptzOZ-j9OdBkpgvJXV_IM=MdJhKZ9vW4uhTesz1LqKiEWo6gZAEXjtgw0RXnlJSgY=


___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org

Re: [gpfsug-discuss] GPFS 4.2.3.4 question

2017-08-26 Thread Frederick Stock
The only change missing is the change delivered  in 4.2.3 PTF3 efix3 which 
was provided on August 22.  The problem had to do with NSD deletion and 
creation.

Fred
__
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
sto...@us.ibm.com



From:   "Buterbaugh, Kevin L" 
To: gpfsug main discussion list 
Date:   08/26/2017 03:40 PM
Subject:[gpfsug-discuss] GPFS 4.2.3.4 question
Sent by:gpfsug-discuss-boun...@spectrumscale.org



Hi All, 

Does anybody know if GPFS 4.2.3.4, which came out today, contains all the 
patches that are in GPFS 4.2.3.3 efix3?

If anybody does, and can respond, I’d greatly appreciate it.  Our cluster 
is in a very, very bad state right now and we may need to just take it 
down and bring it back up.  I was already planning on rolling out GPFS 
4.2.3.3 efix 3 over the next few weeks anyway, so if I can just go to 
4.2.3.4 that would be great…

Thanks!

—
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and 
Education
kevin.buterba...@vanderbilt.edu - (615)875-9633


___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss=DwICAg=jf_iaSHvJObTbx-siA1ZOg=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw=7r9GsD1C2HiY4j21vPYIoQPHXePHxeMhzQeaw_ne4lM=-SFnqoJw--FN3wqClEEBGa9-XSLljgSseIU_SxGoWy0=
 





___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Question regarding "scanning file system metadata"bug

2017-08-22 Thread Frederick Stock
My understanding is that the problem is not with the policy engine 
scanning but with the commands that move data, for example mmrestripefs. 
So if you are using the policy engine for other purposes you are not 
impacted by the problem.

Fred
__
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
sto...@us.ibm.com



From:   Kristy Kallback-Rose 
To: gpfsug main discussion list 
Date:   08/22/2017 12:53 AM
Subject:[gpfsug-discuss] Question regarding "scanning file system 
metadata"   bug
Sent by:gpfsug-discuss-boun...@spectrumscale.org



Can someone comment as to whether the bug below could also be tickled by 
ILM policy engine scans of metadata? We are wanting to know if we should 
disable ILM scans until we have the patch.

Thanks,
Kristy

https://www-01.ibm.com/support/docview.wss?uid=ssg1S1010487=s033=OCSTXKQY=OCSWJ00=E_sp=s033-_-OCSTXKQY-OCSWJ00-_-E
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss=DwICAg=jf_iaSHvJObTbx-siA1ZOg=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw=L6hGADgajb-s1ezkPaD4wQhytCTKnUBGorgQEbmlEzk=nDmkF6EvhbMgktl3Oks3UkCb-2-cwR1QLEpOi6qeea4=
 





___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] gpfs waiters debugging

2017-06-06 Thread Frederick Stock
On recent releases you can accomplish the same with the command, "mmlsnode 
-N waiters -L".

Fred
__
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
sto...@us.ibm.com



From:   valdis.kletni...@vt.edu
To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>
Date:   06/06/2017 12:46 PM
Subject:Re: [gpfsug-discuss] gpfs waiters debugging
Sent by:gpfsug-discuss-boun...@spectrumscale.org



On Tue, 06 Jun 2017 15:06:57 +0200, Stijn De Weirdt said:
> oh sure, i meant waiters that last > 300 seconds or so (something that
> could trigger deadlock). obviously we're not interested in debugging the
> short ones, it's not that gpfs doesn't work or anything ;)

At least at one time, a lot of the mm(whatever) administrative commands
would leave one dangling waiter for the duration of the command - which
could be a while if the command was mmdeldisk or mmrestripefs. I admit
not having specifically checked for gpfs 4.2, but it was true for 3.2 
through
4.1

And my addition to the collective debugging knowledge:  A bash one-liner 
to
dump all the waiters across a cluster, sorted by wait time.  Note that
our clusters tend to be 5-8 servers, this may be painful for those of you
who have 400+ node clusters. :)

##!/bin/bash
for i in ` mmlsnode | tail -1 | sed 's/^[ ]*[^ ]*[ ]*//'`; do  ssh $i 
/usr/lpp/mmfs/bin/mmfsadm dump waiters | sed "s/^/$i /"; done | sort -n -r 
-k 3 -t' '

We've found it useful - if you have 1 waiter on one node that's 1278 
seconds
old, and 3 other nodes have waiters that are 1275 seconds old, it's a good
chance the other 3 nodes waiters are waiting on the first node's waiter to
resolve itself
[attachment "attltepl.dat" deleted by Frederick Stock/Pittsburgh/IBM] 
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss




___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] gpfs waiters debugging

2017-06-06 Thread Frederick Stock
Realize that generally any waiter under 1 second should be ignored.  In an 
active GPFS system there are always waiters and the greater the use of the 
system likely the more waiters you will see.  The point is waiters 
themselves are not an indication your system is having problems.

As for creating them any steady level of activity against the file system 
should cause waiters to appear, though most should be of a short duration.


Fred
__
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
sto...@us.ibm.com



From:   Stijn De Weirdt 
To: gpfsug-discuss@spectrumscale.org
Date:   06/06/2017 08:31 AM
Subject:Re: [gpfsug-discuss] gpfs waiters debugging
Sent by:gpfsug-discuss-boun...@spectrumscale.org



hi bob,

waiters from RPC replies and/or threads waiting on mutex are most 
"popular".

but my question is not how to resolve them, the question is how to
create such a waiter so we can train ourself in grep and mmfsadm etc etc

we want to recreate the waiters a few times, try out some things and
either script or at least put instructions on our internal wiki what to 
do.

the instructions in the slides are clear enough, but there are a lot of
slides, and typically when this occurs offshift, you don't want to start
with rereading the slides and wondering what to do next; let alone debug
scripts ;)

thanks,

stijn

On 06/06/2017 01:44 PM, Oesterlin, Robert wrote:
> Hi Stijn
> 
> You need to provide some more details on the type and duration of the 
waiters before the group can offer some advice.
> 
> Bob Oesterlin
> Sr Principal Storage Engineer, Nuance
> 
> 
> 
> On 6/6/17, 2:05 AM, "gpfsug-discuss-boun...@spectrumscale.org on behalf 
of Stijn De Weirdt"  wrote:
> 
> 
> but we are wondering if and how we can cause those waiters ourself, 
so
> we can train ourself in debugging and resolving them (either on test
> system or in controlled environment on the production clusters).
> 
> all hints welcome.
> 
> stijn
> ___
> 
> 
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss





___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Policy scan against billion files for ILM/HSM

2017-04-11 Thread Frederick Stock
As Zachary noted the location of your metadata is the key and for the 
scanning you have planned flash is necessary.  If you have the resources 
you may consider setting up your flash in a mirrored RAID configuration 
(RAID1/RAID10) and have GPFS only keep one copy of metadata since the 
underlying storage is replicating it via the RAID.  This should improve 
metadata write performance but likely has little impact on your scanning, 
assuming you are just reading through the metadata.

Fred
__
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
sto...@us.ibm.com



From:   Zachary Giles 
To: gpfsug main discussion list 
Date:   04/11/2017 12:49 AM
Subject:Re: [gpfsug-discuss] Policy scan against billion files for 
ILM/HSM
Sent by:gpfsug-discuss-boun...@spectrumscale.org



It's definitely doable, and these days not too hard. Flash for
metadata is the key.
The basics of it are:
* Latest GPFS for performance benefits.
* A few 10's of TBs of flash ( or more ! ) setup in a good design..
lots of SAS, well balanced RAID that can consume the flash fully,
tuned for IOPs, and available in parallel from multiple servers.
* Tune up mmapplypolicy with -g somewhere-on-gpfs; --choice-algorithm
fast; -a, -m and -n to reasonable values ( number of cores on the
servers ); -A to ~1000
* Test first on a smaller fileset to confirm you like it. -I test
should work well and be around the same speed minus the migration
phase.
* Then throw ~8 well tuned Infiniband attached nodes at it using -N,
If they're the same as the NSD servers serving the flash, even better.

Should be able to do 1B in 5-30m depending on the idiosyncrasies of
above choices. Even 60m isn't bad and quite respectable if less gear
is used or if they system is busy while the policy is running.
Parallel metadata, it's a beautiful thing.



On Tue, Apr 11, 2017 at 12:29 AM, Masanori Mitsugi
 wrote:
> Hello,
>
> Does anyone have experience to do mmapplypolicy against billion files 
for
> ILM/HSM?
>
> Currently I'm planning/designing
>
> * 1 Scale filesystem (5-10 PB)
> * 10-20 filesets which includes 1 billion files each
>
> And our biggest concern is "How log does it take for mmapplypolicy 
policy
> scan against billion files?"
>
> I know it depends on how to write the policy,
> but I don't have no billion files policy scan experience,
> so I'd like to know the order of time (min/hour/day...).
>
> It would be helpful if anyone has experience of such large number of 
files
> scan and let me know any considerations or points for policy design.
>
> --
> Masanori Mitsugi
> mits...@linux.vnet.ibm.com
>
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss



-- 
Zach Giles
zgi...@gmail.com
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss





___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] fix mmrepquota report format during grace periods

2017-03-28 Thread Frederick Stock
My understanding is that with the upcoming 4.2.3 release the -Y option 
will be documented for many commands, but perhaps not all.


Fred
__
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
sto...@us.ibm.com



From:   "Buterbaugh, Kevin L" 
To: gpfsug main discussion list 
Date:   03/28/2017 11:35 AM
Subject:Re: [gpfsug-discuss] fix mmrepquota report format during 
grace periods
Sent by:gpfsug-discuss-boun...@spectrumscale.org



All,

Could someone(s) from the GPFS team please:

1) see that the appropriate documentation gets updated (manuals, “-h” 
option to commands, the man pages for the commands, etc.)?
2) let us know what version of GPFS introduced the undocumented “-Y” 
option for mmrepquota and mmlsquota?

I’ve got numerous quota related scripts and I’m just curious to go back 
and figure out how much time I’ve wasted because I didn’t know about it.

These “unknown unknowns” bite us again…  ;-)

Kevin

> On Mar 28, 2017, at 10:11 AM, Simon Thompson (Research Computing - IT 
Services)  wrote:
> 
> I thought this was because the -Y flag was going into new commands and
> being added to older commands during later releases.
> 
> So it might be that it was added, but the docs not updated.
> 
> Simon
> 
> On 28/03/2017, 16:04, "gpfsug-discuss-boun...@spectrumscale.org on 
behalf
> of Buterbaugh, Kevin L"  behalf of kevin.buterba...@vanderbilt.edu> wrote:
> 
>> Ugh!  Of course, I¹m wrong Š mmlsquota does support the ³-Y² option Š
>> it¹s just not documented.  Why not
>> 
>> Kevin
>> 
>>> On Mar 28, 2017, at 10:00 AM, Buterbaugh, Kevin L
>>>  wrote:
>>> 
>>> Hi Bob, Jaime, and GPFS team,
>>> 
>>> That¹s great for mmrepquota, but mmlsquota does not have a similar
>>> option AFAICT. 
>>> 
>>> That has really caused me grief Š for example, I¹ve got a Perl script
>>> that takes mmlsquota output for a user and does two things:  1) 
converts
>>> it into something easier for them to parse, and 2) doesn¹t display
>>> anything for the several dozen filesets they don¹t have access to. 
That
>>> Perl script is ~300 lines and probably about a third of that is 
dealing
>>> with the grace period spacing issueŠ
>>> 
>>> Kevin
>>> 
 On Mar 28, 2017, at 9:54 AM, Oesterlin, Robert
  wrote:
 
 Try running it with the ³-Y² option, it returns an easily to read
 output:
 mmrepquota -Y dns
 
 
mmrepquota::HEADER:version:reserved:reserved:filesystemName:quotaType:id
 
:name:blockUsage:blockQuota:blockLimit:blockInDoubt:blockGrace:filesUsag
 
e:filesQuota:filesLimit:filesInDoubt:filesGrace:remarks:quota:defQuota:f
 id:filesetname:
 
 
mmrepquota::0:1:::dns:USR:0:root:0:0:0:0:none:1:0:0:0:none:i:on:off:0:ro
 ot:
 
 
mmrepquota::0:1:::dns:USR:0:root:0:0:0:0:none:1:0:0:0:none:i:on:off:1:us
 ers:
 
 
mmrepquota::0:1:::dns:GRP:0:root:0:0:0:0:none:1:0:0:0:none:i:on:off:0:ro
 ot:
 
 
mmrepquota::0:1:::dns:GRP:0:root:0:0:0:0:none:1:0:0:0:none:i:on:off:1:us
 ers:
 
 
mmrepquota::0:1:::dns:FILESET:0:root:0:0:0:0:none:1:0:0:0:none:i:on:off:
 ::
 
 
mmrepquota::0:1:::dns:FILESET:1:users:0:4294967296:4294967296:0:none:1:0
 :0:0:none:e:on:off:::
 
 Bob Oesterlin
 Sr Principal Storage Engineer, Nuance
 
 
 
 On 3/28/17, 9:47 AM, "gpfsug-discuss-boun...@spectrumscale.org on
 behalf of Jaime Pinto"  wrote:
 
  Any chance you guys in the GPFS devel team could patch the
 mmrepquota 
  code so that during grace periods the report column for "none" would
 
  still be replaced with >>>*ONE*<<< word? By that I mean, instead of
 "2 
  days" for example, just print "2-days" or "2days" or "2_days", and
 so 
  on.
 
  I have a number of scripts that fail for users when they are over
  their quotas under grace periods, because the report shifts the
  remaining information for that user 1 column to the right.
 
  Obviously it would cost me absolutely nothing to patch my scripts to
 
  deal with this, however the principle here is that the reports
  generated by GPFS should be the ones keeping consistence.
 
  Thanks
  Jaime
 
 
 
 

 TELL US ABOUT YOUR SUCCESS STORIES
 
 
https://urldefense.proofpoint.com/v2/url?u=http-3A__www.scinethpc.ca_tes
 
timonials=DwICAg=djjh8EKwHtOepW4Bjau0lKhLlu-DxM1dlgP0rrLsOzY=LPDew
 
t1Z4o9eKc86MXmhqX-45Cz1yz1ylYELF9olLKU=PnZlzkqTEICwnHCIZvUgTr2CN-RqtzN
 sKbADKWCeLhA=TVGnqMwSWqNI1Vu1BlCcwXiVGsLUO9ZnbqlasVmT2HU=

Re: [gpfsug-discuss] Problem installin CES - 4.2.2-2

2017-03-08 Thread Frederick Stock
What version of python do you have installed?  I think you need at least 
version 2.7.

Fred
__
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
sto...@us.ibm.com



From:   "Oesterlin, Robert" 
To: gpfsug main discussion list 
Date:   03/08/2017 04:55 PM
Subject:[gpfsug-discuss] Problem installin CES - 4.2.2-2
Sent by:gpfsug-discuss-boun...@spectrumscale.org



OK, I’ll admit I’m not protocols install expert. Using the “spectrumscale” 
installer command, it’s failing to install the object packages due to some 
internal dependencies from the IBM supplied repo.
 
Who can help me?
 
Excerpt from the install log:
 
2017-03-08 16:47:34,319 [ TRACE ] arch-cnfs02.nrg1.us.grid.nuance.com   * 
log[IBM SPECTRUM SCALE: Installing Object packages (SS50).] action write
2017-03-08 16:47:34,319 [ TRACE ] arch-cnfs02.nrg1.us.grid.nuance.com
2017-03-08 16:47:59,554 [ TRACE ] arch-cnfs02.nrg1.us.grid.nuance.com   * 
yum_package[spectrum-scale-object] action install
2017-03-08 16:47:59,555 [ TRACE ] arch-cnfs02.nrg1.us.grid.nuance.com
2017-03-08 16:47:59,555 [ TRACE ] arch-cnfs02.nrg1.us.grid.nuance.com 

2017-03-08 16:47:59,556 [ TRACE ] arch-cnfs02.nrg1.us.grid.nuance.com 
Error executing action `install` on resource 
'yum_package[spectrum-scale-object]'
2017-03-08 16:47:59,556 [ TRACE ] arch-cnfs02.nrg1.us.grid.nuance.com 

2017-03-08 16:47:59,557 [ TRACE ] arch-cnfs02.nrg1.us.grid.nuance.com
2017-03-08 16:47:59,557 [ TRACE ] arch-cnfs02.nrg1.us.grid.nuance.com 
Chef::Exceptions::Exec
2017-03-08 16:47:59,558 [ TRACE ] arch-cnfs02.nrg1.us.grid.nuance.com 
--
2017-03-08 16:47:59,558 [ TRACE ] arch-cnfs02.nrg1.us.grid.nuance.com yum 
-d0 -e0 -y install spectrum-scale-object-4.2.2-2 returned 1:
2017-03-08 16:47:59,558 [ TRACE ] arch-cnfs02.nrg1.us.grid.nuance.com 
STDOUT: Package python-keystoneclient is obsoleted by 
python2-keystoneclient, but obsoleting package does not provide for 
requirements
2017-03-08 16:47:59,559 [ TRACE ] arch-cnfs02.nrg1.us.grid.nuance.com 
Package python-keystoneclient is obsoleted by python2-keystoneclient, but 
obsoleting package does not provide for requirements
2017-03-08 16:47:59,559 [ TRACE ] arch-cnfs02.nrg1.us.grid.nuance.com 
Package python-keystoneclient is obsoleted by python2-keystoneclient, but 
obsoleting package does not provide for requirements
2017-03-08 16:47:59,559 [ TRACE ] arch-cnfs02.nrg1.us.grid.nuance.com 
Package python-keystoneclient is obsoleted by python2-keystoneclient, but 
obsoleting package does not provide for requirements
2017-03-08 16:47:59,559 [ TRACE ] arch-cnfs02.nrg1.us.grid.nuance.com 
Package python-keystoneclient is obsoleted by python2-keystoneclient, but 
obsoleting package does not provide for requirements
2017-03-08 16:47:59,560 [ TRACE ] arch-cnfs02.nrg1.us.grid.nuance.com 
Package python-keystoneclient is obsoleted by python2-keystoneclient, but 
obsoleting package does not provide for requirements
2017-03-08 16:47:59,560 [ TRACE ] arch-cnfs02.nrg1.us.grid.nuance.com 
Package python-keystoneclient is obsoleted by python2-keystoneclient, but 
obsoleting package does not provide for requirements
2017-03-08 16:47:59,560 [ TRACE ] arch-cnfs02.nrg1.us.grid.nuance.com 
Package python-keystoneclient is obsoleted by python2-keystoneclient, but 
obsoleting package does not provide for requirements
2017-03-08 16:47:59,561 [ TRACE ] arch-cnfs02.nrg1.us.grid.nuance.com 
Package python-keystoneclient is obsoleted by python2-keystoneclient, but 
obsoleting package does not provide for requirements
2017-03-08 16:47:59,561 [ TRACE ] arch-cnfs02.nrg1.us.grid.nuance.com 
Package python-keystoneclient is obsoleted by python2-keystoneclient, but 
obsoleting package does not provide for requirements
2017-03-08 16:47:59,561 [ TRACE ] arch-cnfs02.nrg1.us.grid.nuance.com 
Package python-keystoneclient is obsoleted by python2-keystoneclient, but 
obsoleting package does not provide for requirements
2017-03-08 16:47:59,561 [ TRACE ] arch-cnfs02.nrg1.us.grid.nuance.com 
Package python-keystoneclient is obsoleted by python2-keystoneclient, but 
obsoleting package does not provide for requirements
2017-03-08 16:47:59,562 [ TRACE ] arch-cnfs02.nrg1.us.grid.nuance.com 
Package python-keystoneclient is obsoleted by python2-keystoneclient, but 
obsoleting package does not provide for requirements
2017-03-08 16:47:59,562 [ TRACE ] arch-cnfs02.nrg1.us.grid.nuance.com 
Package python-keystoneclient is obsoleted by python2-keystoneclient, but 
obsoleting package does not provide for requirements
2017-03-08 16:47:59,562 [ TRACE ] arch-cnfs02.nrg1.us.grid.nuance.com 
Package python-keystoneclient is obsoleted by python2-keystoneclient, but 
obsoleting package does not provide for