Re: [gpfsug-discuss] [External] 5.1.2.2 changes

2022-01-14 Thread Jonathan Buzzard
On 13/01/2022 17:39, mark.berg...@uphs.upenn.edu wrote: [SNIP] The change that I noticed most was: Repair functionality of mmfsck command in online mode is deprecated The repair functionality of mmfsck command in online mode is no longer available. The report-only

Re: [gpfsug-discuss] WAS: alternative path; Now: RDMA

2021-12-13 Thread Jonathan Buzzard
On 13/12/2021 00:03, Andrew Beattie wrote: What is the main outcome or business requirement of the teaching cluster ( i notice your specific in the use of defining it as a teaching cluster) It is entirely possible that the use case for this cluster does not warrant the use of high speed low

Re: [gpfsug-discuss] WAS: alternative path; Now: RDMA

2021-12-12 Thread Jonathan Buzzard
On 12/12/2021 02:19, Alec wrote: I feel the need to respond here... I see many responses on this User Group forum that are dismissive of the fringe / extreme use cases and of the "what do you need that for '' mindset. The thing is that Spectrum Scale is for the extreme, just take the word

Re: [gpfsug-discuss] alternate path between ESS Servers for Datamigration

2021-12-09 Thread Jonathan Buzzard
On 09/12/2021 16:04, Douglas O'flaherty wrote: Though not directly about your design, our work with NVIDIA on GPUdirect Storage and SuperPOD has shown how sensitive RDMA (IB & RoCE) to both MOFED and Firmware version compatibility can be. I would suggest anyone debugging RDMA issues should

Re: [gpfsug-discuss] gpfsug-discuss Digest, Vol 119, Issue 7 - Adding a quorum node

2021-12-09 Thread Jonathan Buzzard
On 09/12/2021 16:43, Ralf Eberhard wrote: Jonathan, my suspicion is that the GPFS daemon on fqdn-new is not reachable via port 1191. You can double check that by sending a lightweight CCR RPC to this daemon from another quorum node by attempting: mmccr echo -n fqdn-new;echo $? If this echo

[gpfsug-discuss] Adding a quorum node

2021-12-09 Thread Jonathan Buzzard
I am looking to replace the quorum node in our cluster. The RAID card in the server we are currently using is a casualty of the RHEL8 SAS card purge :-( I have a "new" dual core server that is fully supported by RHEL8. After some toing and throwing with IBM they agreed a Pentium G6400 is

Re: [gpfsug-discuss] Question on changing mode on many files

2021-12-07 Thread Jonathan Buzzard
On 07/12/2021 14:55, Simon Thompson wrote: Or add:   UPDATECTIME   yes   SKIPACLUPDATECHECK    yes To you dsm.opt file to skip checking for those updates and don’t back them up again. Yeah, but then a restore gives you potentially an unusable file system as the ownership

Re: [gpfsug-discuss] Question on changing mode on many files

2021-12-07 Thread Jonathan Buzzard
On 07/12/2021 14:01, Frederick Stock wrote: If you are running on a more recent version of Scale you might want to look at the mmfind command.  It provides a find-like wrapper around the execution of policy rules. I am not sure that will be any faster than a "chmod -R" as it will exec

Re: [gpfsug-discuss] /tmp/mmfs vanishes randomly?

2021-11-08 Thread Jonathan Buzzard
On 08/11/2021 09:20, Billich Heinrich Rainer (ID SD) wrote: Hello, We use /tmp/mmfs as dataStructureDump directory. Since a while I notice that this directory randomly vanishes. Mmhealth does not complain but just notes that it will no longer monitor the directory. Still I doubt that trace

Re: [gpfsug-discuss] [EXTERNAL] Re: Handling bad file names in policies?

2021-10-11 Thread Jonathan Buzzard
On 11/10/2021 09:55, Peter Childs wrote> We've had this same issue with characters that are fine in Scale but Protect can't handle. Normally its because some script has embedded a newline in the middle of a file name, and normally we end up renaming that file by inode number find . -inum

Re: [gpfsug-discuss] Handling bad file names in policies?

2021-10-09 Thread Jonathan Buzzard
On 08/10/2021 19:14, Wahl, Edward wrote: This goes back as far as I can recall to <=GPFS 3.5 days. And no, I cannot recall what version of TSM-EE that was. But newline has been the only stopping point, for what seems like forever. Having filed many an mmbackup bug, I don't recall ever

Re: [gpfsug-discuss] Serial number of [EG]SS nodes

2021-09-02 Thread Jonathan Buzzard
On 02/09/2021 17:14, Uwe Falke wrote: Hi, Jürgen, try the command dmidecode lists a bunch of information, somewhere should be the serial of the system. You can cut the amount of information down by specifying the type of information you want, so usually dmidecode -t system will display

Re: [gpfsug-discuss] Future of Spectrum Scale support for Centos

2021-08-18 Thread Jonathan Buzzard
On 18/08/2021 14:07, Christian Vieser wrote: Hi out there. Since CentOS 8 support ends end of this year (in favour of CentOS Stream 8), it's time to make decisions where to move. There are two forks of CentOS, Alma and Rocky Linux, but nobody knows whether they will survive the next years.

Re: [gpfsug-discuss] kernel 3.10.0-1160.36.2.el7.x86_64 (CVE-2021-33909) not compatible with DB2 (for TSM, HPSS, possibly other IBM apps)

2021-07-31 Thread Jonathan Buzzard
On 30/07/2021 15:11, Jaime Pinto wrote: Hey Jonathan 3.10.0-1160.31.1 seems to be one of the last kernel releases prior to the CVE-2021-33909 exploit. It is the release immediately prior to 3.10.0-1160.31.2. To be fair I didn't consider it important to install 3.10.0-1160.31.2 on our TSM

Re: [gpfsug-discuss] kernel 3.10.0-1160.36.2.el7.x86_64 (CVE-2021-33909) not compatible with DB2 (for TSM, HPSS, possibly other IBM apps)

2021-07-30 Thread Jonathan Buzzard
On 30/07/2021 05:16, Jaime Pinto wrote: Alert related to sysadmins managing TSM/DB2 servers and those responsible for applying security patches, in particular kernel 3.10.0-1160.36.2.el7.x86_64, despite security concerns raised by CVE-2021-33909: Please hold off on upgrading your RedHat

[gpfsug-discuss] CVE-2021-33909 and 3.10.0-1160.36.2.el7.x86_64

2021-07-23 Thread Jonathan Buzzard
Anyone know what GPFS versions will work with kernel version 3.10.0-1160.36.2 on RHEL7 rebuilds to patch for the above local privilege escalation bug? JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of

Re: [gpfsug-discuss] PVU question

2021-07-01 Thread Jonathan Buzzard
On 29/06/2021 15:41, IBM Spectrum Scale wrote: My suggestion for this question is that it should be directed to your IBM sales team and not the Spectrum Scale support team.  My reading of the information you provided is that your processor counts as 2 cores.  As for the PVU value my guess is

[gpfsug-discuss] PVU question

2021-06-29 Thread Jonathan Buzzard
Hum, it would appear there are gaps in IBM's PVU table. Specifically I am looking at using a Pentium G4620 in a server https://ark.intel.com/content/www/us/en/ark/products/97460/intel-pentium-processor-g4620-3m-cache-3-70-ghz.html It's dual core with ECC memory support all in a socket 1151.

Re: [gpfsug-discuss] Using VMs as quorum / admin nodes in a GPFS infiniband cluster

2021-06-17 Thread Jonathan Buzzard
On 17/06/2021 09:29, Jan-Frode Myklebust wrote: *All* nodes needs to be able to communicate on the daemon network. If they don't have access to this network, they can't join the cluster. Not strictly true. TL;DR if all your NSD/master nodes are both Ethernet and Infiniband connected then

Re: [gpfsug-discuss] GPFS systemd and gpfs.gplbin

2021-06-10 Thread Jonathan Buzzard
On 10/06/2021 15:00, Ryan Novosielski wrote:> The problem with not version locking the kernel, however, is that you really need to know that the kernel you are going to is going to support the GPFS version that you are going to be running. Typically that only becomes a problem when you cross a

[gpfsug-discuss] GPFS systemd and gpfs.gplbin

2021-06-09 Thread Jonathan Buzzard
So you need to apply a kernel update and that means a new gpfs.gplbin :-( So after going around the houses with several different approaches on this I have finally settled on what IMHO is a most elegant method of ensuring the right gpfs.gplbin version is installed for the kernel that is

Re: [gpfsug-discuss] Using VMs as quorum / admin nodes in a GPFS infiniband cluster

2021-06-07 Thread Jonathan Buzzard
On 07/06/2021 13:46, Leonardo Sala wrote: Hallo, we do have multiple bare-metal GPFS clusters with infiniband fabric, and I am actually considering adding some VMs in the mix, to perform admin tasks (so that the bare metal servers do not need passwordless ssh keys) and quorum nodes. Has

Re: [gpfsug-discuss] CVE-2021-29740

2021-06-04 Thread Jonathan Buzzard
On 01/06/2021 17:48, Damir Krstic wrote: IBM posted a security bulletin for the spectrum scale (CVE-2021-29740). Not a lot of detail provided in that bulletin. Has anyone installed this fix? Does anyone have more information about it? Anyone know how quickly Lenovo are at putting up

Re: [gpfsug-discuss] du --apparent-size and quota

2021-06-02 Thread Jonathan Buzzard
On 02/06/2021 11:16, Ulrich Sibiller wrote: [SNIP] My rsync is using -AHS, so this should not be relevant here. I wonder have you done more than one rsync? If so are you using --delete? If not and the source fileset has changed then you will be accumulating files at the destination and

Re: [gpfsug-discuss] Ransom attacks

2021-05-28 Thread Jonathan Buzzard
On 28/05/2021 07:46, Henrik Morsing wrote: That might not make sense if GPFS is holding the SP backup data, but SP can do its own replication too - and could replicate using storage from a second GPFS file system off-site.  Take snapshots of this second storage, as well as SP database, and

Re: [gpfsug-discuss] Ransom attacks

2021-05-27 Thread Jonathan Buzzard
On 27/05/2021 16:23, Skylar Thompson wrote: [SNIP] at the end of the day, nothing beats the air-gap of tape backups, IMHO. Changing/deleting lots of data on tape takes time. So tape is a really good starting point even if you never take the tapes out the library except to dispose of them.

Re: [gpfsug-discuss] GPFS de duplication

2021-05-21 Thread Jonathan Buzzard
On 20/05/2021 13:58, Dave Bond wrote: As part of a project I am doing I am looking if there are any de duplication options for GPFS?  I see there is no native de dupe for the filesystem. The scenario would be user A creates a file or folder and user B takes a copy within the same filesystem,

Re: [gpfsug-discuss] Quick delete of huge tree

2021-04-20 Thread Jonathan Buzzard
On 20/04/2021 13:09, Ulrich Sibiller wrote: Consider using mv to move it out the way or hide it while the delete is in progress. If you do that think carefully about backups, you don't want to back it all up again while it is being deleted :-) ;-) Yeah, that's why I did not the do the mv in

Re: [gpfsug-discuss] Quick delete of huge tree

2021-04-20 Thread Jonathan Buzzard
On 20/04/2021 12:18, Ulrich Sibiller wrote: Hello *, I have to delete a subtree of about ~50 million files in thousands of subdirs, ~14TB of data. Running a recursive rm is very slow so I setup a simple policy file: RULE 'delstuff' DELETE DIRECTORIES_PLUS >WHERE PATH_NAME LIKE

[gpfsug-discuss] Synchronization/Restore of file systems

2021-03-11 Thread Jonathan Buzzard
As promised last year I having just completed a storage upgrade, I have sanitized my scripts and put them up on Github for other people to have a look at the methodology I use in these sorts of scenarios. This time the upgrade involved pulling out all the existing disks and fitting large

Re: [gpfsug-discuss] Backing up GPFS with Rsync

2021-03-10 Thread Jonathan Buzzard
On 10/03/2021 02:59, Alec wrote: CAUTION: This email originated outside the University. Check before clicking links or attachments. You would definitely be able to search by inode creation date and find the files you want... our 1.25m file filesystem takes about 47 seconds to query...  One

Re: [gpfsug-discuss] Policy scan of symbolic links with contents?

2021-03-08 Thread Jonathan Buzzard
On 08/03/2021 20:45, Jonathan Buzzard wrote: [SNIP] So noting that you can write very SQL like statements something like the following should in theory do it RULE finddangling LIST dangle WHERE MISC_ATTRIBUTES='L' AND SUBSTR(PATH_NAME,0,4)='/fs1/' Note the above is not checked in any way

Re: [gpfsug-discuss] Policy scan of symbolic links with contents?

2021-03-08 Thread Jonathan Buzzard
On 08/03/2021 16:07, Frederick Stock wrote: CAUTION: This email originated outside the University. Check before clicking links or attachments. Presumably the only feature that would help here is if policy could determine that the end location pointed to by a symbolic link is within the current

Re: [gpfsug-discuss] TSM errors restoring files with ACL's

2021-03-05 Thread Jonathan Buzzard
On 05/03/2021 19:12, Frederick Stock wrote: CAUTION: This email originated outside the University. Check before clicking links or attachments. I was referring to this flash, https://www.ibm.com/support/pages/node/6381354?myns=swgtiv=OCSSEQVQ=E_sp=swgtiv-_-OCSSEQVQ-_-E

Re: [gpfsug-discuss] TSM errors restoring files with ACL's

2021-03-05 Thread Jonathan Buzzard
On 05/03/2021 12:15, Frederick Stock wrote: CAUTION: This email originated outside the University. Check before clicking links or attachments. Have you checked to see if Spectrum Protect (TSM) has addressed this problem.  There recently was an issue with Protect and how it used the GPFS API

[gpfsug-discuss] TSM errors restoring files with ACL's

2021-03-05 Thread Jonathan Buzzard
I am seeing that whenever I try and restore a file with an ACL I get the a ANS1589W error in /var/log/dsmerror.log ANS1589W Unable to write extended attributes for ** due to errno: 13, reason: Permission denied But bizarrely the ACL is actually restored. At least as far as I can

Re: [gpfsug-discuss] Using setfacl vs. mmputacl

2021-03-01 Thread Jonathan Buzzard
On 01/03/2021 15:18, Olaf Weiser wrote: CAUTION: This email originated outside the University. Check before clicking links or attachments. JAB, yes-this is in argument ;-) ... and personally I like the idea of having smth like setfacl also for GPFS ..  for years... *but* it would not take away

Re: [gpfsug-discuss] Using setfacl vs. mmputacl

2021-03-01 Thread Jonathan Buzzard
On 01/03/2021 12:45, Olaf Weiser wrote: CAUTION: This email originated outside the University. Check before clicking links or attachments. Hallo Stephen, behavior ... or better to say ... predicted behavior for chmod and ACLs .. is not an easy thing or only  , if  you stay in either POSIX

Re: [gpfsug-discuss] dssgmkfs.mmvdisk number of NSD's

2021-03-01 Thread Jonathan Buzzard
On 01/03/2021 09:08, Luis Bolinches wrote: Hi > There other reasons to have more than 1. It is management of those. When you have to add or remove NSDs of a FS having more than 1 makes it possible to empty some space and manage those in and out. Manually but possible. If you have one big NSD

Re: [gpfsug-discuss] dssgmkfs.mmvdisk number of NSD's

2021-02-28 Thread Jonathan Buzzard
On 28/02/2021 09:31, Jan-Frode Myklebust wrote: I’ve tried benchmarking many vs. few vdisks per RG, and never could see any performance difference. That's encouraging. Usually we create 1 vdisk per enclosure per RG,   thinking this will allow us to grow with same size vdisks when adding

[gpfsug-discuss] dssgmkfs.mmvdisk number of NSD's

2021-02-27 Thread Jonathan Buzzard
Doing an upgrade on our storage which involved replacing all the 4TB disks with 16TB disks. Some hiccups with five of the disks being dead when inserted but that is all sorted. So the system was originally installed with DSS-G 2.0a so with "legacy" commands for vdisks etc. We had 10

Re: [gpfsug-discuss] gpfsug-discuss Digest, Vol 108, Issue 18

2021-02-01 Thread Jonathan Buzzard
On 01/02/2021 21:09, Owen Morgan wrote: CAUTION: This email originated outside the University. Check before clicking links or attachments. Jonathan, If I have a single policy file with all the related department rules and each time they want to add additional rules with different working day

Re: [gpfsug-discuss] gpfsug-discuss Digest, Vol 108, Issue 18

2021-02-01 Thread Jonathan Buzzard
On 01/02/2021 18:11, Jan-Frode Myklebust wrote: CAUTION: This email originated outside the University. Check before clicking links or attachments. Agree.. Write a policy that takes a "mmapplypolicy -M var=val" argument, and figure out the workdays outside of the policy. Something like: # cat

Re: [gpfsug-discuss] gpfsug-discuss Digest, Vol 108, Issue 18

2021-01-30 Thread Jonathan Buzzard
On 30/01/2021 00:31, Owen Morgan wrote: [SNIP] I would prefer to stay in the bounds of the SQL policy rule setup as that is the framework I have created and started to implement.. In general SQL is Turing complete. Though I have not checked in detail I believe the SQL of the policy engine

Re: [gpfsug-discuss] Contents of gpfsug-discuss Digest, Vol 107, Issue 13

2020-12-10 Thread Jonathan Buzzard
On 10/12/2020 21:59, Andrew Beattie wrote: CAUTION: This email originated outside the University. Check before clicking links or attachments. Thanks Ed, The UQ team are well aware of the current limits published in the FAQ. However the issue is not the number of physical nodes or the

Re: [gpfsug-discuss] Future of Spectrum Scale support for Centos

2020-12-09 Thread Jonathan Buzzard
On 09/12/2020 01:08, Carl wrote: CAUTION: This email originated outside the University. Check before clicking links or attachments. Hi all, With the announcement of Centos 8 moving to stream > https://blog.centos.org/2020/12/future-is-centos-stream> Will Centos still be considered a clone

Re: [gpfsug-discuss] Future of Spectrum Scale support for Centos

2020-12-09 Thread Jonathan Buzzard
On 09/12/2020 14:02, Carl Zetie - ca...@us.ibm.com wrote: CAUTION: This email originated outside the University. Check before clicking links or attachments. We don’t have an official statement yet, however I did want to give you all an indication of our early thinking on this. Er yes we do,

Re: [gpfsug-discuss] Spectrum Scale 5.1.0.1 Object install / Redhat repos.

2020-12-08 Thread Jonathan Buzzard
On 07/12/2020 22:37, Simon Thompson wrote: CAUTION: This email originated outside the University. Check before clicking links or attachments. Codeready I think you can just enable with subscription-manager, but it is disabled by default. RHOSP is an additional license. But as it says

Re: [gpfsug-discuss] memory needed for gpfs clients

2020-12-01 Thread Jonathan Buzzard
On 01/12/2020 19:07, Christopher Black wrote: CAUTION: This email originated outside the University. Check before clicking links or attachments. We tune vm-related sysctl values on our gpfs clients. These are values we use for 256GB+ mem hpc nodes: vm.min_free_kbytes=2097152 vm.dirty_bytes =

Re: [gpfsug-discuss] Mounting filesystem on top of an existing filesystem

2020-11-19 Thread Jonathan Buzzard
On 19/11/2020 18:13, Caubet Serrabou Marc (PSI) wrote: Hi all, thanks a lot for your comments. Agreed, I better avoid it for now. I was concerned about how GPFS would behave in such case. For production I will take the safe route, but, just out of curiosity, I'll give it a try on a couple

Re: [gpfsug-discuss] Mounting filesystem on top of an existing filesystem

2020-11-19 Thread Jonathan Buzzard
On 19/11/2020 16:40, KG wrote: You can also set mount priority on filesystems so that gpfs can try to mount them in order...parent first One of the things that systemd brings to the table https://github.com/systemd/systemd/commit/3519d230c8bafe834b2dac26ace49fcfba139823 JAB. -- Jonathan

Re: [gpfsug-discuss] Mounting filesystem on top of an existing filesystem

2020-11-19 Thread Jonathan Buzzard
On 19/11/2020 17:34, Jan-Frode Myklebust wrote: I would not mount a GPFS filesystem within a GPFS filesystem. Technically it should work, but I’d expect it to cause surprises if ever the lower filesystem experienced problems. Alone, a filesystem might recover automatically by remounting. But

Re: [gpfsug-discuss] Mounting filesystem on top of an existing filesystem

2020-11-19 Thread Jonathan Buzzard
On 19/11/2020 15:34, Caubet Serrabou Marc (PSI) wrote: Hi, I have a filesystem holding many projects (i.e., mounted under /projects), each project is managed with filesets. I have a new big project which should be placed on a separate filesystem (blocksize, replication policy, etc. will be

Re: [gpfsug-discuss] Migrate/syncronize data from Isilon to Scale over NFS?

2020-11-18 Thread Jonathan Buzzard
On 17/11/2020 23:17, Chris Schlipalius wrote: So at my last job we used to rsync data between isilons across campus, and isilon to Windows File Cluster (and back). I recommend using dry run to generate a list of files and then use this to run with rysnc. This allows you also to be able to

Re: [gpfsug-discuss] Migrate/syncronize data from Isilon to Scale over NFS?

2020-11-17 Thread Jonathan Buzzard
On 17/11/2020 15:55, Simon Thompson wrote: Fortunately, we seem committed to GPFS so it might be we never have to do another bulk transfer outside of the filesystem... Until you want to move a v3 or v4 created file-system to v5 block sizes __ You forget the v2 to v3 for more than

Re: [gpfsug-discuss] Migrate/syncronize data from Isilon to Scale over NFS?

2020-11-17 Thread Jonathan Buzzard
On 17/11/2020 11:51, Andi Christiansen wrote: Hi all, thanks for all the information, there was some interesting things amount it.. I kept on going with rsync and ended up making a file with all top level user directories and splitting them into chunks of 347 per rsync session(total 42000 ish

Re: [gpfsug-discuss] Migrate/syncronize data from Isilon to Scale over NFS?

2020-11-16 Thread Jonathan Buzzard
On 16/11/2020 21:58, Skylar Thompson wrote: When we did a similar (though larger, at ~2.5PB) migration, we used rsync as well, but ran one rsync process per Isilon node, and made sure the NFS clients were hitting separate Isilon nodes for their reads. We also didn't have more than one rsync

Re: [gpfsug-discuss] Migrate/syncronize data from Isilon to Scale over NFS?

2020-11-16 Thread Jonathan Buzzard
On 16/11/2020 19:44, Andi Christiansen wrote: Hi all, i have got a case where a customer wants 700TB migrated from isilon to Scale and the only way for him is exporting the same directory on NFS from two different nodes... as of now we are using multiple rsync processes on different parts

Re: [gpfsug-discuss] Services on DSS/ESS nodes

2020-10-07 Thread Jonathan Buzzard
On 07/10/2020 11:28, Simon Thompson wrote: Agreed ... Report to me a pdisk is failing in my monitoring dashboard we use for *everything else*. Tell me that kswapd is having one of those days. Tell me rsyslogd has stopped sending for some reason. Tell me if there are long waiters on the hosts.

Re: [gpfsug-discuss] Services on DSS/ESS nodes

2020-10-05 Thread Jonathan Buzzard
On 05/10/2020 09:40, Simon Thompson wrote: I now need to check IBM are not going to throw a wobbler down the line if I need to get support before deploying it to the DSS-G nodes :-) I know there were a lot of other emails about this ... I think you maybe want to be careful doing this. Whilst

Re: [gpfsug-discuss] Services on DSS/ESS nodes

2020-10-05 Thread Jonathan Buzzard
On 05/10/2020 07:27, Jordi Caubet Serrabou wrote: > Coming to the routing point, is there any reason why you need it ? I > mean, this is because GPFS trying to connect between compute nodes or > a reason outside GPFS scope ? > If the reason is GPFS, imho best approach - without knowledge of

Re: [gpfsug-discuss] Services on DSS/ESS nodes

2020-10-04 Thread Jonathan Buzzard
On 04/10/2020 10:29, Luis Bolinches wrote: Hi As stated on the same link you can do remote mounts from each other and be a supported setup. “ You can use the remote mount feature of IBM Spectrum Scale to share file system data across clusters.” You can, but imagine I have a DSS-G

Re: [gpfsug-discuss] Services on DSS/ESS nodes

2020-10-03 Thread Jonathan Buzzard
On 03/10/2020 12:19, Luis Bolinches wrote: Are you mixing those ESS DSS in the same cluster? Or you are only running DSS Only running DSS. We are too far down the rabbit hole to ever switch to ESS now. Mixing DSS and ESS in the same cluster is not a supported configuration. I know,

Re: [gpfsug-discuss] Services on DSS/ESS nodes

2020-10-03 Thread Jonathan Buzzard
On 03/10/2020 11:55, Andrew Beattie wrote: Why do you need to run any kind of monitoring client on an IO server the GUI / performance monitor already does all of that work for you and collects the data on the dedicated EMS server. Because any remotely sensible admin demands a single pane

Re: [gpfsug-discuss] Services on DSS/ESS nodes

2020-10-03 Thread Jonathan Buzzard
On 02/10/2020 23:19, Andrew Beattie wrote: Jonathan, I suggest you get a formal statement from Lenovo as the DSS-G Platform is no longer an IBM platform. But for ESS based platforms the answer would be, it is not supported to run anything on the IO Servers other than GNR and the relevant

[gpfsug-discuss] Services on DSS/ESS nodes

2020-10-02 Thread Jonathan Buzzard
What if any are the rules around running additional services on DSS/ESS nodes with regard to support? Let me outline our scenario Our main cluster uses 10Gbps ethernet for storage with the DSS-G nodes hooked up with redundant 40Gbps ethernet. However we have an older cluster that is used

Re: [gpfsug-discuss] Portability interface

2020-09-23 Thread Jonathan Buzzard
On 22/09/2020 16:47, Truong Vu wrote: You are correct, the "identical architecture" means the same machine hardware name as shown by the -m option of the uname command. Thanks for clearing that up. It just seemed something of a blindly obvious statement; surely nobody would expect an RPM

[gpfsug-discuss] Portability interface

2020-09-22 Thread Jonathan Buzzard
I have a question about using RPM's for the portability interface on different CPU's. According to /usr/lpp/mmfs/src/README The generated RPM can ONLY be deployed to the machine with identical architecture, distribution level, Linux kernel version and GPFS version. So does this

Re: [gpfsug-discuss] Best of spectrum scale

2020-09-11 Thread Jonathan Buzzard
On 11/09/2020 15:25, Stephen Ulmer wrote: On Sep 9, 2020, at 10:04 AM, Skylar Thompson <mailto:skyl...@uw.edu>> wrote: On Wed, Sep 09, 2020 at 12:02:53PM +0100, Jonathan Buzzard wrote: On 08/09/2020 18:37, IBM Spectrum Scale wrote: I think it is incorrect to assume that

Re: [gpfsug-discuss] Best of spectrum scale

2020-09-09 Thread Jonathan Buzzard
On 08/09/2020 18:37, IBM Spectrum Scale wrote: I think it is incorrect to assume that a command that continues after detecting the working directory has been removed is going to cause damage to the file system. No I am not assuming it will cause damage. I am making the fairly reasonable

Re: [gpfsug-discuss] Best of spectrum scale

2020-09-08 Thread Jonathan Buzzard
On 08/09/2020 14:04, IBM Spectrum Scale wrote: I think a better metaphor is that the bridge we just crossed has collapsed and as long as we do not need to cross it again our journey should reach its intended destination :-)  As I understand the intent of this message is to alert the user (and

Re: [gpfsug-discuss] tsgskkm stuck---> what about AMD epyc support in GPFS?

2020-09-04 Thread Jonathan Buzzard
On 02/09/2020 23:28, Andrew Beattie wrote: Giovanni, I have clients in Australia that are running AMD ROME processors in their Visualisation nodes connected to scale 5.0.4 clusters with no issues. Spectrum Scale doesn't differentiate between x86 processor technologies -- it only looks at x86_64

Re: [gpfsug-discuss] [External] Re: mmbackup

2020-08-17 Thread Jonathan Buzzard
On 17/08/2020 10:53, Jim Roche wrote: Simon is correct from that point of view. If you can see it on your commercial.lenovo.com site then you are able to use it within your licensing rules and that you are still compatible from a Spectrum Scale point of view. The DSS-G specific tarballs

Re: [gpfsug-discuss] mmbackup

2020-08-17 Thread Jonathan Buzzard
On 15/08/2020 19:24, Simon Thompson wrote: When you "web portal" it's not clear if you refer to fix central or the commercial.lenovo.com site, the client binaries are a separate set of downloads to the DSS-G bundle for the servers, from where you should be able to download 5.0.5.1 (at least I

Re: [gpfsug-discuss] mmbackup

2020-08-14 Thread Jonathan Buzzard
On 06/08/2020 14:35, IBM Spectrum Scale wrote: This has been fixed in Spectrum Scale 4.2.3.20, 5.0.4.2, and 5.0.5.0. Regards, The Spectrum Scale (GPFS) team Thanks, that explains my issue. I am running DSS-G, and the latest DSS-G release only does the 5.0.4.3-2 version of GPFS. However I

[gpfsug-discuss] mmbackup

2020-08-06 Thread Jonathan Buzzard
I upgraded the TSM client for security reasons to 8.1.10 from 8.1.3 last week. It would now appear that my scheduled mmbackup is not running, and trying by hand gives [root@tsm ~]# mmbackup dssgfs -s /opt/mmbackup mmbackup: Backup of

Re: [gpfsug-discuss] DSS-G support period

2020-08-05 Thread Jonathan Buzzard
On 05/08/2020 11:32, Simon Thompson wrote: 3.0 isn't supported on first gen DSS-G servers (x3650m5) so if you have those, you'd hope that it would continue to be supported. Or it's now old hardware and out of support. Here I have some shinny new storage to sell you :-) We're just looking

[gpfsug-discuss] DSS-G support period

2020-08-03 Thread Jonathan Buzzard
I notice that there is now a 3.x version of the DSS-G software that is based on RHEL8, which took me a bit by surprise as the ESS still seems to be on RHEL7. I did however notice that 2.6b which is still based on RHEL7 was released after 3.0a. So this brings the question how much longer

[gpfsug-discuss] Mass UID/GID change program (uidremap)

2020-06-17 Thread Jonathan Buzzard
My university has been giving me Fridays off during lock down so I I have spent a bit of time and added modification of Posix ACL's through the standard library and tidied up the code a bit. Much of it is based on preexisting code which did speed things up. The error checking is still

Re: [gpfsug-discuss] very low read performance in simple spectrum scale/gpfs cluster with a storage-server SAN

2020-06-11 Thread Jonathan Buzzard
On 11/06/2020 08:53, Giovanni Bracco wrote: [SNIP] not really: both clusters, the 400 OPA nodes and the 300 QDR nodes share the same data lake in Spectrum Scale/GPFS so the NSD servers support the flexibility of the setup. NSD servers make use of a IB SAN fabric (Mellanox FDR switch) where

Re: [gpfsug-discuss] Change uidNumber and gidNumber for billions of files

2020-06-10 Thread Jonathan Buzzard
On 10/06/2020 16:31, Lohit Valleru wrote: [SNIP] > I might mostly start small with a single lab, and only change files > without ACLs. May I know if anyone has a method/tool to find out which > files/dirs have NFS4 ACLs set? As far as we know - it is just one > fileset/lab, but it would be

[gpfsug-discuss] Infiniband/Ethernet gateway

2020-06-10 Thread Jonathan Buzzard
We have a mixture of 10Gb Ethernet and Infiniband connected (using IPoIB) nodes on our compute cluster using a DSS-G for storage. Each SR650 has a bonded pair of 40Gb Ethernet connections and a 40Gb Infiniband connection. Performance and stability are *way* better than the old Lustre

Re: [gpfsug-discuss] Change uidNumber and gidNumber for billions of files

2020-06-10 Thread Jonathan Buzzard
On 10/06/2020 02:15, Aaron Knister wrote: Lohit, I did this while working @ NASA. I had two tools I used, one affectionately known as "luke file walker" (to modify traditional unix permissions) and the other known as the "milleniumfacl" (to modify posix ACLs). Stupid jokes aside, there were

Re: [gpfsug-discuss] Change uidNumber and gidNumber for billions of files

2020-06-09 Thread Jonathan Buzzard
On 09/06/2020 14:57, Jonathan Buzzard wrote: [SNIP] I actually thinking on it more thought a generic C random UID/GID to UID/GID mapping program is a really simple piece of code and should be nearly as fast as chown -R. It will be very slightly slower as you have to look the mapping up

Re: [gpfsug-discuss] Change uidNumber and gidNumber for billions of files

2020-06-09 Thread Jonathan Buzzard
On 09/06/2020 14:07, Stephen Ulmer wrote: Jonathan brings up a good point that you’ll only get one shot at this — if you’re using the file system as your record of who owns what. Not strictly true if my existing UID's are in the range 1-1 and my target UID's are in the range

Re: [gpfsug-discuss] Change uidNumber and gidNumber for billions of files

2020-06-09 Thread Jonathan Buzzard
On 08/06/2020 18:44, Lohit Valleru wrote: Hello Everyone, We are planning to migrate from LDAP to AD, and one of the best solution was to change the uidNumber and gidNumber to what SSSD or Centrify would resolve. May I know, if anyone has come across a tool/tools that can change the

Re: [gpfsug-discuss] Immutible attribute

2020-06-03 Thread Jonathan Buzzard
On 03/06/2020 16:25, Frederick Stock wrote: Could you please provide the exact Scale version, or was it really 4.2.3.0? 4.2.3-7 with setuid taken off a bunch of the utilities per relevant CVE while I work on the upgrade to 5.0.5 JAB. -- Jonathan A. Buzzard Tel:

[gpfsug-discuss] Immutible attribute

2020-06-03 Thread Jonathan Buzzard
Hum, on a "normal" Linux file system only the root user can change the immutible attribute on a file. Running on 4.2.3 I have just removed the immutible attribute as an ordinary user if I am the owner of the file. I would suggest that this is a bug as the manual page for mmchattr does

Re: [gpfsug-discuss] Multi-cluster question (was Re: gpfsug-discuss Digest, Vol 100, Issue 32)

2020-06-01 Thread Jonathan Buzzard
On 31/05/2020 17:47, Jan-Frode Myklebust wrote: No, this is a common misconception.  You don’t need any NSD servers. NSD servers are only needed if you have nodes without direct block access. I see that has changed then. In the past mmcrnsd would simply fail without a server list passed

Re: [gpfsug-discuss] Multi-cluster question (was Re: gpfsug-discuss Digest, Vol 100, Issue 32)

2020-05-31 Thread Jonathan Buzzard
On 29/05/2020 20:55, Stephen Ulmer wrote: I have a question about multi-cluster, but it is related to this thread (it would be solving the same problem). Let’s say we have two clusters A and B, both clusters are normally shared-everything with no NSD servers defined. Er, even in a

Re: [gpfsug-discuss] gpfsug-discuss Digest, Vol 100, Issue 32

2020-05-29 Thread Jonathan Buzzard
On 29/05/2020 16:02, Prasad Surampudi wrote: Jim, The minimum release for 5.0.4 cluster is 5.0.4.3. We are aware that we can't merge both gpfs_4 and gpfs_5 filesystems. Our plan is to import the gpfs_4 filesystem into the Spectrum Scale 5.0 cluster (with its NSDs mapped to Spectrum Scale 5.0

Re: [gpfsug-discuss] Importing a Spectrum Scale a filesystem from 4.2.3 cluster to 5.0.4.3 cluster

2020-05-29 Thread Jonathan Buzzard
On 29/05/2020 02:35, Jim Doherty wrote: What is the minimum release level of the Spectrum Scale 5.0.4 cluster?    Is it 4.2.3.X? According to the email Cluster-B (the one with 5.0.4) has the variable block size thing, so that is going to be a no. Besides which even if you could mmimport

Re: [gpfsug-discuss] Spectrum Scale 5.0.5.0 is available on FixCentral (n/t)

2020-05-25 Thread Jonathan Buzzard
On 24/05/2020 16:58, Achim Rehor wrote: no, as this is a .0 package .. there are no 'bug-fixes' ;)  you will see this again, when PTF 1 is announced. So you are saying that no bugs that existed in 5.0.4.x have been fixed in 5.0.5.0? That has the credibility of Dominic Cummings driving to

Re: [gpfsug-discuss] Scale 4.2.3.22 with support for RHEL 7.8 is now on Fix Central

2020-05-14 Thread Jonathan Buzzard
On 14/05/2020 13:31, Flanders, Dean wrote: Hello, It is great, that RHEL 7.8 is supported on SS 4.2.3.22, when will RHEL 8.x be supported on GPFS SS 4.2.3.X?? Thanks, Dean That would be never, 4.2.3 goes out of support in September. Is 5.x supported in 7.8 yet? Some urgent upgrading of

Re: [gpfsug-discuss] Odd networking/name resolution issue

2020-05-11 Thread Jonathan Buzzard
On 10/05/2020 14:28, Jaime Pinto wrote: The rationale for my suggestion doesn't have much to do with the central DNS server, but everything to do with the DNS client side of the service. If you have a very busy cluster at times, and a number of nodes really busy with 1000+ IOPs for instance, so

Re: [gpfsug-discuss] Odd networking/name resolution issue

2020-05-09 Thread Jonathan Buzzard
On 09/05/2020 12:06, Jaime Pinto wrote: DNS shouldn't be relied upon on a GPFS cluster for internal communication/management or data. The 1980's have called and want their lack of IP resolution protocols back :-) I would kindly disagree. If your DNS is not working then your cluster is

Re: [gpfsug-discuss] support for 7.8 in 5.0.4.4?

2020-05-04 Thread Jonathan Buzzard
On 04/05/2020 10:11, Kenneth Waegeman wrote: Hi all, I didn't see any updates in the faq yet about the new 5.0.4.4 release. Does this release support rhel 7.8 ? Has the fix for 7.7 with a kernel >= 3.10.0-1062.18.1.el7 been released yet? If it hasn't then there is no chance of 7.8

Re: [gpfsug-discuss] wait for mount during gpfs startup

2020-04-28 Thread Jonathan Buzzard
On 28/04/2020 11:57, Ulrich Sibiller wrote: Hi, when the gpfs systemd service returns from startup the filesystems are usually not mounted. So having another service depending on gpfs is not feasible if you require the filesystem(s). Therefore we have added a script to the systemd gpfs

Re: [gpfsug-discuss] Spectrum Scale licensing

2020-04-17 Thread Jonathan Buzzard
On 16/04/2020 04:26, Flanders, Dean wrote: Hello All, As IBM has completely switched to capacity based licensing in order to use SS v5 I was wondering how others are dealing with this? We do not find the capacity based licensing sustainable. Our long term plan is to migrate away from SS v5

Re: [gpfsug-discuss] Spectrum Scale licensing - important correction

2020-04-17 Thread Jonathan Buzzard
On 17/04/2020 11:31, T.A. Yeep wrote: Hi Carl, I'm confused here, in the previous email it was said *And for ESS, it is licensed Per Drive with different prices for HDDs and SSDs.* But then you mentioned in below email that: But new customers and new OEM systems are *all licensed by

  1   2   3   >