Re: [gpfsug-discuss] Migrate/syncronize data from Isilon to Scale over NFS?
On Wed, 18 Nov 2020 11:48:52 +, Jonathan Buzzard said: > So what do I mean by "wacky" characters. Well remember a file name can > have just about anything in it on Linux with the exception of '/', and You want to see some fireworks? At least at one time, it was possible to use a file system debugger that's all too trusting of hexadecimal input and create a directory entry of '../'. Let's just say that fs/namei.c was also far too trusting, and fsck was more than happy to make *different* errors than the kernel was > The obvious ones are spaces, but it's not just ASCII 0x20, but tabs too. > Then there is the use of the wildcard characters, especially '?' but > also '*'. Don't forget ESC, CR, LF, backticks, forward ticks, semicolons, and pretty much anything else that will give a shell indigestion. SQL isn't the only thing prone to injection attacks.. :) pgps69JeqhsqZ.pgp Description: PGP signature ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] Services on DSS/ESS nodes
On Sat, 03 Oct 2020 10:55:05 -, "Andrew Beattie" said: > Why do you need to run any kind of monitoring client on an IO server the > GUI / performance monitor already does all of that work for you and > collects the data on the dedicated EMS server. Does *ALL* that work for me? Will it toss you an alert if your sshd goes away, or if somebody's tossing packets that iptables is blocking for good reasons, or any of the many other things that a competent sysadmin wants to be alerted on that aren't GPFS, but which are things that Nagios and Zabbix and similar tools were invented to track? pgpd0Hn2KWxA1.pgp Description: PGP signature ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] Client Latency and High NSD Server Load Average
On Fri, 05 Jun 2020 14:24:27 -, "Saula, Oluwasijibomi" said: > But with the RAID 6 writing costs Vladis explained, it now makes sense why > the write IO was badly affected... > Action [1,2,3,4,A] : The only valid responses are characters from this set: > [1, 2, 3, 4, A] > Action [1,2,3,4,A] : The only valid responses are characters from this set: > [1, 2, 3, 4, A] > Action [1,2,3,4,A] : The only valid responses are characters from this set: > [1, 2, 3, 4, A] > Action [1,2,3,4,A] : The only valid responses are characters from this set: > [1, 2, 3, 4, A] And a read-modify-write on each one.. Ouch. Stuff like that is why making sure program output goes to /var or other local file system is usually a good thing. I seem to remember us getting bit by a similar misbehavior in TSM, but I don't know the details because I was busier with GPFS and LTFS/EE than TSM. Though I have to wonder how TSM could be a decades-old product and still have misbehaviors in basic things like failed reads on input prompts... pgpUqksoZkR44.pgp Description: PGP signature ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] Client Latency and High NSD Server Load Average
On Thu, 04 Jun 2020 15:33:18 -, "Saula, Oluwasijibomi" said: > However, I still can't understand why write IO operations are 5x more latent > than ready operations to the same class of disks. Two things that may be biting you: First, on a RAID 5 or 6 LUN, most of the time you only need to do 2 physical reads (data and parity block). To do a write, you have to read the old parity block, compute the new value, and write the data block and new parity block. This is often called the "RAID write penalty". Second, if a read size is smaller than the physical block size, the storage array can read a block, and return only the fragment needed. But on a write, it has to read the whole block, splice in the new data, and write back the block - a RMW (read modify write) cycle. pgpKAKy2bcSNE.pgp Description: PGP signature ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] gpfsug-discuss Digest, Vol 100, Issue 32
On Fri, 29 May 2020 22:30:08 +0100, Jonathan Buzzard said: > Ethernet goes *very* fast these days you know :-) In fact *much* faster > than fibre channel. Yes, but the justification, purchase, and installation of 40G or 100G Ethernet interfaces in the machines involved, plus the routers/switches along the way, can go very slowly indeed. So finding a way to replace 10G Ether with 16G FC can be a win. pgptPtT2nieiU.pgp Description: PGP signature ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] selinux context
On Fri, 22 May 2020 07:47:45 -, "Talamo Ivano Giuseppe (PSI)" said: > After having done this on one node, the context on the directory is the expec > expected one (system_u:object_r:home_root_t:s0). And everything works as > expected (a > new user logs in and his directory is created). > But on all the other nodes of the cluster still the old context is shown > (system_u:object_r:unlabeled_t:s0). Unless I run the restorecon on them too. > Furthermore, since the filesystem is a remote-cluster mount, on all the nodes > on the central (storage) cluster, the corrent (home_root_t) context is shown. > I was expecting the SElinux context to be stored in the inodes, but now the > situation looks mixed and Iâm puzzled. I suspect the issue is that the other nodes have that inode cached already, and they don't find out that that the SELinux context has been changed. I can't tell from here from whether GPFS is failing to realize that a context change means the old inode is stale just like any other inode change, or if there's something else that has gone astray. pgpDQFjKg47nK.pgp Description: PGP signature ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] GPFS 5 and supported rhel OS
On Sun, 23 Feb 2020 12:20:48 +, Jonathan Buzzard said: > > That's not *quite* so bad. As long as you trust *all* your vendors to > > notify > > you when they release a patch for an issue you hadn't heard about. > Er, what do you think I am paid for? Specifically it is IMHO the job of > any systems administrator to know when any critical patch becomes > available for any software/hardware that they are using. You missed the point. Unless you spend your time constantly e-mailing *all* of your vendors "Are there new patches I don't know about?", you're relying on them to notify you when there's a known issue, and when a patch comes out. Redhat is good about notification. IBM is. But how about things like your Infiniband stack? OFED? The firmware in all your devices? The BIOS/UEFI on the servers? If you're an Intel shop, how do you get notified about security issues in the Management Engine stuff (and there's been plenty of them). Do *all* of those vendors have security lists? Are you subscribed to *all* of them? Do *all* of them actually post to those lists? pgpTgKjWURc9p.pgp Description: PGP signature ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] GPFS 5 and supported rhel OS
On Fri, 21 Feb 2020 11:04:32 +, Jonathan Buzzard said: > > Is that 10 days from vuln dislosure, or from patch availability? > > > > Patch availability. Basically it's a response to the issue a couple of That's not *quite* so bad. As long as you trust *all* your vendors to notify you when they release a patch for an issue you hadn't heard about. (And that no e-mail servers along the way don't file it under 'spam') pgpNRr5xwODKt.pgp Description: PGP signature ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] GPFS 5 and supported rhel OS
On Thu, 20 Feb 2020 23:38:15 +, Jonathan Buzzard said: > For us, it is a Scottish government mandate that all public funded > bodies in Scotland are Cyber Essentials Plus compliant. That's 10 days > from a critical vulnerability till your patched. No if's no buts, just > do it. Is that 10 days from vuln dislosure, or from patch availability? The latter can be a headache, especially if 24-48 hours pass between when the patch actually hits the streets and you get the e-mail, or if you have other legal mandates that patches be tested before production deployment. The former is simply unworkable - you *might* be able to deploy mitigations or other work-arounds, but if it's something complicated that requires a lot of re-work of code, you may be waiting a lot more than 10 days for a patch pgpkaNljRvc3Q.pgp Description: PGP signature ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] Encryption - checking key server health (SKLM)
On Wed, 19 Feb 2020 22:07:50 +, "Felipe Knop" said: > Having a tool that can retrieve keys independently from mmfsd would be useful > capability to have. Could you submit an RFE to request such function? Note that care needs to be taken to do this in a secure manner. pgppKeBauN2ww.pgp Description: PGP signature ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] mmbackup [--tsm-servers TSMServer[, TSMServer...]]
On Tue, 11 Feb 2020 16:44:07 -0500, Jaime Pinto said: > # /usr/lpp/mmfs/bin/mmbackup /gpfs/fs1/home -N tapenode3-ib ‐‐tsm‐servers > TAPENODE3,TAPENODE4 -s /dev/shm --tsm-errorlog $tmpDir/home-tsm-errorlog I got bit by this when cut-n-pasting from IBM documentation - the problem is that the web version has characters that *look* like the command-line hyphen character but are actually something different. It's the same problem as cut-n-pasting a command line where the command *should* have the standard ascii double-quote, but the webpage has "smart quotes" where there's different open and close quote characters. Just even less visually obvious... pgpLuDzqjy1hW.pgp Description: PGP signature ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] gpfs client and turning off swap
On Fri, 08 Nov 2019 10:25:24 -0600, Damir Krstic said: > I was wondering if it's safe to turn off swap on gpfs client machines? we > have a case where checkpointing is swapping and we would like to prevent it > from doing so by disabling swap. However, the gpfs manual admin. GPFS will work just fine without a swap space if the system has sufficient RAM. However, if checkpointing is swapping, you don't have enough RAM (at least with your current configuration), and turning off swap will result in processes being killed due to lack of memory. You *might* get it to work by tuning system config variables to reduce RAM consumption. But that's better tested while you still have swap defined, and not removing swap until you have a system not needing it. pgpbhXeAQ9yXD.pgp Description: PGP signature ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] Question on CES Authentication - LDAP
On Mon, 28 Oct 2019 14:02:57 -, "Oesterlin, Robert" said: > Any by the way, stores a plain text password in the sssd.conf file just for > good measure! Note that if you want the system to come up without intervention, at best you can only store an obfuscated password, not a securely encrypted one. pgpemI7UxprQu.pgp Description: PGP signature ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] Ganesha daemon has 400'000 open files - is this unusual?
On Tue, 24 Sep 2019 08:52:34 -, "Billich Heinrich Rainer (ID SD)" said: > Just some addition, maybe its of interest to someone: The number of max open > files for Ganesha is based on maxFilesToCache. Its. 80%of maxFilesToCache up > to > an upper and lower limits of 2000/1M. The active setting is visible in > /etc/sysconfig/ganesha. Note that strictly speaking, the values in /etc/sysconfig are in general the values that will be used at next restart - it's totally possible for the system to boot, the then-current values be picked up from /etc/sysconfig, and then any number of things, from configuration automation tools like Ansible, to a cow-orker sysadmin armed with nothing but /usr/bin/vi, to have changed the values without you knowing about it and the daemons not be restarted yet... (Let's just say that in 4 decades of doing this stuff, I've been surprised by that sort of thing a few times. :) pgpFHLjbegY6O.pgp Description: PGP signature ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] slow filesystem
On Wed, 10 Jul 2019 07:22:37 -0500, Damir Krstic said: > mmlspdisk all --not-ok does not indicate any failed hard drives. Not a GPFS drive - if a write to /var is taking 10 seconds, that indicates that there is likely a problem with the disk(s) that the system lives on, and you're hitting recoverable write errors Jul 10 07:05:31 gssio4 mmfs: [N] Writing into file /var/mmfs/gen/LastLeaseRequestSent took 10.5 seconds pgpbrz_jJzIMH.pgp Description: PGP signature ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] Renaming Linux device used by a NSD
On Tue, 18 Jun 2019 23:23:37 -, "Rob Logie" said: > We are doing a underlying hardware change that will result in the Linux > device file names changing for attached storage. > Hence I need to reconfigure the NSDs to use the new Linux device names. The only time GPFS cares about the Linux device names is when you go to actually create an NSD. After that, it just romps through /dev, finds anything that looks like a disk, and if it has an NSD on it at the appropriate offset, claims it as a GPFS device. (Protip: Since in a cluster the same disk may not have enumerated to the same name on all NSD servers that have visibility to it, you're almost always better off initially doing an mmcreatnsd specifying only one server, and then using mmchnsd to add the other servers to the server list for it) Heck, even without hardware changes, there's no guarantee that the disks enumerate in the same order across reboots (especially if you have a petabyte of LUNs and 8 or 16 paths to each LUN, though it's possible to tell the multipath daemon to have stable names for the multipath devices) pgpuuyidIfz_o.pgp Description: PGP signature ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] WG: Spectrum Scale with RHEL7.6 kernel 3.10.0-957.21.2
On Thu, 13 Jun 2019 15:25:16 -0400, "Felipe Knop" said: > If SELinux is disabled (SELinux mode set to 'disabled') then the crash > should not happen, and it should be OK to upgrade to (say) 3.10.0-957.21.2 > or stay at that level. Note that if you have any plans to re-enable SELinux in the future, you'll have to do a relabel, which could take a while if you have large filesystems with tens or hundreds of millions of inodes pgpUzDsWfrit1.pgp Description: PGP signature ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] SSDs for data - DWPD?
On Tue, 19 Mar 2019 12:10:08 -, Jonathan Buzzard said: > I would be weary of write amplification in RAID coming to bite you in > the ass. Just because you write 1TB of data to the file system does not > mean the drives write 1TB of data, it could be 2TB of data. Right, but that 2T would be across multiple drives. That's part of why write amplification can cause problems - many RAID subsystems are unable to do the writes in true parallel across the drives involved. ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss