Re: [gpfsug-discuss] Migrate/syncronize data from Isilon to Scale over NFS?

2020-11-18 Thread Valdis Klētnieks
On Wed, 18 Nov 2020 11:48:52 +, Jonathan Buzzard said:

> So what do I mean by "wacky" characters. Well remember a file name can
> have just about anything in it on Linux with the exception of '/', and

You want to see some fireworks?  At least at one time, it was possible to use
a file system debugger that's all too trusting of hexadecimal input and create
a directory entry of '../'. Let's just say that fs/namei.c was also far too 
trusting,
and fsck was more than happy to make *different* errors than the kernel was

> The obvious ones are spaces, but it's not just ASCII 0x20, but tabs too.
> Then there is the use of the wildcard characters, especially '?' but
> also '*'.

Don't forget ESC, CR, LF, backticks, forward ticks, semicolons, and pretty much
anything else that will give a shell indigestion. SQL isn't the only thing 
prone to
injection attacks.. :)



pgps69JeqhsqZ.pgp
Description: PGP signature
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Services on DSS/ESS nodes

2020-10-06 Thread Valdis Klētnieks
On Sat, 03 Oct 2020 10:55:05 -, "Andrew Beattie" said:

> Why do you need to run any kind of monitoring client on an IO server the
> GUI / performance monitor already does all of that work for you and
> collects the data on the dedicated EMS server.

Does *ALL* that work for me?

Will it toss you an alert if your sshd goes away, or if somebody's tossing
packets that iptables is blocking for good reasons, or any of the many other
things that a competent sysadmin wants to be alerted on that aren't GPFS, but
which are things that Nagios and Zabbix and similar tools were invented
to track?




pgpd0Hn2KWxA1.pgp
Description: PGP signature
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Client Latency and High NSD Server Load Average

2020-06-05 Thread Valdis Klētnieks
On Fri, 05 Jun 2020 14:24:27 -, "Saula, Oluwasijibomi" said:

> But with the RAID 6 writing costs Vladis explained, it now makes sense why 
> the write IO was badly affected...

> Action [1,2,3,4,A] : The only valid responses are characters from this set: 
> [1, 2, 3, 4, A]
> Action [1,2,3,4,A] : The only valid responses are characters from this set: 
> [1, 2, 3, 4, A]
> Action [1,2,3,4,A] : The only valid responses are characters from this set: 
> [1, 2, 3, 4, A]
> Action [1,2,3,4,A] : The only valid responses are characters from this set: 
> [1, 2, 3, 4, A]

And a read-modify-write on each one.. Ouch.

Stuff like that is why making sure program output goes to /var or other local 
file
system is usually a good thing.

I seem to remember us getting bit by a similar misbehavior in TSM, but I don't
know the details because I was busier with GPFS and LTFS/EE than TSM. Though I
have to wonder how TSM could be a decades-old product and still have
misbehaviors in basic things like failed reads on input prompts...



pgpUqksoZkR44.pgp
Description: PGP signature
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Client Latency and High NSD Server Load Average

2020-06-04 Thread Valdis Klētnieks
On Thu, 04 Jun 2020 15:33:18 -, "Saula, Oluwasijibomi" said:

> However, I still can't understand why write IO operations are 5x more latent
> than ready operations to the same class of disks.

Two things that may be biting you:

First, on a RAID 5 or 6 LUN, most of the time you only need to do 2 physical
reads (data and parity block). To do a write, you have to read the old parity
block, compute the new value, and write the data block and new parity block.
This is often called the "RAID write penalty".

Second, if a read size is smaller than the physical block size, the storage 
array can read
a block, and return only the fragment needed.  But on a write, it has to read
the whole block, splice in the new data, and write back the block - a RMW (read
modify write) cycle.


pgpKAKy2bcSNE.pgp
Description: PGP signature
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] gpfsug-discuss Digest, Vol 100, Issue 32

2020-05-31 Thread Valdis Klētnieks
On Fri, 29 May 2020 22:30:08 +0100, Jonathan Buzzard said:
> Ethernet goes *very* fast these days you know :-) In fact *much* faster
> than fibre channel.

Yes, but the justification, purchase, and installation of 40G or 100G Ethernet
interfaces in the machines involved, plus the routers/switches along the way,
can go very slowly indeed.

So finding a way to replace 10G Ether with 16G FC can be a win.



pgptPtT2nieiU.pgp
Description: PGP signature
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] selinux context

2020-05-24 Thread Valdis Klētnieks
On Fri, 22 May 2020 07:47:45 -, "Talamo Ivano Giuseppe (PSI)" said:
> After having done this on one node, the context on the directory is the expec
> expected one (system_u:object_r:home_root_t:s0). And everything works as 
> expected (a
> new user logs in and his directory is created).
> But on all the other nodes of the cluster still the old context is shown
> (system_u:object_r:unlabeled_t:s0). Unless I run the restorecon on them too.

> Furthermore, since the filesystem is a remote-cluster mount, on all the nodes
> on the central (storage) cluster, the corrent (home_root_t) context is shown.

> I was expecting the SElinux context to be stored in the inodes, but now the
> situation looks mixed and I’m puzzled.

I suspect the issue is that the other nodes have that inode cached already, and
they don't find out that that the SELinux context has been changed.  I can't
tell from here from whether GPFS is failing to realize that a context change
means the old inode is stale just like any other inode change, or if there's
something else that has gone astray.



pgpDQFjKg47nK.pgp
Description: PGP signature
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] GPFS 5 and supported rhel OS

2020-02-23 Thread Valdis Klētnieks
On Sun, 23 Feb 2020 12:20:48 +, Jonathan Buzzard said:

> > That's not *quite* so bad.  As long as you trust *all* your vendors to 
> > notify
> > you when they release a patch for an issue you hadn't heard about.

> Er, what do you think I am paid for? Specifically it is IMHO the job of
> any systems administrator to know when any critical patch becomes
> available for any software/hardware that they are using.

You missed the point.

Unless you spend your time constantly e-mailing *all* of your vendors
"Are there new patches I don't know about?", you're relying on them to
notify you when there's a known issue, and when a patch comes out.

Redhat is good about notification.  IBM is.

But how about things like your Infiniband stack?  OFED? The firmware in all
your devices? The BIOS/UEFI on the servers? If you're an Intel shop, how do you
get notified about security issues in the Management Engine stuff (and there's
been plenty of them). Do *all* of those vendors have security lists? Are you
subscribed to *all* of them? Do *all* of them actually post to those lists?




pgpTgKjWURc9p.pgp
Description: PGP signature
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] GPFS 5 and supported rhel OS

2020-02-22 Thread Valdis Klētnieks
On Fri, 21 Feb 2020 11:04:32 +, Jonathan Buzzard said:

> > Is that 10 days from vuln dislosure, or from patch availability?
> >
>
> Patch availability. Basically it's a response to the issue a couple of

That's not *quite* so bad.  As long as you trust *all* your vendors to notify
you when they release a patch for an issue you hadn't heard about.

(And that no e-mail servers along the way don't file it under 'spam')


pgpNRr5xwODKt.pgp
Description: PGP signature
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] GPFS 5 and supported rhel OS

2020-02-20 Thread Valdis Klētnieks
On Thu, 20 Feb 2020 23:38:15 +, Jonathan Buzzard said:
> For us, it is a Scottish government mandate that all public funded
> bodies in Scotland are Cyber Essentials Plus compliant. That's 10 days
> from a critical vulnerability till your patched. No if's no buts, just
> do it.

Is that 10 days from vuln dislosure, or from patch availability?

The latter can be a headache, especially if 24-48 hours pass between when the
patch actually hits the streets and you get the e-mail, or if you have other
legal mandates that patches be tested before production deployment.

The former is simply unworkable - you *might* be able to deploy mitigations
or other work-arounds, but if it's something complicated that requires a lot
of re-work of code, you may be waiting a lot more than 10 days for a patch




pgpkaNljRvc3Q.pgp
Description: PGP signature
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Encryption - checking key server health (SKLM)

2020-02-20 Thread Valdis Klētnieks
On Wed, 19 Feb 2020 22:07:50 +, "Felipe Knop" said:

> Having a tool that can retrieve keys independently from mmfsd would be useful
> capability to have. Could you submit an RFE to request such function?

Note that care needs to be taken to do this in a secure manner.


pgppKeBauN2ww.pgp
Description: PGP signature
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] mmbackup [--tsm-servers TSMServer[, TSMServer...]]

2020-02-14 Thread Valdis Klētnieks
On Tue, 11 Feb 2020 16:44:07 -0500, Jaime Pinto said:

> # /usr/lpp/mmfs/bin/mmbackup /gpfs/fs1/home -N tapenode3-ib ‐‐tsm‐servers 
> TAPENODE3,TAPENODE4 -s /dev/shm --tsm-errorlog $tmpDir/home-tsm-errorlog 

I got bit by this when cut-n-pasting from IBM documentation - the problem is 
that
the web version has characters that *look* like the command-line hyphen 
character
but are actually something different.

It's the same problem as cut-n-pasting a command line where the command
*should* have the standard ascii double-quote, but the webpage has "smart 
quotes"
where there's different open and close quote characters.  Just even less 
visually
obvious...


pgpLuDzqjy1hW.pgp
Description: PGP signature
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] gpfs client and turning off swap

2019-11-08 Thread Valdis Klētnieks
On Fri, 08 Nov 2019 10:25:24 -0600, Damir Krstic said:

> I was wondering if it's safe to turn off swap on gpfs client machines? we
> have a case where checkpointing is swapping and we would like to prevent it
> from doing so by disabling swap. However, the gpfs manual admin.

GPFS will work just fine without a swap space if the system has sufficient RAM.
However, if checkpointing is swapping, you don't have enough RAM (at least with
your current configuration), and turning off swap will result in processes being
killed due to lack of memory.

You *might* get it to work by tuning system config variables to reduce RAM
consumption.  But that's better tested while you still have swap defined, and
not removing swap until you have a system not needing it.


pgpbhXeAQ9yXD.pgp
Description: PGP signature
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Question on CES Authentication - LDAP

2019-10-28 Thread Valdis Klētnieks
On Mon, 28 Oct 2019 14:02:57 -, "Oesterlin, Robert" said:
> Any by the way, stores a plain text password  in the sssd.conf file just for
> good measure!

Note that if you want the system to come up without intervention, at best
you can only store an obfuscated password, not a securely encrypted one.





pgpemI7UxprQu.pgp
Description: PGP signature
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Ganesha daemon has 400'000 open files - is this unusual?

2019-09-24 Thread Valdis Klētnieks
On Tue, 24 Sep 2019 08:52:34 -, "Billich Heinrich Rainer (ID SD)" said:
> Just some addition, maybe its of interest to someone:  The number of max open
> files for Ganesha is based on maxFilesToCache. Its. 80%of maxFilesToCache up 
> to
> an upper and lower limits of 2000/1M. The active setting is visible in
> /etc/sysconfig/ganesha.

Note that strictly speaking, the values in /etc/sysconfig are in general the
values that will be used at next restart - it's totally possible for the system
to boot, the then-current values be picked up from /etc/sysconfig, and then any
number of things, from configuration automation tools like Ansible, to a
cow-orker sysadmin armed with nothing but /usr/bin/vi, to have changed the
values without you knowing about it and the daemons not be restarted yet...

(Let's just say that in 4 decades of doing this stuff, I've been surprised by 
that
sort of thing a few times.  :)


pgpFHLjbegY6O.pgp
Description: PGP signature
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] slow filesystem

2019-07-10 Thread Valdis Klētnieks
On Wed, 10 Jul 2019 07:22:37 -0500, Damir Krstic said:

> mmlspdisk all --not-ok does not indicate any failed hard drives.

Not a GPFS drive - if a write to /var is taking 10 seconds, that indicates
that there is likely a problem with the disk(s) that the system lives on,
and you're hitting recoverable write errors

Jul 10 07:05:31 gssio4 mmfs: [N] Writing into file 
/var/mmfs/gen/LastLeaseRequestSent took 10.5 seconds



pgpbrz_jJzIMH.pgp
Description: PGP signature
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Renaming Linux device used by a NSD

2019-06-18 Thread Valdis Klētnieks
On Tue, 18 Jun 2019 23:23:37 -, "Rob Logie" said:

> We are doing a underlying hardware change that will result in the Linux
> device file names changing for attached storage.

> Hence I need to reconfigure the NSDs to use the new Linux device names.

The only time GPFS cares about the Linux device names is when you go to
actually create an NSD.  After that, it just romps through /dev, finds anything
that looks like a disk, and if it has an NSD on it at the appropriate offset,
claims it as a GPFS device.

(Protip:  Since in a cluster the same disk may not have enumerated to the
same name on all NSD servers that have visibility to it, you're almost always
better off initially doing an mmcreatnsd specifying only one server, and then 
using
mmchnsd to add the other servers to the server list for it)

Heck, even without hardware changes, there's no guarantee that the disks
enumerate in the same order across reboots (especially if you have a petabyte
of LUNs and 8 or 16 paths to each LUN, though it's possible to tell the 
multipath
daemon to have stable names for the multipath devices)


pgpuuyidIfz_o.pgp
Description: PGP signature
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] WG: Spectrum Scale with RHEL7.6 kernel 3.10.0-957.21.2

2019-06-13 Thread Valdis Klētnieks
On Thu, 13 Jun 2019 15:25:16 -0400, "Felipe Knop" said:

> If SELinux is disabled (SELinux mode set to  'disabled') then the crash
> should not happen, and it should be OK to upgrade to (say) 3.10.0-957.21.2
> or stay at that level.

Note that if you have any plans to re-enable SELinux in the future, you'll have
to do a relabel, which could take a while if you have large filesystems with 
tens
or hundreds of millions of inodes



pgpUzDsWfrit1.pgp
Description: PGP signature
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] SSDs for data - DWPD?

2019-03-19 Thread Valdis Klētnieks
On Tue, 19 Mar 2019 12:10:08 -, Jonathan Buzzard said:

> I would be weary of write amplification in RAID coming to bite you in
> the ass. Just because you write 1TB of data to the file system does not
> mean the drives write 1TB of data, it could be 2TB of data.

Right, but that 2T would be across multiple drives.  That's part of
why write amplification can cause problems - many RAID subsystems
are unable to do the writes in true parallel across the drives involved.
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss