On Fri, 08 Feb 2019 17:42:13 -0800, Imam Toufique said:
> Is there a way to setup an independent fileset so that it's dependent
> filesets cannot exceed its quota limit? Another words, if my independent
> fileset quota is 2GB, I should not be allowed to set quotas for it's
> dependent filesets
On Sat, 12 Jan 2019 03:07:29 +, "Buterbaugh, Kevin L" said:
> But from there I need to then be able to find out where that fileset is
> mounted in the directory tree so that I can see who the owner and group of
> that
> directory are.
You're not able to leverage a local naming scheme?
On Tue, 20 Nov 2018 15:01:36 +, Andreas Mattsson said:
> On one of our clusters, from time to time if users try to access files or
> folders via the direct full path over NFS, the NFS-client gets invalid
> information from the server.
>
> For instance, if I run "ls
On Tue, 30 Oct 2018 22:52:35 -, Bryan Banister said:
> Valdis will also recall how much "fun" we had with network related corruption
> due to what we surmised was a TCP offload engine FW defect in a certain 10GbE
> HCA. Only happened sporadically every few weeks... what a nightmare that
>
On Mon, 15 Oct 2018 18:34:50 -0400, "Kumaran Rajaram" said:
> 1. >>When writing to GPFS directly I'm able to write ~1800 files / second in
> a test setup.
> >>This is roughly the same on the protocol nodes (NSD client), as well as
> on the ESS IO nodes (NSD server).
>
> 2. >> When writing to
On Wed, 10 Oct 2018 10:24:58 +0200, "Markus Rohwedder" said:
> Hello Simon,
>
> not sure if the answer solved your question from the response,
>
> Even if nodes can be externally resolved by unique hostnames, applications
> that run on the host use the /bin/hostname binary or the hostname()
On Sat, 29 Sep 2018 11:23:08 -0400, "Marc A Kaplan" said:
> This may be a bug and/or a peculiarity of the SQL type system. A proper
> investigation and full explanation will take more time than I have right
> now.
>
> In the meanwhile please try forcing the computation/arithmetic to use
>
OK, so we're running ltfs/ee for archiving to tape, and currently we migrate
based
sheerly on "largest file first". I'm trying to cook up something that does
"largest LRU
first" (basically, large unaccessed files get moved earlier than large used
files).
So I need to do some testing for what
On Mon, 10 Sep 2018 15:49:36 -0400, "Frederick Stock" said:
> My guess is that the "metadata" IO is for either for directory data since
> directories are considered metadata, or fileset metadata.
Plus things like free block lists, etc...
pgpkPE3poB8FW.pgp
Description: PGP signature
On Wed, 22 Aug 2018 17:12:24 -, "Oesterlin, Robert" said:
> Sometimes, I look at the data that's being stored in my file systems and just
> shake my head:
>
> /gpfs//Restricted/EventChangeLogs/deduped/working contains
> 17,967,350 files (in ONE directory)
I've got 114,029 files of the
On Mon, 20 Aug 2018 14:02:05 -0400, "Frederick Stock" said:
> Note you have two additional NSDs in the 33 failure group than you do in
> the 23 failure group. You may want to change one of those NSDs in failure
> group 33 to be in failure group 23 so you have equal storage space in both
>
On Thu, 09 Aug 2018 15:11:27 -0400, Aaron Knister said:
> We recently had a node running 4.2.3.6 (efix 9billion, sorry can't
> remember the exact efix) go wonky with a logAssertFailed error that
> looked similar to the description of this APAR fixed in 4.2.3.8:
>
> - Fix an assert in
On Thu, 19 Jul 2018 22:23:06 -, "Buterbaugh, Kevin L" said:
> Is this what youâre looking for (from an IBMer in response to another
> question a few weeks back)?
>
> assuming 4.2.3 code level this can be done by deleting and recreating the
> rule with changed settings:
Nope, that bring
So I'm trying to tidy up things like 'mmhealth' etc. Got most of it fixed, but
stuck on
one thing..
Note: I already did a 'mmhealth node eventlog --clear -N all' yesterday, which
cleaned out a bunch of other long-past events that were "stuck" as failed /
degraded even though they were corrected
On Fri, 22 Jun 2018 13:28:02 -, "Oesterlin, Robert" said:
> [root@nrg1-gpfs01 ~]# mmchmgr dataeng nrg1-gpfs05
> Sending migrate request to current manager node 10.30.43.136 (nrg1-gpfs13).
> Node 10.30.43.136 (nrg1-gpfs13) resigned as manager for dataeng.
> Node 10.30.43.136 (nrg1-gpfs13)
On Wed, 20 Jun 2018 14:08:09 -, "Grunenberg, Renar" said:
> There are after each test (change of the content) the file became every time
> a new inode number. This behavior is the reason why the shadowfile think(or
> the
> policyengine) the old file is never existent
That's because as far
On Tue, 15 May 2018 11:28:00 +0100, Jonathan Buzzard said:
> One wonders what the mmfs26/mmfslinux does that you can't achieve with
> fuse these days?
Handling each disk I/O request without several transitions to/from userspace
comes to mind...
pgpsFd7zLCVfo.pgp
Description: PGP signature
On Fri, 11 May 2018 19:02:30 -, "Daniel Kidger" said:
> Remember too that ESS uses powerful processors in order to do the erasure
> coding and hence has performance to do checksums too. Traditionally ordinary
> NSD servers are merely âroutersâ and as such are often using low spec cpus
>
On Wed, 09 May 2018 15:01:55 -0400, "Marc A Kaplan" said:
> I see there are also low-power / zero-power disk archive/arrays available.
> Any experience with those?
The last time I looked at those (which was a few years ago) they were
competitive
with tape for power consumption, but not on cost
On Tue, 08 May 2018 14:59:37 -, "Lloyd Dean" said:
> First it must be understood the snap is either at the filesystems or fileset,
> and more importantly is not an application level backup. This is a huge
> difference to say Protects many application integrations like exchange,
> databases,
On Thu, 03 May 2018 16:52:44 +0100, Jonathan Buzzard said:
> The test that I have used in the past for if a file is migrated with a
> high degree of accuracy is
>
> if the space allocated on the file system is less than the
> file size, and equal to the stub size then presume the file
>
We're running GPFS 4.2.3.7 with encryption on disk, LTFS/EE 1.2.6.2 with
encryption on tape, and ISKLM 2.6.0.2 to manage the keys.
I'm in the middle of researching RHEL patches on the key servers.
Do I want to stay at 2.6.0.2, or go to a later 2.6, or jump to 2.7 or 3.0?
Not seeing a lot of
On Wed, 25 Apr 2018 01:10:52 +0200, "Uwe Falke" said:
> Instead, one of the internal pools (pool0) is used to receive files
> written in very small records, the other (pool1) is the "normal" pool and
> receives all other files.
How do you arrange that to happen? As we found out on one of our
So of course, the day after after I upgrade our GPFS/LTFS cluster to the latest
releases
of everything, RedHat drops about 300 new updates, include a kernel update, and
I find out that GPFS 4.2.3.8 has also escaped. :)
Any word if 4.2.3.7 or 4.2.3.8 play nice with the 3.10.0-862.el7 kernel?
On Tue, 17 Apr 2018 08:31:35 -, atmane khiredine said:
> but no location of pdisk
> state = "missing/noPath/systemDrain/noRGD/noVCD/noData"
That can't be good. That's just screaming "dead, uncabled, or removed".
> WWN = "naa.5000C50056717727"
Useful hint where to start if all else fails
On Wed, 04 Apr 2018 10:02:09 -, John Hearns said:
> Has anyone done a procedure like this?
We recently got to rename all 10 nodes in a GPFS cluster to make the
unqualified name unique (turned out that having 2 nodes called 'arnsd1.isb.mgt'
and 'arnsd1.vtc.mgt' causes all sorts of confusion).
On Wed, 14 Mar 2018 15:36:32 -, Mark Bush said:
> Is it possible (albeit not advisable) to mirror LUNs that are NSD's to
> another storage array in another site basically for DR purposes? Once it's
> mirrored to a new cluster elsewhere what would be the step to get the
> filesystem back up
On Mon, 12 Mar 2018 15:51:05 +0100, Lukas Hejtmanek said:
> I don't think like 5 or more data/metadata replicas are practical here. On the
> other hand, multiple node failures is something really expected.
Umm.. do I want to ask *why*, out of only 60 nodes, multiple node
failures are an expected
Has anybody out there done a Wireshark protocol filter for GPFS? Or know where
to find enough documentation of the on-the-wire data formats to write even a
basic one?
pgpQjgKLVnEGV.pgp
Description: PGP signature
___
gpfsug-discuss mailing list
On Wed, 14 Feb 2018 06:20:32 -0800, John Hanks said:
> # ls -aln /srv/gsfs0/projects/pipetest.tmp.txt $HOME/pipetest.tmp.txt
> -rw-r--r-- 1 39073 3953 530721 Feb 14 06:10 /home/griznog/pipetest.tmp.txt
> -rw-r--r-- 1 39073 3001 530721 Feb 14 06:10
> /srv/gsfs0/projects/pipetest.tmp.txt
>
> We
On Thu, 08 Feb 2018 16:25:33 +, "Oesterlin, Robert" said:
> unmountOnDiskFail
> The unmountOnDiskFail specifies how the GPFS daemon responds when a disk
> failure is detected. The valid values of this parameter are yes, no, and meta.
> The default value is no.
I suspect that the only
On Thu, 08 Feb 2018 10:33:13 -0500, "Marc A Kaplan" said:
> Please clarify and elaborate When you write "a full backup ... takes
> 60 days" - that seems very poor indeed.
> BUT you haven't stated how much data is being copied to what kind of
> backup media nor how much equipment or what
On Wed, 24 Jan 2018 11:36:55 -0500, "Frederick Stock" said:
> Thankfully the issue of Ganesha being restarted across the cluster when an
> export is changed has been addressed in the 5.0 release of Spectrum Scale.
Thanks for the info. Now all I need is a version of LTFS/EE that supports 5.0.
On Tue, 23 Jan 2018 20:39:52 -0500, Harold Morales said:
> - site A simple three node cluster configured (AIX nodes). two failure
> groups. two filesystems. No quota, no ILM.
> - Site B the same cluster config as in site A configured (AIX nodes) same
> hdisk device names used as in site A for
On Wed, 24 Jan 2018 16:13:01 +, "Sobey, Richard A" said:
> Gosh.. Seriously? I need downtime to enable NFS?
Wait till you get to the part where 'mmnfs export change /foo/whatever --nfsadd
(whatever)'
bounces your nfs-ganesha services on all protocol nodes - at the same time.
On Wed, 17 Jan 2018 13:48:14 +, Jonathan Buzzard said:
> The mind boggles how you manage to get a backspace character into a
> file name.
You can get yourself into all sorts of trouble with stty - in particular, if
you're
ssh'ed from one system to another, and they disagree on whether the
On Tue, 16 Jan 2018 17:25:47 +, Jonathan Buzzard said:
> User comes with problem, you investigate find problem is due to "wacky"
> characters point them to the mandatory training documentation, tell
> them they need to rename their files to something sane and take no
> further action. Sure
On Thu, 21 Dec 2017 16:38:27 +, Sven Oehme said:
> size. so if you now go to a 16MB blocksize and you have just 50 iops @ 2MB
> each you can write ~800 MB/sec with the exact same setup and same size
> small writes, that's a factor of 8 .
That's assuming your metadata storage is able to
Currently, the IBM support matrix says:
https://www.ibm.com/support/knowledgecenter/STXKQY/gpfsclustersfaq.html#linux
that 4.2.3.5 is supported on RHEL 7.4, but with a footnote:
"AFM, Integrated Protocols, and Installation Toolkit are not supported on RHEL
7.4."
We don't use AFM or the
On Mon, 04 Dec 2017 17:08:19 +, "Simon Thompson (IT Research Support)" said:
> Have you looked at using filesets instead an using fileset quotas to achieve
> this?
Note that fileset quotas aren't able to represent "No Storage Allowed"
either
pgpPB2hSjqxVS.pgp
Description: PGP
On Mon, 04 Dec 2017 08:46:31 -0500, Stephen Ulmer said:
> As described, your case is about not wanting userA to be able to write to a
> fileset if userA isnât in some groups. Donât put them in those groups.
> Thatâs
> not even Spectrum Scale specific, itâs about generic *nix permissions.
On Fri, 17 Nov 2017 14:39:47 +0100, matthias.kni...@rohde-schwarz.com said:
> anyone know in which package I can find the gpfs vfs module? Currently I
> am working with gpfs 4.2.3.0 and Samba 4.4.4. Normally the samba package
> provides the vfs module. I updated Samba to 4.6.2 but the
On Wed, 01 Nov 2017 15:54:04 -0700, John Hanks said:
> illreplicated REGULAR_FILE RESERVED Error: 28 No space left on device
Check 'df -i' to make sure no file systems are out of inodes. That's
___
gpfsug-discuss mailing list
gpfsug-discuss at
On Tue, 24 Oct 2017 13:27:29 +0530, "Malahal R Naineni" said:
> If you want to change multiple existing exports, you could use
> undocumented option "--nfsnorestart" to mmnfs. This should add export
> changes to NFS configuration but it won't restart nfs-ganesha service, so
> you will not see
On Mon, 23 Oct 2017 17:26:07 +0530, "Chetan R Kulkarni" said:
> tests:
> 1. created 1st nfs export - ganesha service was restarted
> 2. created 4 more nfs exports (mmnfs export add path)
> 3. changed 2 nfs exports (mmnfs export change path --nfschange);
> 4. removed all 5 exports one by one
On Thu, 14 Sep 2017 14:55:39 -0400, "Marc A Kaplan" said:
> Read the doc again. Specify both -g and -N options on the command line to
> get fully parallel directory and inode/policy scanning.
Yeah, figured that out, with help from somebody. :)
> I'm curious as to what you're trying to do with
So we have a number of very similar policy files that get applied for file
migration etc. And they vary drastically in the runtime to process, apparently
due to different selections on whether to do the work in parallel.
Running a set of rules with 'mmapplypolicy -I defer' that look like this:
So for a variety of reasons, we had accumulated some 45 tapes that
had found ways to get out of Valid status. I've cleaned up most of
them, but I'm stuck on a few corner cases.
Case 1:
l% tfsee info tapes | sort | grep -C 1 'Not Sup'
AV0186JD Valid TS1150(J5) 9022 0
The LTFSEE docs say:
https://www.ibm.com/support/knowledgecenter/en/ST9MBR_1.2.3/ltfs_ee_ltfsee_info_tapes.html
"Unusable The Unusable status indicates that the tape can't be used.
To change the status, remove the tape from the pool by using the ltfsee pool
remove command with the -r
On Tue, 25 Jul 2017 15:46:45 -0500, "Scott C Batchelder" said:
> - Should the number of threads equal the number of NSDs for the file
> system? or equal to the number of nodes?
Depends on what definition of "throughput" you are interested in. If your
configuration has 50 clients banging on 5 NSD
On Tue, 25 Jul 2017 10:02:14 +0100, Jonathan Buzzard said:
> I would be tempted to zip up the directories and move them ziped ;-)
Not an option, unless you want to come here and re-write the researcher's
tracking systems that knows where they archived a given run, and teach it
"Except now it's
On Mon, 24 Jul 2017 13:36:41 +0300, Ilan Schwarts said:
> Hi,
> I have gpfs with 2 Nodes (redhat).
> I am trying to create NFS share - So I would be able to mount and
> access it from another linux machine.
> While trying to create NFS (I execute the following):
> [root@LH20-GPFS1 ~]# mmnfs
On Mon, 24 Jul 2017 12:43:10 +0100, Jonathan Buzzard said:
> For an archive service how about only accepting files in actual
> "archive" formats and then severely restricting the number of files a
> user can have?
>
> By archive files I am thinking like a .zip, tar.gz, tar.bz or similar.
After
On Fri, 21 Jul 2017 22:04:32 -, Sven Oehme said:
> i talked with a few others to confirm this, but unfortunate this is a
> limitation of the code today (maybe not well documented which we will look
> into). Encryption only encrypts data blocks, it doesn't encrypt metadata.
> Hence, if
So we're running GPFS 4.2.2.3 and LTFS/EE 1.2.3 to use as an archive service.
Inode size is 4K, and we had a requirement to encrypt-at-rest, so encryption
is in play as well. Data is replicated 2x and fragment size is 32K.
I was investigating how much data-in-inode would help deal with users who
On Mon, 12 Jun 2017 20:06:09 -, "Simon Thompson (IT Research Support)" said:
> mmces node suspend -N
>
> Is what you want. This will move the address and stop it being assigned one,
> otherwise the rebalance will occur.
Yeah, I figured that part out. What I couldn't wrap my brain around was
On Tue, 06 Jun 2017 15:06:57 +0200, Stijn De Weirdt said:
> oh sure, i meant waiters that last > 300 seconds or so (something that
> could trigger deadlock). obviously we're not interested in debugging the
> short ones, it's not that gpfs doesn't work or anything ;)
At least at one time, a lot of
On Mon, 08 May 2017 12:06:22 -0400, "Jaime Pinto" said:
> Another piece og information is that as far as GPFS goes all clusters
> are configured to communicate exclusively over Infiniband, each on a
> different 10.20.x.x network, but broadcast 10.20.255.255. As far as
Have you verified that
On Wed, 26 Apr 2017 14:20:30 -, "Simon Thompson (IT Research Support)" said:
> We can't see in any of the logs WHY ganesha is going into grace. Any
> suggestions on how to debug this further? (I.e. If we can stop the grace
> issues, we can solve the problem mostly).
After over 3 decades of
On Mon, 24 Apr 2017 17:24:29 +0100, Jonathan Buzzard said:
> Hate to say but the 822 will happily keep trucking when the CPU
> (assuming it has more than one) fails and similar with the DIMM's. In
How about when you go to replace the DIMM? You able to hot-swap the memory
without anything losing
On Mon, 24 Apr 2017 14:21:09 +0200, "serv...@metamodul.com" said:
> todays hardware is so powerful that imho it might make sense to split a CEC
> into more "piece". For example the IBM S822L has up to 2x12 cores, 9 PCI3
> slots
> ( 4×16 lans & 5×8 lan ).
We look at it the other way around:
On Thu, 20 Apr 2017 12:27:13 -, Frank Tower said:
> - some will do large I/O (e.g: store 1TB files)
> - some will read/write more than 10k files in a raw
> - other will do only sequential read
> But I wondering if some people have recommendations regarding hardware sizing
> and software
On Wed, 05 Apr 2017 16:40:30 -, "Buterbaugh, Kevin L" said:
> So, I have gone to all of the 4 clients and none of them say they have it
> mounted according to either âdfâ or âmountâ. Iâve gone ahead and
> run both
> âmmunmountâ and âumount -lâ on the filesystem anyway, but
On Mon, 13 Mar 2017 20:06:29 -0400, Aaron Knister said:
> After setting the sync=always parameter to not lose data in the event of
> a crash or power outage the write performance became unbearably slow
> (under 100MB/s of writes for an 8+2 RAIDZ2 if I recall correctly). I
Not at all
On Fri, 10 Mar 2017 20:43:37 +, "Buterbaugh, Kevin L" said:
> So I tried to create a filesystem:
>
> root@nec:~/gpfs# mmcrfs gpfs0 -F ~/gpfs/flash.stanza -A yes -B 1M -j scatter
> -k all -m 1 -M 3 -Q no -r 1 -R 3 -T /gpfs0
What was in flash.stanza?
pgpVD8DxNr08y.pgp
Description: PGP
On Tue, 07 Mar 2017 21:17:35 +, Bryan Banister said:
> Just depends on how your problem is detected⦠is it in a log? Is it found
> by
> running a command (.e.g mm*)? Is it discovered in `ps` output? Is your
> scheduler failing jobs?
I think the problem here is that if you have a sudden
On Thu, 02 Feb 2017 18:28:22 +0100, "Olaf Weiser" said:
> but the /var/mmfs DIR is obviously damaged/empty .. what ever.. that's why you
> see a message like this..
> have you reinstalled that node / any backup/restore thing ?
The internal RAID controller died a horrid death and basically took
On Sun, 22 Jan 2017 20:10:14 -0500, Aaron Knister said:
> This is going to sound like a ridiculous request, but, is there a way to
> cause a filesystem to panic everywhere in one "swell foop"?
(...)
> I can seem to do it on a per-node basis with "mmfsadm test panic
> " but if I do that over all
On Thu, 05 Jan 2017 22:18:08 +, "Rob Basham" said:
> By way of introduction, I am TCT architect across all of IBM's storage
> products, including Spectrum Scale. There have been queries as to whether or
> not CentOS is supported with TCT Server on Spectrum Scale. It is not
> currently
>
On Wed, 04 Jan 2017 00:14:21 +0100, Jan-Frode Myklebust said:
> This looks like Spectrum Archive v1.2.1.0 (Build 10230). Newest version
> available on fixcentral is v1.2.2.0, but it doesn't support GPFS v4.2.2.x
> yet.
That's what I was afraid of. OK, shelve that option, and call IBM for the
So we have GPFS Advanced 4.2.1 installed, and the following RPMs:
% rpm -qa 'ltfs*' | sort
ltfsle-2.1.6.0-9706.x86_64
ltfsle-library-2.1.6.0-9706.x86_64
ltfsle-library-plus-2.1.6.0-9706.x86_64
ltfs-license-2.1.0-20130412_2702.x86_64
ltfs-mig-1.2.1.1-10232.x86_64
What release of "Spectrum
On Tue, 03 Jan 2017 14:27:17 -0600, Matt Weil said:
> this follows the IP what ever node the ip lands on. the ganesha.nfsd
> process seems to stop working. any ideas? there is nothing helpful in
> the logs.
Does it in fact "stop working", or are you just having a mount issue? Do
already
On Fri, 16 Dec 2016 23:24:34 -0500, Aaron Knister said:
> that I can then parse and map the nsd id to the nsd name. I hesitate
> calling ts* commands directly and I admit it's perhaps an irrational
> fear, but I associate the -D flag with "delete" in my head and am afraid
> that some day -D may
Is it possible to use 'ltfsee fsopt' to set stub and preview sizes
on a per-fileset basis, or is it fixed across an entire filesystem?
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
So as a basis for our archive solution, we're using a GPFS cluster
in a stretch configuration, with 2 sites separated by about 20ms worth
of 10G link. Each end has 2 protocol servers doing NFS and 3 NSD servers.
Identical disk arrays and LTFS/EE at both ends, and all metadata and
userdata are
On Fri, 11 Nov 2016 08:50:00 +, "Sobey, Richard A" said:
> Question: when I upgrade to the new PTF when itâs available, can I install
> it
> first on just the GUI node (which happens to be the Quorum server for the
> cluster)
*the* quorum server, not "one of the quorum nodes"?
Best
On Tue, 13 Sep 2016 00:30:19 +0200, Lukas Hejtmanek said:
> I guess we could reach snapid 100,000.
It probably stores the snap ID as a 32 or 64 bit int, so 100K is peanuts.
What you *do* want to do is make the snap *name* meaningful, using
a timestamp or something to keep your sanity.
On Wed, 07 Sep 2016 17:34:07 -0400, Stephen Ulmer said:
> Hostnames can have many A records.
And quad-A records. :)
(Despite our best efforts, we're still one of the 100 biggest IPv6
deployments according to http://www.worldipv6launch.org/measurements/ -
were's sitting at 84th in traffic volume
On Wed, 07 Sep 2016 13:40:13 -0700, "Michael L Taylor" said:
> Can't be for certain this is what you're hitting but reverse DNS lookup is
> documented the KC:
> Note: All CES IPs must have an associated hostname and reverse DNS lookup
> must be configured for each. For more information, see
We're in the middle of deploying Spectrum Archive, and I've hit a
snag. We assigned some floating IP addresses, which now need to
be changed. So I look at the mmces manpage, and it looks like I need
to add the new addresses, and delete the old ones.
We're on GPFS 4.2.1.0, if that matters...
80 matches
Mail list logo