Re: [gpfsug-discuss] snapshots causing filesystem quiesce

2022-02-02 Thread Jan-Frode Myklebust
Also, if snapshotting multiple filesets, it's important to group these into a single mmcrsnapshot command. Then you get a single quiesce, instead of one per fileset. i.e. do: snapname=$(date --utc +@GMT-%Y.%m.%d-%H.%M.%S) mmcrsnapshot gpfs0

Re: [gpfsug-discuss] ESS 6.1.2.1 changes

2021-12-20 Thread Jan-Frode Myklebust
Just ran an upgrade on an EMS, and the only changes I see are these updated packages on the ems: +gpfs.docs-5.1.2-0.9.noarchMon 20 Dec 2021 11:56:43 AM CET +gpfs.ess.firmware-6.0.0-15.ppc64leMon 20 Dec 2021 11:56:42 AM CET +gpfs.msg.en_US-5.1.2-0.9.noarch

Re: [gpfsug-discuss] alternate path between ESS Servers for Datamigration

2021-12-09 Thread Jan-Frode Myklebust
I believe this should be a fully working solution. I see no problem enabling RDMA between a subset of nodes -- just disable verbsRdma on the nodes you want to use plain IP. -jf On Thu, Dec 9, 2021 at 11:04 AM Walter Sklenka wrote: > Dear spectrum scale users! > > May I ask you a design

Re: [gpfsug-discuss] mmapplypolicy slow

2021-08-03 Thread Jan-Frode Myklebust
I have also played with the -A and -a parameters with no combination that I > can find making it any better. > > > > Thanks for the feedback. > > > > *From:* gpfsug-discuss-boun...@spectrumscale.org < > gpfsug-discuss-boun...@spectrumscale.org> *On Behalf Of *Jan-Frode

Re: [gpfsug-discuss] mmapplypolicy slow

2021-08-03 Thread Jan-Frode Myklebust
So…. the advertisement ssys we should be able to do 1M files/s… http://files.gpfsug.org/presentations/2018/USA/SpectrumScalePolicyBP.pdf First I would try is if maybe limiting which nodes are used for the processing helps. Maybe limit to the NSD-servers (-N nodenames) ? Also,

Re: [gpfsug-discuss] Using VMs as quorum / admin nodes in a GPFS infiniband cluster

2021-06-17 Thread Jan-Frode Myklebust
titut > Dr. Leonardo Sala > Group Leader High Performance Computing > Deputy Section Head Science IT > Science IT > WHGA/036 > Forschungstrasse 111 > 5232 Villigen PSI > Switzerland > > Phone: +41 56 310 3369leonardo.s...@psi.chwww.psi.ch > > On 07.06.21 21:49, J

Re: [gpfsug-discuss] Using VMs as quorum / admin nodes in a GPFS infiniband cluster

2021-06-07 Thread Jan-Frode Myklebust
I’ve done this a few times. Once with IPoIB as daemon network, and then created a separate routed network on the hypervisor to bridge (?) between VM and IPoIB network. Example RHEL config where bond0 is an IP-over-IB bond on the hypervisor: To give the VMs access to the daemon network,

Re: [gpfsug-discuss] Long IO waiters and IBM Storwize V5030

2021-05-28 Thread Jan-Frode Myklebust
One thing to check: Storwize/SVC code will *always* guess wrong on prefetching for GPFS. You can see this with having a lot higher read data throughput on mdisk vs. on on vdisks in the webui. To fix it, disable cache_prefetch with "chsystem -cache_prefetch off". This being a global setting, you

Re: [gpfsug-discuss] Spectrum Scale & S3

2021-05-21 Thread Jan-Frode Myklebust
It has features for both being an Object store for other applications (running openstack swift/S3), and for migrating/tiering filesystem data to an object store like Amazon S3, IBM COS, etc... -jf fre. 21. mai 2021 kl. 10:42 skrev David Reynolds : > When we talk about supported protocols on

Re: [gpfsug-discuss] Quick delete of huge tree

2021-04-20 Thread Jan-Frode Myklebust
A couple of ideas. The KC recommends adding WEIGHT(DIRECTORY_HASH) to group deletions within a directory. Then maybe also do it as a 2-step process, in the same policy run. Where you delete all non-directories first, and then deletes the directories in a depth-first order using

Re: [gpfsug-discuss] NFSIO metrics absent in pmcollector

2021-04-20 Thread Jan-Frode Myklebust
Have you installed the gpfs.pm-ganesha package, and do you have any active NFS exports/clients ? -jf On Tue, Apr 20, 2021 at 12:19 PM Dorigo Alvise (PSI) wrote: > Dear Community, > > > > I’ve activated CES-related metrics by simply doing: > > [root@xbl-ces-91 ~]# mmperfmon config show

Re: [gpfsug-discuss] Move data to fileset seamlessly

2021-03-22 Thread Jan-Frode Myklebust
No — all copying between filesets require full data copy. No simple rename. This might be worthy of an RFE, as it’s a bit unexpected, and could potentially work more efficiently.. -jf man. 22. mar. 2021 kl. 10:39 skrev Ulrich Sibiller < u.sibil...@science-computing.de>: > Hello, > > we

Re: [gpfsug-discuss] dssgmkfs.mmvdisk number of NSD's

2021-02-28 Thread Jan-Frode Myklebust
I’ve tried benchmarking many vs. few vdisks per RG, and never could see any performance difference. Usually we create 1 vdisk per enclosure per RG, thinking this will allow us to grow with same size vdisks when adding additional enclosures in the future. Don’t think mmvdisk can be told to

Re: [gpfsug-discuss] policy ilm features?

2021-02-19 Thread Jan-Frode Myklebust
We just discussed this a bit internally, and I found "something* that might help... There's a mmrestripefs --inode-criteria command that can be used to identify files with these unknown-to-ILM flags set. Something like: # echo illreplicated > criteria # mmrestripefs gpfs01 -p --inode-criteria

Re: [gpfsug-discuss] gpfsug-discuss Digest, Vol 108, Issue 18

2021-02-01 Thread Jan-Frode Myklebust
Agree.. Write a policy that takes a "mmapplypolicy -M var=val" argument, and figure out the workdays outside of the policy. Something like: # cat test.poilcy define( access_age, (DAYS(CURRENT_TIMESTAMP) - DAYS(ACCESS_TIME))) /* list migrated files */ RULE EXTERNAL LIST 'oldFiles' EXEC '' RULE

Re: [gpfsug-discuss] Spectrum Scale 5 and Reading Compressed Data

2021-01-20 Thread Jan-Frode Myklebust
This sounds like a bug to me... (I wouldn't expect mmchattr works on different node than other file access). I would check "mmdiag --iohist verbose" during these slow reads, to see if it gives a hint at what it's doing, versus what it shows during "mmchattr". Maybe one is triggering prefetch,

Re: [gpfsug-discuss] Protocol limits

2020-12-09 Thread Jan-Frode Myklebust
My understanding of these limits are that they are to limit the configuration files from becoming too large, which makes changing/processing them somewhat slow. For SMB shares, you might be able to limit the number of configured shares by using wildcards in the config (%U). These wildcarded

Re: [gpfsug-discuss] Mounting filesystem on top of an existing filesystem

2020-11-19 Thread Jan-Frode Myklebust
I would not mount a GPFS filesystem within a GPFS filesystem. Technically it should work, but I’d expect it to cause surprises if ever the lower filesystem experienced problems. Alone, a filesystem might recover automatically by remounting. But if there’s another filesystem mounted within, I

Re: [gpfsug-discuss] Tiny cluster quorum problem

2020-08-18 Thread Jan-Frode Myklebust
I would expect you should be able to get it back up using the routine at https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.5/com.ibm.spectrum.scale.v5r05.doc/bl1adv_failsynch.htm Maybe you just need to force remove quorum-role from the dead node ? -jf On Tue, Aug 18, 2020 at 2:16 PM

Re: [gpfsug-discuss] rsync NFS4 ACLs

2020-07-17 Thread Jan-Frode Myklebust
your local IBM Service Center in > other countries. > > The forum is informally monitored as time permits and should not be used > for priority messages to the Spectrum Scale (GPFS) team. > > [image: Inactive hide details for Jan-Frode Myklebust ---15-07-2020 > 08.44.49 PM---It

[gpfsug-discuss] rsync NFS4 ACLs

2020-07-15 Thread Jan-Frode Myklebust
It looks like the old NFS4 ACL patch for rsync is no longer needed. Starting with rsync-3.2.0 (and backported to rsync-3.1.2-9 in RHEL7), it will now copy NFS4 ACLs if we tell it to ignore the posix ACLs: rsync -X --filter '-x system.posix_acl' file-with-acl copy-with-acl

Re: [gpfsug-discuss] very low read performance in simple spectrum scale/gpfs cluster with a storage-server SAN: effect of ignorePrefetchLUNCount

2020-06-16 Thread Jan-Frode Myklebust
tir. 16. jun. 2020 kl. 15:32 skrev Giovanni Bracco : > > > I would correct MaxMBpS -- put it at something reasonable, enable > > verbsRdmaSend=yes and > > ignorePrefetchLUNCount=yes. > > Now we have set: > verbsRdmaSend yes > ignorePrefetchLUNCount yes > maxMBpS 8000 > > but the only parameter

Re: [gpfsug-discuss] very low read performance in simple spectrum scale/gpfs cluster with a storage-server SAN

2020-06-11 Thread Jan-Frode Myklebust
On Thu, Jun 11, 2020 at 9:53 AM Giovanni Bracco wrote: > > > > > You could potentially still do SRP from QDR nodes, and via NSD for your > > omnipath nodes. Going via NSD seems like a bit pointless indirection. > > not really: both clusters, the 400 OPA nodes and the 300 QDR nodes share > the

Re: [gpfsug-discuss] very low read performance in simple spectrum scale/gpfs cluster with a storage-server SAN

2020-06-05 Thread Jan-Frode Myklebust
fre. 5. jun. 2020 kl. 15:53 skrev Giovanni Bracco : > answer in the text > > On 05/06/20 14:58, Jan-Frode Myklebust wrote: > > > > Could maybe be interesting to drop the NSD servers, and let all nodes > > access the storage via srp ? > > no we can not: the pr

Re: [gpfsug-discuss] very low read performance in simple spectrum scale/gpfs cluster with a storage-server SAN

2020-06-05 Thread Jan-Frode Myklebust
Could maybe be interesting to drop the NSD servers, and let all nodes access the storage via srp ? Maybe turn off readahead, since it can cause performance degradation when GPFS reads 1 MB blocks scattered on the NSDs, so that read-ahead always reads too much. This might be the cause of the slow

Re: [gpfsug-discuss] Multi-cluster question (was Re: gpfsug-discuss Digest, Vol 100, Issue 32)

2020-05-31 Thread Jan-Frode Myklebust
No, this is a common misconception. You don’t need any NSD servers. NSD servers are only needed if you have nodes without direct block access. Remote cluster or not, disk access will be over local block device (without involving NSD servers in any way), or NSD server if local access isn’t

Re: [gpfsug-discuss] Spectrum Scale 5.0.5.0 is available on FixCentral

2020-05-26 Thread Jan-Frode Myklebust
Seeing that as a %changelog in the RPMs would be fantastic.. :-) -jf tir. 26. mai 2020 kl. 15:44 skrev Carl Zetie - ca...@us.ibm.com < ca...@us.ibm.com>: > Achim, I think the request here (lost in translation?) is for a list of > the bugs that 5.0.5.0 addresses. And we're currently looking

Re: [gpfsug-discuss] Enabling SSL/HTTPS/ on Object S3.

2020-05-07 Thread Jan-Frode Myklebust
(almost verbatim copy of my previous email — in case anybody else needs it, or has ideas for improvements :-) The way I would do this is to install "haproxy" on all these nodes, and have haproxy terminate SSL and balance incoming requests over the 3 CES-addresses. For S3 -- we only need to

Re: [gpfsug-discuss] Read-only mount option for GPFS version 4.2.3.19

2020-03-04 Thread Jan-Frode Myklebust
I don’t know the answer — but as an alternative solution, have you considered splitting the read only clients out into a separate cluster. Then you could enforce the read-only setting using «mmauth grant ... -a ro». That should be supported. -jf ons. 4. mar. 2020 kl. 12:05 skrev Agostino

Re: [gpfsug-discuss] gui_refresh_task_failed for HW_INVENTORY with two active GUI nodes

2020-02-03 Thread Jan-Frode Myklebust
> ppc64le. The Readme for 5.3.5 lists FW860.60 again, same as 5.3.4? > > > > Cheers, > > > > Heiner > > *From: * on behalf of Jan-Frode > Myklebust > *Reply to: *gpfsug main discussion list > *Date: *Thursday, 30 January 2020 at 18:00 > *To: *gpfsug

Re: [gpfsug-discuss] gui_refresh_task_failed for HW_INVENTORY with two active GUI nodes

2020-01-30 Thread Jan-Frode Myklebust
I *think* this was a known bug in the Power firmware included with 5.3.4, and that it was fixed in the FW860.70. Something hanging/crashing in IPMI. -jf tor. 30. jan. 2020 kl. 17:10 skrev Wahl, Edward : > Interesting. We just deployed an ESS here and are running into a very > similar

Re: [gpfsug-discuss] How to join GNR nodes to a non-GNR cluster

2019-12-05 Thread Jan-Frode Myklebust
There’s still being maintained the ESS v5.2 release stream with gpfs v4.2.3.x for customer that are stuck on v4. You should probably install that on your ESS if you want to add it to your existing cluster. BTW: I think Tomer misunderstood the task a bit. It sounded like you needed to keep the

Re: [gpfsug-discuss] How to join GNR nodes to a non-GNR cluster

2019-12-04 Thread Jan-Frode Myklebust
Adding the GL2 into your existing cluster shouldn’t be any problem. You would just delete the existing cluster on the GL2, then on the EMS run something like: gssaddnode -N gssio1-hs --cluster-node netapp-node --nodetype gss --accept-license gssaddnode -N gssio2-hs --cluster-node

Re: [gpfsug-discuss] Fileheat - does work! Complete test/example provided here.

2019-09-03 Thread Jan-Frode Myklebust
me') ) > > > [root@/main/gpfs-git]$mmapplypolicy /c23 --maxdepth 1 -P > /gh/policies/fileheat.policy -I test -L 3 > ... > <1> /c23/10g RULE 'fh2' LIST 'fh' WEIGHT(0.022363) SHOW( 238 17060 1024 > +2.2363281250E-002 000000EE42A40400 60 10 makaplan.sl.cloud9.ibm.com) &g

Re: [gpfsug-discuss] Fileheat

2019-08-13 Thread Jan-Frode Myklebust
What about filesystem atime updates. We recently changed the default to «relatime». Could that maybe influence heat tracking? -jf tir. 13. aug. 2019 kl. 11:29 skrev Ulrich Sibiller < u.sibil...@science-computing.de>: > On 12.08.19 15:38, Marc A Kaplan wrote: > > My Admin guide says: > > >

Re: [gpfsug-discuss] Any guidelines for choosing vdisk size?

2019-07-01 Thread Jan-Frode Myklebust
I would mainly consider future upgrades. F.ex. do one vdisk per disk shelf per rg. F.ex. for a GL6S you would have 12 vdisks, and if you add a GL4S you would add 8 more vdisks, then each spindle of both systems should get approximately the same number of IOs. Another thing to consider is

Re: [gpfsug-discuss] rescan-scsi-bus.sh and "Local access to NSD failed with EIO, switching to access the disk remotely."

2019-06-25 Thread Jan-Frode Myklebust
I’ve had a situation recently where mmnsddiscover didn’t help, but mmshutdown/mmstartup on that node did fix it. This was with v5.0.2-3 on ppc64le. -jf tir. 25. jun. 2019 kl. 17:02 skrev Son Truong : > > Hello Renar, > > Thanks for that command, very useful and I can now see the problematic

Re: [gpfsug-discuss] [EXTERNAL] Intro, and Spectrum Archive self-service recall interface question

2019-05-21 Thread Jan-Frode Myklebust
It’s a multiple of full blocks. -jf tir. 21. mai 2019 kl. 20:06 skrev Todd Ruston : > Hi Indulis, > > Yes, thanks for the reminder. I'd come across that, and our system is > currently set to a stub size of zero (the default, I presume). I'd intended > to ask in my original query whether

Re: [gpfsug-discuss] gpfsug-discuss Digest, Vol 87, Issue 4

2019-04-03 Thread Jan-Frode Myklebust
ing, please edit your Subject line so it is more specific > than "Re: Contents of gpfsug-discuss digest..." > > > Today's Topics: > >1. Re: Adding ESS to existing Scale Cluster (Sanchez, Paul) >2. New ESS install - Network adapter down level (Oesterlin

Re: [gpfsug-discuss] New ESS install - Network adapter down level

2019-04-03 Thread Jan-Frode Myklebust
Have you tried: updatenode nodename -P gss_ofed But, is this the known issue listed in the qdg? https://www.ibm.com/support/knowledgecenter/SSYSP8_5.3.2/ess_qdg.pdf -jf ons. 3. apr. 2019 kl. 19:26 skrev Oesterlin, Robert < robert.oester...@nuance.com>: > Any insight on what command I

Re: [gpfsug-discuss] Filesystem descriptor discs for GNR

2019-03-28 Thread Jan-Frode Myklebust
There seems to be some changes or bug here.. But try usage=dataOnly pool=neverused failureGroup=xx.. and it should have the same function as long as you never place anything in this pool. -jf tor. 28. mar. 2019 kl. 18:43 skrev Luke Sudbery : > We have a 2 site Lenovo DSS-G based filesystem,

Re: [gpfsug-discuss] Getting which files are store fully in inodes

2019-03-28 Thread Jan-Frode Myklebust
I've been looking for a good way of listing this as well. Could you please share your policy ? -jf On Thu, Mar 28, 2019 at 1:52 PM Dorigo Alvise (PSI) wrote: > Hello, > to get the list (and size) of files that fit into inodes what I do, using > a policy, is listing "online" (not evicted)

Re: [gpfsug-discuss] Querying size of snapshots

2019-01-29 Thread Jan-Frode Myklebust
You could put snapshot data in a separate storage pool. Then it should be visible how much space it occupies, but it’s a bit hard to see how this will be usable/manageable.. -jf tir. 29. jan. 2019 kl. 20:08 skrev Christopher Black : > Thanks for the quick and detailed reply! I had read the

Re: [gpfsug-discuss] Anybody running GPFS over iSCSI?

2018-12-16 Thread Jan-Frode Myklebust
I’d be curious to hear if all these arguments against iSCSI shouldn’t also apply to NSD protocol over TCP/IP? -jf man. 17. des. 2018 kl. 01:22 skrev Jonathan Buzzard < jonathan.buzz...@strath.ac.uk>: > On 13/12/2018 20:54, Buterbaugh, Kevin L wrote: > > [SNIP] > > > > > Two things that I am

Re: [gpfsug-discuss] Anybody running GPFS over iSCSI? -

2018-12-16 Thread Jan-Frode Myklebust
I have been running GPFS over iSCSI, and know of customers who are also. Probably not in the most demanding environments, but from my experience iSCSI works perfectly fine as long as you have a stable network. Having a dedicated (simple) storage network for iSCSI is probably a good idea (just like

Re: [gpfsug-discuss] gpfs mount point not visible in snmp hrStorageTable

2018-11-07 Thread Jan-Frode Myklebust
/mibgroup/hardware/fsys/mnttypes.h @@ -121,6 +121,9 @@ #ifndef MNTTYPE_GFS2 #define MNTTYPE_GFS2 "gfs2" #endif +#ifndef MNTTYPE_GPFS +#define MNTTYPE_GPFS "gpfs" +#endif #ifndef MNTTYPE_XFS #define MNTTYPE_XFS "xfs" #endif On Wed, Nov 7, 2018 at 12:20

Re: [gpfsug-discuss] gpfs mount point not visible in snmp hrStorageTable

2018-11-07 Thread Jan-Frode Myklebust
Looking at the CHANGELOG for net-snmp, it seems it needs to know about each filesystem it's going to support, and I see no GPFS/mmfs. It has entries like: - Added simfs (OpenVZ filesystem) to hrStorageTable and hrFSTable. - Added CVFS (CentraVision File System) to hrStorageTable and

Re: [gpfsug-discuss] Preliminary conclusion: single client, single thread, small files - native Scale vs NFS

2018-10-17 Thread Jan-Frode Myklebust
Also beware there are 2 different linux NFS "async" settings. A client side setting (mount -o async), which still cases sync on file close() -- and a server (knfs) side setting (/etc/exports) that violates NFS protocol and returns requests before data has hit stable storage. -jf On Wed, Oct

Re: [gpfsug-discuss] Preliminary conclusion: single client, single thread, small files - native Scale vs NFS

2018-10-17 Thread Jan-Frode Myklebust
nc on each file.. > > you'll never outperform e.g. 128 (maybe slower), but, parallel threads > (running write-behind) <---> with one single but fast threads, > > so as Alex suggest.. if possible.. take gpfs client of kNFS for those > types of workloads.. > > >

Re: [gpfsug-discuss] Preliminary conclusion: single client, single thread, small files - native Scale vs NFS

2018-10-17 Thread Jan-Frode Myklebust
Do you know if the slow throughput is caused by the network/nfs-protocol layer, or does it help to use faster storage (ssd)? If on storage, have you considered if HAWC can help? I’m thinking about adding an SSD pool as a first tier to hold the active dataset for a similar setup, but that’s mainly

Re: [gpfsug-discuss] replicating ACLs across GPFS's?

2018-09-25 Thread Jan-Frode Myklebust
Not sure if better or worse idea, but I believe robocopy support syncing just the ACLs, so if you do SMB mounts from both sides, that might be an option. -jf tir. 25. sep. 2018 kl. 20:05 skrev Bryan Banister : > I have to correct myself, looks like using nfs4_getacl, nfs4_setfacl, >

Re: [gpfsug-discuss] Metadata with GNR code

2018-09-21 Thread Jan-Frode Myklebust
That reminds me of a point Sven made when I was trying to optimize mdtest results with metadata on FlashSystem... He sent me the following: -- started at 11/15/2015 15:20:39 -- mdtest-1.9.3 was launched with 138 total task(s) on 23 node(s) Command line used:

[gpfsug-discuss] mmapplypolicy --choice-algorithm fast

2018-05-28 Thread Jan-Frode Myklebust
Just found the Spectrum Scale policy "best practices" presentation from the latest UG: http://files.gpfsug.org/presentations/2018/USA/SpectrumScalePolicyBP.pdf which mentions: "mmapplypolicy … --choice-algorithm fast && ... WEIGHT(0) … (avoids final sort of all selected files by weight)" and

Re: [gpfsug-discuss] Recharging where HSM is used

2018-05-03 Thread Jan-Frode Myklebust
Since I'm pretty proud of my awk one-liner, and maybe it's useful for this kind of charging, here's how to sum up how much data each user has in the filesystem (without regards to if the data blocks are offline, online, replicated or compressed): # cat full-file-list.policy RULE EXTERNAL LIST

Re: [gpfsug-discuss] GPFS autoload - wait for IB portstobecomeactive

2018-04-27 Thread Jan-Frode Myklebust
___ > Fred Stock | IBM Pittsburgh Lab | 720-430-8821 > sto...@us.ibm.com > > > > From:Jan-Frode Myklebust <janfr...@tanso.net> > To:gpfsug main discussion list <gpfsug-discuss@spectrumscale.org> > Date:03/16/2018 04:30 AM > >

Re: [gpfsug-discuss] gpfsug-discuss Digest, Vol 75, Issue 34

2018-04-22 Thread Jan-Frode Myklebust
regards > > Ray Coetzee > Mob: +44 759 704 7060 > > Skype: ray.coetzee > > Email: coetzee@gmail.com > > > On Mon, Apr 23, 2018 at 12:02 AM, Jan-Frode Myklebust <janfr...@tanso.net> > wrote: > >> >> Yes, I've been struggelig with something s

Re: [gpfsug-discuss] gpfsug-discuss Digest, Vol 75, Issue 34

2018-04-22 Thread Jan-Frode Myklebust
Yes, I've been struggelig with something similiar this week. Ganesha dying with SIGABRT -- nothing else logged. After catching a few coredumps, it has been identified as a problem with some udp-communication during mounts from solaris clients. Disabling udp as transport on the shares serverside

Re: [gpfsug-discuss] GPFS autoload - wait for IB ports tobecomeactive

2018-03-16 Thread Jan-Frode Myklebust
ser.target.wants/NetworkManager-wait-online. > service' > > in many cases .. it helps .. > > > > > > From:Jan-Frode Myklebust <janfr...@tanso.net> > To:gpfsug main discussion list <gpfsug-discuss@spectrumscale.org> > Date:03/15/2018

Re: [gpfsug-discuss] GPFS autoload - wait for IB ports to becomeactive

2018-03-15 Thread Jan-Frode Myklebust
I found some discussion on this at https://www.ibm.com/developerworks/community/forums/html/threadTopic?id=----14471957=25 and there it's claimed that none of the callback events are early enough to resolve this. That we need a pre-preStartup trigger. Any idea if this has

[gpfsug-discuss] AFM and RHEL 7.4

2018-01-25 Thread Jan-Frode Myklebust
The FAQ has a note stating: 1. AFM, Asynch Disaster Recovery with AFM, Integrated Protocols, and Installation Toolkit are not supported on RHEL 7.4. Could someone please clarify this sentence ? It can't be right that none of these features are supported with RHEL 7.4, or .. ? -jf

Re: [gpfsug-discuss] storage-based replication for Spectrum Scale

2018-01-23 Thread Jan-Frode Myklebust
Have you seen https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.0/com.ibm.spectrum.scale.v4r2.adv.doc/bl1adv_dr.htm ? Seems to cover what you’re looking for.. -jf ons. 24. jan. 2018 kl. 07:33 skrev Harold Morales : > Thanks for answering. > > Essentially, the

Re: [gpfsug-discuss] pool block allocation algorithm

2018-01-13 Thread Jan-Frode Myklebust
Don’t have documentation/whitepaper, but as I recall, it will first allocate round-robin over failureGroup, then round-robin over nsdServers, and then round-robin over volumes. So if these new NSDs are defined as different failureGroup from the old disks, that might explain it.. -jf lør. 13.

Re: [gpfsug-discuss] ESS bring up the GPFS in recovery group without takeover

2017-12-22 Thread Jan-Frode Myklebust
Can’t you just reverse the mmchrecoverygroup --servers order, before starting the io-server? -jf fre. 22. des. 2017 kl. 18:45 skrev Damir Krstic : > It's been a very frustrating couple of months with our 2 ESS systems. IBM > tells us we had blueflame bug and they came on

Re: [gpfsug-discuss] Online data migration tool

2017-12-01 Thread Jan-Frode Myklebust
Bill, could you say something about what the metadata-storage here was? ESS/NL-SAS/3way replication? I just asked about this in the internal slack channel #scale-help today.. -jf fre. 1. des. 2017 kl. 13:44 skrev Bill Hartner : > > "It has a significant performance

Re: [gpfsug-discuss] Backing up GPFS config

2017-11-14 Thread Jan-Frode Myklebust
Plese see https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General%20Parallel%20File%20System%20(GPFS)/page/Back%20Up%20GPFS%20Configuration But also check «mmcesdr primary backup». I don't rememner if it included all of mmbackupconfig/mmccr, but I think it did, and it also

Re: [gpfsug-discuss] Experience with CES NFS export management

2017-10-23 Thread Jan-Frode Myklebust
You can lower LEASE_LIFETIME and GRACE_PERIOD to shorten the time it's in grace, to make it more bearable. Making export changes dynamic is something that's fixed in newer versions of nfs-ganesha than what's shipped with Scale: https://github.com/nfs-ganesha/nfs-ganesha/releases/tag/V2.4.0:

Re: [gpfsug-discuss] how gpfs work when disk fail

2017-10-09 Thread Jan-Frode Myklebust
You don't have room to write 180GB of file data, only ~100GB. When you write f.ex. 90 GB of file data, each filesystem block will get one copy written to each of your disks, occuppying 180 GB on total disk space. So you can always read if from the other disks if one should fail. This is

Re: [gpfsug-discuss] get free space in GSS

2017-07-09 Thread Jan-Frode Myklebust
You had it here: [root@server ~]# mmlsrecoverygroup BB1RGL -L declustered recovery group arrays vdisks pdisks format version - --- -- -- -- BB1RGL 3 18 119 4.2.0.1 declustered needs replace scrub background activity array service vdisks pdisks

Re: [gpfsug-discuss] 'mmces address move' weirdness?

2017-06-12 Thread Jan-Frode Myklebust
Switch to node affinity policy, and it will stick to where you move it. "mmces address policy node-affinity". -jf tir. 13. jun. 2017 kl. 06.21 skrev : > On Mon, 12 Jun 2017 20:06:09 -, "Simon Thompson (IT Research Support)" > said: > > > mmces node suspend -N > > >

Re: [gpfsug-discuss] connected v. datagram mode

2017-05-12 Thread Jan-Frode Myklebust
I also don't know much about this, but the ESS quick deployment guide is quite clear on the we should use connected mode for IPoIB: -- Note: If using bonded IP over IB, do the following: Ensure that the CONNECTED_MODE=yes statement exists in the corresponding slave-bond interface

Re: [gpfsug-discuss] Tiebreaker disk question

2017-05-03 Thread Jan-Frode Myklebust
This doesn't sound like normal behaviour. It shouldn't matter which filesystem your tiebreaker disks belong to. I think the failure was caused by something else, but am not able to guess from the little information you posted.. The mmfs.log will probably tell you the reason. -jf ons. 3. mai

Re: [gpfsug-discuss] NFS issues

2017-04-26 Thread Jan-Frode Myklebust
workload on a client, but may we need either long IO blocked reads > >or writes (from the GPFS end). > > > >We also originally had soft as the default option, but saw issues then > >and the docs suggested hard, so we switched and also enabled sync (we > >figured maybe it w

Re: [gpfsug-discuss] NFS issues

2017-04-25 Thread Jan-Frode Myklebust
I *think* I've seen this, and that we then had open TCP connection from client to NFS server according to netstat, but these connections were not visible from netstat on NFS-server side. Unfortunately I don't remember what the fix was.. -jf tir. 25. apr. 2017 kl. 16.06 skrev Simon Thompson

Re: [gpfsug-discuss] Protocol node recommendations

2017-04-23 Thread Jan-Frode Myklebust
ory node could be a plus if we > mix both protocols for such case. > > > Is the spreadsheet publicly available or do we need to ask IBM ? > > > Thank for your help, > > Frank. > > > -- > *From:* Jan-Frode Myklebust <janfr...@ta

Re: [gpfsug-discuss] Protocol node recommendations

2017-04-22 Thread Jan-Frode Myklebust
That's a tiny maxFilesToCache... I would start by implementing the settings from /usr/lpp/mmfs/*/gpfsprotocolldefaul* plus a 64GB pagepool for your protocoll nodes, and leave further tuning to when you see you have issues. Regarding sizing, we have a spreadsheet somewhere where you can input

Re: [gpfsug-discuss] Can't delete filesystem

2017-04-05 Thread Jan-Frode Myklebust
Maybe try mmumount -f on the remaining 4 nodes? -jf ons. 5. apr. 2017 kl. 18.54 skrev Buterbaugh, Kevin L < kevin.buterba...@vanderbilt.edu>: > Hi Simon, > > No, I do not. > > Let me also add that this is a filesystem that I migrated users off of and > to another GPFS filesystem. I moved the

Re: [gpfsug-discuss] gpfsug-discuss Digest, Vol 62, Issue 33

2017-03-16 Thread Jan-Frode Myklebust
Why would you need a NSD protocol router when the NSD servers can have a mix of infiniband and ethernet adapters? F.ex. 4x EDR + 2x 100GbE per io-node in an ESS should give you lots of bandwidth for your common ethernet medium. -jf On Thu, Mar 16, 2017 at 1:52 AM, Aaron Knister

Re: [gpfsug-discuss] default inode size

2017-03-15 Thread Jan-Frode Myklebust
Another thing to consider is how many disk block pointers you have room for in the inode, and when you'll need to add additional indirect blocks. Ref: http://files.gpfsug.org/presentations/2016/south-bank/ D2_P2_A_spectrum_scale_metadata_dark_V2a.pdf If I understand that presentation correctly..

Re: [gpfsug-discuss] Running gpfs.snap outside of problems

2017-03-09 Thread Jan-Frode Myklebust
There's a manual for it now.. and it points out "The tool impacts performance.." Also it has caused mmfsd crashes for me earlier, so I've learned to be weary of running it.. The manual also says it's collecting using mmfsadm, and the mmfsadm manual warns that it might cause GPFS to fail ("in

Re: [gpfsug-discuss] shutdown

2017-03-03 Thread Jan-Frode Myklebust
Unmount filesystems cleanly "mmumount all -a", stop gpfs "mmshutdown -N gss_ppc64" and poweroff "xdsh gss_ppc64 poweroff". -jf fre. 3. mar. 2017 kl. 20.55 skrev Joseph Grace : > Please excuse the newb question but we have a planned power outage coming > and I can't

Re: [gpfsug-discuss] Issues getting SMB shares working.

2017-03-02 Thread Jan-Frode Myklebust
try, I think my departed colleague was just > using that for testing, the NFS clients all access it though nfsv4 which > looks to be using kerberos from this. > > On 01/03/17 20:21, Jan-Frode Myklebust wrote: > > This looks to me like a quite plain SYS authorized NFS, maybe also verify >

Re: [gpfsug-discuss] Issues getting SMB shares working.

2017-03-01 Thread Jan-Frode Myklebust
guration > == > LOG_LEVEL: EVENT > == > > Idmapd Configuration > == > DOMAIN: DS.LEEDS.AC.UK > == > > On 01/03/17 14:12, Jan-Frode Myklebust wrote: > > Lets figure out how your NF

Re: [gpfsug-discuss] Tracking deleted files

2017-02-27 Thread Jan-Frode Myklebust
AFM apparently keeps track og this, so maybe it would be possible to run AFM-SW with disconnected home and query the queue of changes? But would require some way of clearing the queue as well.. -jf On Monday, February 27, 2017, Marc A Kaplan wrote: > Diffing file

Re: [gpfsug-discuss] bizarre performance behavior

2017-02-17 Thread Jan-Frode Myklebust
I just had a similar experience from a sandisk infiniflash system SAS-attached to s single host. Gpfsperf reported 3,2 Gbyte/s for writes. and 250-300 Mbyte/s on sequential reads!! Random reads were on the order of 2 Gbyte/s. After a bit head scratching snd fumbling around I found out that

Re: [gpfsug-discuss] Getting 'blk_cloned_rq_check_limits: over max size limit' errors after updating the systems to kernel 2.6.32-642.el6 or later

2017-02-12 Thread Jan-Frode Myklebust
The 4.2.2.2 readme says: * Fix a multipath device failure that reads "blk_cloned_rq_check_limits: over max size limit" which can occur when kernel function bio_get_nr_vecs() returns a value which is larger than the value of max sectors of the block device. -jf On Sat, Feb 11, 2017 at 7:32

Re: [gpfsug-discuss] nodes being ejected out of the cluster

2017-01-11 Thread Jan-Frode Myklebust
this off? > > Thanks, > Damir > > On Wed, Jan 11, 2017 at 12:38 PM Jan-Frode Myklebust <janfr...@tanso.net> > wrote: > > And there you have: > > [ems1-fdr,compute,gss_ppc64] > verbsRdmaSend yes > > Try turning this off. > > > -jf > ons. 11. jan. 2017

Re: [gpfsug-discuss] nodes being ejected out of the cluster

2017-01-11 Thread Jan-Frode Myklebust
And there you have: [ems1-fdr,compute,gss_ppc64] verbsRdmaSend yes Try turning this off. -jf ons. 11. jan. 2017 kl. 18.54 skrev Damir Krstic : > Thanks for all the suggestions. Here is our mmlsconfig file. We just > purchased another GL6. During the installation of the

Re: [gpfsug-discuss] CES log files

2017-01-11 Thread Jan-Frode Myklebust
I also struggle with where to look for CES log files.. but maybe the new "mmprotocoltrace" command can be useful? # mmprotocoltrace start smb ### reproduce problem # mmprotocoltrace stop smb Check log files it has collected. -jf On Wed, Jan 11, 2017 at 10:27 AM, Sobey, Richard A

Re: [gpfsug-discuss] replication and no failure groups

2017-01-09 Thread Jan-Frode Myklebust
Yaron, doesn't "-1" make each of these disk an independent failure group? >From 'man mmcrnsd': "The default is -1, which indicates this disk has no point of failure in common with any other disk." -jf man. 9. jan. 2017 kl. 21.54 skrev Yaron Daniel : > Hi > > So - do u

Re: [gpfsug-discuss] AFM Migration Issue

2017-01-09 Thread Jan-Frode Myklebust
Untested, and I have no idea if it will work on the number of files and directories you have, but maybe you can fix it by rsyncing just the directories? rsync -av --dry-run --include='*/' --exclude='*' source/ destination/ -jf man. 9. jan. 2017 kl. 16.09 skrev : >

Re: [gpfsug-discuss] What is LTFS/EE now called, and what version should I be on?

2017-01-03 Thread Jan-Frode Myklebust
This looks like Spectrum Archive v1.2.1.0 (Build 10230). Newest version available on fixcentral is v1.2.2.0, but it doesn't support GPFS v4.2.2.x yet. -jf On Tue, Jan 3, 2017 at 11:56 PM, Valdis Kletnieks wrote: > So we have GPFS Advanced 4.2.1 installed, and the

Re: [gpfsug-discuss] correct way of taking IO server down for maintenance

2016-12-20 Thread Jan-Frode Myklebust
and things (filesystem performance) > seems to be OK. Load average on both io servers is quite high (250avg) and > does not seem to be going down. > > I really wish that maintenance procedures were documented somewhere on IBM > website. This experience this morning has really shaken my confid

Re: [gpfsug-discuss] GPFS 3.5 to 4.1 Upgrade Question

2016-12-06 Thread Jan-Frode Myklebust
om 3.5 to 4.1 and roughly how many clients/servers do you > have in your cluster? > > -Aaron > > On 12/5/16 5:52 PM, Jan-Frode Myklebust wrote: > >> I read it as "do your best". I doubt there can be problems that shows up >> after 3 weeks, that wouldn't also be

Re: [gpfsug-discuss] CES services on an existing GPFS cluster

2016-12-05 Thread Jan-Frode Myklebust
No, the first time you define it, I'm pretty sure can be done online. But when changing it later, it will require the stopping the full cluster first. -jf man. 5. des. 2016 kl. 15.26 skrev Sander Kuusemets : > Hello, > > I have been thinking about setting up a CES

Re: [gpfsug-discuss] Upgrading kernel on RHEL

2016-11-29 Thread Jan-Frode Myklebust
I think GPFS upgrades are a fine opportunity to check the FAQ and update to latest tested/supported OS versions. But please remember to check all components in the "Functional Support Matrices", and latest kernel tested. -jf On Tue, Nov 29, 2016 at 10:59 AM, Sobey, Richard A

[gpfsug-discuss] filesystem thresholds in gui alerting

2016-10-26 Thread Jan-Frode Myklebust
Does anybody know if there are any way to define what thresholds are to be used for alterting in the GUI? F.ex. we have some filesystems that are very full, but won't be getting any more data added.. we'd like to turn off monitoring of these, are raise the threshold to allow them to be ~100% full.

Re: [gpfsug-discuss] GPFS Upgrade 3.5 -> 4.1

2016-10-10 Thread Jan-Frode Myklebust
I've also always been worried about that one, but never experienced it taking any time, I/O or interruption. I've the interpreted it to just start using new features, but not really changing anything with the existing metadata. Things needing on disk changes are probably put in mmmigratefs I have

Re: [gpfsug-discuss] Blocksize

2016-09-22 Thread Jan-Frode Myklebust
https://www.ibm.com/developerworks/community/forums/html/topic?id=----14774266 "Use 256K. Anything smaller makes allocation blocks for the inode file inefficient. Anything larger wastes space for directories. These are the two largest consumers of metadata space."

Re: [gpfsug-discuss] big difference between output of 'mmlsquota' and 'du'?

2016-09-12 Thread Jan-Frode Myklebust
Maybe you have a huge file open, that's been unlinked and still growing? -jf ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Re: [gpfsug-discuss] Weirdness with 'mmces address add'

2016-09-07 Thread Jan-Frode Myklebust
I believe your first guess is correct. The ces-ip needs to be resolvable for some reason... Just put a name for it in /etc/hosts, if you can't add it to your dns. -jf ons. 7. sep. 2016 kl. 20.45 skrev Valdis Kletnieks : > We're in the middle of deploying Spectrum

  1   2   >