[gpfsug-discuss] AFM does too small NFS writes, and I don't see parallel writes

2021-11-23 Thread Billich Heinrich Rainer (ID SD)
Hello, We currently move data to a new AFM fileset and I see poor performance and ask for advice and insight: The migration to afm home seems slow. I note: Afm writes a whole file of ~100MB in much too many small chunks My assumption: The many small writes reduce performance as we

Re: [gpfsug-discuss] /tmp/mmfs vanishes randomly?

2021-11-16 Thread Billich Heinrich Rainer (ID SD)
14562 / WEEE-Reg.-Nr. DE 99369940 - Ursprüngliche Nachricht - Von: "Billich Heinrich Rainer (ID SD)" Gesendet von: gpfsug-discuss-boun...@spectrumscale.org An: "gpfsug main discussion list" CC: Betreff: [EXTERNAL] [gpfsug-discuss] /tmp/mmfs vanishes randomly? Datu

[gpfsug-discuss] /tmp/mmfs vanishes randomly?

2021-11-08 Thread Billich Heinrich Rainer (ID SD)
Hello, We use /tmp/mmfs as dataStructureDump directory. Since a while I notice that this directory randomly vanishes. Mmhealth does not complain but just notes that it will no longer monitor the directory. Still I doubt that trace collection and similar will create the directory when needed?

Re: [gpfsug-discuss] Metadata usage almost doubled after policy run with migration rule

2021-06-17 Thread Billich Heinrich Rainer (ID SD)
t 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. "Billich Heinrich Rainer (ID SD)" ---2021/06/08 05:19:32 PM--- Hello, From

[gpfsug-discuss] Metadata usage almost doubled after policy run with migration rule

2021-06-08 Thread Billich Heinrich Rainer (ID SD)
Hello, A policy run with ‘-I defer’ and a placement rule did almost double the metadata usage of a filesystem. This filled the metadata disks to a critical level. I would like to understand if this is to be expected and ‘as designed’ or if I face some issue or bug.  I hope a subsequent

Re: [gpfsug-discuss] Mmrestripefs -R --metadat-only - how to estimate remaining execution time

2021-05-28 Thread Billich Heinrich Rainer (ID SD)
5%)       1342 ( 0%).  <<<<<< 356190MB used From: on behalf of "Billich Heinrich Rainer (ID SD)" Reply to: gpfsug main discussion list Date: Friday, 28 May 2021 at 10:25 To: gpfsug main discussion list Subject: [gpfsug-discuss] Mmrestripefs -R --metadat-

[gpfsug-discuss] Mmrestripefs -R --metadat-only - how to estimate remaining execution time

2021-05-28 Thread Billich Heinrich Rainer (ID SD)
Hello, I want to estimate how much longer a running mmrestripefs -R –metadata-only –qos maintenance job will take to finish. We did switch from ‘-m 1’ to ‘-m 2 ’ and now run mmrestripefs to get the second copy of all metadata. I know that the % values in the output are useless, but

[gpfsug-discuss] migrate to new metadata GNR systems while maintaining a fallback possibilty

2021-05-10 Thread Billich Heinrich Rainer (ID SD)
Hello, I need to update/replace our metadata disks but I want to keep old and new in parallel for a while before I remove the old storage: as active/active pair with double copies. This will allow an immediate fall-back if we ever need. Maybe you want to comment on this procedure – I

[gpfsug-discuss] Mmapplypolicy with -I defer doesn't sort resulting list? Is this by intention?

2021-02-08 Thread Billich Heinrich Rainer (ID SD)
Hello, I want to migrate data with mmapplypolicy to a different pool: 1. create a file list with "-I defer -f /some/path" 2. execute with "-I yes -r /some/path" I noted that the file list created in 1. Is not sorted. I asked to sort by kb_allocated, the ideas is to migrate the largest files

[gpfsug-discuss] 'ganesha_mgr display_export - client not listed

2020-10-30 Thread Billich Heinrich Rainer (ID SD)
Hello, Some nfsv4 client of ganesha does not show up in the output of 'ganesha_mgr display_export'. The client has an active mount, but also shows some nfs issues, some commands did hang, the process just stays in state D (uninterruptible sleep) according to 'ps', but not the whole mount. I

[gpfsug-discuss] Best of spectrum scale

2020-09-07 Thread Billich Heinrich Rainer (ID SD)
Hi, just came across this: /usr/lpp/mmfs/bin/mmafmctl fs3101 getstate mmafmctl: Invalid current working directory detected: /tmp/A The command may fail in an unexpected way. Processing continues .. It’s like a bus driver telling you that the brakes don’t work and next speeding up even more.

[gpfsug-discuss] AFM cache rolling upgrade with minimal impact / no directory scan

2020-08-25 Thread Billich Heinrich Rainer (ID SD)
Hello, We will upgrade a pair of AFM cache nodes which serve about 40 SW filesets. I want to do a rolling upgrade. I wonder if I can minimize the impact of the failover when filesets move to the other afm node. I can't stop replication during the upgrade: The update will take too long (OS,

[gpfsug-discuss] Tune OS for Mellanox IB/ETH HCA on Power hardware - should I run mlnx_affinity or mlnx_tune or sysctl tuning?

2020-08-19 Thread Billich Heinrich Rainer (ID SD)
Hello, We run Spectrum Scale on Power hardware - le and be - and Mellanox IB and VPI cards. We did not enable any automatic tuning at system start by the usual Mellanox scripts. /etc/infiniband/openib.conf contains # Run /usr/sbin/mlnx_affinity RUN_AFFINITY_TUNER=no # Run /usr/sbin/mlnx_tune

[gpfsug-discuss] Example /var/mmfs/etc/eventsCallback script?

2020-06-24 Thread Billich Heinrich Rainer (ID SD)
Hello, I’m looking for an example script /var/mmfs/etc/eventsCallback to add callbacks for system health events. I searched the installation and googled but didn’t found one. As there is just one script to handle all events the script probably should be a small mediator that just checks if

[gpfsug-discuss] IJ24518: NVME SCSI EMULATION ISSUE - what do do with this announcement, all I get is an APAR number

2020-06-02 Thread Billich Heinrich Rainer (ID SD)
Hello, I’m quite upset of the form and usefulness of some IBM announcements like this one: IJ24518: NVME SCSI EMULATION ISSUE How do I translate an APAR number to the spectrum scale or ess release which fix it? And which versions are affected? Need I to download all Readmes and grep for the

[gpfsug-discuss] Parse -Y command output

2020-05-27 Thread Billich Heinrich Rainer (ID SD)
Hello, I wonder if any python or bash functions exist to parse the output of mm-command's -Y format, i.e. colon-separated with HEADER rows. I would be nice to convert the output to a python list of named tuples or even better pandas dataframe. I would like to access the values by column

[gpfsug-discuss] Mmhealth events longwaiters_found and deadlock_detected

2020-04-16 Thread Billich Heinrich Rainer (ID SD)
Hello, I’m puzzled about the difference between the two mmhealth events longwaiters_found ERROR Detected Spectrum Scale long-waiters and deadlock_detected WARNINGThe cluster detected a Spectrum Scale filesystem deadlock Especially why the later has level WARNING only while the

Re: [gpfsug-discuss] GUI timeout when running HW_INVENTORY on little endian ESS server

2020-03-25 Thread Billich Heinrich Rainer (ID SD)
CmdRunTask.doExecute nas12io04b-i: Error executing rinv command. Exit code = 1; Command output = ; Command error =***: [**]: Error: timeout On 25.03.20, 16:35, "Billich Heinrich Rainer (ID SD)" wrote: Hello, I did ask about this timeouts when the gui runs HW

[gpfsug-discuss] GUI timeout when running HW_INVENTORY on little endian ESS server

2020-03-25 Thread Billich Heinrich Rainer (ID SD)
Hello, I did ask about this timeouts when the gui runs HW_INVENTORY before. Now I would like to know what the exact timeout value in the gui code is and if we can change it. I want to argue: If a xCat command takes X seconds but the GUI code timeouts after Y we know the command will

[gpfsug-discuss] Spectrum scale yum repos - any chance to the number of repos

2020-02-10 Thread Billich Heinrich Rainer (ID SD)
Hello, Does it work to merge “all” Spectrum Scale rpms of one version in one yum repo, can I merge rpms from different versions in the same repo, even different architectures? Yum repos for RedHat, Suse, Debian or application repos like EPEL all manage to keep many rpms and all different

Re: [gpfsug-discuss] gui_refresh_task_failed for HW_INVENTORY with two active GUI nodes

2020-02-03 Thread Billich Heinrich Rainer (ID SD)
n Behalf Of Ulrich Sibiller Sent: Thursday, January 30, 2020 9:44 AM To: gpfsug-discuss@spectrumscale.org<mailto:gpfsug-discuss@spectrumscale.org> Subject: Re: [gpfsug-discuss] gui_refresh_task_failed for HW_INVENTORY with two active GUI nodes On 1/29/20 2:05 PM, Billich Heinrich Raine

[gpfsug-discuss] When is a file system log recovery triggered

2020-02-03 Thread Billich Heinrich Rainer (ID SD)
Hello, Does mmshutdown or mmumount trigger a file system log recovery, same as a node failure or daemon crash do? Last week we got this advisory: IBM Spectrum Scale (GPFS) 5.0.4 levels: possible metadata or data corruption during file system log recovery

Re: [gpfsug-discuss] gui_refresh_task_failed for HW_INVENTORY with two active GUI nodes

2020-01-30 Thread Billich Heinrich Rainer (ID SD)
-- === Heinrich Billich ETH Zürich Informatikdienste Tel.: +41 44 632 72 56 heinrich.bill...@id.ethz.ch On 30.01.20, 15:44, "gpfsug-discuss-boun...@spectrumscale.org on behalf of Ulrich Sibiller" wrote: On 1/29/20 2:05 PM, Billich Heinrich

[gpfsug-discuss] gui_refresh_task_failed for HW_INVENTORY with two active GUI nodes

2020-01-29 Thread Billich Heinrich Rainer (ID SD)
Hello, Can I change the times at which the GUI runs HW_INVENTORY and related tasks? we frequently get messages like gui_refresh_task_failed GUI WARNING 12 hours ago The following GUI refresh task(s) failed: HW_INVENTORY The tasks fail due to timeouts. Running the

Re: [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected?

2020-01-20 Thread Billich Heinrich Rainer (ID SD)
performance issue. Flushing the pending queue entries is not avaible as of today (5.0.4), we are currently working on this feature. ~Venkat (vpuvv...@in.ibm.com) From: "Billich Heinrich Rainer (ID SD)" To:gpfsug main discussion list Date:01/13/2020 05:29 PM S

[gpfsug-discuss] Does an AFM recovery stop AFM from recalling files?

2020-01-20 Thread Billich Heinrich Rainer (ID SD)
Hello, Do AFM recalls from home to cache still work when a fileset is in state ‘Recovery’? Are there any other states that allow to write/read from cache but won’t allow to recall from home? We announced to users that they can continue to work on cache while a recovery is running. But we got

Re: [gpfsug-discuss] How to install efix with yum ?

2020-01-20 Thread Billich Heinrich Rainer (ID SD)
Thank you, this did work. I did install efix9 for 5.0.4.1 using yum, just with a plain “yum update” after installing the base version. I placed efix and base rpms in different yum repos and did disable the efix-repo while installing the base version, and vice versa. Kind regards, Heiner

[gpfsug-discuss] How to install efix with yum ?

2020-01-15 Thread Billich Heinrich Rainer (ID SD)
Hello, I will install efix9 on 5.0.4.1. The instructions ask to use rpm --force -U gpfs.*.rpm but give no yum command. I assume that this is not specific to this efix. I wonder if installing an efix with yum is supported and what the proper commands are? Using yum would make deployment much

Re: [gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected?

2020-01-13 Thread Billich Heinrich Rainer (ID SD)
ided. These are some issues fixed in this regard. What is the scale version ? https://www-01.ibm.com/support/docview.wss?uid=isg1IJ15436 ~Venkat (vpuvv...@in.ibm.com) From:"Billich Heinrich Rainer (ID SD)" To:gpfsug main discussion list Date:01/08/2020 10:3

[gpfsug-discuss] AFM Recovery of SW cache does a full scan of home - is this to be expected?

2020-01-08 Thread Billich Heinrich Rainer (ID SD)
Hello, still new to AFM, so some basic question on how Recovery works for a SW cache: we have an AFM SW cache in recovery mode – recovery first did run policies on the cache cluster, but now I see a ‘tcpcachescan’ process on cache slowly scanning home via nfs. Single host, single process, no

[gpfsug-discuss] Max number of vdisks in a recovery group - is it 64?

2019-12-12 Thread Billich Heinrich Rainer (ID SD)
Hello, I remember that a GNR/ESS recovery group can hold up to 64 vdisks, but I can’t find a citation to proof it. Now I wonder if 64 is the actual limit? And where is it documented? And did the limit change with versions? Thank you. I did spend quite some time searching the documentation,

[gpfsug-discuss] mmvdisk - how to see which recovery groups are managed by mmvdisk?

2019-11-04 Thread Billich Heinrich Rainer (ID SD)
Hello, I try to get acquainted with mmvdisk: can I decide on the names of vdisks/nsds which mmvdisk creates? Tools like mmdf still show nsd devices, no vdisk-sets, hence a proper naming helps. RG001VS001 isn’t always what I would choose. Of course I can just not use mmvdisk where

Re: [gpfsug-discuss] Ganesha all IPv6 sockets - ist this to be expected?

2019-10-01 Thread Billich Heinrich Rainer (ID SD)
/2963091 From: on behalf of "Billich Heinrich Rainer (ID SD)" Reply to: gpfsug main discussion list Date: Monday, 16 September 2019 at 17:51 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Ganesha all IPv6 sockets - ist this to be expected? Hello Olaf, Thank you, so

Re: [gpfsug-discuss] Ganesha daemon has 400'000 open files - is this unusual?

2019-09-24 Thread Billich Heinrich Rainer (ID SD)
2019 15:20, Billich Heinrich Rainer (ID SD) wrote: > Hello, > > Is it usual to see 200’000-400’000 open files for a single ganesha > process? Or does this indicate that something ist wrong? > > We have some issues with ganesha (on spectrum scale protoc

Re: [gpfsug-discuss] Ganesha daemon has 400'000 open files - is this unusual?

2019-09-23 Thread Billich Heinrich Rainer (ID SD)
O failure happens or when the open fd count is high, you could do the following: 1. ganesha_mgr set_log COMPONENT_CACHE_INODE_LRU FULL_DEBUG 2. wait for 90 seconds, then run 3. ganesha_mgr set_log COMPONENT_CACHE_INODE_LRU EVENT Regards, Malahal. - Original message - From: "Billich

Re: [gpfsug-discuss] Ganesha daemon has 400'000 open files - is this unusual?

2019-09-23 Thread Billich Heinrich Rainer (ID SD)
2019 15:20, Billich Heinrich Rainer (ID SD) wrote: > Hello, > > Is it usual to see 200’000-400’000 open files for a single ganesha > process? Or does this indicate that something ist wrong? > > We have some issues with ganesha (on spectrum scale protoc

[gpfsug-discuss] Ganesha daemon has 400'000 open files - is this unusual?

2019-09-19 Thread Billich Heinrich Rainer (ID SD)
Hello, Is it usual to see 200’000-400’000 open files for a single ganesha process? Or does this indicate that something ist wrong? We have some issues with ganesha (on spectrum scale protocol nodes) reporting NFS3ERR_IO in the log. I noticed that the affected nodes have a large number of

[gpfsug-discuss] Ganesha all IPv6 sockets - ist this to be expected?

2019-09-13 Thread Billich Heinrich Rainer (ID SD)
Hello, I just noted that our ganesha daemons offer IPv6 sockets only, IPv4 traffic gets encapsulated. But all traffic to samba is IPv4, smbd offers both IPv4 and IPv6 sockets. I just wonder whether this is to be expected? Protocols support IPv4 only, so why running on IPv6 sockets only for

[gpfsug-discuss] How to prove that data is in inode

2019-07-17 Thread Billich Heinrich Rainer (ID SD)
Hello, How can I prove that data of a small file is stored in the inode (and not on a data nsd)? We have a filesystem with 4k inodes on Scale 5.0.2 , but it seems there is no file data in the inodes? I would expect that 'stat' reports 'Blocks: 0' for a small file, but I see 'Blocks:1'.