> The queues are steadily rising and we've seen them over 1000000 ...

There is definitely a song here...  I see you playing the blues guitar...

I can't answer your question directly.
As I recall you are at the latest version? We recently had to update to 4.2.3.4 
due to an AFM issue - where if the home NFS share was disconnected, a read 
operation would finish early and not re-start.

One thing I would do is look at where the 'real' NFS mount is being done 
(apology - I assume an NFS home).
Log on to bber-afmgw01     and find where the home filesystem is being mounted, 
which is below /var/mmfs/afm
Have a ferret around in there - do you still have that filesystem mounted?




-----Original Message-----
From: gpfsug-discuss-boun...@spectrumscale.org 
[mailto:gpfsug-discuss-boun...@spectrumscale.org] On Behalf Of Simon Thompson 
(IT Research Support)
Sent: Monday, October 09, 2017 2:57 PM
To: gpfsug-discuss@spectrumscale.org
Subject: [gpfsug-discuss] AFM fun (more!)


Hi All,

We're having fun (ok not fun ...) with AFM.

We have a file-set where the queue length isn't shortening, watching it over 5 
sec periods, the queue length increases by ~600-1000 items, and the numExec 
goes up by about 15k.

The queues are steadily rising and we've seen them over 1000000 ...

This is on one particular fileset e.g.:

mmafmctl rds-cache getstate
                       Mon Oct  9 08:43:58 2017

Fileset Name    Fileset Target                                Cache State
        Gateway Node    Queue Length   Queue numExec
------------    --------------
-------------        ------------    ------------   -------------
rds-projects-facility gpfs:///rds/projects/facility           Dirty
        bber-afmgw01    3068953        520504
rds-projects-2015 gpfs:///rds/projects/2015                   Active
        bber-afmgw01    0              3
rds-projects-2016 gpfs:///rds/projects/2016                   Dirty
        bber-afmgw01    1482           70
rds-projects-2017 gpfs:///rds/projects/2017                   Dirty
        bber-afmgw01    713            9104
bear-appsgpfs:///rds/bear-apps                         Dirty
  bber-afmgw02    3              2472770871
user-homesgpfs:///rds/homes                             Active
   bber-afmgw02    0              19
bear-sysapps    gpfs:///rds/bear-sysapps                      Active
        bber-afmgw02    0              4



This is having the effect that other filesets on the same "Gateway" are not 
getting their queues processed.

Question 1.
Can we force the gateway node for the other file-sets to our "02" node.
I.e. So that we can get the queue services for the other filesets.

Question 2.
How can we make AFM actually work for the "facility" file-set. If we shut down 
GPFS on the node, on the secondary node, we'll see log entires like:
2017-10-09_13:35:30.330+0100: [I] AFM: Found 1069575 local remove operations...

So I'm assuming the massive queue is all file remove operations?

Alarmingly, we are also seeing entires like:
2017-10-09_13:54:26.591+0100: [E] AFM: WriteSplit file system rds-cache fileset 
rds-projects-2017 file IDs [5389550.5389550.-1.-1,R] name  remote error 5

Anyone any suggestions?

Thanks

Simon


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hearns%40asml.com%7Caa732d9965f64983c2e508d50f15424e%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=wVJhicLSj%2FWUjedvBKo6MG%2FYrtFAaWKxMeqiUrKRHfM%3D&reserved=0
-- The information contained in this communication and any attachments is 
confidential and may be privileged, and is for the sole use of the intended 
recipient(s). Any unauthorized review, use, disclosure or distribution is 
prohibited. Unless explicitly stated otherwise in the body of this 
communication or the attachment thereto (if any), the information is provided 
on an AS-IS basis without any express or implied warranties or liabilities. To 
the extent you are relying on this information, you are doing so at your own 
risk. If you are not the intended recipient, please notify the sender 
immediately by replying to this message and destroy all copies of this message 
and any attachments. Neither the sender nor the company/group of companies he 
or she represents shall be liable for the proper and complete transmission of 
the information contained in this communication, or for any delay in its 
receipt.
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to