[gpfsug-discuss] Looking for experiences with Huawei Oceanstore and GPFS / Spectrum Scale

2016-10-17 Thread Christoph Krafft
Hi folks, has anyone made experiences with Huawei Oceanstore and GPFS - and would be willing to share some details with me? Any helpful hints are deeply appreciated - THANK you in advance! Mit freundlichen Grüßen / Sincerely Christoph Krafft Client Technical Specialist - Power Systems, IBM

[gpfsug-discuss] CES: IP address won't assign: "handleNetworkProblem with lock held"

2016-10-17 Thread Oesterlin, Robert
Can anyone help me pinpoint the issue here? These message repeat and the IP addresses never get assigned. [root@tct-gw01 ~]# tail /var/mmfs/gen/mmfslog Mon Oct 17 10:57:55 EDT 2016: mmcesnetworkmonitor: Found unassigned address 10.30.22.178 Mon Oct 17 10:57:55 EDT 2016: mmcesnetworkmonitor: Foun

[gpfsug-discuss] Disk can't be recovered due to uncorrectable read error in vdisk (GSS)

2016-10-17 Thread Kenneth Waegeman
Hi, Currently our file system is down due to down/unrecovered disks. We try to start the disks again with mmchdisk, but when we do this, we see this error in our mmfs.log: Mon Oct 17 15:28:18.122 2016: [E] smallRead VIO failed due to uncorrectable read error in vdisk nsd11_MetaData_8M_3p_2 v

Re: [gpfsug-discuss] CES and NFS Tuning suggestions

2016-10-17 Thread Bryan Banister
One major issue is the maxFilesToCache and maybe the maxStatCache (though I hear that Linux negates the use of this parameter now? I don’t quite remember). Ganesha apparently likes to hold open a large number of files and this means that it will quickly fill up the maxFilesToCache. When this

Re: [gpfsug-discuss] Disk can't be recovered due to uncorrectable read error in vdisk (GSS)

2016-10-17 Thread Ralph A Becker-szendy
"Kenneth Waegeman" wrote:   > Currently our file system is down due to down/unrecovered disks. We > try to start the disks again with mmchdisk, but when we do this, we > see this error in our mmfs.log: > ... > This is a 3-way replicated vdisk, and not one of the recovering disks, but > this disk

Re: [gpfsug-discuss] Disk can't be recovered due to uncorrectable read error in vdisk (GSS)

2016-10-17 Thread Stijn De Weirdt
hi ralph, >>Currently our file system is down due to down/unrecovered disks. We >> try to start the disks again with mmchdisk, but when we do this, we >> see this error in our mmfs.log: >> ... >> This is a 3-way replicated vdisk, and not one of the recovering disks,but >> this disk is in 'up' stat

Re: [gpfsug-discuss] CES and NFS Tuning suggestions

2016-10-17 Thread Olaf Weiser
in addition ... depending on you block size and the multi threaded NFS .. IO's may not come in the right order to GPFS so that GPFS can't recognize sequential or random IO access patterns correctly... therefore adjust: nfsPrefetchStrategy  default [0]  to [1-10]it tells GPFS to consider all infligh

Re: [gpfsug-discuss] CES: IP address won't assign: "handleNetworkProblem with lock held"

2016-10-17 Thread Olaf Weiser
simple question  -sorry for that - your Nodes.. do they have an IP address in the same subnet as your IP address listed here ?and if, is this network up n running so that GPFS can find/detect it ?what tells mmlscluster --ces ?from each node - assuming class C /24 network , do a ip a | grep 10.30.22

[gpfsug-discuss] CES: IP address won't assign: "handleNetworkProblem with lock held"

2016-10-17 Thread Oesterlin, Robert
Yes - so interesting - it looks like the nodes have the addresses assigned but CES doesn’t know that. [root@tct-gw01 ~]# mmlscluster --ces GPFS cluster information GPFS cluster name: nrg1-tct.nrg1.us.grid.nuance.com GPFS cluster id: 1786951463969941

Re: [gpfsug-discuss] CES: IP address won't assign: "handleNetworkProblem with lock held"

2016-10-17 Thread Olaf Weiser
ah .. I see.. seems, that you already has IP aliases around .. GPFS don't like it... eg. your node tct-gw01.infra.us.grid.nuance.com:      inet 10.30.22.160/24    has already an alias -  10.30.22.176 ... if I understand you answers correctly...from the doc'... [...] you need to provide a static IP

Re: [gpfsug-discuss] CES: IP address won't assign: "handleNetworkProblem with lock held"

2016-10-17 Thread Simon Thompson (Research Computing - IT Services)
Does it strictly follow this? We were doing some testing with tap interfaces into vxlan networks and found that if we simulated taking down the vxlan interface (which appears in ifconfig as a physical int really), then it moved the ces ip onto the box's primary Nic which was a different subnet

Re: [gpfsug-discuss] [EXTERNAL] Re: CES: IP address won't assign: "handleNetworkProblem with lock held"

2016-10-17 Thread Oesterlin, Robert
No - the :0 and :1 address are floating addresses *assigned by CES* - it created those interfaces. The issue seems to be that these are assigned and CES doesn't know it. Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid From: on behalf of Olaf Weiser Reply-To: gpfsug main discussion list

Re: [gpfsug-discuss] [EXTERNAL] Re: CES: IP address won't assign: "handleNetworkProblem with lock held"

2016-10-17 Thread Olaf Weiser
ah .. I see. sorry, should have checked that , so to stay with this example, the  IP address 10.30.22.176is set by CES.. as a floating service IP .. something is insane.. are the smb/NFS services running (systemctl ...)    and can you access the exports from outside ?From:        "Oesterlin, Robert

Re: [gpfsug-discuss] CES: IP address won't assign: "handleNetworkProblem with lock held"

2016-10-17 Thread Olaf Weiser
strange.. this should not happen if you can recreate it, please open a PMR for this.. From:        "Simon Thompson (Research Computing - IT Services)" To:        gpfsug main discussion list Date:        10/17/2016 11:57 AMSubject:        Re: [gpfsug-discuss] CES: IP address won't        assig

[gpfsug-discuss] Any spaces left for a user-presentation at the UG, SC16?

2016-10-17 Thread Jake Carroll
Hi, I have something interesting that I think the user group might find novel, or at least, hopefully interesting, at the UG meetup at SC16. It would entail an unusual use-case for AFM and some of the unusual things we are doing with it. All good if no spots left – but I’d be happy to present s