Hi Salvatore, Are you using ethernet or infiniband as the GPFS interconnect to your clients?
If 10/40GbE - do you have a separate admin network? I have seen behaviour similar to this where the storage traffic causes congestion and the "admin" traffic gets lost or delayed causing expels. Vic On 21 Aug 2014, at 10:04, Salvatore Di Nardo <[email protected]> wrote: > Thanks for the feedback, but we managed to find a scenario that excludes > network problems. > > we have a file called input_file of nearly 100GB: > > if from client A we do: > > cat input_file >> output_file > > it start copying.. and we see waiter goeg a bit up,secs but then they flushes > back to 0, so we xcan say that the copy proceed well... > > > if now we do the same from another client ( or just another shell on the same > client) client B : > > cat input_file >> output_file > > > ( in other words we are trying to write to the same destination) all the > waiters gets up until one node get expelled. > > > Now, while its understandable that the destination file is locked for one of > the "cat", so have to wait ( and since the file is BIG , have to wait for a > while), its not understandable why it stop the renewal lease. > Why its doen't return just a timeout error on the copy instead to expel the > node? We can reproduce this every time, and since our users to operations > like this on files over 100GB each you can imagine the result. > > > > As you can imagine even if its a bit silly to write at the same time to the > same destination, its also quite common if we want to dump to a log file logs > and for some reason one of the writers, write for a lot of time keeping the > file locked. > Our expels are not due to network congestion, but because a write attempts > have to wait another one. What i really dont understand is why to take a so > expreme mesure to expell jest because a process is waiteing "to too much > time". > > > I have ticket opened to IBM for this and the issue is under investigation, > but no luck so far.. > > Regards, > Salvatore > > > > On 21/08/14 09:20, Jez Tucker (Chair) wrote: >> Hi there, >> >> I've seen the on several 'stock'? 'core'? GPFS system (we need a better >> term now GSS is out) and seen ping 'working', but alongside ejections from >> the cluster. >> The GPFS internode 'ping' is somewhat more circumspect than unix ping - and >> rightly so. >> >> In my experience this has _always_ been a network issue of one sort of >> another. If the network is experiencing issues, nodes will be ejected. >> Of course it could be unresponsive mmfsd or high loadavg, but I've seen that >> only twice in 10 years over many versions of GPFS. >> >> You need to follow the logs through from each machine in time order to >> determine who could not see who and in what order. >> Your best way forward is to log a SEV2 case with IBM support, directly or >> via your OEM and collect and supply a snap and traces as required by support. >> >> Without knowing your full setup, it's hard to help further. >> >> Jez >> >> On 20/08/14 08:57, Salvatore Di Nardo wrote: >>> Still problems. Here some more detailed examples: >>> >>> EXAMPLE 1: >>> EBI5-220 ( CLIENT) >>> Tue Aug 19 11:03:04.980 2014: Timed out waiting for a reply from node >>> <GSS02B IP> gss02b >>> Tue Aug 19 11:03:04.981 2014: Request sent to <GSS02A IP> (gss02a in >>> GSS.ebi.ac.uk) to expel <GSS02B IP> (gss02b in GSS.ebi.ac.uk) from cluster >>> GSS.ebi.ac.uk >>> Tue Aug 19 11:03:04.982 2014: This node will be expelled from cluster >>> GSS.ebi.ac.uk due to expel msg from <EBI5-220 IP> (ebi5-220) >>> Tue Aug 19 11:03:09.319 2014: Cluster Manager connection broke. Probing >>> cluster GSS.ebi.ac.uk >>> Tue Aug 19 11:03:10.321 2014: Unable to contact any quorum nodes during >>> cluster probe. >>> Tue Aug 19 11:03:10.322 2014: Lost membership in cluster GSS.ebi.ac.uk. >>> Unmounting file systems. >>> Tue Aug 19 11:03:10 BST 2014: mmcommon preunmount invoked. File system: >>> gpfs1 Reason: SGPanic >>> Tue Aug 19 11:03:12.066 2014: Connecting to <GSS02A IP> gss02a <c1p687> >>> Tue Aug 19 11:03:12.070 2014: Connected to <GSS02A IP> gss02a <c1p687> >>> Tue Aug 19 11:03:17.071 2014: Connecting to <GSS02B IP> gss02b <c1p686> >>> Tue Aug 19 11:03:17.072 2014: Connecting to <GSS03B IP> gss03b <c1p685> >>> Tue Aug 19 11:03:17.079 2014: Connecting to <GSS03A IP> gss03a <c1p684> >>> Tue Aug 19 11:03:17.080 2014: Connecting to <GSS01B IP> gss01b <c1p683> >>> Tue Aug 19 11:03:17.079 2014: Connecting to <GSS01A IP> gss01a <c1p1> >>> Tue Aug 19 11:04:23.105 2014: Connected to <GSS02B IP> gss02b <c1p686> >>> Tue Aug 19 11:04:23.107 2014: Connected to <GSS03B IP> gss03b <c1p685> >>> Tue Aug 19 11:04:23.112 2014: Connected to <GSS03A IP> gss03a <c1p684> >>> Tue Aug 19 11:04:23.115 2014: Connected to <GSS01B IP> gss01b <c1p683> >>> Tue Aug 19 11:04:23.121 2014: Connected to <GSS01A IP> gss01a <c1p1> >>> Tue Aug 19 11:12:28.992 2014: Node <GSS02A IP> (gss02a in GSS.ebi.ac.uk) is >>> now the Group Leader. >>> >>> GSS02B ( NSD SERVER) >>> ... >>> Tue Aug 19 11:03:17.070 2014: Killing connection from <EBI5-220 IP> because >>> the group is not ready for it to rejoin, err 46 >>> Tue Aug 19 11:03:25.016 2014: Killing connection from <EBI5-102 IP> because >>> the group is not ready for it to rejoin, err 46 >>> Tue Aug 19 11:03:28.080 2014: Killing connection from <EBI5-220 IP> because >>> the group is not ready for it to rejoin, err 46 >>> Tue Aug 19 11:03:36.019 2014: Killing connection from <EBI5-102 IP> because >>> the group is not ready for it to rejoin, err 46 >>> Tue Aug 19 11:03:39.083 2014: Killing connection from <EBI5-220 IP> because >>> the group is not ready for it to rejoin, err 46 >>> Tue Aug 19 11:03:47.023 2014: Killing connection from <EBI5-102 IP> because >>> the group is not ready for it to rejoin, err 46 >>> Tue Aug 19 11:03:50.088 2014: Killing connection from <EBI5-220 IP> because >>> the group is not ready for it to rejoin, err 46 >>> Tue Aug 19 11:03:52.218 2014: Killing connection from <EBI5-043 IP> because >>> the group is not ready for it to rejoin, err 46 >>> Tue Aug 19 11:03:58.030 2014: Killing connection from <EBI5-102 IP> because >>> the group is not ready for it to rejoin, err 46 >>> Tue Aug 19 11:04:01.092 2014: Killing connection from <EBI5-220 IP> because >>> the group is not ready for it to rejoin, err 46 >>> Tue Aug 19 11:04:03.220 2014: Killing connection from <EBI5-043 IP> because >>> the group is not ready for it to rejoin, err 46 >>> Tue Aug 19 11:04:09.034 2014: Killing connection from <EBI5-102 IP> because >>> the group is not ready for it to rejoin, err 46 >>> Tue Aug 19 11:04:12.096 2014: Killing connection from <EBI5-220 IP> because >>> the group is not ready for it to rejoin, err 46 >>> Tue Aug 19 11:04:14.224 2014: Killing connection from <EBI5-043 IP> because >>> the group is not ready for it to rejoin, err 46 >>> Tue Aug 19 11:04:20.037 2014: Killing connection from <EBI5-102 IP> because >>> the group is not ready for it to rejoin, err 46 >>> Tue Aug 19 11:04:23.103 2014: Accepted and connected to <EBI5-220 IP> >>> ebi5-220 <c0n618> >>> ... >>> >>> GSS02a ( NSD SERVER) >>> Tue Aug 19 11:03:04.980 2014: Expel <GSS02B IP> (gss02b) request from >>> <EBI5-220 IP> (ebi5-220 in ebi-cluster.ebi.ac.uk). Expelling: <EBI5-220 IP> >>> (ebi5-220 in ebi-cluster.ebi.ac.uk) >>> Tue Aug 19 11:03:12.069 2014: Accepted and connected to <EBI5-220 IP> >>> ebi5-220 <c0n618> >>> >>> >>> =============================================== >>> EXAMPLE 2: >>> >>> EBI5-038 >>> Tue Aug 19 11:32:34.227 2014: Disk lease period expired in cluster >>> GSS.ebi.ac.uk. Attempting to reacquire lease. >>> Tue Aug 19 11:33:34.258 2014: Lease is overdue. Probing cluster >>> GSS.ebi.ac.uk >>> Tue Aug 19 11:35:24.265 2014: Close connection to <GSS02A IP> gss02a <c1n2> >>> (Connection reset by peer). Attempting reconnect. >>> Tue Aug 19 11:35:24.865 2014: Close connection to <EBI5-014 IP> ebi5-014 >>> <c1n457> (Connection reset by peer). Attempting reconnect. >>> ... >>> LOT MORE RESETS BY PEER >>> ... >>> Tue Aug 19 11:35:25.096 2014: Close connection to <EBI5-167 IP> ebi5-167 >>> <c1n155> (Connection reset by peer). Attempting reconnect. >>> Tue Aug 19 11:35:25.267 2014: Connecting to <GSS02A IP> gss02a <c1n2> >>> Tue Aug 19 11:35:25.268 2014: Close connection to <GSS02A IP> gss02a <c1n2> >>> (Connection failed because destination is still processing previous node >>> failure) >>> Tue Aug 19 11:35:26.267 2014: Retry connection to <GSS02A IP> gss02a <c1n2> >>> Tue Aug 19 11:35:26.268 2014: Close connection to <GSS02A IP> gss02a <c1n2> >>> (Connection failed because destination is still processing previous node >>> failure) >>> Tue Aug 19 11:36:24.276 2014: Unable to contact any quorum nodes during >>> cluster probe. >>> Tue Aug 19 11:36:24.277 2014: Lost membership in cluster GSS.ebi.ac.uk. >>> Unmounting file systems. >>> >>> GSS02a >>> Tue Aug 19 11:35:24.263 2014: Node <EBI5-038 IP> (ebi5-038 in >>> ebi-cluster.ebi.ac.uk) is being expelled because of an expired lease. Pings >>> sent: 60. Replies received: 60. >>> >>> >>> >>> In example 1 seems that an NSD was not repliyng to the client, but the >>> servers seems working fine.. how can i trace better ( to solve) the >>> problem? >>> >>> In example 2 it seems to me that for some reason the manager are not >>> renewing the lease in time. when this happens , its not a single client. >>> Loads of them fail to get the lease renewed. Why this is happening? how can >>> i trace to the source of the problem? >>> >>> >>> >>> Thanks in advance for any tips. >>> >>> Regards, >>> Salvatore >>> >>> >>> >>> >>> >>> >>> >>> >>> _______________________________________________ >>> gpfsug-discuss mailing list >>> gpfsug-discuss at gpfsug.org >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at gpfsug.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
