Yes, We have upgraded to 5.0.1-0.5, which has the patch for the issue. The related IBM case number was : TS001010674
Regards, Lohit On Nov 2, 2018, 12:27 PM -0400, Mazurkova, Svetlana/Information Systems <[email protected]>, wrote: > Hi Damir, > > It was related to specific user jobs and mmap (?). We opened PMR with IBM and > have patch from IBM, since than we don’t see issue. > > Regards, > > Sveta. > > > On Nov 2, 2018, at 11:55 AM, Damir Krstic <[email protected]> wrote: > > > > Hi, > > > > Did you ever figure out the root cause of the issue? We have recently (end > > of the June) upgraded our storage to: gpfs.base-5.0.0-1.1.3.ppc64 > > > > In the last few weeks we have seen an increasing number of ps hangs across > > compute and login nodes on our cluster. The filesystem version (of all > > filesystems on our cluster) is: > > -V 15.01 (4.2.0.0) File system version > > > > I am just wondering if anyone has seen this type of issue since you first > > reported it and if there is a known fix for it. > > > > Damir > > > > > On Tue, May 22, 2018 at 10:43 AM <[email protected]> wrote: > > > > Hello All, > > > > > > > > We have recently upgraded from GPFS 4.2.3.2 to GPFS 5.0.0-2 about a > > > > month ago. We have not yet converted the 4.2.2.2 filesystem version to > > > > 5. ( That is we have not run the mmchconfig release=LATEST command) > > > > Right after the upgrade, we are seeing many “ps hangs" across the > > > > cluster. All the “ps hangs” happen when jobs run related to a Java > > > > process or many Java threads (example: GATK ) > > > > The hangs are pretty random, and have no particular pattern except that > > > > we know that it is related to just Java or some jobs reading from > > > > directories with about 600000 files. > > > > > > > > I have raised an IBM critical service request about a month ago related > > > > to this - PMR: 24090,L6Q,000. > > > > However, According to the ticket - they seemed to feel that it might > > > > not be related to GPFS. > > > > Although, we are sure that these hangs started to appear only after we > > > > upgraded GPFS to GPFS 5.0.0.2 from 4.2.3.2. > > > > > > > > One of the other reasons we are not able to prove that it is GPFS is > > > > because, we are unable to capture any logs/traces from GPFS once the > > > > hang happens. > > > > Even GPFS trace commands hang, once “ps hangs” and thus it is getting > > > > difficult to get any dumps from GPFS. > > > > > > > > Also - According to the IBM ticket, they seemed to have a seen a “ps > > > > hang" issue and we have to run mmchconfig release=LATEST command, and > > > > that will resolve the issue. > > > > However we are not comfortable making the permanent change to > > > > Filesystem version 5. and since we don’t see any near solution to these > > > > hangs - we are thinking of downgrading to GPFS 4.2.3.2 or the previous > > > > state that we know the cluster was stable. > > > > > > > > Can downgrading GPFS take us back to exactly the previous GPFS config > > > > state? > > > > With respect to downgrading from 5 to 4.2.3.2 -> is it just that i > > > > reinstall all rpms to a previous version? or is there anything else > > > > that i need to make sure with respect to GPFS configuration? > > > > Because i think that GPFS 5.0 might have updated internal default GPFS > > > > configuration parameters , and i am not sure if downgrading GPFS will > > > > change them back to what they were in GPFS 4.2.3.2 > > > > > > > > Our previous state: > > > > > > > > 2 Storage clusters - 4.2.3.2 > > > > 1 Compute cluster - 4.2.3.2 ( remote mounts the above 2 storage > > > > clusters ) > > > > > > > > Our current state: > > > > > > > > 2 Storage clusters - 5.0.0.2 ( filesystem version - 4.2.2.2) > > > > 1 Compute cluster - 5.0.0.2 > > > > > > > > Do i need to downgrade all the clusters to go to the previous state ? > > > > or is it ok if we just downgrade the compute cluster to previous > > > > version? > > > > > > > > Any advice on the best steps forward, would greatly help. > > > > > > > > Thanks, > > > > > > > > Lohit > > > > _______________________________________________ > > > > gpfsug-discuss mailing list > > > > gpfsug-discuss at spectrumscale.org > > > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
