10G Ethernet.

Thanks,
Lohit

On May 22, 2018, 11:55 AM -0400, [email protected], wrote:
> Hi Lohit,
>
> What type of network are you using on the back end to transfer the GPFS 
> traffic?
>
> Best,
> Dwayne
>
> From: [email protected] 
> [mailto:[email protected]] On Behalf Of 
> [email protected]
> Sent: Tuesday, May 22, 2018 1:13 PM
> To: gpfsug main discussion list <[email protected]>
> Subject: [gpfsug-discuss] Critical Hang issues with GPFS 5.0. Downgrading 
> from GPFS 5.0.0-2 to GPFS 4.2.3.2
>
> Hello All,
>
> We have recently upgraded from GPFS 4.2.3.2 to GPFS 5.0.0-2 about a month 
> ago. We have not yet converted the 4.2.2.2 filesystem version to 5. ( That is 
> we have not run the mmchconfig release=LATEST command)
> Right after the upgrade, we are seeing many “ps hangs" across the cluster. 
> All the “ps hangs” happen when jobs run related to a Java process or many 
> Java threads (example: GATK )
> The hangs are pretty random, and have no particular pattern except that we 
> know that it is related to just Java or some jobs reading from directories 
> with about 600000 files.
>
> I have raised an IBM critical service request about a month ago related to 
> this - PMR: 24090,L6Q,000.
> However, According to the ticket  - they seemed to feel that it might not be 
> related to GPFS.
> Although, we are sure that these hangs started to appear only after we 
> upgraded GPFS to GPFS 5.0.0.2 from 4.2.3.2.
>
> One of the other reasons we are not able to prove that it is GPFS is because, 
> we are unable to capture any logs/traces from GPFS once the hang happens.
> Even GPFS trace commands hang, once “ps hangs” and thus it is getting 
> difficult to get any dumps from GPFS.
>
> Also  - According to the IBM ticket, they seemed to have a seen a “ps hang" 
> issue and we have to run  mmchconfig release=LATEST command, and that will 
> resolve the issue.
> However we are not comfortable making the permanent change to Filesystem 
> version 5. and since we don’t see any near solution to these hangs - we are 
> thinking of downgrading to GPFS 4.2.3.2 or the previous state that we know 
> the cluster was stable.
>
> Can downgrading GPFS take us back to exactly the previous GPFS config state?
> With respect to downgrading from 5 to 4.2.3.2 -> is it just that i reinstall 
> all rpms to a previous version? or is there anything else that i need to make 
> sure with respect to GPFS configuration?
> Because i think that GPFS 5.0 might have updated internal default GPFS 
> configuration parameters , and i am not sure if downgrading GPFS will change 
> them back to what they were in GPFS 4.2.3.2
>
> Our previous state:
>
> 2 Storage clusters - 4.2.3.2
> 1 Compute cluster - 4.2.3.2  ( remote mounts the above 2 storage clusters )
>
> Our current state:
>
> 2 Storage clusters - 5.0.0.2 ( filesystem version - 4.2.2.2)
> 1 Compute cluster - 5.0.0.2
>
> Do i need to downgrade all the clusters to go to the previous state ? or is 
> it ok if we just downgrade the compute cluster to previous version?
>
> Any advice on the best steps forward, would greatly help.
>
> Thanks,
>
> Lohit
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to