Hi All, We've recently had an issue where a job on our client GPFS cluster caused out main storage to go extremely slowly. The job was running relion using MPI (https://www2.mrc-lmb.cam.ac.uk/relion/index.php?title=Main_Page)
It caused waiters across the cluster, and caused the load to spike on NSDS on at a time. When the spike ended on one NSD, it immediately started on another. There were no obvious errors in the logs and the issues cleared immediately after the job was cancelled. Has anyone else see any issues with relion using GPFS storage? Michael Michael Holliday RITTech MBCS Senior HPC & Research Data Systems Engineer | eMedLab Operations Team Scientific Computing STP | The Francis Crick Institute 1, Midland Road | London | NW1 1AT | United Kingdom Tel: 0203 796 3167 The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 1 Midland Road London NW1 1AT
_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
