It was reported by me but the problem was not solved. I thought ccp4bb has much bigger user base and may be someone has experienced this issue. It is something related to os (rocky linux) or mpi in my computer. Data set is not a problem as i can process same data set with exactly same parameters on a much less powerful computer without any problem. Also 2D classification on my computer is using gpu without any problem. Its only mpi processes of relion which are failing. Cryosparc is not a problem either. Thanks Dhiraj ________________________________ From: Takanori Nakane <tnakane.prot...@osaka-u.ac.jp> Sent: Friday, December 22, 2023 5:35 PM To: Srivastava, Dhiraj <dhiraj-srivast...@uiowa.edu> Cc: CCP4BB@JISCMAIL.AC.UK <CCP4BB@JISCMAIL.AC.UK> Subject: [External] Re: [ccp4bb] Relion issue with MPI
Hi, First of all, please report details of your hardware and your job. - Type of GPU - Number of GPU - GPU memory size - Box size - Number of threads - Number of MPI processes - Full command line Do you get the same error in ALL datasets (including our tutorial dataset) or only on this particular dataset? A very similar issue was reported in https://github.com/3dem/relion/issues/1056 but I do not know what is the cause at the moment. Best regards, Takanori Nakane On 12/23/23 03:31, Srivastava, Dhiraj wrote: > Hi > I am trying to use relion and I am getting error when trying to use mpi > (for 3d classification and 3D auto-refine). > > > ERROR: out of memory in > /home/lvantol/relion5/relion/src/acc/cuda/custom_allocator.cuh at line > 436 (error-code 2) > > in: /home/lvantol/relion5/relion/src/acc/cuda/cuda_settings.h, line 65 > > ERROR: > > A GPU-function failed to execute. > > > 2D classification is working fine with significant GPU usage. I tried 3 > different versions (4, 4 beta and 5 beta), one installed by vendor > (Exxact) and all have the same issue. I am able to do 3D auto-refine > and 3D classification on the same data set using our cluster without any > problem. did anyone encounter a similar issue earlier? How can I fix > this problem? > > > Thank you > > Dhiraj > > > > ------------------------------------------------------------------------ > > To unsubscribe from the CCP4BB list, click the following link: > https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 > <https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1> > ######################################################################## To unsubscribe from the CCP4BB list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1 This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list hosted by www.jiscmail.ac.uk, terms & conditions are available at https://www.jiscmail.ac.uk/policyandsecurity/