It was reported by me but the problem was not solved. I thought ccp4bb has much 
bigger user base and may be someone has experienced this issue.
It is something related to os (rocky linux) or mpi in my computer. Data set is 
not a problem as i can process same data set with exactly same parameters on a 
much less powerful computer without any problem. Also 2D classification on my 
computer is using gpu without any problem. Its only mpi processes of relion 
which are failing. Cryosparc is not a problem either.
Thanks
Dhiraj
________________________________
From: Takanori Nakane <tnakane.prot...@osaka-u.ac.jp>
Sent: Friday, December 22, 2023 5:35 PM
To: Srivastava, Dhiraj <dhiraj-srivast...@uiowa.edu>
Cc: CCP4BB@JISCMAIL.AC.UK <CCP4BB@JISCMAIL.AC.UK>
Subject: [External] Re: [ccp4bb] Relion issue with MPI

Hi,

First of all, please report details of your hardware and your job.

- Type of GPU
- Number of GPU
- GPU memory size
- Box size
- Number of threads
- Number of MPI processes
- Full command line

Do you get the same error in ALL datasets (including our
tutorial dataset) or only on this particular dataset?

A very similar issue was reported in
https://github.com/3dem/relion/issues/1056
but I do not know what is the cause at the moment.

Best regards,

Takanori Nakane

On 12/23/23 03:31, Srivastava, Dhiraj wrote:
> Hi
> I am trying to use relion and I am getting error when trying to use mpi
> (for 3d classification and 3D auto-refine).
>
>
> ERROR: out of memory in
> /home/lvantol/relion5/relion/src/acc/cuda/custom_allocator.cuh at line
> 436 (error-code 2)
>
> in: /home/lvantol/relion5/relion/src/acc/cuda/cuda_settings.h, line 65
>
> ERROR:
>
> A GPU-function failed to execute.
>
>
> 2D classification is working fine with significant GPU usage. I tried 3
> different versions (4, 4 beta and 5 beta), one installed by vendor
> (Exxact) and all have the same issue.  I am able to do 3D auto-refine
> and 3D classification on the same data set using our cluster without any
> problem.  did anyone encounter a similar issue earlier? How can I fix
> this problem?
>
>
> Thank you
>
> Dhiraj
>
>
>
> ------------------------------------------------------------------------
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1
> <https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1>
>

########################################################################

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Reply via email to