You just need to check your code to ensure it returns a zero status when it 
completes correctly - or ignore the message.

BTW: I’m assuming you are using OpenMPI here as that message looks like it came 
from there. If not, then someone more familiar with the implementation you are 
using should speak up.


> On Nov 30, 2015, at 10:41 AM, Fany Pagés Díaz <[email protected]> wrote:
> 
> But the results is fine, what can I do for fix this? thaks you
> Ing. Fany Pages Díaz
>  
> De: Ralph Castain [mailto:[email protected]] En nombre de Ralph Castain
> Enviado el: lunes, 30 de noviembre de 2015 13:10
> Para: slurm-dev
> Asunto: [slurm-dev] Re: Messages of mpirun noticed that the job aborted, but 
> has no info as to the process
>  
> It means that at least one process exited with a non-zero status, indicating 
> that a problem occurred in the application
>  
> On Nov 30, 2015, at 9:39 AM, Fany Pagés Díaz <[email protected] 
> <mailto:[email protected]>> wrote:
>  
> When I send a job whit  slurm I always have these messages when the job 
> finished, anyone know what this means?
>  
> [root@cluster bin]# salloc -n 3 -N 2 --exclusive --gres=gpu:2 mpirun mpiocl 
> salloc: Granted job allocation 133
>   We have 3 processors
>   Spawning from compute-0-0.local 
>   OpenCL MPI
>  
>   Probing nodes...
>      Node        Psid   Cards (devID)
>      ----------- ----- ---- ----------
> Available platforms:
> platform 0: Intel(R) OpenCL
> platform 1: NVIDIA CUDA
> selected platform 1
> Nombre dispositivo: GeForce GTX 260 
> Nombre dispositivo: GeForce GTX 260 
> Available platforms:
> platform 0: Intel(R) OpenCL
> platform 1: NVIDIA CUDA
> selected platform 1
> Nombre dispositivo: GeForce GTX 260 
> Nombre dispositivo: GeForce GTX 260 
>  
> --------------------------------------------------------------------------
> mpirun noticed that the job aborted, but has no info as to the process
> that caused that situation.
> --------------------------------------------------------------------------
> salloc: Relinquishing job allocation 133
> salloc: Job allocation 133 has been revoked.
>  
> Thank you,
> Ing. Fany Pages Diaz

Reply via email to