What version of OMPI are you using?

On Nov 26, 2012, at 1:02 AM, George Markomanolis <geo...@markomanolis.com> 
wrote:

> Dear all,
> 
> Initially I would like an advice of how to identify the maximum number of MPI 
> processes that can be executed on a node with oversubscribing. When I try to 
> execute an application with 4096 MPI processes on a 24-cores node with 48GB 
> of memory, I have an error "Unknown error: 1" while the memory is not even at 
> the half. I can execute the same application with 2048 MPI processes in less 
> than one minute. I have checked linux settings about maximum number of 
> processes and it is much bigger than 4096.
> 
> Another more generic question, is about discovering nodes with faulty memory. 
> Is there any way to identify nodes with faulty memory? I found accidentally 
> that a node with exact the same hardware couldn't execute an MPI application 
> when it was using more than 12GB of ram while the second one could use all of 
> the 48GB of memory. If I have 500+ nodes is difficult to check all of them 
> and I am not familiar with any efficient solution. Initially I thought about 
> memtester but it takes a lot of time. I know that this does not apply exactly 
> on this mailing list but I thought that maybe an OpenMPI user knows something 
> about.
> 
> 
> Best regards,
> George Markomanolis
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to