Have you run your application through a debugger, or examined the corefiles to 
see where exactly the segv is occurring?  That may shed some insight into what 
the exact problem is.


On Dec 16, 2010, at 4:20 AM, Vaz, Guilherme wrote:

> Ok, ok. It is indeed a CFD program, and Gus got it right. Number of cells per 
> core means memory per core (sorry for the inaccuracy).
> My PC has 12GB of RAM. And the same calculation runs fine in an old 
> Ubuntu8.04 32bits with 4GB RAM.
> What I find strange is that the same problems runs with 1 core (without 
> evoking mpiexec) and then for large number of cores/processes, for instance 
> mpiexec -n 32. Something in between not. And it is not a bug in the program 
> because it runs in other machines and the code has not been changed.
> 
> Anymore hints?
> 
> Thanks in advance.
> 
> Guilherme
> 
> 
> 
> 
> dr. ir. Guilherme Vaz
> CFD Researcher
> Research & Development
> E mailto:g....@marin.nl
> T +31 317 49 33 25
> 
> MARIN
> 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands
> T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl
> 
> -----Original Message-----
> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On 
> Behalf Of Gus Correa
> Sent: Thursday, December 16, 2010 12:46 AM
> To: Open MPI Users
> Subject: Re: [OMPI users] segmentation fault
> 
> Maybe a CFD jargon?
> Perhaps the number (not size) of cells in a mesh/grid being handled
> by each core/cpu?
> 
> Ralph Castain wrote:
>> I have no idea what you mean by "cell sizes per core". Certainly not any
>> terminology within OMPI...
>> 
>> 
>> On Dec 15, 2010, at 3:47 PM, Vaz, Guilherme wrote:
>> 
>>> 
>>> Dear all,
>>> 
>>> I have a problem with openmpi1.3, ifort+mkl v11.1 in Ubuntu10.04
>>> systems (32 or 64bit). My code worked in Ubuntu8.04 and works in
>>> RedHat based systems, with slightly different version changes on mkl
>>> and ifort. There were no changes in the source code.
>>> The problem is that the application works for small cell sizes per
>>> core, but not for large cell sizes per core. And it always works for 1
>>> core.
>>> Example: a grid with 1.2Million cells does not work with mpiexec -n 4
>>> <my_app> but it works with mpiexec -n 32 <my_app>. It seems that there
>>> is a maximum of cell/core. And it works with <my_app>.
>>> 
>>> Is this a stack size (or any memory problem)? Should I set the ulimit
>>> -s unlimited not only on my bashrc but also in the ssh environment
>>> (and how)? Or is something else?
>>> Any clues/tips?
>>> 
>>> Thanks for any help.
>>> 
>>> Gui
>>> 
>>> 
>>> 
>>> 
>>> <imagec393d1.JPG><image4c4685.JPG>
>>> 
>>> dr. ir. Guilherme Vaz
>>> 
>>> CFD Researcher
>>> 
>>> 
>>> Research & Development
>>> 
>>> 
>>> 
>>> 
>>> 
>>> *MARIN*
>>> 
>>> 
>>> 
>>> 
>>> 
>>>     2, Haagsteeg
>>> E g....@marin.nl <mailto:g....@marin.nl>     P.O. Box 28     T +31 317 49 
>>> 39 11
>>>     6700 AA Wageningen      F +31 317 49 32 45
>>> T  +31 317 49 33 25  The Netherlands I  www.marin.nl <http://www.marin.nl>
>>> 
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> 
>> ------------------------------------------------------------------------
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to