This usually means that you have a memory error of some kind in your 
application.

Have you tried running your application through a memory-checking debugger, 
such as valgrind?


On Sep 5, 2011, at 3:48 AM, Jai Dayal wrote:

> Hi all,
>   I've been beating my head on this for quite a while now.  I don't have this 
> problem when running with 1,2, or 3 procs, however, once I get to 4 or 
> beyond, I have a problem.
> 
> When I call malloc, I get this error:
> 
> txserver:11055 terminated with signal 11 at PC=2b46886bc18a SP=7fff20f51030.  
> Backtrace:
> /apps/x86_64/mpi/openmpi/gcc-4.3.4/openmpi-1.4.3_oobpr/lib/libopen-pal.so.0(opal_memory_ptmalloc2_int_malloc+0x54a)[0x2b46886bc18a]
> /apps/x86_64/mpi/openmpi/gcc-4.3.4/openmpi-1.4.3_oobpr/lib/libopen-pal.so.0[0x2b46886bd4f3]
> txserver[0x415769]
> txserver[0x40da8c]
> txserver[0x4344bb]
> txserver[0x4351cd]
> txserver[0x40e3d4]
> /lib64/libc.so.6(__libc_start_main+0xf4)[0x2b468a5af994]
> txserver(_ZNSt8ios_base4InitD1Ev+0x39)[0x40c889]
> 
> However, only rank 0 calls this function.  Which is strange.  I can just put 
> a dummy malloc in there (int * dummy = (int *)malloc(10);) for example, and 
> it will still crash.
> 
> Again, this does not happen with n < 4 procs.  The crash happens on rank 0, 
> as it's the only rank that calls this code...
> 
> I'm perplexed.
> 
> Thanks a lot,
> J.D.
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to