Hello Jack, list
As others mentioned, this may be a problem with dynamic
memory allocation.
It could also be a violation of statically allocated memory,
I guess.
You say:
My program can run well for 1,2,10 processors, but fail when the
number of tasks cannot
be divided evenly by number of processes.
Often times, when the division of the number of "tasks"
(or the global problem size) by the number of "processors" is not even,
one processor gets a lighter/heavier workload then the others,
it also allocates less/more memory than the others,
and it accesses smaller/larger arrays than the others.
In general integer division and remainder/module calculations
are used to control memory allocation, the array sizes, etc,
on different processors.
These formulas tend to use the MPI communicator size
(i.e., effectively the number of processors if you are using
MPI_COMM_WORLD) to split the workload across the processors.
I would search for the lines of code where those calculations are done,
and where the arrays are allocated and accessed,
to make sure the algorithm works both when
they are of the same size
(even workload across the processors),
as when they are of different sizes
(uneven workload across the processors).
You may be violating memory access by a few bytes only, due to a small
mistake in one of those integer division / remainder/module formulas,
perhaps where an array index upper or lower bound is calculated.
It happened to me before, probably to others too.
This type of code inspection can be done without a debugger,
or before you get to the debugger phase.
I hope this helps,
Gus Correa
---------------------------------------------------------------------
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
---------------------------------------------------------------------
Jeff Squyres wrote:
Also see http://www.open-mpi.org/faq/?category=debugging.
On Jul 1, 2010, at 3:17 AM, Asad Ali wrote:
Hi Jack,
Debugging OpenMPI with traditional debuggers is a pain.
>From your error message it sounds that you have some memory allocation
problem. Do you use dynamic memory allocation (allocate and then free)?
I use display (printf()) command with MPIrank command. It tells me which thread
is giving segmentation fault.
Cheers,
Asad
On Thu, Jul 1, 2010 at 4:13 PM, Jack Bryan <dtustud...@hotmail.com> wrote:
thanks
I am not familiar with OpenMPI.
Would you please help me with how to ask openMPI to show where the fault occurs
?
GNU debuger ?
Any help is appreciated.
thanks!!!
Jack
June 30 2010
Date: Wed, 30 Jun 2010 16:13:09 -0400
From: amja...@gmail.com
To: us...@open-mpi.org
Subject: Re: [OMPI users] Open MPI, Segmentation fault
Based on my experiences, I would FULLY endorse (100% agree with) David Zhang.
It is usually a coding or typo mistake.
At first, Ensure that array sizes and dimension are correct.
I experience that if openmpi is compiled with gnu compilers (not with Intel)
then it also point outs the subroutine exactly in which the fault occur. have a
try.
best,
AA
On Wed, Jun 30, 2010 at 12:43 PM, David Zhang <solarbik...@gmail.com> wrote:
When I got segmentation faults, it has always been my coding mistakes. Perhaps
your code is not robust against number of processes not divisible by 2?
On Wed, Jun 30, 2010 at 8:47 AM, Jack Bryan <dtustud...@hotmail.com> wrote:
Dear All,
I am using Open MPI, I got the error:
n337:37664] *** Process received signal ***
[n337:37664] Signal: Segmentation fault (11)
[n337:37664] Signal code: Address not mapped (1)
[n337:37664] Failing at address: 0x7fffcfe90000
[n337:37664] [ 0] /lib64/libpthread.so.0 [0x3c50e0e4c0]
[n337:37664] [ 1] /lustre/home/rhascheduler/RhaScheduler-0.4.1.1/mytest/nmn2
[0x414ed7]
[n337:37664] [ 2] /lib64/libc.so.6(__libc_start_main+0xf4) [0x3c5021d974]
[n337:37664] [ 3]
/lustre/home/rhascheduler/RhaScheduler-0.4.1.1/mytest/nmn2(__gxx_personality_v0+0x1f1)
[0x412139]
[n337:37664] *** End of error message ***
After searching answers, it seems that some functions fail.
My program can run well for 1,2,10 processors, but fail when the number of tasks cannot
be divided evenly by number of processes.
Any help is appreciated.
thanks
Jack
June 30 2010
The New Busy think 9 to 5 is a cute idea. Combine multiple calendars with
Hotmail. Get busy.
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
--
David Zhang
University of California, San Diego
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox.
Learn more.
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
--
"Statistical thinking will one day be as necessary for efficient citizenship as the
ability to read and write." - H.G. Wells
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users