On Thu, 5 Sep 2002, khoa nguyen wrote: > I'm using oscar1.2 for RedHat 7.2 , and as I try to > compile and run our programs w/ LAM, here is the error > message when I try to run our codes w/ mpirun: (all > compiling steps w/ lam-mpicc working fine):
<SNIP CODE> > ******************** > any suggestions about this? I do run lamboot in every > node manually before calling mpirun, so I wonder is > that because I haven't set up LAM environment > correctly or somehow? First, the problem with the segfault. The failure is occuring in one off the internal send functions from inside a call to MPI_Bcast(). Given the parameters to MPI_Bcast, there are two options for what the problem is. First, at some point previous in the application, you scribbled on the memory that holds the the MPI communicator. Second, your buffer / length (which is given by both the count and Dtype parameters) was invalid, so when LAM went to read out of that buffer, things went "badly". Without seeing the code, it is not possible for me to tell exactly what is wrong. Using a debugger might be of some use. Using a memory-checking toool like Purify would probably expose the problem. When you say "run lamboot in every node", are you running a seperate "lamboot" command on every node, or running lamboot once with a hostfile with all the nodes in it? Lamboot should only be run once - it takes care of starting the LAM environment on each node in the machine. You might want to take a look at the LAM/MPI faq: http://www.lam-mpi.org/faq/ One thing just occurred to me (how's that for thinking while you type...), it is possible that the problem is because your application has fewer nodes that it is expecting. If you have hard coded assumptions about the size of MPI_COMM_WORLD that aren't being met, that could cause some problems. I only bring this up because if you are running seperate copies of lamboot on each node, thatn your world size will be at most 1. which could cause problems... Hope this helps, Brian -- Brian Barrett Graduate Student, Open Systems Lab, Indiana University http://www.osl.iu.edu/~brbarret/ ------------------------------------------------------- This sf.net email is sponsored by: OSDN - Tired of that same old cell phone? Get a new here for FREE! https://www.inphonic.com/r.asp?r=sourceforge1&refcode1=vs3390 _______________________________________________ Oscar-users mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/oscar-users
