Re: [OMPI users] debugging problem
Hi Ralph sorry about this, I understood that -d should make the output directory the xterm, but my expectation, was to have separate xterms for each running process that I can debug! am I completely off-track? where I can find more information about debugging multiprocess-multithreaded programs using gdb? I have the -np processes created by mpirun, and then each process has a number of threads running in parallel independently (some semaphores are used anyway?) will I end up having different xterms for each process (hopefully each thread within as well? I am totally lost in this debugging scenario, and need basic help actually about what to expect? thank you for your reply, Best Regards, Manal List-Post: users@lists.open-mpi.org Date: Thu, 09 Nov 2006 21:58:57 -0700 From: Ralph Castain <r...@lanl.gov> Subject: Re: [OMPI users] debugging problem To: Open MPI Users <us...@open-mpi.org> Message-ID: <c1795521.3d5%...@lanl.gov> Content-Type: text/plain; charset="US-ASCII" Hi Manal The output you are seeing is caused by the "-d" flag you put in the mpirun command line - it shows normal operation. Could you tell us something more about why you believe there was an error? Ralph On 11/9/06 9:34 PM, "Manal Helal" <manalor...@gmail.com> wrote: Hi I am trying to run the following command: mpirun -np XX -d xterm -e gdb and I am receiving these errors: * [leo01:02141] [0,0,0] setting up session dir with [leo01:02141] universe default-universe [leo01:02141] user mhelal [leo01:02141] host leo01 [leo01:02141] jobid 0 [leo01:02141] procid 0 [leo01:02141] procdir: /tmp/openmpi-sessions-mhelal@leo01_0/default-universe/0/0 [leo01:02141] jobdir: /tmp/openmpi-sessions-mhelal@leo01_0/default-universe/0 [leo01:02141] unidir: /tmp/openmpi-sessions-mhelal@leo01_0/default-universe [leo01:02141] top: openmpi-sessions-mhelal@leo01_0 [leo01:02141] tmp: /tmp [leo01:02141] [0,0,0] contact_file /tmp/openmpi-sessions-mhelal@leo01_0/default- universe/universe-setup.txt [leo01:02141] [0,0,0] wrote setup file [leo01:02141] pls:rsh: local csh: 0, local bash: 1 [leo01:02141] pls:rsh: assuming same remote shell as local shell [leo01:02141] pls:rsh: remote csh: 0, remote bash: 1 [leo01:02141] pls:rsh: final template argv: [leo01:02141] pls:rsh: /usr/bin/ssh orted --debug --bootproxy 1 - -name --num_procs 2 --vpid_start 0 --nodename --universe m helal@leo01:default-universe --nsreplica "0.0.0;tcp://129.94.242.77:40738" --gpr replica "0.0.0;tcp://129.94.242.77:40738" --mpi-call-yield 0 [leo01:02141] pls:rsh: launching on node localhost [leo01:02141] pls:rsh: oversubscribed -- setting mpi_yield_when_idle to 1 (1 4) [leo01:02141] pls:rsh: localhost is a LOCAL node [leo01:02141] pls:rsh: changing to directory /import/eno/1/mhelal [leo01:02141] pls:rsh: executing: orted --debug --bootproxy 1 --name 0.0.1 --num _procs 2 --vpid_start 0 --nodename localhost --universe mhelal@leo01:default-uni verse --nsreplica "0.0.0;tcp://129.94.242.77:40738" --gprreplica "0.0.0;tcp://12 9.94.242.77:40738" --mpi-call-yield 1 [leo01:02143] [0,0,1] setting up session dir with [leo01:02143] universe default-universe [leo01:02143] user mhelal [leo01:02143] host localhost [leo01:02143] jobid 0 [leo01:02143] procid 1 [leo01:02143] procdir: /tmp/openmpi-sessions-mhelal@localhost_0/default-universe /0/1 [leo01:02143] jobdir: /tmp/openmpi-sessions-mhelal@localhost_0/default-universe/ 0 [leo01:02143] unidir: /tmp/openmpi-sessions-mhelal@localhost_0/default-universe [leo01:02143] top: openmpi-sessions-mhelal@localhost_0 [leo01:02143] tmp: /tmp [leo01:02143] sess_dir_finalize: proc session dir not empty - leaving [leo01:02143] sess_dir_finalize: proc session dir not empty - leaving [leo01:02143] sess_dir_finalize: proc session dir not empty - leaving [leo01:02143] sess_dir_finalize: proc session dir not empty - leaving [leo01:02143] orted: job_state_callback(jobid = 1, state = ORTE_PROC_STATE_TERMI NATED) [leo01:02143] sess_dir_finalize: job session dir not empty - leaving [leo01:02143] sess_dir_finalize: found proc session dir empty - deleting [leo01:02143] sess_dir_finalize: found job session dir empty - deleting [leo01:02143] sess_dir_finalize: found univ session dir empty - deleting [leo01:02143] sess_dir_finalize: found top session dir empty - deleting Will you please have a look, and advise if possible where I could change these paths, when I checked the paths, it was not there all Best Regards, Manal ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
[OMPI users] MPI_Finalize runtime error
Hi after I finish execution, and all results are reported, and both processes are about to call MPI_Finalize, I get this runtime error: any help is appreciated, thanks Manal Signal:11 info.si_errno:0(Success) si_code:1(SEGV_MAPERR) Failing at addr:0xa [0] func:/usr/local/bin/openmpi/lib/libopal.so.0 [0x3e526c] [1] func:[0x4bfc7440] [2] func:/usr/local/bin/openmpi/lib/libopal.so.0(free+0xb4) [0x3e9ff4] [3] func:/usr/local/bin/openmpi/lib/libmpi.so.0 [0x70484e] [4] func:/usr/local/bin/openmpi//lib/openmpi/mca_btl_tcp.so(mca_btl_tcp_component_close+0x278) [0xc78a58] [5] func:/usr/local/bin/openmpi/lib/libopal.so.0(mca_base_components_close +0x6a) [0x3d93fa] [6] func:/usr/local/bin/openmpi/lib/libmpi.so.0(mca_btl_base_close+0xbd) [0x75154d] [7] func:/usr/local/bin/openmpi/lib/libmpi.so.0(mca_bml_base_close+0x17) [0x751427] [8] func:/usr/local/bin/openmpi//lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_component_close+0x3a) [0x625a0a] [9] func:/usr/local/bin/openmpi/lib/libopal.so.0(mca_base_components_close +0x6a) [0x3d93fa] [10] func:/usr/local/bin/openmpi/lib/libmpi.so.0(mca_pml_base_close +0x65) [0x7580e5] [11] func:/usr/local/bin/openmpi/lib/libmpi.so.0(ompi_mpi_finalize +0x1b4) [0x71e984] [12] func:/usr/local/bin/openmpi/lib/libmpi.so.0(MPI_Finalize+0x4b) [0x73cb5b] [13] func:master/mmMaster(main+0x3cc) [0x804b2dc] [14] func:/lib/libc.so.6(__libc_start_main+0xdc) [0x4bffa724] [15] func:master/mmMaster [0x8049b91] *** End of error message ***
Re: [OMPI users] OpenMPI, debugging, and Portland Group's
Hi sorry, adding the path of my source code to the search path solved this problem, thanks, Manal On Fri, 2006-07-14 at 18:01 +1000, Manal Helal wrote: > Hi > > I tried your suggestion: > > mpirun --debug -np 4 a.out > > and I have TV and openMPI 1.1, and it worked fine, thats on Fedora C5 > and x86 intel chip, single machine > > however, my problem, this starts TV with the mpirun program itself being > debugged, but then starts my program and I see the output up to the > deadlock I have, however, I need to place a break point in my source > file, and when I click action point then at location, I am only allowed > to type a function name or a line number , not a file name like I would > assume, when I try to use one of my functions, it says not found, and > find the closest match to it from the mpirun source, > > how can I place action points in my program's source then, > > I appreciate your help, thanks, > > Manal >
Re: [OMPI users] OpenMPI, debugging, and Portland Group's
Hi I tried your suggestion: mpirun --debug -np 4 a.out and I have TV and openMPI 1.1, and it worked fine, thats on Fedora C5 and x86 intel chip, single machine however, my problem, this starts TV with the mpirun program itself being debugged, but then starts my program and I see the output up to the deadlock I have, however, I need to place a break point in my source file, and when I click action point then at location, I am only allowed to type a function name or a line number , not a file name like I would assume, when I try to use one of my functions, it says not found, and find the closest match to it from the mpirun source, how can I place action points in my program's source then, I appreciate your help, thanks, Manal
Re: [OMPI users] debugging with mpirun
or, if there is openMPI under Windows, where I can do some visual debugging, I appreciate any hints, because my application is getting too big, that printf are not doing any good, thanks, Manal On Fri, 2006-07-07 at 14:10 +1000, Manal Helal wrote: > Hi > > I see that XMPI will do all that I need, but it says, it works with > LAM/MPI up to versions 6.3.2 & 6.5.9, I am not sure if trying that with > open/mpi will work or not, > > Thanks again, > > Manal > On Fri, 2006-07-07 at 12:27 +1000, Manal Helal wrote: > > thing that can show me varia
Re: [OMPI users] debugging with mpirun
Hi I see that XMPI will do all that I need, but it says, it works with LAM/MPI up to versions 6.3.2 & 6.5.9, I am not sure if trying that with open/mpi will work or not, Thanks again, Manal On Fri, 2006-07-07 at 12:27 +1000, Manal Helal wrote: > thing that can show me varia
[OMPI users] debugging with mpirun
hi I am trying to debug my mpi program, but printf debugging is not doing much, and I need something that can show me variable values, and which line of execution (and where it is called from), something like gdb with mpi, is there anything like that? thank you very much for your help, Manal
[OMPI users] runtime error
Hi sorry for posting too much, I tried running and I got this error, I assume that this is the stack of the calls before the error Signal:11 info.si_errno:0(Success) si_code:2(SEGV_ACCERR) Failing at addr:0x8059b73 [0] func:/usr/local/bin/openmpi/lib/libopal.so.0 [0xb7e76ed0] [1] func:[0xe440] [2] func:/lib/tls/i686/cmov/libc.so.6(_IO_vfprintf+0x34b1) [0xb7d283a1] [3] func:/lib/tls/i686/cmov/libc.so.6(vsprintf+0x8b) [0xb7d4041b] [4] func:/lib/tls/i686/cmov/libc.so.6(sprintf+0x2b) [0xb7d2d76b] [5] func:./moaDist(mprintf+0x4a) [0x8056e96] [6] func:./moaDist(processArguments+0x90c) [0x804d60c] [7] func:./moaDist(main+0x197) [0x804aa4b] [8] func:/lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xd2) [0xb7cfeea2] [9] func:./moaDist [0x804a079] *** End of error message *** I am not sure if its an IO error; I tried on the web site to search the mailing archives, but it takes me to google search, and I am not sure if there is any documentation to understand the error messages, Thanks again, Manal
Re: [OMPI users] Can I install OpenMPI on a machine where I have mpich2
Hi Eric Thank you very much for your reply. I am a PhD student, and I do need this comparison for academic purposes; a fairly generic one will do, and I guess after running on both, I might have my own application/hardware specific points to add, Thanks again, I appreciate it, Manal On Mon, 2006-07-03 at 23:17 -0400, Eric Thibodeau wrote: > See comments below: > > Le lundi 3 juillet 2006 23:01, Manal Helal a écrit : > > Hi > > > > I am having problems running a multi-threaded applications using MPICH > > 2, and considering moving to OpenMPI. I already have mpich2 installed, > > and don't want to uninstall as yet. Can I have both installed and works > > fine on the same machine? > Yes, simply run the configure script with something like: > > ./configure --prefix=$HOME/openmpi-`uname -m` > > You will then be able to compile applications with: > > ~/openmpi-i686/bin/mpicc app.c -o app > > And run them with: > > ~/openmpi-i686/bin/mpirun -np 3 app > > > Also, I searched for a comparison of features of mpich vs lammpi vs > > openmpi and didn't find any so far. Will you please help me find one? > > Comparison is only relevant on your hardware with you application. Any other > comparison are mostly for academic purposes and grand assignments ;) > > > Thank you for your help in advance, > > > > Regards, > > > > Manal >