Re: [OMPI users] Help: Program Terminated
2008/5/30 Jeff Squyres: > I'd also like to re-emphasize something Andreas said earlier: SIGTERM > *usually* means that some external entity is killing your > application. It *could* be coming from within the application itself, > but that's not too common. > > You might want to look into that to find out where the SIGTERM is > coming from. The Microtar maintainers might have some better ideas. > > > On May 30, 2008, at 9:17 AM, Andreas Schäfer wrote: > > > On 12:28 Fri 30 May , Lee Amy wrote: > >> 2008/5/29 Andreas Schäfer : > >> Thank you very much. If I do a shorter job it seems run well. And > >> the job > >> dosen't repeatedly fail at the same time, but it will fail at this > >> error > >> messages. Anyway, I'm not using a scheduling system. So any > >> suggestions? > > > > At least no easy ones, sorry. ;-) You could ask the Microtar guys if > > they know anything about that problem. And of course you could use a > > debugger to dig into Microtar and find the problem yourself. ^^ Open > > MPI has some doc how to attach gdb to a parallel job: (and how to use > > valgrind etc.) > > > > http://www.open-mpi.org/faq/?category=debugging > > > > Good luck! > > -Andi > > > > > > -- > > > > Andreas Schäfer > > Cluster and Metacomputing Working Group > > Friedrich-Schiller-Universität Jena, Germany > > PGP/GPG key via keyserver > > I'm a bright... http://www.the-brights.net > > > > > > (\___/) > > (+'.'+) > > (")_(") > > This is Bunny. Copy and paste Bunny into your > > signature to help him gain world domination! > > ___ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > -- > Jeff Squyres > Cisco Systems > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > Thank you very much. I will try. Amy
Re: [OMPI users] eigenvalue problem
arpack http://www.caam.rice.edu/software/ARPACK/ maybe some of the functions from scalapack also. Any MPI eigenvalue package should work with OPMPI as its just another MPI library (though my current fav) Brock Palen www.umich.edu/~brockp Center for Advanced Computing bro...@umich.edu (734)936-1985 On May 30, 2008, at 4:05 PM, Radovan Herchel wrote: Hello, does anyone has a Fortran program or a link to Fortran code to calculate eigenvalues of real/complex symmetric matrices using OpenMPI package? Would be very thankful for help. ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
[OMPI users] eigenvalue problem
Hello, does anyone has a Fortran program or a link to Fortran code to calculate eigenvalues of real/complex symmetric matrices using OpenMPI package? Would be very thankful for help.
Re: [OMPI users] specifying hosts in mpi_spawn()
I'm using open mpi 1.2.6 from the open mpi site, but I can switch to another version if necessary. 2008/5/30 Ralph H Castain: > I'm afraid I cannot answer that question without first knowing what version > of Open MPI you are using. Could you provide that info? > > Thanks > Ralph > > > > On 5/29/08 6:41 PM, "Bruno Coutinho" wrote: > > > How mpi handles the host string passed in the info argument to > > mpi_comm_spawn() ? > > > > if I set host to: > > "host1,host2,host3,host2,host2,host1" > > > > then ranks 0 and 5 will run in host1, ranks 1,3,4 in host 2 and rank 3 > > in host3? > > ___ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
[OMPI users] Problem with X forwarding
hi, I have some problem running DistributedData.cxx ( it is a VTK file ) , I need to be able to see the rendering from my computer I, however have problem running the executable, I loaded both the executabe into 2 machines and I am accesing it from my computer( DHCP enabled ) after running the following command - I use OpenMPI mpirun -hostfile myhostfile -np 2 -bynode ./DistributedData and I keep getting these errors ERROR: In /home/kalpanak/Installation_Files/VTKProject/VTK/Rendering/vtkXOpenGLRenderWindow.cxx, line 326 vtkXOpenGLRenderWindow (0x8664438): bad X server connection. ERROR: In /home/kalpanak/Installation_Files/VTKProject/VTK/Rendering/vtkXOpenGLRenderWindow.cxx, line 169 vtkXOpenGLRenderWindow (0x8664438): bad X server connection. [vrc1:27394] *** Process received signal *** [vrc1:27394] Signal: Segmentation fault (11) [vrc1:27394] Signal code: Address not mapped (1) [vrc1:27394] Failing at address: 0x84 [vrc1:27394] [ 0] [0xe440] [vrc1:27394] [ 1] ./DistributedData(_ZN22vtkXOpenGLRenderWindow20GetDesiredVisualInfoEv+0x229) [0x8227e7d] [vrc1:27394] [ 2] ./DistributedData(_ZN22vtkXOpenGLRenderWindow16WindowInitializeEv+0x340) [0x8226812] [vrc1:27394] [ 3] ./DistributedData(_ZN22vtkXOpenGLRenderWindow10InitializeEv+0x29) [0x82234f9] [vrc1:27394] [ 4] ./DistributedData(_ZN22vtkXOpenGLRenderWindow5StartEv+0x29) [0x82235eb] [vrc1:27394] [ 5] ./DistributedData(_ZN15vtkRenderWindow14DoStereoRenderEv+0x1a) [0x82342ac] [vrc1:27394] [ 6] ./DistributedData(_ZN15vtkRenderWindow10DoFDRenderEv+0x427) [0x8234757] [vrc1:27394] [ 7] ./DistributedData(_ZN15vtkRenderWindow10DoAARenderEv+0x5b7) [0x8234d19] [vrc1:27394] [ 8] ./DistributedData(_ZN15vtkRenderWindow6RenderEv+0x690) [0x82353b4] [vrc1:27394] [ 9] ./DistributedData(_ZN22vtkXOpenGLRenderWindow6RenderEv+0x52) [0x82245e2] [vrc1:27394] [10] ./DistributedData [0x819e355] [vrc1:27394] [11] ./DistributedData(_ZN16vtkMPIController19SingleMethodExecuteEv+0x1ab) [0x837a447] [vrc1:27394] [12] ./DistributedData(main+0x180) [0x819de78] [vrc1:27394] [13] /lib/libc.so.6(__libc_start_main+0xe0) [0xb79c0fe0] [vrc1:27394] [14] ./DistributedData [0x819dc21] [vrc1:27394] *** End of error message *** mpirun noticed that job rank 0 with PID 27394 on node exited on signal 11 (Segmentation fault). Maybe I am not doing the xforwading properly, but has anyone ever encountered the same problem, it works fine on one pc, and I read the mailing list but I just don't know if my prob is similiar to their, I even tried changing the DISPLAY env This is what I want to do my mpirun should run on 2 machines ( A and B ) and I should be able to view the output ( on my PC ), are there any specfic commands to use.
Re: [OMPI users] Problem with NFS + PVFS2 + OpenMPI
Hi, Sorry but I made a mistake... I'm not trying to use PVFS over NFS but PVFS over EXT3. I still don't know this error message... On Thu, May 29, 2008 at 5:33 PM, Robert Lathamwrote: > On Thu, May 29, 2008 at 04:48:49PM -0300, Davi Vercillo C. Garcia wrote: >> > Oh, I see you want to use ordered i/o in your application. PVFS >> > doesn't support that mode. However, since you know how much data each >> > process wants to write, a combination of MPI_Scan (to compute each >> > processes offset) and MPI_File_write_at_all (to carry out the >> > collective i/o) will give you the same result with likely better >> > performance (and has the nice side effect of working with pvfs). >> >> I don't understand very well this... what do I need to change in my code ? > > MPI_File_write_ordered has an interesting property (which you probably > know since you use it, but i'll spell it out anyway): writes end up > in the file in rank-order, but are not necessarily carried out in > rank-order. > > Once each process knows the offsets and lengths of the writes the > other process will do, that process can writes its data. Observe that > rank 0 can write immediately. Rank 1 only needs to know how much data > rank 0 will write. and so on. > > Rank N can compute its offset by knowing how much data the proceeding > N-1 processes want to write. The most efficent way to collect this is > to use MPI_Scan and collect a sum of data: > > http://www.mpi-forum.org/docs/mpi-11-html/node84.html#Node84 > > Once you've computed these offsets, MPI_File_write_at_all has enough > information to cary out a collective write of the data. > > ==rob > > -- > Rob Latham > Mathematics and Computer Science DivisionA215 0178 EA2D B059 8CDF > Argonne National Lab, IL USA B29D F333 664A 4280 315B > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > -- Davi Vercillo Carneiro Garcia Universidade Federal do Rio de Janeiro Departamento de Ciência da Computação DCC-IM/UFRJ - http://www.dcc.ufrj.br "Good things come to those who... wait." - Debian Project "A computer is like air conditioning: it becomes useless when you open windows." - Linus Torvalds "Há duas coisas infinitas, o universo e a burrice humana. E eu estou em dúvida quanto o primeiro." - Albert Einstein
[OMPI users] File download sizes
I notice on the download page all file sizes are listed as 0KB, this is presumably an error somewhere. http://www.open-mpi.org/software/ompi/v1.2/ Ashley,
Re: [OMPI users] specifying hosts in mpi_spawn()
I'm afraid I cannot answer that question without first knowing what version of Open MPI you are using. Could you provide that info? Thanks Ralph On 5/29/08 6:41 PM, "Bruno Coutinho"wrote: > How mpi handles the host string passed in the info argument to > mpi_comm_spawn() ? > > if I set host to: > "host1,host2,host3,host2,host2,host1" > > then ranks 0 and 5 will run in host1, ranks 1,3,4 in host 2 and rank 3 > in host3? > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Open MPI instructional videos
I've never really dig into Open MPI's guts, not because I wasn't interested, but mainly because the time required to get my bearings seemed just too much. Until now. I've watched a couple of the videos while coding and it was pretty awesome. Easy to understand, structured and well spoken. On 12:41 Tue 27 May , Jeff Squyres wrote: > Note that "receiving permission" is very different than receiving > funding or additional staff to publish said training material. :-) Put them on itunes and talk some lectures into making them suggested materials for their parallel computing courses. ;-) > - Do you like the format? > - Is the (slides+narration) format useful? Yes, I like it a lot. I guess a pure podcast would be insufficient for complex issues where you simply need diagrams. Maybe a small suggestion: maybe it's just me, but I'd actually prefer (even) leaner slides. Currently you're basically duplicating on screen what you're saying, which is good when you're a nervous, moumbling college student and might lose your audience somewhere. But when you're an experenced speaker (which you obviously are), the audience does rarely need this redundancy and might rather get confused when trying to digest both streams of information (visual and auditory) simultaneously. But this is of course a question of personal preference. > - Would terminal screen-scrape sessions be useful? I'd prefer how-to pages for this, as you can copy the commands directly into your own shell. > - ...other [low-budget] suggestions? Maybe an a tad higher audio bitrate. And some people don't like the .mov format, but that isn't really important. Thanks! -Andreas -- Andreas Schäfer Cluster and Metacomputing Working Group Friedrich-Schiller-Universität Jena, Germany PGP/GPG key via keyserver I'm a bright... http://www.the-brights.net (\___/) (+'.'+) (")_(") This is Bunny. Copy and paste Bunny into your signature to help him gain world domination! pgpzskmtC1spA.pgp Description: PGP signature
Re: [OMPI users] Help: Program Terminated
I'd also like to re-emphasize something Andreas said earlier: SIGTERM *usually* means that some external entity is killing your application. It *could* be coming from within the application itself, but that's not too common. You might want to look into that to find out where the SIGTERM is coming from. The Microtar maintainers might have some better ideas. On May 30, 2008, at 9:17 AM, Andreas Schäfer wrote: On 12:28 Fri 30 May , Lee Amy wrote: 2008/5/29 Andreas Schäfer: Thank you very much. If I do a shorter job it seems run well. And the job dosen't repeatedly fail at the same time, but it will fail at this error messages. Anyway, I'm not using a scheduling system. So any suggestions? At least no easy ones, sorry. ;-) You could ask the Microtar guys if they know anything about that problem. And of course you could use a debugger to dig into Microtar and find the problem yourself. ^^ Open MPI has some doc how to attach gdb to a parallel job: (and how to use valgrind etc.) http://www.open-mpi.org/faq/?category=debugging Good luck! -Andi -- Andreas Schäfer Cluster and Metacomputing Working Group Friedrich-Schiller-Universität Jena, Germany PGP/GPG key via keyserver I'm a bright... http://www.the-brights.net (\___/) (+'.'+) (")_(") This is Bunny. Copy and paste Bunny into your signature to help him gain world domination! ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres Cisco Systems
Re: [OMPI users] Help: Program Terminated
On 12:28 Fri 30 May , Lee Amy wrote: > 2008/5/29 Andreas Schäfer: > Thank you very much. If I do a shorter job it seems run well. And the job > dosen't repeatedly fail at the same time, but it will fail at this error > messages. Anyway, I'm not using a scheduling system. So any suggestions? At least no easy ones, sorry. ;-) You could ask the Microtar guys if they know anything about that problem. And of course you could use a debugger to dig into Microtar and find the problem yourself. ^^ Open MPI has some doc how to attach gdb to a parallel job: (and how to use valgrind etc.) http://www.open-mpi.org/faq/?category=debugging Good luck! -Andi -- Andreas Schäfer Cluster and Metacomputing Working Group Friedrich-Schiller-Universität Jena, Germany PGP/GPG key via keyserver I'm a bright... http://www.the-brights.net (\___/) (+'.'+) (")_(") This is Bunny. Copy and paste Bunny into your signature to help him gain world domination! pgpeUr0VKE8bt.pgp Description: PGP signature
Re: [OMPI users] Help: Program Terminated
2008/5/29 Andreas Schäfer: > Hi Amy, > > On 16:10 Thu 29 May , Lee Amy wrote: > > MicroTar parallel version was terminated after 463 minutes with following > > error messages: > > > > [gnode5:31982] [ 0] /lib64/tls/libpthread.so.0 [0x345460c430] > > [gnode5:31982] [ 1] microtar(LocateNuclei+0x137) [0x403037] > > [gnode5:31982] [ 2] microtar(main+0x4ac) [0x40431c] > > [gnode5:31982] [ 3] /lib64/tls/libc.so.6(__libc_start_main+0xdb) > > [0x3453b1c3fb] > > [gnode5:31982] [ 4] microtar [0x402e6a] > > [gnode5:31982] *** End of error message *** > > mpirun noticed that job rank 0 with PID 18710 on node gnode1 exited on > > signal 15 (Terminated). > > 19 additional processes aborted (not shown) > > > > if I'm not mistaken, signal 15 is SIGTERM, which is sent to processes > to terminate them. To me this sounds like your application is > terminated from an external instance, maybe because your job exceeded > the wall clock time limit of your scheduling system. Does the job > repeatedly fail at the same time? Do shorter jobs finish successfully? > > Just my 0.02 Euros (-8 > > Cheers > -Andreas > > > -- > > Andreas Schäfer > Cluster and Metacomputing Working Group > Friedrich-Schiller-Universität Jena, Germany > PGP/GPG key via keyserver > I'm a bright... http://www.the-brights.net > > > (\___/) > (+'.'+) > (")_(") > This is Bunny. Copy and paste Bunny into your > signature to help him gain world domination! > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > Thank you very much. If I do a shorter job it seems run well. And the job dosen't repeatedly fail at the same time, but it will fail at this error messages. Anyway, I'm not using a scheduling system. So any suggestions? Thank you again. Regards, Amy