Re: [OMPI users] Open MPI instructional videos
Jeff Squyres wrote: > Over the past year or two, I have been slowly creating a large set of > Open MPI training material that I've used to present to my company's > customers and partners. I have just recently received permission to > release all of my slides to the greater HPC community. Woo hoo! Great idea Jeff, sounds really useful. But where do I find them? -- Graham Jenkins Senior Software Specialist, eResearch Monash University (Clayton Campus, Bldg 11, Rm S503) Email: graham.jenk...@its.monash.edu.au Tel: +613 9905-5942 (office) +614 4850-2491 (mobile)
[OMPI users] Different Interfaces on Different Nodes .. OpenMPI 1.2.3, 1.2.4 ..
We're moving from using a single (eth0) interface on our execute nodes to using a bond interface (bond0) for resilience. And what we're seeing on those nodes which have been upgraded is: -- [0,1,1][btl_tcp_component.c:349:mca_btl_tcp_component_create_instances] invalid interface "eth0" -- This of course, is because all nodes share a common copy of openmpi-mca-params.conf .. in which its says: -- btl_tcp_if_include=eth0 -- So .. does anybody have a suggestion for a way around this during our migration/upgrade period? If we place "bond0" in there as well, then we get error messages about whichever one is absent on the node where execution is happening. Regards .. -- Graham Jenkins Senior Software Specialist, eResearch Monash University (Clayton Campus, Bldg 11, Rm S503) Email: graham.jenk...@its.monash.edu.au Tel: +613 9905-5942 (office) +614 4850-2491 (mobile)
[OMPI users] Excessive Use of CPU System Resources with OpenMPI 1.2.4 using TCP only ..
We've observed an excessive use of CPU system resources with OpenMPI 1.2.4 using TCP connections only on our SL5 x86_64 Cluster. Typically, for a simple Canonical Ring Program, we're seeing between 30 and 70% system usage. Has anybody else noticed this sort of behaviour? And does anybody have some suggestions for resolving the issue? Present values we have are: -- ompi_info --param btl tcp |grep MCA MCA btl: parameter "btl_base_debug" (current value: "0") MCA btl: parameter "btl" (current value: ) MCA btl: parameter "btl_base_verbose" (current value: "0") MCA btl: parameter "btl_tcp_if_include" (current value: "eth0") MCA btl: parameter "btl_tcp_if_exclude" (current value: "lo") MCA btl: parameter "btl_tcp_free_list_num" (current value: "8") MCA btl: parameter "btl_tcp_free_list_max" (current value: "-1") MCA btl: parameter "btl_tcp_free_list_inc" (current value: "32") MCA btl: parameter "btl_tcp_sndbuf" (current value: "131072") MCA btl: parameter "btl_tcp_rcvbuf" (current value: "131072") MCA btl: parameter "btl_tcp_endpoint_cache" (current value: "30720") MCA btl: parameter "btl_tcp_exclusivity" (current value: "0") MCA btl: parameter "btl_tcp_eager_limit" (current value: "65536") MCA btl: parameter "btl_tcp_min_send_size" (current value: "65536") MCA btl: parameter "btl_tcp_max_send_size" (current value: "131072") MCA btl: parameter "btl_tcp_min_rdma_size" (current value: "131072") MCA btl: parameter "btl_tcp_max_rdma_size" (current value: "2147483647") MCA btl: parameter "btl_tcp_flags" (current value: "122") MCA btl: parameter "btl_tcp_priority" (current value: "0") MCA btl: parameter "btl_base_warn_component_unused" (current value: "1") -- Graham Jenkins Senior Software Specialist, eResearch Monash University Email: graham.jenk...@its.monash.edu.au Tel: +613 9905-5942 (office) +614 4850-2491 (mobile)
Re: [OMPI users] NAMD/Charm++ Looking for libmpich
Brock Palen wrote: > I have done work before to make namd and charm++ work with openMPI I > dont remember what but it is doable. Something like removing -lmpich > was enough i think, maybe a hack to use mpiCC and -fPIC (pgi compilers). > > I could look more if you want. -- I'd really appreciate that Brock, thanks! Where would one remove "-lmpich" from? I've had some difficulty finding it. It actually builds OK using: ./build charm++ mpi-linux-amd64 ifort \ --basedir /opt/sw/openmpi-1.2.3-i But if barfs when you try to do: "try out a sample program like tests/charm++/simplearrayhello" You can actually make the test compile by doing: cd /opt/sw/openmpi-1.2.3-i/lib ; ln -s libmpi.so.0.0.0 libmpich.so .. but I'm not sure that it's legit! :) -- Graham Jenkins Senior Software Specialist, E-Research Email: graham.jenk...@its.monash.edu.au Tel: +613 9905-5942 Mob: +614 4850-2491
[OMPI users] NAMD/Charm++ Looking for libmpich
This iteme was originally sent to the NAMD mailing list, but it occurred to me that it's something you guys may ahve seen in another vein .. and may have a solution for .. I'm trying to build charm++ on a SL5 x86_64 machine on which the openmpi-1.1.1-5.el5.x86_64 RPM has been installed. So here's the sequence: -- cd charm-5.9 module load openmpi-intel ./build charm++ mpi-linux-amd64 --libdir=/usr/lib64/openmpi \ --incdir=/usr/include/openmpi .. cd tests/charm++/simplearrayhello make ../../../bin/charmc -language charm++ -o hello hello.o /usr/bin/ld: cannot find -lmpich collect2: ld returned 1 exit status -- Bottom line .. charm++ doesn't know about libmpi, even though it exists thus: ls -1 /opt/sw/openmpi-1.2.3-i/lib/libmpi.?? /opt/sw/openmpi-1.2.3-i/lib/libmpi.la /opt/sw/openmpi-1.2.3-i/lib/libmpi.so So .. anybody got a solution .. please? -- Graham Jenkins Senior Software Specialist, E-Research Email: graham.jenk...@its.monash.edu.au Tel: +613 9905-5942 Mob: +614 4850-2491
[OMPI users] Unable to find any HCAs ..
I'm using the openmpi-1.1.1-5.el5.x86_64 RPM on a Scientific Linux 5 cluster, with no installed HCAs. And a simple MPI job submitted to that cluster runs OK .. except that it issues messages for each node like the one shown below. Is there some way I can supress these, perhaps by an appropriate entry in /etc/openmpi-mca-params.conf ? -- libibverbs: Fatal: couldn't open sysfs class 'infiniband_verbs'. -- [0,1,0]: OpenIB on host localhost was unable to find any HCAs. Another transport will be used instead, although this may result in lower performance. -- -- Graham Jenkins Senior Software Specialist, E-Research Email: graham.jenk...@its.monash.edu.au Tel: +613 9905-5942 Mob: +614 4850-2491