Re: [OMPI users] Gigabit ethernet (PCI Express) and openmpi v1.2.4
Hi George, The following test peaks at 8392Mpbs: mpirun --prefix /opt/opnmpi124b --host a1,a1 -mca btl tcp,sm,self -np 2 ./NPmpi on a1 and on a2 mpirun --prefix /opt/opnmpi124b --host a2,a2 -mca btl tcp,sm,self -np 2 ./NPmpi gives 8565Mbps --(a) on a1: mpirun --prefix /opt/opnmpi124b --host a1,a1 -np 2 ./NPmpi gives 8424Mbps on a2: mpirun --prefix /opt/opnmpi124b --host a2,a2 -np 2 ./NPmpi gives 8372Mbps So theres enough memory and processor b/w to give 2.7Gbps for 3 pci express eth cards especially from --(a) between a1 and a2? Thank you for your help. Any assistance would be greatly apprectiated! Regards, Allan Menezes You should run a shared memory test, to see what's the max memory bandwidth you can get. Thanks, george. On Dec 17, 2007, at 7:14 AM, Gleb Natapov wrote: On Sun, Dec 16, 2007 at 06:49:30PM -0500, Allan Menezes wrote: Hi, How many PCI-Express Gigabit ethernet cards does OpenMPI version 1.2.4 support with a corresponding linear increase in bandwith measured with netpipe NPmpi and openmpi mpirun? With two PCI express cards I get a B/W of 1.75Gbps for 892Mbps each ans for three pci express cards ( one built into the motherboard) i get 1.95Gbps. They all are around 890Mbs indiviually measured with netpipe and NPtcp and NPmpi and openmpi. For two it seems there is a linear increase in b/w but not for three pci express gigabit eth cards. I have tune the cards using netpipe and $HOME/.openmpi/mca- params.conf file for latency and percentage b/w . Please advise. What is in your $HOME/.openmpi/mca-params.conf? May be are hitting your chipset limit here. What is your HW configuration? Can you try to run NPtcp on each interface simultaneously and see what BW do you get. -- Gleb.
Re: [OMPI users] users Digest, Vol 770, Issue 1
HTML attachment scrubbed and removed -- Message: 2 Date: Sun, 16 Dec 2007 18:49:30 -0500 From: Allan MenezesSubject: [OMPI users] Gigabit ethernet (PCI Express) and openmpi v1.2.4 To: us...@open-mpi.org Message-ID: <4765b98a.30...@sympatico.ca> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Hi, How many PCI-Express Gigabit ethernet cards does OpenMPI version 1.2.4 support with a corresponding linear increase in bandwith measured with netpipe NPmpi and openmpi mpirun? With two PCI express cards I get a B/W of 1.75Gbps for 892Mbps each ans for three pci express cards ( one built into the motherboard) i get 1.95Gbps. They all are around 890Mbs indiviually measured with netpipe and NPtcp and NPmpi and openmpi. For two it seems there is a linear increase in b/w but not for three pci express gigabit eth cards. I have tune the cards using netpipe and $HOME/.openmpi/mca-params.conf file for latency and percentage b/w . Please advise. Regards, Allan Menezes -- Message: 3 Date: Mon, 17 Dec 2007 14:14:42 +0200 From: gl...@voltaire.com (Gleb Natapov) Subject: Re: [OMPI users] Gigabit ethernet (PCI Express) and openmpi v1.2.4 To: Open MPI Users Message-ID: <20071217121442.gd28...@minantech.com> Content-Type: text/plain; charset=us-ascii On Sun, Dec 16, 2007 at 06:49:30PM -0500, Allan Menezes wrote: Hi, How many PCI-Express Gigabit ethernet cards does OpenMPI version 1.2.4 support with a corresponding linear increase in bandwith measured with netpipe NPmpi and openmpi mpirun? With two PCI express cards I get a B/W of 1.75Gbps for 892Mbps each ans for three pci express cards ( one built into the motherboard) i get 1.95Gbps. They all are around 890Mbs indiviually measured with netpipe and NPtcp and NPmpi and openmpi. For two it seems there is a linear increase in b/w but not for three pci express gigabit eth cards. I have tune the cards using netpipe and $HOME/.openmpi/mca-params.conf file for latency and percentage b/w . Please advise. What is in your $HOME/.openmpi/mca-params.conf? May be are hitting your chipset limit here. What is your HW configuration? Can you try to run NPtcp on each interface simultaneously and see what BW do you get. -- Gleb. Hi , My mca-params.conf file is: btl_tcp_latency_eth0=171 btl_tcp_latency_eth2=50 btl_tcp_latency_eth3=71 btl_tcp_bandwidth_eth0=34 btl_tcp_bandwidth_eth2=33 btl_tcp_bandwidth_eth3=33 HW config: host a1: On x4 PCI express a Syskonnect PCI express x1 gigabit ethernet card. On x16 PCI express a Intel pro 1000 pt gigabit pci express x1 gigabit ethernet card. Built in mobo pci express gigabit ethernet card e1000 intel 82566DM chipset all MTUs = 1500 host a2: same hardware config as host a1 I measure the latency and b/w this way: a1#> ./NPtcp a2#>./NPtcp -h 192.168.1.1 -n 50 for eth0 a2#>. /NPtcp -h 192.168.5.1 -n 50 for eth2 a2#>./NPtcp -h 192.168.8.1 -n 50 for eth3 and I use the measurement straight at 64bytes as 171 micro secs for eth0 etc .. and the highest band width. The bandwith measured for eth0 syskonnect 892Mbs latency 171microseconds The bandwith measured for eth2 intel pro 1000 pt 892Mbs latency 50microseconds The bandwith measured for eth3 intel built in pci ex 888Mbs latency 71microsecond#!/bin/sh Linux: FC8 kernel 2.6.23.11 with marvell drivers patch 10.22.4.3 and intel e1000 version 7.6.12 from the intel website This is how i use /opt/openmpi124b to check the b/w: and the b/w i measure is max of 1950Mbps for three 890 Mbps gigabit pci express eth cards with gigabit switches for each subnet! a1$>mpirun --prefix /opt/openmpi124b --host a1,a2 -mca btl tcp,sm,self -mca btl_tcp_if_include eth0,eth3,eth2 -mca btl_tcp_if_exclude lo,eth1,eth4 -mca oob_tcp_include eth0,eth3,eth2 -mca oob_tcp_exclude lo,eth1,eth4 -np 2 ./NPmpi The motherboards are Asus P5B-VM DO and processors pentium-d intel 945 each with 2gigabytes of ddr2 667mhz ram. Any help would bet appreciated. Thank you, Allan Menezes
Re: [OMPI users] Bug in oob_tcp_[in|ex]clude?
Hi Marco and Jeff, My own knowledge of OpenMPI's internals is limited, but I thought I'd add my less-than-two-cents... > I've found only a way in order to have tcp connections binded only to > > the eth1 interface, using both the following MCA directives in the > > command line: > > > > mpirun --mca oob_tcp_include eth1 --mca oob_tcp_include > > lo,eth0,ib0,ib1 . > > > > This sounds me as bug. > > Yes, it does. Specifying the MCA same param twice on the command line > results in undefined behavior -- it will only take one of them, and I > assume it'll take the first (but I'd have to check the code to be sure). I *think* that Marco intended to write: mpirun --mca oob_tcp_include eth1 --mca oob_tcp_exclude lo,eth0,ib0,ib1 ... Is this correct? So you're not specifying include twice, you're specifying include *and* exclude, so each interface is explicitly stated in one list or the other. I remember encountering this behaviour as well, in a slightly different format, but I can't seem to reproduce it now either. That said, with these options, won't the MPI traffic (as opposed to the OOB traffic) still use the eth1,ib0 and ib1 interfaces? You'd need to add '-mca btl_tcp_include eth1' in order to say it should only go over that NIC, I think. As for the 'connection errors', two bizarre things to check are, first, that all of your nodes using eth1 actually have correct /etc/hosts mappings to the other nodes. One system I ran on had this problem when some nodes had an IP address for node002 as one thing, and another node had node002's IP address as something different. This should be easy enough by trying to run on one node first, then two nodes that you're sure have the correct addresses. .. The second situation is if you're launching an MPMD program. Here, you need to use '-gmca ' instead of '-mca '. Hope some of that is at least a tad useful. :) Cheers, - Brian
Re: [OMPI users] Gigabit ethernet (PCI Express) and openmpi v1.2.4
You should run a shared memory test, to see what's the max memory bandwidth you can get. Thanks, george. On Dec 17, 2007, at 7:14 AM, Gleb Natapov wrote: On Sun, Dec 16, 2007 at 06:49:30PM -0500, Allan Menezes wrote: Hi, How many PCI-Express Gigabit ethernet cards does OpenMPI version 1.2.4 support with a corresponding linear increase in bandwith measured with netpipe NPmpi and openmpi mpirun? With two PCI express cards I get a B/W of 1.75Gbps for 892Mbps each ans for three pci express cards ( one built into the motherboard) i get 1.95Gbps. They all are around 890Mbs indiviually measured with netpipe and NPtcp and NPmpi and openmpi. For two it seems there is a linear increase in b/w but not for three pci express gigabit eth cards. I have tune the cards using netpipe and $HOME/.openmpi/mca- params.conf file for latency and percentage b/w . Please advise. What is in your $HOME/.openmpi/mca-params.conf? May be are hitting your chipset limit here. What is your HW configuration? Can you try to run NPtcp on each interface simultaneously and see what BW do you get. -- Gleb. ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users smime.p7s Description: S/MIME cryptographic signature
Re: [OMPI users] Bug in oob_tcp_[in|ex]clude?
On Dec 17, 2007, at 8:35 AM, Marco Sbrighi wrote: I'm using Open MPI 1.2.2 over OFED 1.2 on an 256 nodes, dual Opteron, dual core, Linux cluster. Of course, with Infiniband 4x interconnect. Each cluster node is equipped with 4 (or more) ethernet interface, namely 2 gigabit ones plus 2 IPoIB. The two gig are named eth0,eth1, while the two IPoIB are named ib0,ib1. It happens that the eth0 is a management network, with poor performances, and furthermore we wouldn't use the ib* to carry MPI's traffic (neither OOB or TCP), so we would like the eth1 is used for open MPI OOB and TCP. In order to drive the OOB over only eth1 I've tried various combinations of oob_tcp_[ex|in]clude MCA statements, starting from the obvious oob_tcp_exclude = lo,eth0,ib0,ib1 then trying the othe obvious: oob_tcp_include = eth1 This one statement (_include) should be sufficient. Assumedly this(these) statement(s) are in a config file that is being read by Open MPI, such as $HOME/.openmpi/mca-params.conf? and both at the same time. Next I've tried the following: oob_tcp_exclude = eth0 but after the job starts, I still have a lot of tcp connections established using eth0 or ib0 or ib1. Furthermore It happens the following error: [node191:03976] [0,1,14]-[0,1,12] mca_oob_tcp_peer_complete_connect: connection failed: Connection timed out (110) - retrying This is quite odd. :-( I've found only a way in order to have tcp connections binded only to the eth1 interface, using both the following MCA directives in the command line: mpirun --mca oob_tcp_include eth1 --mca oob_tcp_include lo,eth0,ib0,ib1 . This sounds me as bug. Yes, it does. Specifying the MCA same param twice on the command line results in undefined behavior -- it will only take one of them, and I assume it'll take the first (but I'd have to check the code to be sure). Is there someone able to reproduce this behaviour? If this is a bug, are there fixes? I'm unfortunately unable to reproduce this behavior. I have a test cluster with 2 IP interfaces: ib0, eth0. I have tried several combinations of MCA params with 1.2.2: --mca oob_tcp_include ib0 --mca oob_tcp_include ib0,bogus --mca oob_tcp_include eth0 --mca oob_tcp_include eth0,bogus --mca oob_tcp_exclude ib0 --mca oob_tcp_exclude ib0,bogus --mca oob_tcp_exclude eth0 --mca oob_tcp_exclude eth0,bogus All do as they are supposed to -- including or excluding ib0 or eth0. I do note, however, that the handling of these parameters changed in 1.2.3 -- as well as their names. The names changed to "oob_tcp_if_include" and "oob_tcp_if_exclude" to match other MCA parameter name conventions from other components. Could you try with 1.2.3 or 1.2.4 (1.2.4 is the most recent; 1.2.5 is due out "soon" -- it *may* get out before the holiday break, but no promises...)? If you can't upgrade, let me know and I can provide a debugging patch that will give us a little more insight into what is happening on your machines. Thanks. -- Jeff Squyres Cisco Systems
Re: [MTT users] MTT database access
I see... I wasn't aware of the protocol with regards to these accounts. Having a 'ubc' account is sufficient for us. You can remove the kmroz and penoff accounts. Thanks, and sorry for the confusion. Ethan Mallove wrote: > I thought maybe there was a reason Karol wanted separate > accounts that I hadn't thought of. Karol, which accounts do > you plan on using? > > -Ethan > > > On Fri, Dec/14/2007 08:04:37PM, Jeff Squyres wrote: >> Do we really need all 3 database accounts for ubc? >> >> >> On Dec 14, 2007, at 5:52 PM, Ethan Mallove wrote: >> >>> Hi Karol, >>> >>> I added three accounts: >>> >>> * ubc >>> * kmroz >>> * penoff >>> >>> OMPI MTT users use their organization name (e.g., "ubc" for >>> you folks), though your local UNIX username is recorded in >>> the results as well to help you sort through whose results >>> belong to who. >>> >>> Cheers, >>> Ethan >>> >>> >>> On Fri, Dec/14/2007 02:15:05PM, Karol Mroz wrote: Hi... I was wondering if it was possible to have 2 accounts (kmroz, penoff) added to the MTT database access list? Thanks. -- Karol Mroz km...@cs.ubc.ca ___ mtt-users mailing list mtt-us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users >>> ___ >>> mtt-users mailing list >>> mtt-us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users >> >> -- >> Jeff Squyres >> Cisco Systems >> ___ >> mtt-users mailing list >> mtt-us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users -- Karol Mroz km...@cs.ubc.ca
Re: [OMPI users] MPI::Intracomm::Spawn and cluster configuration
On 12/17/07 8:19 AM, "Elena Zhebel"wrote: > Hello Ralph, > > Thank you for your answer. > > I'm using OpenMPI 1.2.3. , compiler glibc232, Linux Suse 10.0. > My "master" executable runs only on the one local host, then it spawns > "slaves" (with MPI::Intracomm::Spawn). > My question was: how to determine the hosts where these "slaves" will be > spawned? > You said: "You have to specify all of the hosts that can be used by > your job > in the original hostfile". How can I specify the host file? I can not > find it > in the documentation. Hmmm...sorry about the lack of documentation. I always assumed that the MPI folks in the project would document such things since it has little to do with the underlying run-time, but I guess that fell through the cracks. There are two parts to your question: 1. how to specify the hosts to be used for the entire job. I believe that is somewhat covered here: http://www.open-mpi.org/faq/?category=running#simple-spmd-run That FAQ tells you what a hostfile should look like, though you may already know that. Basically, we require that you list -all- of the nodes that both your master and slave programs will use. 2. how to specify which nodes are available for the master, and which for the slave. You would specify the host for your master on the mpirun command line with something like: mpirun -n 1 -hostfile my_hostfile -host my_master_host my_master.exe This directs Open MPI to map that specified executable on the specified host - note that my_master_host must have been in my_hostfile. Inside your master, you would create an MPI_Info key "host" that has a value consisting of a string "host1,host2,host3" identifying the hosts you want your slave to execute upon. Those hosts must have been included in my_hostfile. Include that key in the MPI_Info array passed to your Spawn. We don't currently support providing a hostfile for the slaves (as opposed to the host-at-a-time string above). This may become available in a future release - TBD. Hope that helps Ralph > > Thanks and regards, > Elena > > -Original Message- > From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On > Behalf Of Ralph H Castain > Sent: Monday, December 17, 2007 3:31 PM > To: Open MPI Users > Cc: Ralph H Castain > Subject: Re: [OMPI users] MPI::Intracomm::Spawn and cluster > configuration > > On 12/12/07 5:46 AM, "Elena Zhebel" wrote: >> >> >> Hello, >> >> I'm working on a MPI application where I'm using OpenMPI instead of >> MPICH. >> >> In my "master" program I call the function MPI::Intracomm::Spawn which > spawns >> "slave" processes. It is not clear for me how to spawn the "slave" > processes >> over the network. Currently "master" creates "slaves" on the same >> host. >> >> If I use 'mpirun --hostfile openmpi.hosts' then processes are spawn >> over > the >> network as expected. But now I need to spawn processes over the >> network > from >> my own executable using MPI::Intracomm::Spawn, how can I achieve it? >> > > I'm not sure from your description exactly what you are trying to do, > nor in > what environment this is all operating within or what version of Open > MPI > you are using. Setting aside the environment and version issue, I'm > guessing > that you are running your executable over some specified set of hosts, > but > want to provide a different hostfile that specifies the hosts to be > used for > the "slave" processes. Correct? > > If that is correct, then I'm afraid you can't do that in any version > of Open > MPI today. You have to specify all of the hosts that can be used by > your job > in the original hostfile. You can then specify a subset of those hosts > to be > used by your original "master" program, and then specify a different > subset > to be used by the "slaves" when calling Spawn. > > But the system requires that you tell it -all- of the hosts that are > going > to be used at the beginning of the job. > > At the moment, there is no plan to remove that requirement, though > there has > been occasional discussion about doing so at some point in the future. > No > promises that it will happen, though - managed environments, in > particular, > currently object to the idea of changing the allocation on-the-fly. We > may, > though, make a provision for purely hostfile-based environments (i.e., > unmanaged) at some time in the future. > > Ralph > >> >> >> Thanks in advance for any help. >> >> Elena >> >> >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] unable to open osc pt2pt
If you care, this is actually the result of a complex issue that was just recently discussed on the OMPI devel list. You can see a full explanation there if you're interested. On Dec 17, 2007, at 10:46 AM, Brian Granger wrote: This should be fixed in the subversion trunk of mpi4py. Can you do an update to that version and retry. If it still doesn't work, post to the mpi4py list and we will see what we can do. Brian On Dec 17, 2007 8:25 AM, de Almeida, Valmor F.wrote: Hello, I am getting these messages (below) when running mpi4py python codes. Always one message per mpi process. The codes seem to run correctly. Any ideas why this is happening and how to avoid it? Thanks, -- Valmor de Almeida mpirun -np 2 python helloworld.py [xeon0:05998] mca: base: component_find: unable to open osc pt2pt: file not found (ignored) [xeon0:05999] mca: base: component_find: unable to open osc pt2pt: file not found (ignored) Hello, World!! I am process 0 of 2 on xeon0. Hello, World!! I am process 1 of 2 on xeon0. ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres Cisco Systems
Re: [OMPI users] unable to open osc pt2pt
This should be fixed in the subversion trunk of mpi4py. Can you do an update to that version and retry. If it still doesn't work, post to the mpi4py list and we will see what we can do. Brian On Dec 17, 2007 8:25 AM, de Almeida, Valmor F.wrote: > > Hello, > > I am getting these messages (below) when running mpi4py python codes. > Always one message per mpi process. The codes seem to run correctly. Any > ideas why this is happening and how to avoid it? > > Thanks, > > -- > Valmor de Almeida > > >mpirun -np 2 python helloworld.py > [xeon0:05998] mca: base: component_find: unable to open osc pt2pt: file > not found (ignored) > [xeon0:05999] mca: base: component_find: unable to open osc pt2pt: file > not found (ignored) > Hello, World!! I am process 0 of 2 on xeon0. > Hello, World!! I am process 1 of 2 on xeon0. > > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
[OMPI users] unable to open osc pt2pt
Hello, I am getting these messages (below) when running mpi4py python codes. Always one message per mpi process. The codes seem to run correctly. Any ideas why this is happening and how to avoid it? Thanks, -- Valmor de Almeida >mpirun -np 2 python helloworld.py [xeon0:05998] mca: base: component_find: unable to open osc pt2pt: file not found (ignored) [xeon0:05999] mca: base: component_find: unable to open osc pt2pt: file not found (ignored) Hello, World!! I am process 0 of 2 on xeon0. Hello, World!! I am process 1 of 2 on xeon0.
Re: [OMPI users] MPI::Intracomm::Spawn and cluster configuration
On 12/12/07 5:46 AM, "Elena Zhebel"wrote: > > > Hello, > > > > I'm working on a MPI application where I'm using OpenMPI instead of MPICH. > > In my "master" program I call the function MPI::Intracomm::Spawn which spawns > "slave" processes. It is not clear for me how to spawn the "slave" processes > over the network. Currently "master" creates "slaves" on the same host. > > If I use 'mpirun --hostfile openmpi.hosts' then processes are spawn over the > network as expected. But now I need to spawn processes over the network from > my own executable using MPI::Intracomm::Spawn, how can I achieve it? > I'm not sure from your description exactly what you are trying to do, nor in what environment this is all operating within or what version of Open MPI you are using. Setting aside the environment and version issue, I'm guessing that you are running your executable over some specified set of hosts, but want to provide a different hostfile that specifies the hosts to be used for the "slave" processes. Correct? If that is correct, then I'm afraid you can't do that in any version of Open MPI today. You have to specify all of the hosts that can be used by your job in the original hostfile. You can then specify a subset of those hosts to be used by your original "master" program, and then specify a different subset to be used by the "slaves" when calling Spawn. But the system requires that you tell it -all- of the hosts that are going to be used at the beginning of the job. At the moment, there is no plan to remove that requirement, though there has been occasional discussion about doing so at some point in the future. No promises that it will happen, though - managed environments, in particular, currently object to the idea of changing the allocation on-the-fly. We may, though, make a provision for purely hostfile-based environments (i.e., unmanaged) at some time in the future. Ralph > > > Thanks in advance for any help. > > Elena > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Gigabit ethernet (PCI Express) and openmpi v1.2.4
On Sun, Dec 16, 2007 at 06:49:30PM -0500, Allan Menezes wrote: > Hi, > How many PCI-Express Gigabit ethernet cards does OpenMPI version 1.2.4 > support with a corresponding linear increase in bandwith measured with > netpipe NPmpi and openmpi mpirun? > With two PCI express cards I get a B/W of 1.75Gbps for 892Mbps each ans > for three pci express cards ( one built into the motherboard) i get > 1.95Gbps. They all are around 890Mbs indiviually measured with netpipe > and NPtcp and NPmpi and openmpi. For two it seems there is a linear > increase in b/w but not for three pci express gigabit eth cards. > I have tune the cards using netpipe and $HOME/.openmpi/mca-params.conf > file for latency and percentage b/w . > Please advise. What is in your $HOME/.openmpi/mca-params.conf? May be are hitting your chipset limit here. What is your HW configuration? Can you try to run NPtcp on each interface simultaneously and see what BW do you get. -- Gleb.