Re: [OMPI devel] Trunk launch scaling
Ralph, If you plan to compare OMPI to Mvapich, make sure to take the version 1.0.0 (or above). In 1.0.0 OSU introduced new launcher that works much faster than previous one. Regards, Pasha Ralph H Castain wrote: Per this morning's telecon, I have added the latest scaling test results to the wiki: https://svn.open-mpi.org/trac/ompi/wiki/ORTEScalabilityTesting As you will see upon review, the trunk is scaling about an order of magnitude better than 1.2.x, both in terms of sheer speed and in the strength of the non-linear components of the scaling law. Those of us working on scaling issues expect to make additional improvements over the next few weeks. Update results will be posted to the wiki as they become available. Ralph ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Pavel Shamis (Pasha) Mellanox Technologies
[OMPI devel] --disable-ipv6 broken on trunk
It seems that builds configured with '--disable-ipv6' are broken on the trunk. I suspect r18055 for this break since the tarball from two nights ago worked fine and it is the only significant change in this code in the past week. The build error is: --- oob_tcp.c: In function `mca_oob_tcp_fini': oob_tcp.c:1364: error: structure has no member named `tcp6_listen_sd' oob_tcp.c:1365: error: structure has no member named `tcp6_recv_event' --- Can someone take a look at this? Cheers, Josh
Re: [OMPI devel] --disable-ipv6 broken on trunk
On Wed, Apr 02, 2008 at 06:36:02AM -0400, Josh Hursey wrote: > It seems that builds configured with '--disable-ipv6' are broken on > the trunk. I suspect r18055 for this break since the tarball from two > --- > oob_tcp.c: In function `mca_oob_tcp_fini': > oob_tcp.c:1364: error: structure has no member named `tcp6_listen_sd' > oob_tcp.c:1365: error: structure has no member named `tcp6_recv_event' > --- > Can someone take a look at this? Fixed in r18071. Thanks for observation. -- Cluster and Metacomputing Working Group Friedrich-Schiller-Universität Jena, Germany private: http://adi.thur.de
Re: [OMPI devel] --disable-ipv6 broken on trunk
Great. Thanks for the fix. On Apr 2, 2008, at 6:54 AM, Adrian Knoth wrote: On Wed, Apr 02, 2008 at 06:36:02AM -0400, Josh Hursey wrote: It seems that builds configured with '--disable-ipv6' are broken on the trunk. I suspect r18055 for this break since the tarball from two --- oob_tcp.c: In function `mca_oob_tcp_fini': oob_tcp.c:1364: error: structure has no member named `tcp6_listen_sd' oob_tcp.c:1365: error: structure has no member named `tcp6_recv_event' --- Can someone take a look at this? Fixed in r18071. Thanks for observation. -- Cluster and Metacomputing Working Group Friedrich-Schiller-Universität Jena, Germany private: http://adi.thur.de ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
[OMPI devel] [PATCH] Fix typo in configure helptext
Hi, * config/ompi_configure_options.m4: Fix typo in helptext Please apply. TIA, Bernhard Index: ompi-trunk/config/ompi_configure_options.m4 === --- ompi-trunk/config/ompi_configure_options.m4 (revision 18069) +++ ompi-trunk/config/ompi_configure_options.m4 (working copy) @@ -711,7 +711,7 @@ AC_MSG_RESULT([$with_ident_string]) AC_MSG_CHECKING([if ConnectX XRC support should be enabled]) AC_ARG_ENABLE([connectx-xrc], [AC_HELP_STRING([--enable-connectx-xrc], -[Enable features required for ConnectX XRC support. If you don't have Infiniband ConnectX adapters you may disable the ConnectX XRC support. If you don't know which Infiniband adapter is installed on you cluster - leave it enabled (default: enabled)])]) +[Enable features required for ConnectX XRC support. If you don't have Infiniband ConnectX adapters you may disable the ConnectX XRC support. If you don't know which Infiniband adapter is installed on your cluster - leave it enabled (default: enabled)])]) if test "$enable_connectx_xrc" = "no" ; then AC_MSG_RESULT([no]) ompi_want_connectx_xrc=0
[OMPI devel] [PATCH] Fix compilation error without XRC
Hi, * ompi/mca/btl/openib/btl_openib_component.c (init_one_hca): mca_btl_openib_open_xrc_domain and mca_btl_openib_close_xrc_domain depend on XRC Fixes the compilation failure as in the head of attached patch. TIA, Bernhard CXX -g -finline-functions -o .libs/ompi_info components.o ompi_info.o output.o param.o version.o -Wl,--export-dynamic ../../../ompi/.libs/libmpi.so -L/opt/infiniband/lib /opt/infiniband/lib/libibverbs.so -lpthread -lrt /home/bernhard/src/openmpi/ompi-trunk/orte/.libs/libopen-rte.so /home/bernhard/src/openmpi/ompi-trunk/opal/.libs/libopen-pal.so -ldl -lnuma -lnsl -lutil -Wl,-rpath,/opt/libs//openmpi-1.3.0.a1.r18069-INTEL-10.1.013-64/lib -Wl,-rpath,/opt/infiniband/lib ../../../ompi/.libs/libmpi.so: undefined reference to `mca_btl_openib_close_xrc_domain' ../../../ompi/.libs/libmpi.so: undefined reference to `mca_btl_openib_open_xrc_domain' make[2]: *** [ompi_info] Error 1 Index: ompi-trunk/ompi/mca/btl/openib/btl_openib_component.c === --- ompi-trunk/ompi/mca/btl/openib/btl_openib_component.c (revision 18069) +++ ompi-trunk/ompi/mca/btl/openib/btl_openib_component.c (working copy) @@ -1012,12 +1012,14 @@ static int init_one_hca(opal_list_t *btl goto error; } +#if HAVE_XRC if (MCA_BTL_XRC_ENABLED) { if (OMPI_SUCCESS != mca_btl_openib_open_xrc_domain(hca)) { BTL_ERROR(("XRC Internal error. Failed to open xrc domain")); goto error; } } +#endif mpool_resources.reg_data = (void*)hca; mpool_resources.sizeof_reg = sizeof(mca_btl_openib_reg_t); @@ -1103,11 +1105,13 @@ error: #endif if(hca->mpool) mca_mpool_base_module_destroy(hca->mpool); +#if HAVE_XRC if (MCA_BTL_XRC_ENABLED) { if(OMPI_SUCCESS != mca_btl_openib_close_xrc_domain(hca)) { BTL_ERROR(("XRC Internal error. Failed to close xrc domain")); } } +#endif if(hca->ib_pd) ibv_dealloc_pd(hca->ib_pd); if(hca->ib_dev_context)
[OMPI devel] FW: [devel-core] [RFC] Add an alias name to MCA parameter
-Original Message- From: devel-core-boun...@open-mpi.org [mailto:devel-core-boun...@open-mpi.org] On Behalf Of Jeff Squyres Sent: Wednesday, April 02, 2008 3:44 PM To: Open MPI Core Developers Subject: Re: [devel-core] [RFC] Add an alias name to MCA parameter BTW, these mails can go across devel. devel-core is only for "private" stuff, like dialup phone numbers, etc. On Apr 2, 2008, at 9:34 AM, Jeff Squyres wrote: > I agree that it would be beneficial to support an arbitrary number of > aliases. > > Also, some points about ompi_info: > > - say you register "opal_paffinity_base_alone", and later register > "mpi_paffinity_alone" as an alias. When you "ompi_info --param mpi > all", the alias should still show up > > - aliases that are displayed through ompi_info's --param option should > somehow indicate that they are aliases, and show the "real" name as > well. Perhaps something like this? (just an idea) > > MCA btl: parameter "mpi_paffinity_alone" (current > value: 1, > alias for: opal_paffinity_base_alone) > If nonzero, assume that this job is the > only (set of) > process(es) running on each node and bind > processes to > processors, starting with processor ID 0 > > > On Apr 2, 2008, at 8:37 AM, Josh Hursey wrote: >> This sounds great. I have a couple questions though: >> - Is there a patch for this that we can look at/test? >> - Do you require that the parameter be registered before adding an >> alias? >> - Is the 'index' argument referencing the original MCA parameter, or >> are aliases given individual 'index' values? >> - Does this support more than one alias for a single MCA parameter? >> If so then there should be a way to specify that in the remove >> function. >> >> Cheers, >> Josh >> >> On Apr 2, 2008, at 9:22 AM, Sharon Melamed wrote: >>> WHAT: Add an alias name to MCA parameter. >>> >>> WHY: There is a parameter that we need to register and use in OPAL >>> (before ompi_init) but historically the parameter name is >>> ompi_something_ With the alias name we can register this >>> parameter in OPAL and call it opal_something_ and then add an >>> alias name: ompi_something.. Now the user can use this parameter >>> with its real name or with its alias. >>> >>> WHERE: in /opal/mca/base/ >>> >>> TIMOUT: Thursday - April, 10. >>> >>> DESCRIPTION: >>> Add two Interfaces to the MCA system: >>> >>> OPAL_DECLSPEC int mca_base_param_add_alias (int index, const char* >>> aliase); >>> OPAL_DECLSPEC int mca_base_param_remove_alias (int index); >>> >>> These functions could be called any where in the code after the >>> registration of the MCA parameter. (mca_base_register) >>> >>> This change includes: >>> . Adding a member to mca_base_param_t structure. >>> . Modifying the find functions >>> . Modifying ompi_info. >>> >>> >>> ___ >>> devel-core mailing list >>> devel-c...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel-core >> >> >> ___ >> devel-core mailing list >> devel-c...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel-core > > > -- > Jeff Squyres > Cisco Systems > > > ___ > devel-core mailing list > devel-c...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel-core -- Jeff Squyres Cisco Systems ___ devel-core mailing list devel-c...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel-core
Re: [OMPI devel] FW: [devel-core] [RFC] Add an alias name to MCA parameter
An arbitrary number of aliases is useful in a number of ways. For example you mention wanting to register an OPAL MCA parameter and later alias it as an OMPI MCA parameter. What if we also wanted to alias it as an ORTE level parameter. The best example I can think of is the TCP include/exclude interface MCA paramters. OPAL may need to know them for the 'if' functionality, ORTE for the OOB/tcp component, and OMPI for the BTL/tcp component. Another example might be the TMP_DIR MCA parameter discussion that has been going on in another thread. Why not have an OPAL/ORTE/OMPI variant of this parameter? Another reason for multiple aliases is general code development. Going forward we may want to re-alias an already aliased MCA parameter to give it a 'better' name. -- Josh On Apr 2, 2008, at 10:50 AM, Sharon Melamed wrote: -Original Message- From: devel-core-boun...@open-mpi.org [mailto:devel-core-boun...@open-mpi.org] On Behalf Of Jeff Squyres Sent: Wednesday, April 02, 2008 3:44 PM To: Open MPI Core Developers Subject: Re: [devel-core] [RFC] Add an alias name to MCA parameter BTW, these mails can go across devel. devel-core is only for "private" stuff, like dialup phone numbers, etc. On Apr 2, 2008, at 9:34 AM, Jeff Squyres wrote: I agree that it would be beneficial to support an arbitrary number of aliases. Also, some points about ompi_info: - say you register "opal_paffinity_base_alone", and later register "mpi_paffinity_alone" as an alias. When you "ompi_info --param mpi all", the alias should still show up - aliases that are displayed through ompi_info's --param option should somehow indicate that they are aliases, and show the "real" name as well. Perhaps something like this? (just an idea) MCA btl: parameter "mpi_paffinity_alone" (current value: 1, alias for: opal_paffinity_base_alone) If nonzero, assume that this job is the only (set of) process(es) running on each node and bind processes to processors, starting with processor ID 0 On Apr 2, 2008, at 8:37 AM, Josh Hursey wrote: This sounds great. I have a couple questions though: - Is there a patch for this that we can look at/test? - Do you require that the parameter be registered before adding an alias? - Is the 'index' argument referencing the original MCA parameter, or are aliases given individual 'index' values? - Does this support more than one alias for a single MCA parameter? If so then there should be a way to specify that in the remove function. Cheers, Josh On Apr 2, 2008, at 9:22 AM, Sharon Melamed wrote: WHAT: Add an alias name to MCA parameter. WHY: There is a parameter that we need to register and use in OPAL (before ompi_init) but historically the parameter name is ompi_something_ With the alias name we can register this parameter in OPAL and call it opal_something_ and then add an alias name: ompi_something.. Now the user can use this parameter with its real name or with its alias. WHERE: in /opal/mca/base/ TIMOUT: Thursday - April, 10. DESCRIPTION: Add two Interfaces to the MCA system: OPAL_DECLSPEC int mca_base_param_add_alias (int index, const char* aliase); OPAL_DECLSPEC int mca_base_param_remove_alias (int index); These functions could be called any where in the code after the registration of the MCA parameter. (mca_base_register) This change includes: . Adding a member to mca_base_param_t structure. . Modifying the find functions . Modifying ompi_info. ___ devel-core mailing list devel-c...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel-core ___ devel-core mailing list devel-c...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel-core -- Jeff Squyres Cisco Systems ___ devel-core mailing list devel-c...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel-core -- Jeff Squyres Cisco Systems ___ devel-core mailing list devel-c...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel-core ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
[OMPI devel] RFC: changes to modex
WHAT: Changes to MPI layer modex API WHY: To be mo' betta scalable WHERE: ompi/mpi/runtime/ompi_module_exchange.* and everywhere that calls ompi_modex_send() and/or ompi_modex_recv() TIMEOUT: COB Fri 4 Apr 2008 DESCRIPTION: Per some of the scalability discussions that have been occurring (some on-list and some off-list), and per the e-mail I sent out last week about ongoing work in the openib BTL, Ralph and I put together a loose proposal this morning to make the modex more scalable. The timeout is fairly short because Ralph wanted to start implementing in the near future, and we didn't anticipate that this would be a contentious proposal. The theme is to break the modex into two different kinds of data: - Modex data that is specific to a given proc - Modex data that is applicable to all procs on a given node For example, in the openib BTL, the majority of modex data is applicable to all processes on the same node (GIDs and LIDs and whatnot). It is much more efficient to send only one copy of such node-specific data to each process (vs. sending ppn copies to each process). The spreadsheet I included in last week's e-mail clearly shows this. 1. Add new modex API functions. The exact function signatures are TBD, but they will be generally of the form: * int ompi_modex_proc_send(...): send modex data that is specific to this process. It is just about exactly the same as the current API call (ompi_modex_send). * int ompi_modex_proc_recv(...): receive modex data from a specified peer process (indexed on ompi_proc_t*). It is just about exactly the same as the current API call (ompi_modex_recv). * int ompi_modex_node_send(...): send modex data that is relevant for all processes in this job on this node. It is intended that only one process in a job on a node will call this function. If more than one process in a job on a node calls _node_send(), then only one will "win" (meaning that the data sent by the others will be overwritten). * int ompi_modex_node_recv(...): receive modex data that is relevant for a whole peer node; receive the ["winning"] blob sent by _node_send() from the source node. We haven't yet decided what the node index will be; it may be (ompi_proc_t*) (i.e., _node_recv() would figure out what node the (ompi_proc_t*) resides on and then give you the data). 2. Make the existing modex API calls (ompi_modex_send, ompi_modex_recv) be wrappers around the new "proc" send/receive calls. This will provide exactly the same functionality as the current API (but be sub-optimal at scale). It will give BTL authors (etc.) time to update to the new API, potentially taking advantage of common data across multiple processes on the same node. We'll likely put in some opal_output()'s in the wrappers to help identify code that is still calling the old APIs. 3. Remove the old API calls (ompi_modex_send, ompi_modex_recv) before v1.3 is released. -- Jeff Squyres Cisco Systems
Re: [OMPI devel] RFC: changes to modex
On Wed, Apr 02, 2008 at 10:21:12AM -0400, Jeff Squyres wrote: > * int ompi_modex_proc_send(...): send modex data that is specific to > this process. It is just about exactly the same as the current API > call (ompi_modex_send). > [skip] > > * int ompi_modex_node_send(...): send modex data that is relevant > for all processes in this job on this node. It is intended that only > one process in a job on a node will call this function. If more than > one process in a job on a node calls _node_send(), then only one will > "win" (meaning that the data sent by the others will be overwritten). > In the case of openib BTL what part of modex are you going to send using proc_send() and what part using node_send()? -- Gleb.
Re: [OMPI devel] RFC: changes to modex
On Apr 2, 2008, at 10:27 AM, Gleb Natapov wrote: In the case of openib BTL what part of modex are you going to send using proc_send() and what part using node_send()? In the /tmp-public/openib-cpc2 branch, almost all of it will go to the node_send(). The CPC's will likely now get 2 buffers: one for node_send, and one for proc_send. The ibcm CPC, for example, can do everything in node_send (the service_id that I use in the ibcm calls is the proc's PID; ORTE may supply peer PIDs directly -- haven't decided if that's a good idea yet or not -- if it doesn't, the PID can be sent in the proc_send data). The rdmacm CPC may need a proc_send for the listening TCP port number; still need to figure that one out. If we use carto to limit hcas/ports are used on a given host on a per- proc basis, then we can include some proc_send data to say "this proc only uses indexes X,Y,Z from the node data". The indexes can be either uint8_ts, or maybe even a variable length bitmap. -- Jeff Squyres Cisco Systems
Re: [OMPI devel] RFC: changes to modex
Jeff Squyres wrote: WHAT: Changes to MPI layer modex API WHY: To be mo' betta scalable WHERE: ompi/mpi/runtime/ompi_module_exchange.* and everywhere that calls ompi_modex_send() and/or ompi_modex_recv() TIMEOUT: COB Fri 4 Apr 2008 DESCRIPTION: [...snip...] * int ompi_modex_node_send(...): send modex data that is relevant for all processes in this job on this node. It is intended that only one process in a job on a node will call this function. If more than one process in a job on a node calls _node_send(), then only one will "win" (meaning that the data sent by the others will be overwritten). * int ompi_modex_node_recv(...): receive modex data that is relevant for a whole peer node; receive the ["winning"] blob sent by _node_send() from the source node. We haven't yet decided what the node index will be; it may be (ompi_proc_t*) (i.e., _node_recv() would figure out what node the (ompi_proc_t*) resides on and then give you the data). The above sounds like there could be race conditions if more than one process on a node is doing ompi_modex_node_send. That is are you really going to be able to be assured when ompi_modex_node_recv is done that one of the processes is not in the middle of doing ompi_modex_node_send? I assume there must be some sort of gate that allows you to make sure no one is in the middle of overwriting your data. --td
Re: [OMPI devel] RFC: changes to modex
Is there a reason to rename ompi_modex_{send,recv} to ompi_modex_proc_{send,recv}? It seems simpler (and no more confusing and less work) to leave the names alone and add ompi_modex_node_{send,recv}. Another question: Does the receiving process care that the information received applies to a whole node? I ask because maybe we could get the same effect by simply adding a parameter to ompi_modex_send which specifies if the data applies to just the proc or a whole node. So, if we have ranks 1 & 2 on n1, and rank 3 on n2, then rank 1 would do: ompi_modex_send("arch", arch, ); then rank 3 would do: ompi_modex_recv(rank 1, "arch"); ompi_modex_recv(rank 2, "arch"); I don't really care either way, just wanted to throw out the idea. Tim Jeff Squyres wrote: WHAT: Changes to MPI layer modex API WHY: To be mo' betta scalable WHERE: ompi/mpi/runtime/ompi_module_exchange.* and everywhere that calls ompi_modex_send() and/or ompi_modex_recv() TIMEOUT: COB Fri 4 Apr 2008 DESCRIPTION: Per some of the scalability discussions that have been occurring (some on-list and some off-list), and per the e-mail I sent out last week about ongoing work in the openib BTL, Ralph and I put together a loose proposal this morning to make the modex more scalable. The timeout is fairly short because Ralph wanted to start implementing in the near future, and we didn't anticipate that this would be a contentious proposal. The theme is to break the modex into two different kinds of data: - Modex data that is specific to a given proc - Modex data that is applicable to all procs on a given node For example, in the openib BTL, the majority of modex data is applicable to all processes on the same node (GIDs and LIDs and whatnot). It is much more efficient to send only one copy of such node-specific data to each process (vs. sending ppn copies to each process). The spreadsheet I included in last week's e-mail clearly shows this. 1. Add new modex API functions. The exact function signatures are TBD, but they will be generally of the form: * int ompi_modex_proc_send(...): send modex data that is specific to this process. It is just about exactly the same as the current API call (ompi_modex_send). * int ompi_modex_proc_recv(...): receive modex data from a specified peer process (indexed on ompi_proc_t*). It is just about exactly the same as the current API call (ompi_modex_recv). * int ompi_modex_node_send(...): send modex data that is relevant for all processes in this job on this node. It is intended that only one process in a job on a node will call this function. If more than one process in a job on a node calls _node_send(), then only one will "win" (meaning that the data sent by the others will be overwritten). * int ompi_modex_node_recv(...): receive modex data that is relevant for a whole peer node; receive the ["winning"] blob sent by _node_send() from the source node. We haven't yet decided what the node index will be; it may be (ompi_proc_t*) (i.e., _node_recv() would figure out what node the (ompi_proc_t*) resides on and then give you the data). 2. Make the existing modex API calls (ompi_modex_send, ompi_modex_recv) be wrappers around the new "proc" send/receive calls. This will provide exactly the same functionality as the current API (but be sub-optimal at scale). It will give BTL authors (etc.) time to update to the new API, potentially taking advantage of common data across multiple processes on the same node. We'll likely put in some opal_output()'s in the wrappers to help identify code that is still calling the old APIs. 3. Remove the old API calls (ompi_modex_send, ompi_modex_recv) before v1.3 is released.
Re: [OMPI devel] RFC: changes to modex
On Wed, Apr 02, 2008 at 10:35:03AM -0400, Jeff Squyres wrote: > If we use carto to limit hcas/ports are used on a given host on a per- > proc basis, then we can include some proc_send data to say "this proc > only uses indexes X,Y,Z from the node data". The indexes can be > either uint8_ts, or maybe even a variable length bitmap. > So you propose that each proc will send info (using node_send()) about every hca/proc on a host even about those that are excluded from use by the proc just in case? And then each proc will have to send additional info (using proc_send() this time) to indicate what hcas/ports it is actually using? -- Gleb.
Re: [OMPI devel] RFC: changes to modex
On 4/2/08 8:52 AM, "Terry Dontje" wrote: > Jeff Squyres wrote: >> WHAT: Changes to MPI layer modex API >> >> WHY: To be mo' betta scalable >> >> WHERE: ompi/mpi/runtime/ompi_module_exchange.* and everywhere that >> calls ompi_modex_send() and/or ompi_modex_recv() >> >> TIMEOUT: COB Fri 4 Apr 2008 >> >> DESCRIPTION: >> >> > [...snip...] >> * int ompi_modex_node_send(...): send modex data that is relevant >> for all processes in this job on this node. It is intended that only >> one process in a job on a node will call this function. If more than >> one process in a job on a node calls _node_send(), then only one will >> "win" (meaning that the data sent by the others will be overwritten). >> >> >> * int ompi_modex_node_recv(...): receive modex data that is relevant >> for a whole peer node; receive the ["winning"] blob sent by >> _node_send() from the source node. We haven't yet decided what the >> node index will be; it may be (ompi_proc_t*) (i.e., _node_recv() would >> figure out what node the (ompi_proc_t*) resides on and then give you >> the data). >> >> > The above sounds like there could be race conditions if more than one > process on a node is doing > ompi_modex_node_send. That is are you really going to be able to be > assured when ompi_modex_node_recv > is done that one of the processes is not in the middle of doing > ompi_modex_node_send? I assume > there must be some sort of gate that allows you to make sure no one is > in the middle of overwriting your data. The nature of the modex actually precludes this. The modex is implemented as a barrier, so the timing actually looks like this: 1. each proc registers its modex_node[proc]_send calls early in MPI_Init. All this does is collect the data locally in a buffer 2. each proc hits the orte_grpcomm.modex call in MPI_Init. At this point, the collected data is sent to the local daemon. The proc "barriers" at this point and can go no further until the modex is completed. 3. when the daemon detects that all local procs have sent it a modex buffer, it enters an "allgather" operation across all daemons. When that operation completes, each daemon has a complete modex buffer spanning the job. 4. each daemon "drops" the collected buffer into each local proc 5. each proc, upon receiving the modex buffer, decodes it and sets up its data structs to respond to future modex_recv calls. Once that is completed, the proc returns from the orte_grpcomm.modex call and is released from the "barrier". So we resolve the race condition by including a "barrier" inside the modex. This is the current behavior as well - so this represents no change, just a different organization of the modex'd data. > > --td > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel
[OMPI devel] Ssh tunnelling broken in trunk?
I am noticing that ssh seems to be broken on trunk (and my cpc branch, as it is based on trunk). When I try to use xterm and gdb to debug, I only successfully get 1 xterm. I have tried this on 2 different setups. I can successfully get the xterm's on the 1.2 svn branch. I am running the following command: mpirun --n 2 --host vic12,vic20 -mca btl tcp,self -d xterm -e gdb /usr/mpi/gcc/openmpi-1.2-svn/tests/IMB-3.0/IMB-MPI1 Is anyone else seeing this problem? Thanks, Jon
Re: [OMPI devel] Ssh tunnelling broken in trunk?
I regressed my tree and it looks like it happened between 17590:17917 On Wednesday 02 April 2008 10:22:52 am Jon Mason wrote: > I am noticing that ssh seems to be broken on trunk (and my cpc branch, as > it is based on trunk). When I try to use xterm and gdb to debug, I only > successfully get 1 xterm. I have tried this on 2 different setups. I can > successfully get the xterm's on the 1.2 svn branch. > > I am running the following command: > mpirun --n 2 --host vic12,vic20 -mca btl tcp,self -d xterm -e > gdb /usr/mpi/gcc/openmpi-1.2-svn/tests/IMB-3.0/IMB-MPI1 > > Is anyone else seeing this problem? > > Thanks, > Jon > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] RFC: changes to modex
On Apr 2, 2008, at 11:10 AM, Tim Prins wrote: Is there a reason to rename ompi_modex_{send,recv} to ompi_modex_proc_{send,recv}? It seems simpler (and no more confusing and less work) to leave the names alone and add ompi_modex_node_{send,recv}. If the arguments don't change, I don't have a strong objection to leaving the names alone. I think the rationale for a new names is: - the arguments may change - completely clear names, and good symmetry with *_node_* and *_proc_* If the args change, then I think it is best to use new names so that BTL authors (etc.) have time to adapt. If not, then I minorly prefer the new names, but don't care too much. Another question: Does the receiving process care that the information received applies to a whole node? I ask because maybe we could get the same effect by simply adding a parameter to ompi_modex_send which specifies if the data applies to just the proc or a whole node. So, if we have ranks 1 & 2 on n1, and rank 3 on n2, then rank 1 would do: ompi_modex_send("arch", arch, ); then rank 3 would do: ompi_modex_recv(rank 1, "arch"); ompi_modex_recv(rank 2, "arch"); I'm not sure I understand what you mean. Proc 3 would get the one blob that was sent from proc 1? In the openib btl, I'll likely have both node and proc portions to send. I don't really care either way, just wanted to throw out the idea. Tim Jeff Squyres wrote: WHAT: Changes to MPI layer modex API WHY: To be mo' betta scalable WHERE: ompi/mpi/runtime/ompi_module_exchange.* and everywhere that calls ompi_modex_send() and/or ompi_modex_recv() TIMEOUT: COB Fri 4 Apr 2008 DESCRIPTION: Per some of the scalability discussions that have been occurring (some on-list and some off-list), and per the e-mail I sent out last week about ongoing work in the openib BTL, Ralph and I put together a loose proposal this morning to make the modex more scalable. The timeout is fairly short because Ralph wanted to start implementing in the near future, and we didn't anticipate that this would be a contentious proposal. The theme is to break the modex into two different kinds of data: - Modex data that is specific to a given proc - Modex data that is applicable to all procs on a given node For example, in the openib BTL, the majority of modex data is applicable to all processes on the same node (GIDs and LIDs and whatnot). It is much more efficient to send only one copy of such node-specific data to each process (vs. sending ppn copies to each process). The spreadsheet I included in last week's e-mail clearly shows this. 1. Add new modex API functions. The exact function signatures are TBD, but they will be generally of the form: * int ompi_modex_proc_send(...): send modex data that is specific to this process. It is just about exactly the same as the current API call (ompi_modex_send). * int ompi_modex_proc_recv(...): receive modex data from a specified peer process (indexed on ompi_proc_t*). It is just about exactly the same as the current API call (ompi_modex_recv). * int ompi_modex_node_send(...): send modex data that is relevant for all processes in this job on this node. It is intended that only one process in a job on a node will call this function. If more than one process in a job on a node calls _node_send(), then only one will "win" (meaning that the data sent by the others will be overwritten). * int ompi_modex_node_recv(...): receive modex data that is relevant for a whole peer node; receive the ["winning"] blob sent by _node_send() from the source node. We haven't yet decided what the node index will be; it may be (ompi_proc_t*) (i.e., _node_recv() would figure out what node the (ompi_proc_t*) resides on and then give you the data). 2. Make the existing modex API calls (ompi_modex_send, ompi_modex_recv) be wrappers around the new "proc" send/receive calls. This will provide exactly the same functionality as the current API (but be sub-optimal at scale). It will give BTL authors (etc.) time to update to the new API, potentially taking advantage of common data across multiple processes on the same node. We'll likely put in some opal_output()'s in the wrappers to help identify code that is still calling the old APIs. 3. Remove the old API calls (ompi_modex_send, ompi_modex_recv) before v1.3 is released. ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Jeff Squyres Cisco Systems
Re: [OMPI devel] Ssh tunnelling broken in trunk?
Are these r numbers relevant on the /tmp-public branch, or the trunk? On Apr 2, 2008, at 11:59 AM, Jon Mason wrote: I regressed my tree and it looks like it happened between 17590:17917 On Wednesday 02 April 2008 10:22:52 am Jon Mason wrote: I am noticing that ssh seems to be broken on trunk (and my cpc branch, as it is based on trunk). When I try to use xterm and gdb to debug, I only successfully get 1 xterm. I have tried this on 2 different setups. I can successfully get the xterm's on the 1.2 svn branch. I am running the following command: mpirun --n 2 --host vic12,vic20 -mca btl tcp,self -d xterm -e gdb /usr/mpi/gcc/openmpi-1.2-svn/tests/IMB-3.0/IMB-MPI1 Is anyone else seeing this problem? Thanks, Jon ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Jeff Squyres Cisco Systems
Re: [OMPI devel] RFC: changes to modex
On Apr 2, 2008, at 11:13 AM, Gleb Natapov wrote: On Wed, Apr 02, 2008 at 10:35:03AM -0400, Jeff Squyres wrote: If we use carto to limit hcas/ports are used on a given host on a per- proc basis, then we can include some proc_send data to say "this proc only uses indexes X,Y,Z from the node data". The indexes can be either uint8_ts, or maybe even a variable length bitmap. So you propose that each proc will send info (using node_send()) about every hca/proc on a host even about those that are excluded from use by the proc just in case? And then each proc will have to send additional info (using proc_send() this time) to indicate what hcas/ports it is actually using? No, I think it would be fine to only send the output after btl_openib_if_in|exclude is applied. Perhaps we need an MCA param to say "always send everything" in the case that someone applies a non- homogeneous if_in|exclude set of values...? When is carto stuff applied? Is that what you're really asking about? -- Jeff Squyres Cisco Systems
Re: [OMPI devel] Ssh tunnelling broken in trunk?
On Wednesday 02 April 2008 11:07:18 am Jeff Squyres wrote: > Are these r numbers relevant on the /tmp-public branch, or the trunk? I pulled it out of the command used to update the branch, which was: svn merge -r 17590:17917 https://svn.open-mpi.org/svn/ompi/trunk . In the cpc tmp branch, it happened at r17920. Thanks, Jon > On Apr 2, 2008, at 11:59 AM, Jon Mason wrote: > > I regressed my tree and it looks like it happened between 17590:17917 > > > > On Wednesday 02 April 2008 10:22:52 am Jon Mason wrote: > >> I am noticing that ssh seems to be broken on trunk (and my cpc > >> branch, as > >> it is based on trunk). When I try to use xterm and gdb to debug, I > >> only > >> successfully get 1 xterm. I have tried this on 2 different > >> setups. I can > >> successfully get the xterm's on the 1.2 svn branch. > >> > >> I am running the following command: > >> mpirun --n 2 --host vic12,vic20 -mca btl tcp,self -d xterm -e > >> gdb /usr/mpi/gcc/openmpi-1.2-svn/tests/IMB-3.0/IMB-MPI1 > >> > >> Is anyone else seeing this problem? > >> > >> Thanks, > >> Jon > >> ___ > >> devel mailing list > >> de...@open-mpi.org > >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > > ___ > > devel mailing list > > de...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] Ssh tunnelling broken in trunk?
I'm using this feature on the trunk with the version from yesterday. It works without problems ... george. On Apr 2, 2008, at 12:14 PM, Jon Mason wrote: On Wednesday 02 April 2008 11:07:18 am Jeff Squyres wrote: Are these r numbers relevant on the /tmp-public branch, or the trunk? I pulled it out of the command used to update the branch, which was: svn merge -r 17590:17917 https://svn.open-mpi.org/svn/ompi/trunk . In the cpc tmp branch, it happened at r17920. Thanks, Jon On Apr 2, 2008, at 11:59 AM, Jon Mason wrote: I regressed my tree and it looks like it happened between 17590:17917 On Wednesday 02 April 2008 10:22:52 am Jon Mason wrote: I am noticing that ssh seems to be broken on trunk (and my cpc branch, as it is based on trunk). When I try to use xterm and gdb to debug, I only successfully get 1 xterm. I have tried this on 2 different setups. I can successfully get the xterm's on the 1.2 svn branch. I am running the following command: mpirun --n 2 --host vic12,vic20 -mca btl tcp,self -d xterm -e gdb /usr/mpi/gcc/openmpi-1.2-svn/tests/IMB-3.0/IMB-MPI1 Is anyone else seeing this problem? Thanks, Jon ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel smime.p7s Description: S/MIME cryptographic signature
Re: [OMPI devel] Ssh tunnelling broken in trunk?
I remember that someone had found a bug that caused orte_debug_flag to not get properly set (local var covering over a global one) - could be that your tmp-public branch doesn't have that patch in it. You might try updating to the latest trunk On 4/2/08 10:41 AM, "George Bosilca" wrote: > I'm using this feature on the trunk with the version from yesterday. > It works without problems ... > >george. > > On Apr 2, 2008, at 12:14 PM, Jon Mason wrote: >> On Wednesday 02 April 2008 11:07:18 am Jeff Squyres wrote: >>> Are these r numbers relevant on the /tmp-public branch, or the trunk? >> >> I pulled it out of the command used to update the branch, which was: >> svn merge -r 17590:17917 https://svn.open-mpi.org/svn/ompi/trunk . >> >> In the cpc tmp branch, it happened at r17920. >> >> Thanks, >> Jon >> >>> On Apr 2, 2008, at 11:59 AM, Jon Mason wrote: I regressed my tree and it looks like it happened between 17590:17917 On Wednesday 02 April 2008 10:22:52 am Jon Mason wrote: > I am noticing that ssh seems to be broken on trunk (and my cpc > branch, as > it is based on trunk). When I try to use xterm and gdb to debug, I > only > successfully get 1 xterm. I have tried this on 2 different > setups. I can > successfully get the xterm's on the 1.2 svn branch. > > I am running the following command: > mpirun --n 2 --host vic12,vic20 -mca btl tcp,self -d xterm -e > gdb /usr/mpi/gcc/openmpi-1.2-svn/tests/IMB-3.0/IMB-MPI1 > > Is anyone else seeing this problem? > > Thanks, > Jon > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel >> >> >> ___ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] RFC: changes to modex
On Wed, Apr 02, 2008 at 12:08:47PM -0400, Jeff Squyres wrote: > On Apr 2, 2008, at 11:13 AM, Gleb Natapov wrote: > > On Wed, Apr 02, 2008 at 10:35:03AM -0400, Jeff Squyres wrote: > >> If we use carto to limit hcas/ports are used on a given host on a > >> per- > >> proc basis, then we can include some proc_send data to say "this proc > >> only uses indexes X,Y,Z from the node data". The indexes can be > >> either uint8_ts, or maybe even a variable length bitmap. > >> > > So you propose that each proc will send info (using node_send()) > > about every > > hca/proc on a host even about those that are excluded from use by > > the proc > > just in case? And then each proc will have to send additional info > > (using > > proc_send() this time) to indicate what hcas/ports it is actually > > using? > > > No, I think it would be fine to only send the output after > btl_openib_if_in|exclude is applied. Perhaps we need an MCA param to > say "always send everything" in the case that someone applies a non- > homogeneous if_in|exclude set of values...? > > When is carto stuff applied? Is that what you're really asking about? > There is no difference between carto and include/exclude. I can specify different openib_if_include values for different procs on the same host. -- Gleb.
Re: [OMPI devel] Ssh tunnelling broken in trunk?
On Wednesday 02 April 2008 11:54:50 am Ralph H Castain wrote: > I remember that someone had found a bug that caused orte_debug_flag to not > get properly set (local var covering over a global one) - could be that > your tmp-public branch doesn't have that patch in it. > > You might try updating to the latest trunk I updated my ompi-trunk tree, did a clean build, and I still seem the same problem. I regressed trunk to rev 17589 and everything works as I expect. So I think the problem is still there in the top of trunk. I don't discount user error, but I don't think I am doing anyting different. Did some setting change that perhaps I did not modify? Thanks, Jon > On 4/2/08 10:41 AM, "George Bosilca" wrote: > > I'm using this feature on the trunk with the version from yesterday. > > It works without problems ... > > > >george. > > > > On Apr 2, 2008, at 12:14 PM, Jon Mason wrote: > >> On Wednesday 02 April 2008 11:07:18 am Jeff Squyres wrote: > >>> Are these r numbers relevant on the /tmp-public branch, or the trunk? > >> > >> I pulled it out of the command used to update the branch, which was: > >> svn merge -r 17590:17917 https://svn.open-mpi.org/svn/ompi/trunk . > >> > >> In the cpc tmp branch, it happened at r17920. > >> > >> Thanks, > >> Jon > >> > >>> On Apr 2, 2008, at 11:59 AM, Jon Mason wrote: > I regressed my tree and it looks like it happened between > 17590:17917 > > On Wednesday 02 April 2008 10:22:52 am Jon Mason wrote: > > I am noticing that ssh seems to be broken on trunk (and my cpc > > branch, as > > it is based on trunk). When I try to use xterm and gdb to debug, I > > only > > successfully get 1 xterm. I have tried this on 2 different > > setups. I can > > successfully get the xterm's on the 1.2 svn branch. > > > > I am running the following command: > > mpirun --n 2 --host vic12,vic20 -mca btl tcp,self -d xterm -e > > gdb /usr/mpi/gcc/openmpi-1.2-svn/tests/IMB-3.0/IMB-MPI1 > > > > Is anyone else seeing this problem? > > > > Thanks, > > Jon > > ___ > > devel mailing list > > de...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/devel > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel > >> > >> ___ > >> devel mailing list > >> de...@open-mpi.org > >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > > ___ > > devel mailing list > > de...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/devel > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] [PATCH] Fix typo in configure helptext
Thanks! We have a general rule to not apply autogen-worthy changes during the US workday, so I'll commit this tonight. On Apr 2, 2008, at 8:20 AM, Bernhard Fischer wrote: Hi, * config/ompi_configure_options.m4: Fix typo in helptext Please apply. TIA, Bernhard connectx.diff>___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Jeff Squyres Cisco Systems
Re: [OMPI devel] RFC: changes to modex
On Apr 2, 2008, at 1:58 PM, Gleb Natapov wrote: No, I think it would be fine to only send the output after btl_openib_if_in|exclude is applied. Perhaps we need an MCA param to say "always send everything" in the case that someone applies a non- homogeneous if_in|exclude set of values...? When is carto stuff applied? Is that what you're really asking about? There is no difference between carto and include/exclude. You mean in terms of when they are applied? I can specify different openib_if_include values for different procs on the same host. I know you *can*, but it is certainly uncommon. The common case is that it's the same for all procs on all hosts. I guess there's a few cases: 1. homogeneous include/exclude, no carto: send all in node info; no proc info 2. homogeneous include/exclude, carto is used: send all ports in node info; send index in proc info for which node info port index it will use 3. heterogeneous include/exclude, no cart: need user to tell us that this situation exists (e.g., use another MCA param), but then is same as #2 4. heterogeneous include/exclude, cart is used, same as #3 Right? -- Jeff Squyres Cisco Systems
Re: [OMPI devel] [PATCH] Fix compilation error without XRC
Thanks; applied https://svn.open-mpi.org/trac/ompi/changeset/18076. On Apr 2, 2008, at 8:21 AM, Bernhard Fischer wrote: Hi, * ompi/mca/btl/openib/btl_openib_component.c (init_one_hca): mca_btl_openib_open_xrc_domain and mca_btl_openib_close_xrc_domain depend on XRC Fixes the compilation failure as in the head of attached patch. TIA, Bernhard 01.diff>___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Jeff Squyres Cisco Systems
Re: [OMPI devel] RFC: changes to modex
On Wed, Apr 02, 2008 at 03:45:20PM -0400, Jeff Squyres wrote: > On Apr 2, 2008, at 1:58 PM, Gleb Natapov wrote: > >> No, I think it would be fine to only send the output after > >> btl_openib_if_in|exclude is applied. Perhaps we need an MCA param to > >> say "always send everything" in the case that someone applies a non- > >> homogeneous if_in|exclude set of values...? > >> > >> When is carto stuff applied? Is that what you're really asking > >> about? > >> > > There is no difference between carto and include/exclude. > > You mean in terms of when they are applied? I mean that there are multiple ways to use different hca/port in different proc on the same host. > > > I can specify > > different openib_if_include values for different procs on the same > > host. > > > I know you *can*, but it is certainly uncommon. The common case is Uncommon - yes, but do you what to make it unsupported? > that it's the same for all procs on all hosts. I guess there's a few > cases: > > 1. homogeneous include/exclude, no carto: send all in node info; no > proc info > 2. homogeneous include/exclude, carto is used: send all ports in node > info; send index in proc info for which node info port index it will use This may actually increase modex size. Think about two procs using two different hcas. We'll send all the data we send today + indexes. > 3. heterogeneous include/exclude, no cart: need user to tell us that > this situation exists (e.g., use another MCA param), but then is same > as #2 > 4. heterogeneous include/exclude, cart is used, same as #3 > > Right? > Looks like it. FWIW I don't like the idea to code all those special cases. The way it works now I can be pretty sure that any crazy setup I'll come up with will work. By the way how much data are moved during modex stage? What if modex will use compression? -- Gleb.
Re: [OMPI devel] Ssh tunnelling broken in trunk?
On Wednesday 02 April 2008 01:21:31 pm Jon Mason wrote: > On Wednesday 02 April 2008 11:54:50 am Ralph H Castain wrote: > > I remember that someone had found a bug that caused orte_debug_flag to not > > get properly set (local var covering over a global one) - could be that > > your tmp-public branch doesn't have that patch in it. > > > > You might try updating to the latest trunk > > I updated my ompi-trunk tree, did a clean build, and I still seem the same > problem. I regressed trunk to rev 17589 and everything works as I expect. > So I think the problem is still there in the top of trunk. I stepped through the revs of trunk and found the first failing rev to be 17632. Its a big patch, so I'll defer to those more in the know to determine what is breaking in there. > I don't discount user error, but I don't think I am doing anyting different. > Did some setting change that perhaps I did not modify? > > Thanks, > Jon > > > On 4/2/08 10:41 AM, "George Bosilca" wrote: > > > I'm using this feature on the trunk with the version from yesterday. > > > It works without problems ... > > > > > >george. > > > > > > On Apr 2, 2008, at 12:14 PM, Jon Mason wrote: > > >> On Wednesday 02 April 2008 11:07:18 am Jeff Squyres wrote: > > >>> Are these r numbers relevant on the /tmp-public branch, or the trunk? > > >> > > >> I pulled it out of the command used to update the branch, which was: > > >> svn merge -r 17590:17917 https://svn.open-mpi.org/svn/ompi/trunk . > > >> > > >> In the cpc tmp branch, it happened at r17920. > > >> > > >> Thanks, > > >> Jon > > >> > > >>> On Apr 2, 2008, at 11:59 AM, Jon Mason wrote: > > I regressed my tree and it looks like it happened between > > 17590:17917 > > > > On Wednesday 02 April 2008 10:22:52 am Jon Mason wrote: > > > I am noticing that ssh seems to be broken on trunk (and my cpc > > > branch, as > > > it is based on trunk). When I try to use xterm and gdb to debug, I > > > only > > > successfully get 1 xterm. I have tried this on 2 different > > > setups. I can > > > successfully get the xterm's on the 1.2 svn branch. > > > > > > I am running the following command: > > > mpirun --n 2 --host vic12,vic20 -mca btl tcp,self -d xterm -e > > > gdb /usr/mpi/gcc/openmpi-1.2-svn/tests/IMB-3.0/IMB-MPI1 > > > > > > Is anyone else seeing this problem? > > > > > > Thanks, > > > Jon > > > ___ > > > devel mailing list > > > de...@open-mpi.org > > > http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > > ___ > > devel mailing list > > de...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/devel > > >> > > >> ___ > > >> devel mailing list > > >> de...@open-mpi.org > > >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > > > > ___ > > > devel mailing list > > > de...@open-mpi.org > > > http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > > ___ > > devel mailing list > > de...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel >
Re: [OMPI devel] Ssh tunnelling broken in trunk?
Can you diagnose a little further: 1. in the case where it works, can you verify that the ssh to launch the orteds is still running? 2. in the case where it doesn't work, can you verify that the ssh to launch the orteds has actually died? On Apr 2, 2008, at 4:58 PM, Jon Mason wrote: On Wednesday 02 April 2008 01:21:31 pm Jon Mason wrote: On Wednesday 02 April 2008 11:54:50 am Ralph H Castain wrote: I remember that someone had found a bug that caused orte_debug_flag to not get properly set (local var covering over a global one) - could be that your tmp-public branch doesn't have that patch in it. You might try updating to the latest trunk I updated my ompi-trunk tree, did a clean build, and I still seem the same problem. I regressed trunk to rev 17589 and everything works as I expect. So I think the problem is still there in the top of trunk. I stepped through the revs of trunk and found the first failing rev to be 17632. Its a big patch, so I'll defer to those more in the know to determine what is breaking in there. I don't discount user error, but I don't think I am doing anyting different. Did some setting change that perhaps I did not modify? Thanks, Jon On 4/2/08 10:41 AM, "George Bosilca" wrote: I'm using this feature on the trunk with the version from yesterday. It works without problems ... george. On Apr 2, 2008, at 12:14 PM, Jon Mason wrote: On Wednesday 02 April 2008 11:07:18 am Jeff Squyres wrote: Are these r numbers relevant on the /tmp-public branch, or the trunk? I pulled it out of the command used to update the branch, which was: svn merge -r 17590:17917 https://svn.open-mpi.org/svn/ompi/trunk . In the cpc tmp branch, it happened at r17920. Thanks, Jon On Apr 2, 2008, at 11:59 AM, Jon Mason wrote: I regressed my tree and it looks like it happened between 17590:17917 On Wednesday 02 April 2008 10:22:52 am Jon Mason wrote: I am noticing that ssh seems to be broken on trunk (and my cpc branch, as it is based on trunk). When I try to use xterm and gdb to debug, I only successfully get 1 xterm. I have tried this on 2 different setups. I can successfully get the xterm's on the 1.2 svn branch. I am running the following command: mpirun --n 2 --host vic12,vic20 -mca btl tcp,self -d xterm -e gdb /usr/mpi/gcc/openmpi-1.2-svn/tests/IMB-3.0/IMB-MPI1 Is anyone else seeing this problem? Thanks, Jon ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Jeff Squyres Cisco Systems
Re: [OMPI devel] Ssh tunnelling broken in trunk?
Here's a real simple diagnostic you can do: set -mca plm_base_verbose 1 and look at the cmd line being executed (send it here). It will look like: [[xxx,1],0] plm:rsh: executing: jjkljks;jldfsaj; If the cmd line has --daemonize on it, then the ssh will close and xterm won't work. Ralph On 4/2/08 3:14 PM, "Jeff Squyres" wrote: > Can you diagnose a little further: > > 1. in the case where it works, can you verify that the ssh to launch > the orteds is still running? > > 2. in the case where it doesn't work, can you verify that the ssh to > launch the orteds has actually died? > > > On Apr 2, 2008, at 4:58 PM, Jon Mason wrote: >> On Wednesday 02 April 2008 01:21:31 pm Jon Mason wrote: >>> On Wednesday 02 April 2008 11:54:50 am Ralph H Castain wrote: I remember that someone had found a bug that caused orte_debug_flag to not get properly set (local var covering over a global one) - could be that your tmp-public branch doesn't have that patch in it. You might try updating to the latest trunk >>> >>> I updated my ompi-trunk tree, did a clean build, and I still seem >>> the same >>> problem. I regressed trunk to rev 17589 and everything works as I >>> expect. >>> So I think the problem is still there in the top of trunk. >> >> >> I stepped through the revs of trunk and found the first failing rev >> to be >> 17632. Its a big patch, so I'll defer to those more in the know to >> determine >> what is breaking in there. >> >> >>> I don't discount user error, but I don't think I am doing anyting >>> different. >>> Did some setting change that perhaps I did not modify? >>> >>> Thanks, >>> Jon >>> On 4/2/08 10:41 AM, "George Bosilca" wrote: > I'm using this feature on the trunk with the version from > yesterday. > It works without problems ... > > george. > > On Apr 2, 2008, at 12:14 PM, Jon Mason wrote: >> On Wednesday 02 April 2008 11:07:18 am Jeff Squyres wrote: >>> Are these r numbers relevant on the /tmp-public branch, or the >>> trunk? >> >> I pulled it out of the command used to update the branch, which >> was: >> svn merge -r 17590:17917 https://svn.open-mpi.org/svn/ompi/trunk . >> >> In the cpc tmp branch, it happened at r17920. >> >> Thanks, >> Jon >> >>> On Apr 2, 2008, at 11:59 AM, Jon Mason wrote: I regressed my tree and it looks like it happened between 17590:17917 On Wednesday 02 April 2008 10:22:52 am Jon Mason wrote: > I am noticing that ssh seems to be broken on trunk (and my cpc > branch, as > it is based on trunk). When I try to use xterm and gdb to > debug, I > only > successfully get 1 xterm. I have tried this on 2 different > setups. I can > successfully get the xterm's on the 1.2 svn branch. > > I am running the following command: > mpirun --n 2 --host vic12,vic20 -mca btl tcp,self -d xterm -e > gdb /usr/mpi/gcc/openmpi-1.2-svn/tests/IMB-3.0/IMB-MPI1 > > Is anyone else seeing this problem? > > Thanks, > Jon > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel >> >> ___ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> >>> >>> ___ >>> devel mailing list >>> de...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> >> >> >> ___ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel >
Re: [OMPI devel] Ssh tunnelling broken in trunk?
On Wednesday 02 April 2008 05:04:47 pm Ralph Castain wrote: > Here's a real simple diagnostic you can do: set -mca plm_base_verbose 1 and > look at the cmd line being executed (send it here). It will look like: > > [[xxx,1],0] plm:rsh: executing: jjkljks;jldfsaj; > > If the cmd line has --daemonize on it, then the ssh will close and xterm > won't work. [vic20:01863] [[40388,0],0] plm:rsh: executing: (//usr/bin/ssh) [/usr/bin/ssh vic12 orted --daemonize -mca ess env -mca orte_ess_jobid 2646867968 -mca orte_ess_vpid 1 -mca orte_ess_num_procs 2 --hnp-uri "2646867968.0;tcp://192.168.70.150:39057;tcp://10.10.0.150:39057;tcp://86.75.30.10:39057" --nodename vic12 -mca btl openib,self --mca btl_openib_receive_queues P,65536,256,128,128 -mca plm_base_verbose 1 -mca mca_base_param_file_path /usr/mpi/gcc/ompi-trunk/share/openmpi/amca-param-sets:/root -mca mca_base_param_file_path_force /root] It looks like what you say is happening. Is this configured somewhere, so that I can remove it? Thanks, Jon > Ralph > > On 4/2/08 3:14 PM, "Jeff Squyres" wrote: > > Can you diagnose a little further: > > > > 1. in the case where it works, can you verify that the ssh to launch > > the orteds is still running? > > > > 2. in the case where it doesn't work, can you verify that the ssh to > > launch the orteds has actually died? > > > > On Apr 2, 2008, at 4:58 PM, Jon Mason wrote: > >> On Wednesday 02 April 2008 01:21:31 pm Jon Mason wrote: > >>> On Wednesday 02 April 2008 11:54:50 am Ralph H Castain wrote: > I remember that someone had found a bug that caused > orte_debug_flag to not > get properly set (local var covering over a global one) - could be > that > your tmp-public branch doesn't have that patch in it. > > You might try updating to the latest trunk > >>> > >>> I updated my ompi-trunk tree, did a clean build, and I still seem > >>> the same > >>> problem. I regressed trunk to rev 17589 and everything works as I > >>> expect. > >>> So I think the problem is still there in the top of trunk. > >> > >> I stepped through the revs of trunk and found the first failing rev > >> to be > >> 17632. Its a big patch, so I'll defer to those more in the know to > >> determine > >> what is breaking in there. > >> > >>> I don't discount user error, but I don't think I am doing anyting > >>> different. > >>> Did some setting change that perhaps I did not modify? > >>> > >>> Thanks, > >>> Jon > >>> > On 4/2/08 10:41 AM, "George Bosilca" wrote: > > I'm using this feature on the trunk with the version from > > yesterday. > > It works without problems ... > > > > george. > > > > On Apr 2, 2008, at 12:14 PM, Jon Mason wrote: > >> On Wednesday 02 April 2008 11:07:18 am Jeff Squyres wrote: > >>> Are these r numbers relevant on the /tmp-public branch, or the > >>> trunk? > >> > >> I pulled it out of the command used to update the branch, which > >> was: > >> svn merge -r 17590:17917 https://svn.open-mpi.org/svn/ompi/trunk . > >> > >> In the cpc tmp branch, it happened at r17920. > >> > >> Thanks, > >> Jon > >> > >>> On Apr 2, 2008, at 11:59 AM, Jon Mason wrote: > I regressed my tree and it looks like it happened between > 17590:17917 > > On Wednesday 02 April 2008 10:22:52 am Jon Mason wrote: > > I am noticing that ssh seems to be broken on trunk (and my cpc > > branch, as > > it is based on trunk). When I try to use xterm and gdb to > > debug, I > > only > > successfully get 1 xterm. I have tried this on 2 different > > setups. I can > > successfully get the xterm's on the 1.2 svn branch. > > > > I am running the following command: > > mpirun --n 2 --host vic12,vic20 -mca btl tcp,self -d xterm -e > > gdb /usr/mpi/gcc/openmpi-1.2-svn/tests/IMB-3.0/IMB-MPI1 > > > > Is anyone else seeing this problem? > > > > Thanks, > > Jon > > ___ > > devel mailing list > > de...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/devel > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel > >> > >> ___ > >> devel mailing list > >> de...@open-mpi.org > >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > > ___ > > devel mailing list > > de...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/devel > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel > >>> > >>> ___
[OMPI devel] Mercurial demo OMPI repository
Thanks to the sysadmins at IU, I put up a sample Mercurial OMPI repository here: http://www.open-mpi.org/hg/hgwebdir.cgi/ I converted the entire SVN ompi repository history (/trunk, /tags, and /branches only) as of r17921. Note that it shows some commits on the 0.9 branch as the most recent activity only because it converts the branches in reverse order -- the entire trunk is there as of r17921. You can clone this repository with the following: hg clone http://www.open-mpi.org/hg/hgwebdir.cgi/jsquyres/hg/ompi-svn-conversion-r17921/ Enjoy. -- Jeff Squyres Cisco Systems
Re: [OMPI devel] RFC: changes to modex
On Apr 2, 2008, at 4:12 PM, Gleb Natapov wrote: I can specify different openib_if_include values for different procs on the same host. I know you *can*, but it is certainly uncommon. The common case is Uncommon - yes, but do you what to make it unsupported? No, there's no need for that. that it's the same for all procs on all hosts. I guess there's a few cases: 1. homogeneous include/exclude, no carto: send all in node info; no proc info 2. homogeneous include/exclude, carto is used: send all ports in node info; send index in proc info for which node info port index it will use This may actually increase modex size. Think about two procs using two different hcas. We'll send all the data we send today + indexes. It'll increase it compared to the optimization that we're about to make. But it will certainly be a large decrease compared to what we're doing today (see the spreadsheet that I sent last week). Indeed, we can even put in the optimization that if there's only one process on a host, it can only publish the ports that it will use (and therefore there's no need for the proc data). 3. heterogeneous include/exclude, no cart: need user to tell us that this situation exists (e.g., use another MCA param), but then is same as #2 4. heterogeneous include/exclude, cart is used, same as #3 Right? Looks like it. FWIW I don't like the idea to code all those special cases. The way it works now I can be pretty sure that any crazy setup I'll come up with will work. And so it will with the new scheme. The only place it won't work is if the user specifies a heterogeneous include/exclude (i.e., we'll require that the user tells us when they do that), which nobody does. I guess I don't see the problem...? By the way how much data are moved during modex stage? What if modex will use compression? The spreadsheet I listed was just the openib part of the modex, and it was fairly hefty. I have no idea how well (or not) it would compress. -- Jeff Squyres Cisco Systems
Re: [OMPI devel] Ssh tunnelling broken in trunk?
Hmmm...something isn't making sense. Can I see the command line you used to generate this? I'll tell you why I'm puzzled. If orte_debug_flag is set, then the "--daemonize" should NOT be there, and you should see "--debug" on that command line. What I see is the reverse, which implies to me that orte_debug_flag is NOT being set to "true". When I tested here and on odin, though, I found that the -d option correctly set the flag and everything works just fine. So there is something in your environment or setup that is messing up that orte_debug_flag. I have no idea what it could be - the command line should override anything in your environment, but you could check. Otherwise, if this diagnostic output came from a command line that included -d or --debug-devel, or had OMPI_MCA_orte_debug=1 in the environment, then I am at a loss - everywhere I've tried it, it works fine. Ralph On 4/2/08 5:41 PM, "Jon Mason" wrote: > On Wednesday 02 April 2008 05:04:47 pm Ralph Castain wrote: >> Here's a real simple diagnostic you can do: set -mca plm_base_verbose 1 and >> look at the cmd line being executed (send it here). It will look like: >> >> [[xxx,1],0] plm:rsh: executing: jjkljks;jldfsaj; >> >> If the cmd line has --daemonize on it, then the ssh will close and xterm >> won't work. > > [vic20:01863] [[40388,0],0] plm:rsh: executing: (//usr/bin/ssh) [/usr/bin/ssh > vic12 orted --daemonize -mca ess env -mca orte_ess_jobid 2646867968 -mca > orte_ess_vpid 1 -mca orte_ess_num_procs > 2 --hnp-uri > "2646867968.0;tcp://192.168.70.150:39057;tcp://10.10.0.150:39057;tcp://86.75.3 > 0.10:39057" --nodename > vic12 -mca btl openib,self --mca btl_openib_receive_queues > P,65536,256,128,128 -mca plm_base_verbose 1 -mca > mca_base_param_file_path > /usr/mpi/gcc/ompi-trunk/share/openmpi/amca-param-sets:/root -mca > mca_base_param_file_path_force /root] > > > It looks like what you say is happening. Is this configured somewhere, so > that I can remove it? > > Thanks, > Jon > >> Ralph >> >> On 4/2/08 3:14 PM, "Jeff Squyres" wrote: >>> Can you diagnose a little further: >>> >>> 1. in the case where it works, can you verify that the ssh to launch >>> the orteds is still running? >>> >>> 2. in the case where it doesn't work, can you verify that the ssh to >>> launch the orteds has actually died? >>> >>> On Apr 2, 2008, at 4:58 PM, Jon Mason wrote: On Wednesday 02 April 2008 01:21:31 pm Jon Mason wrote: > On Wednesday 02 April 2008 11:54:50 am Ralph H Castain wrote: >> I remember that someone had found a bug that caused >> orte_debug_flag to not >> get properly set (local var covering over a global one) - could be >> that >> your tmp-public branch doesn't have that patch in it. >> >> You might try updating to the latest trunk > > I updated my ompi-trunk tree, did a clean build, and I still seem > the same > problem. I regressed trunk to rev 17589 and everything works as I > expect. > So I think the problem is still there in the top of trunk. I stepped through the revs of trunk and found the first failing rev to be 17632. Its a big patch, so I'll defer to those more in the know to determine what is breaking in there. > I don't discount user error, but I don't think I am doing anyting > different. > Did some setting change that perhaps I did not modify? > > Thanks, > Jon > >> On 4/2/08 10:41 AM, "George Bosilca" wrote: >>> I'm using this feature on the trunk with the version from >>> yesterday. >>> It works without problems ... >>> >>> george. >>> >>> On Apr 2, 2008, at 12:14 PM, Jon Mason wrote: On Wednesday 02 April 2008 11:07:18 am Jeff Squyres wrote: > Are these r numbers relevant on the /tmp-public branch, or the > trunk? I pulled it out of the command used to update the branch, which was: svn merge -r 17590:17917 https://svn.open-mpi.org/svn/ompi/trunk . In the cpc tmp branch, it happened at r17920. Thanks, Jon > On Apr 2, 2008, at 11:59 AM, Jon Mason wrote: >> I regressed my tree and it looks like it happened between >> 17590:17917 >> >> On Wednesday 02 April 2008 10:22:52 am Jon Mason wrote: >>> I am noticing that ssh seems to be broken on trunk (and my cpc >>> branch, as >>> it is based on trunk). When I try to use xterm and gdb to >>> debug, I >>> only >>> successfully get 1 xterm. I have tried this on 2 different >>> setups. I can >>> successfully get the xterm's on the 1.2 svn branch. >>> >>> I am running the following command: >>> mpirun --n 2 --host vic12,vic20 -mca btl tcp,self -d xterm -e >>> gdb /usr/mpi/gcc/openmpi-1.2-svn/tests/IMB-3.0/IMB-MPI1 >>> >>> Is anyone els
Re: [OMPI devel] Ssh tunnelling broken in trunk?
There is one other thing you can check - check for stale libraries on your backend nodes. The options on the daemons changed. They used to always daemonize unless told otherwise. They now do NOT daemonize unless told to do so. If the orted executables back there are "stale", then you will get the incorrect behavior. I don't think that is the problem here as your command line looks simply wrong per my comments below, but it might be worth checking out anyway. Ralph On 4/2/08 7:04 PM, "Ralph Castain" wrote: > Hmmm...something isn't making sense. Can I see the command line you used to > generate this? > > I'll tell you why I'm puzzled. If orte_debug_flag is set, then the > "--daemonize" should NOT be there, and you should see "--debug" on that > command line. What I see is the reverse, which implies to me that > orte_debug_flag is NOT being set to "true". > > When I tested here and on odin, though, I found that the -d option correctly > set the flag and everything works just fine. > > So there is something in your environment or setup that is messing up that > orte_debug_flag. I have no idea what it could be - the command line should > override anything in your environment, but you could check. Otherwise, if > this diagnostic output came from a command line that included -d or > --debug-devel, or had OMPI_MCA_orte_debug=1 in the environment, then I am at > a loss - everywhere I've tried it, it works fine. > > Ralph > > > > On 4/2/08 5:41 PM, "Jon Mason" wrote: > >> On Wednesday 02 April 2008 05:04:47 pm Ralph Castain wrote: >>> Here's a real simple diagnostic you can do: set -mca plm_base_verbose 1 and >>> look at the cmd line being executed (send it here). It will look like: >>> >>> [[xxx,1],0] plm:rsh: executing: jjkljks;jldfsaj; >>> >>> If the cmd line has --daemonize on it, then the ssh will close and xterm >>> won't work. >> >> [vic20:01863] [[40388,0],0] plm:rsh: executing: (//usr/bin/ssh) [/usr/bin/ssh >> vic12 orted --daemonize -mca ess env -mca orte_ess_jobid 2646867968 -mca >> orte_ess_vpid 1 -mca orte_ess_num_procs >> 2 --hnp-uri >> "2646867968.0;tcp://192.168.70.150:39057;tcp://10.10.0.150:39057;tcp://86.75.>> 3 >> 0.10:39057" --nodename >> vic12 -mca btl openib,self --mca btl_openib_receive_queues >> P,65536,256,128,128 -mca plm_base_verbose 1 -mca >> mca_base_param_file_path >> /usr/mpi/gcc/ompi-trunk/share/openmpi/amca-param-sets:/root -mca >> mca_base_param_file_path_force /root] >> >> >> It looks like what you say is happening. Is this configured somewhere, so >> that I can remove it? >> >> Thanks, >> Jon >> >>> Ralph >>> >>> On 4/2/08 3:14 PM, "Jeff Squyres" wrote: Can you diagnose a little further: 1. in the case where it works, can you verify that the ssh to launch the orteds is still running? 2. in the case where it doesn't work, can you verify that the ssh to launch the orteds has actually died? On Apr 2, 2008, at 4:58 PM, Jon Mason wrote: > On Wednesday 02 April 2008 01:21:31 pm Jon Mason wrote: >> On Wednesday 02 April 2008 11:54:50 am Ralph H Castain wrote: >>> I remember that someone had found a bug that caused >>> orte_debug_flag to not >>> get properly set (local var covering over a global one) - could be >>> that >>> your tmp-public branch doesn't have that patch in it. >>> >>> You might try updating to the latest trunk >> >> I updated my ompi-trunk tree, did a clean build, and I still seem >> the same >> problem. I regressed trunk to rev 17589 and everything works as I >> expect. >> So I think the problem is still there in the top of trunk. > > I stepped through the revs of trunk and found the first failing rev > to be > 17632. Its a big patch, so I'll defer to those more in the know to > determine > what is breaking in there. > >> I don't discount user error, but I don't think I am doing anyting >> different. >> Did some setting change that perhaps I did not modify? >> >> Thanks, >> Jon >> >>> On 4/2/08 10:41 AM, "George Bosilca" wrote: I'm using this feature on the trunk with the version from yesterday. It works without problems ... george. On Apr 2, 2008, at 12:14 PM, Jon Mason wrote: > On Wednesday 02 April 2008 11:07:18 am Jeff Squyres wrote: >> Are these r numbers relevant on the /tmp-public branch, or the >> trunk? > > I pulled it out of the command used to update the branch, which > was: > svn merge -r 17590:17917 https://svn.open-mpi.org/svn/ompi/trunk . > > In the cpc tmp branch, it happened at r17920. > > Thanks, > Jon > >> On Apr 2, 2008, at 11:59 AM, Jon Mason wrote: >>> I regressed my tree and it looks like it happened between >>> 17590:17917 >>> >>> On