Re: [OMPI devel] hang with launch including remote nodes

2012-06-21 Thread Ralph Castain
Now fixed with r26631 On Jun 20, 2012, at 9:48 PM, Eugene Loh wrote: > On 06/19/12 23:11, Ralph Castain wrote: >> Also, how did you configure this version? > --enable-heterogeneous > --enable-cxx-exceptions > --enable-shared > --enable-orterun-prefix-by-default > --with-sge >

Re: [OMPI devel] OpenIB compile error

2012-06-21 Thread Jeff Squyres
On Jun 21, 2012, at 11:10 AM, Shamis, Pavel wrote: > OpenIB BTL is the primary source cause for existence of the OOB UD component. Err.. I'm confused. These are two unrelated things. I.e., there's no reason to think that OOB and MPI transport/protocol types are related. Indeed, we've been

Re: [OMPI devel] OpenIB compile error

2012-06-21 Thread Shamis, Pavel
BTW, if people want to rename openib btl to something else and then change the configure scripts - I'm ok. About naming - I would agree with Terry, it makes sense to name it after network API used for this btl - "verbs" (it is not ibverbs). Bottom line, I would recommend to keep configure

Re: [OMPI devel] OpenIB compile error

2012-06-21 Thread Shamis, Pavel
> On Jun 20, 2012, at 4:25 PM, Shamis, Pavel wrote: > >> I hate it ... >> >> As far as I understand it is not reason to rename it. The OFED-lovin >> components should look at $with_openib. > > Ah, sorry -- I didn't think this would be controversial. It is not controversial. The "hate" was

Re: [hwloc-devel] HWLOC_NBMAXCPUS

2012-06-21 Thread TERRY DONTJE
I guess I was looking at the wrong version of code since I now see that topology-linux.c has fixed this issue. I guess I will need to look to port this change over to solaris-topology.c --td On 6/21/2012 9:46 AM, TERRY DONTJE wrote: I see a couple places where HWLOC_NBMAXCPUS is defined with

Re: [hwloc-devel] HWLOC_NBMAXCPUS

2012-06-21 Thread Samuel Thibault
TERRY DONTJE, le Thu 21 Jun 2012 15:47:22 +0200, a écrit : > I see a couple places where HWLOC_NBMAXCPUS is defined with a comment of > "FIXME: drop".  This static size just bit me on a machine that has 1440 CPUs.  > I can bump up the define in my clone but I was wondering if this fixed size >

[hwloc-devel] HWLOC_NBMAXCPUS

2012-06-21 Thread TERRY DONTJE
I see a couple places where HWLOC_NBMAXCPUS is defined with a comment of "FIXME: drop". This static size just bit me on a machine that has 1440 CPUs. I can bump up the define in my clone but I was wondering if this fixed size might change in the near future? -- Terry D. Dontje | Principal

Re: [OMPI devel] OpenIB compile error

2012-06-21 Thread Jeff Squyres
On Jun 21, 2012, at 8:40 AM, TERRY DONTJE wrote: > So you specify --with-ofed and you get mca_btl_openib generated? ICK!!! I > think that will just make things more confusing. I am against this unless > you change the btl name. We already have this situation. There are 4 components that

Re: [OMPI devel] OpenIB compile error

2012-06-21 Thread TERRY DONTJE
On 6/21/2012 6:38 AM, Jeff Squyres wrote: On Jun 21, 2012, at 6:11 AM, TERRY DONTJE wrote: As far as I understand it is not reason to rename it. The OFED-lovin components should look at $with_openib. I agree with Pasha that the reason you give for renaming openib btl seem orthogonal to

Re: [OMPI devel] SVN / Trac appears to be down

2012-06-21 Thread Jeff Squyres
SVN and Trac are back. On Jun 21, 2012, at 7:17 AM, Jeff Squyres wrote: > We're investigating with IU. > > Sorry for the interruption, folks... > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > >

Re: [OMPI devel] hang with launch including remote nodes

2012-06-21 Thread Ralph Castain
Got it! Will take a little thinking to fix - it's basically a conflict between rollup and tree spawn. For now, you can run with: -mca orte_use_common_port 0 -mca plm_rsh_no_tree_spawn 1 Sorry about that - thanks for letting me know! Ralph On Jun 20, 2012, at 9:48 PM, Eugene Loh wrote: > On

[OMPI devel] SVN / Trac appears to be down

2012-06-21 Thread Jeff Squyres
We're investigating with IU. Sorry for the interruption, folks... -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/

Re: [OMPI devel] OpenIB compile error

2012-06-21 Thread Jeff Squyres
On Jun 21, 2012, at 6:11 AM, TERRY DONTJE wrote: >>> As far as I understand it is not reason to rename it. The OFED-lovin >>> components should look at $with_openib. >>> > I agree with Pasha that the reason you give for renaming openib btl seem > orthogonal to renaming the btl. Note that

Re: [OMPI devel] OpenIB compile error

2012-06-21 Thread TERRY DONTJE
On 6/20/2012 5:02 PM, Jeff Squyres wrote: On Jun 20, 2012, at 4:25 PM, Shamis, Pavel wrote: I hate it ... As far as I understand it is not reason to rename it. The OFED-lovin components should look at $with_openib. I agree with Pasha that the reason you give for renaming openib btl seem

Re: [OMPI devel] hang with launch including remote nodes

2012-06-21 Thread Eugene Loh
On 06/19/12 23:11, Ralph Castain wrote: Also, how did you configure this version? --enable-heterogeneous --enable-cxx-exceptions --enable-shared --enable-orterun-prefix-by-default --with-sge --enable-mpi-f90 --with-mpi-f90-size=small --disable-peruse