On Jul 23, 2013, at 3:56 AM, Jeff Squyres (jsquyres) <jsquy...@cisco.com> wrote:

> On Jul 21, 2013, at 8:50 AM, Kevin H. Hobbs <hob...@ohio.edu> wrote:
> 
>>> Ah! That would indicate an issue with the external hwloc
>>> package they provided, which is the big reason we don't
>>> recommend installing from packages.
>> 
>> I'll happily report the bug to the hwloc developers.
> 
> I don't think that this is necessarily an hwloc bug.
> 
>> I'll also add what we've found here to the bug on the Fedora
>> bugzilla.
>> 
>> Is there anything more I can do on this list to figure out the
>> nature of the bug?
>> 
>>> We have internal copies of hwloc and libevent that ensure (a)
>>> they are at the proper level, and (b) they are configured
>>> properly for OMPI's use.
>> 
>> It does look like Fedora's hwloc is ahead of OMPI's.
>> 
>> Fedora 18 has openmpi-1.6.3 and hwloc-1.4.2.
>> 
>> The source of openmpi-1.6.5 has hwloc-1.3.2.
> 
> Hypothetically, hwloc 1.4.x is backwards source-compatible with hwloc 1.3.x, 
> but we have not tested this.  I don't know if hwloc has, either (I'm sure 
> they haven't tested with Open MPI 1.6.x).
> 
>> How can I tell what the configuration differences are?
>> 
>> The entire configure section of the .spec file in
>> hwloc-1.4.2-2.fc18.src.rpm is :
>> 
>> %configure
>> %{__make} %{?_smp_mflags} V=1
> 
> OMPI builds hwloc in "embedded" mode, which means that OMPI's configure line 
> is used to build hwloc (vs. having a separate configure invocation for 
> hwloc).  They're hypothetically the moral equivalent of each other, but 
> perhaps something is different somehow...
> 
>> I don't see anything that looks like any hwloc configure options
>> are being set.
>> 
>> How do I tell how OMPI configures it's bundled hwloc?
> 
> With this embedded mechanism, we're calling hwloc's configury with the moral 
> equivalent of:
> 
> ./configure --disable-cairo --disable-libxml2 --enable-xml 
> --with-hwloc-symbol-prefix=opal_hwloc152_ --enable-embedded-mode
> 
>> Better yet, I'd like to figure out the actual nature of the bug
>> and report it in the proper place.
> 
> 
> Yes, it's curious that they can't reproduce your issue,

Guess I missed this - where does it say that they can't reproduce the issue?? 
I'm suspicious because build-from-source produced a working result.

> which suggests that the hwloc issue is a red herring (because, as stated 
> above, hwloc *should* be backwards compatible).
> 
> Ralph: is there an easy way to find out more detail on why 
> orte_util_nidmap_init() failed without attaching a debugger?

A debugger would be the best way.

> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 


Reply via email to