[hwloc-devel] Git testing of hwloc

2013-09-06 Thread Jeff Squyres (jsquyres)
Brice / Samuel --

We seem to be in pretty good shape:

- github commits work as expected
- we're getting emails sent upon pushes to github
- DongInn got the "refs #X" and "closes #X" stuff working (i.e., putting tokens 
in git commit messages affects Trac tickets -- see 
http://trac.edgewall.org/wiki/CommitTicketUpdater)

Do you want to do some testing?  What are your github IDs?

The test Trac instance is located here:

https://git.open-mpi.org/trac/hwloc/

Once we're happy with all this setup, we can do the conversion for real and 
start working forward with github and leave SVN behind.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI devel] (no subject)

2013-09-06 Thread Jeff Squyres (jsquyres)
On Sep 6, 2013, at 8:06 AM, Alex Margolin  wrote:

> Sorry for the title and the html... send button got pressed too earl.
> 
> Anyway, I tried to build OMPI without threads at all with the following 
> command:
> 
> ./configure --prefix=/cs/mosna/alexam02/ompi CFLAGS=-m64 CXXFLAGS=-m64 
> --without-threads --without-hwloc --enable-mca-no-build=maffinity,paffinity 
> --enable-contrib-no-build=libnbc,vt

FWIW, there's no maffinifty/paffinity any more.  And you can just --disable-vt.

> Sadly, the build failed very early:
> 
>   CC runtime/opal_info_support.lo
> runtime/opal_info_support.c: In function 'opal_info_do_params':
> runtime/opal_info_support.c:444:9: error: 'errno' undeclared (first use in 
> this function)
> runtime/opal_info_support.c:444:9: note: each undeclared identifier is 
> reported only once for each function it appears in
> make[2]: *** [runtime/opal_info_support.lo] Error 1
> make[2]: Leaving directory `/a/store-04/h/lab/mosix/alexam02/ompi-jeff/opal'
> make[1]: *** [all-recursive] Error 1
> make[1]: Leaving directory `/a/store-04/h/lab/mosix/alexam02/ompi-jeff/opal'
> make: *** [all-recursive] Error 1
> 
> Should this be a trac ticket?

Seems like it should be an easy fix (e.g., a missing header file?) -- can you 
submit a patch?

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [hwloc-devel] [mpich-core] hwloc-1.6.x tainted with GPL?

2013-09-06 Thread Jeff Squyres (jsquyres)
On Sep 6, 2013, at 10:50 AM, Pavan Balaji  wrote:

> Aha!  I was never happier to be wrong.  :-)  Woohoo!
> 
> Thanks for the clarification.

No problem.  Go smack your partner for not doing their due diligence properly. 
:-)

> FWIW, I said that you guys have LGPL-2.1 code, so technically it's still 
> correct ;-).  But I blame Fortran for this oversight.

LOL!

(that was an MPI Forum/inside joke for those of you wondering WTF it meant :-) )

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [hwloc-devel] [mpich-core] hwloc-1.6.x tainted with GPL?

2013-09-06 Thread Pavan Balaji

Aha!  I was never happier to be wrong.  :-)  Woohoo!

Thanks for the clarification.

FWIW, I said that you guys have LGPL-2.1 code, so technically it's still 
correct ;-).  But I blame Fortran for this oversight.

 -- Pavan

On Sep 6, 2013, at 9:41 AM, Jeff Squyres (jsquyres) wrote:

> On Sep 6, 2013, at 10:30 AM, Pavan Balaji  wrote:
> 
>> The mpich-3.0.x series was released with hwloc-1.6.x.  One our of partners 
>> just brought it to our attention that this version of hwloc has LGPL-2.1 
>> code in src/libltdl.
> 
> Because GPL violations are quite serious, I want to be totally clear: ***your 
> statement is totally incorrect***.
> 
> libltdl distributed in hwloc is not LGPL.
> 
> I cite the comments in src/libltdl/ltdl.h:
> 
> -
> As a special exception to the GNU Lesser General Public License,
> if you distribute this file as part of a program or library that
> is built using GNU Libtool, you may include this file under the
> same distribution terms that you use for the rest of that program.
> -
> 
> This special exception is included in every single file under src/libltdl 
> except README and COPYING.LIB.
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 

--
Pavan Balaji
http://www.mcs.anl.gov/~balaji



Re: [hwloc-devel] hwloc-1.6.x tainted with GPL?

2013-09-06 Thread Jeff Squyres (jsquyres)
On Sep 6, 2013, at 10:30 AM, Pavan Balaji  wrote:

> The mpich-3.0.x series was released with hwloc-1.6.x.  One our of partners 
> just brought it to our attention that this version of hwloc has LGPL-2.1 code 
> in src/libltdl.

Because GPL violations are quite serious, I want to be totally clear: ***your 
statement is totally incorrect***.

libltdl distributed in hwloc is not LGPL.

I cite the comments in src/libltdl/ltdl.h:

-
As a special exception to the GNU Lesser General Public License,
if you distribute this file as part of a program or library that
is built using GNU Libtool, you may include this file under the
same distribution terms that you use for the rest of that program.
-

This special exception is included in every single file under src/libltdl 
except README and COPYING.LIB.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



[hwloc-devel] hwloc-1.6.x tainted with GPL?

2013-09-06 Thread Pavan Balaji
Folks,

The mpich-3.0.x series was released with hwloc-1.6.x.  One our of partners just 
brought it to our attention that this version of hwloc has LGPL-2.1 code in 
src/libltdl.

We are upgrading our code to use hwloc-1.7.x, which doesn't seem to have this 
problem, but if 1.6.x is still supported, you guys might want to get that fixed 
in there.  Or at least put up a warning on the website about this.

Regards,

 -- Pavan

--
Pavan Balaji
http://www.mcs.anl.gov/~balaji



Re: [OMPI devel] (no subject)

2013-09-06 Thread Alex Margolin
Sorry for the title and the html... send button got pressed too earl.

Anyway, I tried to build OMPI without threads at all with the following
command:

./configure --prefix=/cs/mosna/alexam02/ompi CFLAGS=-m64 CXXFLAGS=-m64
--without-threads --without-hwloc --enable-mca-no-build=maffinity,paffinity
--enable-contrib-no-build=libnbc,vt

Sadly, the build failed very early:

  CC runtime/opal_info_support.lo
runtime/opal_info_support.c: In function 'opal_info_do_params':
runtime/opal_info_support.c:444:9: error: 'errno' undeclared (first use in
this function)
runtime/opal_info_support.c:444:9: note: each undeclared identifier is
reported only once for each function it appears in
make[2]: *** [runtime/opal_info_support.lo] Error 1
make[2]: Leaving directory `/a/store-04/h/lab/mosix/alexam02/ompi-jeff/opal'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/a/store-04/h/lab/mosix/alexam02/ompi-jeff/opal'
make: *** [all-recursive] Error 1

Should this be a trac ticket?

Alex



On Fri, Sep 6, 2013 at 1:22 PM, Alex Margolin  wrote:

> Hi,
>
> I'm building ompi r29104 with the following command:
>
> make distclean && ./autogen.sh && ./configure
> --prefix=/cs/mosna/alexam02/ompi CFLAGS=-m64 CXXFLAGS=-m64 --without-hwloc
> --disable-mpi-threads --disable-progress-threads
> --enable-mca-no-build=maffinity,paffinity
> --enable-contrib-no-build=libnbc,vt && make && make install
>
> When I build and run any MPI app, I'm getting the following error (and the
> app fails):
>
> mpirun: Symbol `orte_process_info' has different size in shared object,
> consider re-linking
> mpirun: Symbol `orte_plm' has different size in shared object, consider
> re-linking
> mpirun: symbol lookup error: mpirun: undefined symbol:
> orte_trigger_event_t_class
>
> Anybody ever stumbled on this or something similar in the past?
>
> Thanks,
> Alex
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>


[OMPI devel] (no subject)

2013-09-06 Thread Alex Margolin
Hi,

I'm building ompi r29104 with the following command:

make distclean && ./autogen.sh && ./configure
--prefix=/cs/mosna/alexam02/ompi CFLAGS=-m64 CXXFLAGS=-m64 --without-hwloc
--disable-mpi-threads --disable-progress-threads
--enable-mca-no-build=maffinity,paffinity
--enable-contrib-no-build=libnbc,vt && make && make install

When I build and run any MPI app, I'm getting the following error (and the
app fails):

mpirun: Symbol `orte_process_info' has different size in shared object,
consider re-linking
mpirun: Symbol `orte_plm' has different size in shared object, consider
re-linking
mpirun: symbol lookup error: mpirun: undefined symbol:
orte_trigger_event_t_class

Anybody ever stumbled on this or something similar in the past?

Thanks,
Alex


Re: [OMPI devel] Open-MPI build of NAMD launched from srun over 20% slowed than with mpirun

2013-09-06 Thread Christopher Samuel
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 06/09/13 14:14, Christopher Samuel wrote:

> However, modifying the test program confirms that variable is getting
> propagated as expected with both mpirun and srun for 1.6.5 and the 1.7
> snapshot. :-(

Investigating further by setting:

export OMPI_MCA_orte_report_bindings=1
export SLURM_CPU_BIND=core
export SLURM_CPU_BIND_VERBOSE=verbose

reveals that only OMPI 1.6.5 with mpirun reports bindings being set
(see below).   We cannot understand why Slurm doesn't *appear* to be
setting bindings as we have the correct settings according to the
documentation.

Whilst it may explain the difference between 1.6.5 mpirun and srun
it doesn't to explain why the 1.7 snapshot is so much better as you'd
expect them to be hurt in the same way.


==OPENMPI 1.6.5==
==mpirun==
[barcoo003:03633] System has detected external process binding to cores 0001
[barcoo003:03633] MCW rank 0 bound to socket 0[core 0]: [B]
[barcoo004:04504] MCW rank 1 bound to socket 0[core 0]: [B]
Hello, World, I am 0 of 2 on host barcoo003 from app number 0 universe size 2 
universe envar 2
Hello, World, I am 1 of 2 on host barcoo004 from app number 0 universe size 2 
universe envar 2
==srun==
Hello, World, I am 0 of 2 on host barcoo003 from app number 1 universe size 2 
universe envar NULL
Hello, World, I am 1 of 2 on host barcoo004 from app number 1 universe size 2 
universe envar NULL
=
==OPENMPI 1.7.3==
DANGER: YOU ARE LOADING A TEST VERSION OF OPENMPI. THIS MAY BE BAD.
==mpirun==
Hello, World, I am 0 of 2 on host barcoo003 from app number 0 universe size 2 
universe envar 2
Hello, World, I am 1 of 2 on host barcoo004 from app number 0 universe size 2 
universe envar 2
==srun==
Hello, World, I am 0 of 2 on host barcoo003 from app number 0 universe size 2 
universe envar NULL
Hello, World, I am 1 of 2 on host barcoo004 from app number 0 universe size 2 
universe envar NULL
=



- -- 
 Christopher SamuelSenior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545
 http://www.vlsci.org.au/  http://twitter.com/vlsci

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlIpcxcACgkQO2KABBYQAh/wdQCfR4q7DfGqJVSU0O3BmgXqAn8w
HsEAn3po0xaxB0+ywejWgSjQ385da7Pa
=T3w4
-END PGP SIGNATURE-


Re: [OMPI devel] Open-MPI build of NAMD launched from srun over 20% slowed than with mpirun

2013-09-06 Thread Christopher Samuel
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 06/09/13 00:23, Hjelm, Nathan T wrote:

> I assume that process binding is enabled for both mpirun and srun?
> If not that could account for a difference between the runtimes.

You raise an interesting point, we have been doing that with:

[samuel@barcoo ~]$ module show openmpi 2>&1 | grep binding
setenv   OMPI_MCA_orte_process_binding core

However, modifying the test program confirms that variable is getting
propagated as expected with both mpirun and srun for 1.6.5 and the 1.7
snapshot. :-(

cheers,
Chris
- -- 
 Christopher SamuelSenior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545
 http://www.vlsci.org.au/  http://twitter.com/vlsci

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlIpVp4ACgkQO2KABBYQAh88rQCggOZkAjPV+/1PX2R9auuij+1M
jdsAn17nDCoubkdvCsLRKozqGEYWjOY1
=RaoK
-END PGP SIGNATURE-