Re: [Numpy-discussion] performance of numpy.array()

2015-04-29 Thread Nick Papior Andersen
You could try and install your own numpy to check whether that resolves the
problem.

2015-04-29 17:40 GMT+02:00 simona bellavista afy...@gmail.com:

 on cluster A 1.9.0 and on cluster B 1.8.2

 2015-04-29 17:18 GMT+02:00 Nick Papior Andersen nickpap...@gmail.com:

 Compile it yourself to know the limitations/benefits of the dependency
 libraries.

 Otherwise, have you checked which versions of numpy they are, i.e. are
 they the same version?

 2015-04-29 17:05 GMT+02:00 simona bellavista afy...@gmail.com:

 I work on two distinct scientific clusters. I have run the same python
 code on the two clusters and I have noticed that one is faster by an order
 of magnitude than the other (1min vs 10min, this is important because I run
 this function many times).

 I have investigated with a profiler and I have found that the cause of
 this is that (same code and same data) is the function numpy.array that is
 being called 10^5 times. On cluster A it takes 2 s in total, whereas on
 cluster B it takes ~6 min.  For what regards the other functions, they are
 generally faster on cluster A. I understand that the clusters are quite
 different, both as hardware and installed libraries. It strikes me that on
 this particular function the performance is so different. I would have
 though that this is due to a difference in the available memory, but
 actually by looking with `top` the memory seems to be used only at 0.1% on
 cluster B. In theory numpy is compiled with atlas on cluster B, and on
 cluster A it is not clear, because numpy.__config__.show() returns NOT
 AVAILABLE for anything.

 Does anybody has any insight on that, and if I can improve the
 performance on cluster B?

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion




 --
 Kind regards Nick

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion




-- 
Kind regards Nick
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] performance of numpy.array()

2015-04-29 Thread Nick Papior Andersen
Compile it yourself to know the limitations/benefits of the dependency
libraries.

Otherwise, have you checked which versions of numpy they are, i.e. are they
the same version?

2015-04-29 17:05 GMT+02:00 simona bellavista afy...@gmail.com:

 I work on two distinct scientific clusters. I have run the same python
 code on the two clusters and I have noticed that one is faster by an order
 of magnitude than the other (1min vs 10min, this is important because I run
 this function many times).

 I have investigated with a profiler and I have found that the cause of
 this is that (same code and same data) is the function numpy.array that is
 being called 10^5 times. On cluster A it takes 2 s in total, whereas on
 cluster B it takes ~6 min.  For what regards the other functions, they are
 generally faster on cluster A. I understand that the clusters are quite
 different, both as hardware and installed libraries. It strikes me that on
 this particular function the performance is so different. I would have
 though that this is due to a difference in the available memory, but
 actually by looking with `top` the memory seems to be used only at 0.1% on
 cluster B. In theory numpy is compiled with atlas on cluster B, and on
 cluster A it is not clear, because numpy.__config__.show() returns NOT
 AVAILABLE for anything.

 Does anybody has any insight on that, and if I can improve the performance
 on cluster B?

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion




-- 
Kind regards Nick
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] PR, extended site.cfg capabilities

2015-02-24 Thread Nick Papior Andersen
Dear all,

I have initiated a PR-5597 https://github.com/numpy/numpy/pull/5597,
which enables the reading of new flags from the site.cfg file.
@rgommers requested that I posted some information on this site, possibly
somebody could test it on their setup.

So the PR basically enables reading these extra options in each section:
runtime_library_dirs : Add runtime library directories to the shared
libraries (overrides the dreaded LD_LIBRARY_PATH)
extra_compile_args: Adds extra compile flags to the compilation
extra_link_args: Adds extra flags when linking to libraries

Note that this PR will fix a lot of issues down the line. Specifically
all software which utilises numpy's distutils will benefit from this.
As an example, I have successfully set runtime_library_dirs for site.cfg in
numpy, where scipy, petsc4py, pygsl, slepc4py utilises these flags and this
enables me to create environments without the need for LD_LIBRARY_PATH.

The other options simply adds to the flexibility of the compilation to test
different optimisations etc.

For instance my OpenBLAS section looks like this:
[openblas]
library_dirs = /opt/openblas/0.2.13/gnu-4.9.2/lib
include_dirs = /opt/openblas/0.2.13/gnu-4.9.2/include
runtime_library_dirs = /opt/openblas/0.2.13/gnu-4.9.2/lib

I hope this can be of use to somebody else than me :)

Feel free to test it and provide feedback!

-- 
Kind regards Nick
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] PR, extended site.cfg capabilities

2015-02-24 Thread Nick Papior Andersen
2015-02-24 14:31 GMT+01:00 Julian Taylor jtaylor.deb...@googlemail.com:

 On 02/24/2015 02:16 PM, Nick Papior Andersen wrote:
  Dear all,
 
  I have initiated a PR-5597 https://github.com/numpy/numpy/pull/5597,
  which enables the reading of new flags from the site.cfg file.
  @rgommers requested that I posted some information on this site,
  possibly somebody could test it on their setup.

 I do not fully understand the purpose of these changes. Can you give
 some more detailed use cases?


 
  So the PR basically enables reading these extra options in each section:
  runtime_library_dirs : Add runtime library directories to the shared
  libraries (overrides the dreaded LD_LIBRARY_PATH)

 LD_LIBRARY_PATH should not be used during compilation, this is a runtime
 flag that numpy.distutils has no control over.
 Can you explain in more detail what you intend to do with this flag?

Yes, but in my case I almost never set LD_LIBRARY_PATH, instead I link with
the runtime library directory so that LD_LIBRARY_PATH need not be set.
Consider this output from linalg/lapack_lite.so
$ echo $LD_LIBRARY_PATH

$ldd lapack_lite.so
  libopenblas.so.0 = not found
$ echo $LD_LIBRARY_PATH
/path/to/openblas/lib
$ldd lapack_lite.so
  libopenblas.so.0 = /path/to/openblas/lib/libopenblas.so.0

However, if I compile numpy with
runtime_library_dirs = /path/to/openblas/lib
in the openblas section, then the output would be
$ echo $LD_LIBRARY_PATH

$ldd lapack_lite.so
  libopenblas.so.0 = /path/to/openblas/lib/libopenblas.so.0

I.e. screw-ups in LD_LIBRARY_PATHS can be circumvented.


  extra_compile_args: Adds extra compile flags to the compilation

 extra flags to which compilation?
 site.cfg lists libraries that already are compiled. The only influence
 compiler flags could have is for header only libraries that profit from
 some flags. But numpy has no support for such libraries currently. E.g.
 cblas.h (which is just a header with signatures) is bundled with numpy.
 I guess third parties may make use of this, an example would be good.

The way I see distutils in numpy is that it extends the generic distutils
package so that packages relying on numpy can compile their software the
way they want.
In some of the extra software I work with, using numpy's distutils to link
to lapack/blas is easy, but adding specific compilation flags to sources is
not so easy (requires editing the compiler sources).
Also, when numpy compiles the lapack_litemodules.c it does so by the
generic flags in the compilers specified in the numpy distribution, however
now it also uses the flags provided in extra_compile_args from the lapack
section of site.cfg.
In that regard I would not consider numpy as having no support as some
packages does in fact use it.


  extra_link_args: Adds extra flags when linking to libraries

 This flag may be useful.
 It could be used to pass options required during linking, like
 -Wl,--no-as-needed that is sometimes needed to link with gsl.
 Possibly also useful for link time optimizations.

Exactly, the runtime_library_dirs can be considered a shorthand for:
extra_link_args = -Wl,-rpath=dir1 -Wl,-rpath=dir2
So you might consider it superfluous, but the intrinsic distutils package
allows both abstractions, so why not allow them both?


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion



I hope this clarified a bit.
Thanks for the questions.

-- 
Kind regards Nick
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] PR, extended site.cfg capabilities

2015-02-24 Thread Nick Papior Andersen
2015-02-24 14:56 GMT+01:00 Julian Taylor jtaylor.deb...@googlemail.com:

 On 02/24/2015 02:31 PM, Julian Taylor wrote:
  On 02/24/2015 02:16 PM, Nick Papior Andersen wrote:
  Dear all,
 
  I have initiated a PR-5597 https://github.com/numpy/numpy/pull/5597,
  which enables the reading of new flags from the site.cfg file.
  @rgommers requested that I posted some information on this site,
  possibly somebody could test it on their setup.
 
  I do not fully understand the purpose of these changes. Can you give
  some more detailed use cases?

 I think I understand better now, so this is intended as a site.cfg
 equivalent (and possibly more portable) variant of the environment
 variables that control these options?
 e.g. runtime_lib_dirs would be equivalent to LD_RUN_PATH env. variable
 and build_ext --rpath
 and the compile_extra_opts equivalent to the OPT env variable?

 Yes, but with the flexibility of each library (section). Instead of
globally using the env's.
And also that the site.cfg file is used in scipy which does not force the
user to build numpy AND scipy with build_ext --rpath, etc.


 
 
  So the PR basically enables reading these extra options in each section:
  runtime_library_dirs : Add runtime library directories to the shared
  libraries (overrides the dreaded LD_LIBRARY_PATH)
 
  LD_LIBRARY_PATH should not be used during compilation, this is a runtime
  flag that numpy.distutils has no control over.
  Can you explain in more detail what you intend to do with this flag?
 
  extra_compile_args: Adds extra compile flags to the compilation
 
  extra flags to which compilation?
  site.cfg lists libraries that already are compiled. The only influence
  compiler flags could have is for header only libraries that profit from
  some flags. But numpy has no support for such libraries currently. E.g.
  cblas.h (which is just a header with signatures) is bundled with numpy.
  I guess third parties may make use of this, an example would be good.
 
  extra_link_args: Adds extra flags when linking to libraries
 
  This flag may be useful.
  It could be used to pass options required during linking, like
  -Wl,--no-as-needed that is sometimes needed to link with gsl.
  Possibly also useful for link time optimizations.
 

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion




-- 
Kind regards Nick
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion