Re: [Numpy-discussion] ATLAS build errors

2016-03-28 Thread Ian Henriksen
On Sat, Mar 26, 2016 at 3:06 PM Matthew Brett 
wrote:

> Hi,
>
> I'm workon on building manylinux wheels for numpy, and I ran into
> unexpected problems with a numpy built against the ATLAS 3.8 binaries
> supplied by CentOS 5.
>
> I'm working on the manylinux docker container [1]
>
> To get ATLAS, I'm doing `yum install atlas-devel` which gets the
> default CentOS 5 ATLAS packages.
>
> I then build numpy.  Local tests work fine, but when I test on travis,
> I get these errors [2]:
>
> ==
> ERROR: test_svd_build (test_regression.TestRegression)
> --
> Traceback (most recent call last):
>   File
> "/home/travis/build/matthew-brett/manylinux-testing/venv/lib/python2.7/site-packages/numpy/linalg/tests/test_regression.py",
> line 56, in test_svd_build
> u, s, vh = linalg.svd(a)
>   File
> "/home/travis/build/matthew-brett/manylinux-testing/venv/lib/python2.7/site-packages/numpy/linalg/linalg.py",
> line 1359, in svd
> u, s, vt = gufunc(a, signature=signature, extobj=extobj)
> ValueError: On entry to DGESDD parameter number 12 had an illegal value
>
> ==
> FAIL: test_lapack (test_build.TestF77Mismatch)
> --
> Traceback (most recent call last):
>   File
> "/home/travis/build/matthew-brett/manylinux-testing/venv/lib/python2.7/site-packages/numpy/testing/decorators.py",
> line 146, in skipper_func
> return f(*args, **kwargs)
>   File
> "/home/travis/build/matthew-brett/manylinux-testing/venv/lib/python2.7/site-packages/numpy/linalg/tests/test_build.py",
> line 56, in test_lapack
> information.""")
> AssertionError: Both g77 and gfortran runtimes linked in lapack_lite !
> This is likely to
> cause random crashes and wrong results. See numpy INSTALL.txt for more
> information.
>
>
> Sure enough, scipy built the same way segfaults or fails to import (see
> [2]).
>
> I get no errors for an openblas build.
>
> Does anyone recognize these?   How should I modify the build to avoid them?
>
> Cheers,
>
> Matthew
>
>
> [1] https://github.com/pypa/manylinux
> [2] https://travis-ci.org/matthew-brett/manylinux-testing/jobs/118712090
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion


The error regarding parameter 12 of dgesdd sounds a lot like
https://github.com/scipy/scipy/issues/5039 where the issue was that the
LAPACK
version was too old. CentOS 5 is pretty old, so I wouldn't be surprised if
that were
the case here too.
In general, you can't expect Linux distros to have a uniform shared object
interface
for LAPACK, so you don't gain much by using the version that ships with
CentOS 5 beyond not having to compile it all yourself. It might be better
to use a
newer LAPACK built from source with the older toolchains already there.
Best,
-Ian Henriksen
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Make np.bincount output same dtype as weights

2016-03-28 Thread Jaime Fernández del Río
Have modified the PR to do the "promote integers to at least long" we do in
np.sum.

Jaime

On Mon, Mar 28, 2016 at 9:55 PM, CJ Carey  wrote:

> Another +1 for Josef's interpretation from me. Consistency with np.sum
> seems like the best option.
>
> On Sat, Mar 26, 2016 at 11:12 PM, Juan Nunez-Iglesias 
> wrote:
>
>> Thanks for clarifying, Jaime, and fwiw I agree with Josef: I would expect
>> np.bincount to behave like np.sum with regards to promoting weights dtypes.
>> Including bool.
>>
>> On Sun, Mar 27, 2016 at 1:58 PM,  wrote:
>>
>>> On Sat, Mar 26, 2016 at 9:54 PM, Joseph Fox-Rabinovitz
>>>  wrote:
>>> > Would it make sense to just make the output type large enough to hold
>>> the
>>> > cumulative sum of the weights?
>>> >
>>> >
>>> > - Joseph Fox-Rabinovitz
>>> >
>>> > -- Original message--
>>> >
>>> > From: Jaime Fernández del Río
>>> >
>>> > Date: Sat, Mar 26, 2016 16:16
>>> >
>>> > To: Discussion of Numerical Python;
>>> >
>>> > Subject:[Numpy-discussion] Make np.bincount output same dtype as
>>> weights
>>> >
>>> > Hi all,
>>> >
>>> > I have just submitted a PR (#7464) that fixes an enhancement request
>>> > (#6854), making np.bincount return an array of the same type as the
>>> weights
>>> > parameter.  This is an important deviation from current behavior, which
>>> > always casts weights to double, and always returns a double array, so I
>>> > would like to hear what others think about the worthiness of this.
>>> Main
>>> > discussion points:
>>> >
>>> > np.bincount now works with complex weights (yay!), I guess this should
>>> be a
>>> > pretty uncontroversial enhancement.
>>> > The return is of the same type as weights, which means that small
>>> integers
>>> > are very likely to overflow.  This is exactly what #6854 requested, but
>>> > perhaps we should promote the output for integers to a long, as we do
>>> in
>>> > np.sum?
>>>
>>> I always thought of bincount with weights just as a group-by sum. So
>>> it would be easier to remember and have fewer surprises if it matches
>>> the behavior of np.sum.
>>>
>>> > Boolean arrays stay boolean, and OR, rather than sum, the weights. Is
>>> this
>>> > what one would want? If we decide that integer promotion is the way to
>>> go,
>>> > perhaps booleans should go in the same pack?
>>>
>>> Isn't this calculating the sum, i.e. count of True by group, already?
>>> Based on a quick example with numpy 1.9.2, I don't think I ever used
>>> bool weights before.
>>>
>>>
>>> > This new implementation currently supports all of the reasonable native
>>> > types, but has no fallback for user defined types.  I guess we should
>>> > attempt to cast the array to double as before if no native loop can be
>>> > found? It would be good to have a way of testing this though, any
>>> thoughts
>>> > on how to go about this?
>>> > Does a behavior change like this require some deprecation period? What
>>> would
>>> > that look like?
>>> > I have also added broadcasting of weights to the full size of list, so
>>> that
>>> > one can do e.g. np.bincount([1, 2, 3], weights=2j) without having to
>>> tile
>>> > the single weight to the size of the bins list.
>>> >
>>> > Any other thoughts are very welcome as well!
>>>
>>> (2-D weights ?)
>>>
>>>
>>> Josef
>>>
>>>
>>> >
>>> > Jaime
>>> >
>>> > --
>>> > (__/)
>>> > ( O.o)
>>> > ( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus
>>> planes de
>>> > dominación mundial.
>>> >
>>> > ___
>>> > NumPy-Discussion mailing list
>>> > NumPy-Discussion@scipy.org
>>> > https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>> >
>>> ___
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion@scipy.org
>>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>
>>
>>
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


-- 
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
de dominación mundial.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Using OpenBLAS for manylinux wheels

2016-03-28 Thread Matthew Brett
Hi,

Olivier Grisel and I are working on building and testing manylinux
wheels for numpy and scipy.

We first thought that we should use ATLAS BLAS, but Olivier found that
my build of these could be very slow [1].  I set up a testing grid [2]
which found test errors for numpy and scipy using ATLAS wheels.

On the other hand, the same testing grid finds no errors or failures
[3] using latest OpenBLAS (0.2.17) and running tests for:

numpy
scipy
scikit-learn
numexpr
pandas
statsmodels

This is on the travis-ci ubuntu VMs.

Please do test on your own machines with something like this script [4]:

source test_manylinux.sh

We have worried in the past about the reliability of OpenBLAS, but I
find these tests reassuring.

Are there any other tests of OpenBLAS that we should run to assure
ourselves that it is safe to use?

Matthew

[1] https://github.com/matthew-brett/manylinux-builds/issues/4#issue-143530908
[2] https://travis-ci.org/matthew-brett/manylinux-testing/builds/118780781
[3] I disabled a few pandas tests which were failing for reasons not
related to BLAS.  Some of the statsmodels test runs time out.
[4] https://gist.github.com/matthew-brett/2fd9d9a29e022c297634
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] numpy in python callback from threaded c++

2016-03-28 Thread Burlen Loring

Hi All,

in my c++ code I've added Python binding via swig. one scenario is to 
pass a python function to do some computational work. the Python program 
runs in serial in the main thread but work is handled by a thread pool, 
the callback is invoked from another thread on unique data. Before a 
thread invokes the Python callback it acquires Python's GIL. Also I 
PyEval_InitThreads during module initialization, and have swig'd with 
-threads flag. However, I'm experiencing frequent crashes when thread 
pool size is greater than 1, and valgrind is reporting errors from numpy 
even in case where thread pool size is 1.


Here's the essence of the error reported by valgrind:

   ==10316== Invalid read of size 4
   ==10316==at 0x4ED7D73: PyObject_Free (obmalloc.c:1013)
   ==10316==by 0x10D540B0: NpyIter_Deallocate (nditer_constr.c:699)
   
   ==10316==  Address 0x20034020 is 3,856 bytes inside a block of size
   4,097 free'd
   ==10316==at 0x4C29E00: free (vg_replace_malloc.c:530)
   ==10316==by 0x4F57B22: import_module_level (import.c:2278)
   ==10316==by 0x4F57B22: PyImport_ImportModuleLevel (import.c:2292)
   ==10316==by 0x4F36597: builtin___import__ (bltinmodule.c:49)
   ==10316==by 0x4E89AC2: PyObject_Call (abstract.c:2546)
   ==10316==by 0x4E89C1A: call_function_tail (abstract.c:2578)
   ==10316==by 0x4E89C1A: PyObject_CallFunction (abstract.c:2602)
   ==10316==by 0x4F58735: PyImport_Import (import.c:2890)
   ==10316==by 0x4F588B9: PyImport_ImportModule (import.c:2133)
   ==10316==by 0x10D334C2: get_forwarding_ndarray_method (methods.c:57)
   ==10316==by 0x10D372C0: array_mean (methods.c:1932)
   ==10316==by 0x4F40AC7: call_function (ceval.c:4350)

There are a few of these reported. I'll attach the full output. This is 
from the simplest scenario, where the thread pool has a size of 1. 
Although there are 2 threads, the program is serial as the main thread 
passes work tasks to the thread pool and waits for work to finish.


Here is the work function where above occurs:

   def execute(port, data_in, req):
sys.stderr.write('descriptive_stats::execute MPI %d\n'%(rank))

mesh = as_teca_cartesian_mesh(data_in[0])

table = teca_table.New()
table.declare_columns(['step','time'], ['ul','d'])
table << mesh.get_time_step() << mesh.get_time()

for var_name in var_names:

table.declare_columns(['min '+var_name, 'avg '+var_name, \
'max '+var_name, 'std '+var_name, 'low_q '+var_name, \
'med '+var_name, 'up_q '+var_name], ['d']*7)

var = mesh.get_point_arrays().get(var_name).as_array()

table << float(np.min(var)) << float(np.average(var)) \
<< float(np.max(var)) << float(np.std(var)) \
<< map(float, np.percentile(var, [25.,50.,75.]))

return table

Again, I'm acquiring the GIL so this should be executed in serial. What 
am I doing wrong? Have I missed some key aspect of using numpy in this 
scenario? Any documentation on using numpy in a scenario like this? Any 
help is greatly appreciated!


Thanks
Burlen

==10316== Thread 2:
==10316== Invalid read of size 4
==10316==at 0x4ED7D73: PyObject_Free (obmalloc.c:1013)
==10316==by 0x10D540B0: NpyIter_Deallocate (nditer_constr.c:699)
==10316==by 0x112B01EE: iterator_loop (ufunc_object.c:1511)
==10316==by 0x112B01EE: execute_legacy_ufunc_loop (ufunc_object.c:1660)
==10316==by 0x112B01EE: PyUFunc_GenericFunction (ufunc_object.c:2627)
==10316==by 0x112B0D95: ufunc_generic_call (ufunc_object.c:4253)
==10316==by 0x4E89AC2: PyObject_Call (abstract.c:2546)
==10316==by 0x4F3E009: do_call (ceval.c:4568)
==10316==by 0x4F3E009: call_function (ceval.c:4373)
==10316==by 0x4F3E009: PyEval_EvalFrameEx (ceval.c:2987)
==10316==by 0x4F41F9B: PyEval_EvalCodeEx (ceval.c:3582)
==10316==by 0x4F3E3DE: fast_function (ceval.c:4446)
==10316==by 0x4F3E3DE: call_function (ceval.c:4371)
==10316==by 0x4F3E3DE: PyEval_EvalFrameEx (ceval.c:2987)
==10316==by 0x4F41F9B: PyEval_EvalCodeEx (ceval.c:3582)
==10316==by 0x4F3E3DE: fast_function (ceval.c:4446)
==10316==by 0x4F3E3DE: call_function (ceval.c:4371)
==10316==by 0x4F3E3DE: PyEval_EvalFrameEx (ceval.c:2987)
==10316==by 0x4F41F9B: PyEval_EvalCodeEx (ceval.c:3582)
==10316==by 0x4F3E3DE: fast_function (ceval.c:4446)
==10316==by 0x4F3E3DE: call_function (ceval.c:4371)
==10316==by 0x4F3E3DE: PyEval_EvalFrameEx (ceval.c:2987)
==10316==by 0x4F41F9B: PyEval_EvalCodeEx (ceval.c:3582)
==10316==by 0x4EBA5DB: function_call (funcobject.c:526)
==10316==by 0x4E89AC2: PyObject_Call (abstract.c:2546)
==10316==by 0x4F38096: PyEval_CallObjectWithKeywords (ceval.c:4219)
==10316==by 0x1E8A088A: 
teca_py_algorithm::execute_callback::operator()(unsigned int, 
std::vector > const&, teca_metadata 
const&) 

Re: [Numpy-discussion] Make np.bincount output same dtype as weights

2016-03-28 Thread CJ Carey
Another +1 for Josef's interpretation from me. Consistency with np.sum
seems like the best option.

On Sat, Mar 26, 2016 at 11:12 PM, Juan Nunez-Iglesias 
wrote:

> Thanks for clarifying, Jaime, and fwiw I agree with Josef: I would expect
> np.bincount to behave like np.sum with regards to promoting weights dtypes.
> Including bool.
>
> On Sun, Mar 27, 2016 at 1:58 PM,  wrote:
>
>> On Sat, Mar 26, 2016 at 9:54 PM, Joseph Fox-Rabinovitz
>>  wrote:
>> > Would it make sense to just make the output type large enough to hold
>> the
>> > cumulative sum of the weights?
>> >
>> >
>> > - Joseph Fox-Rabinovitz
>> >
>> > -- Original message--
>> >
>> > From: Jaime Fernández del Río
>> >
>> > Date: Sat, Mar 26, 2016 16:16
>> >
>> > To: Discussion of Numerical Python;
>> >
>> > Subject:[Numpy-discussion] Make np.bincount output same dtype as weights
>> >
>> > Hi all,
>> >
>> > I have just submitted a PR (#7464) that fixes an enhancement request
>> > (#6854), making np.bincount return an array of the same type as the
>> weights
>> > parameter.  This is an important deviation from current behavior, which
>> > always casts weights to double, and always returns a double array, so I
>> > would like to hear what others think about the worthiness of this.  Main
>> > discussion points:
>> >
>> > np.bincount now works with complex weights (yay!), I guess this should
>> be a
>> > pretty uncontroversial enhancement.
>> > The return is of the same type as weights, which means that small
>> integers
>> > are very likely to overflow.  This is exactly what #6854 requested, but
>> > perhaps we should promote the output for integers to a long, as we do in
>> > np.sum?
>>
>> I always thought of bincount with weights just as a group-by sum. So
>> it would be easier to remember and have fewer surprises if it matches
>> the behavior of np.sum.
>>
>> > Boolean arrays stay boolean, and OR, rather than sum, the weights. Is
>> this
>> > what one would want? If we decide that integer promotion is the way to
>> go,
>> > perhaps booleans should go in the same pack?
>>
>> Isn't this calculating the sum, i.e. count of True by group, already?
>> Based on a quick example with numpy 1.9.2, I don't think I ever used
>> bool weights before.
>>
>>
>> > This new implementation currently supports all of the reasonable native
>> > types, but has no fallback for user defined types.  I guess we should
>> > attempt to cast the array to double as before if no native loop can be
>> > found? It would be good to have a way of testing this though, any
>> thoughts
>> > on how to go about this?
>> > Does a behavior change like this require some deprecation period? What
>> would
>> > that look like?
>> > I have also added broadcasting of weights to the full size of list, so
>> that
>> > one can do e.g. np.bincount([1, 2, 3], weights=2j) without having to
>> tile
>> > the single weight to the size of the bins list.
>> >
>> > Any other thoughts are very welcome as well!
>>
>> (2-D weights ?)
>>
>>
>> Josef
>>
>>
>> >
>> > Jaime
>> >
>> > --
>> > (__/)
>> > ( O.o)
>> > ( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus
>> planes de
>> > dominación mundial.
>> >
>> > ___
>> > NumPy-Discussion mailing list
>> > NumPy-Discussion@scipy.org
>> > https://mail.scipy.org/mailman/listinfo/numpy-discussion
>> >
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion