Re: [Numpy-discussion] Integers to integer powers

2016-05-24 Thread Eric Moore
Yes, I'm fully aware of that.  I'm speaking toward changing the default
result dtype.  Raising an error for negative exponents is a fine idea.
Changing np.arange(10)**3 to have a non-integer dtype seems like a big
change.

Speaking of this, that some of the integer array operation errors can be
controlled via the np.seterr and some cannot should also be addressed
longer term.

Eric

On Tue, May 24, 2016 at 3:05 PM, Nathaniel Smith <n...@pobox.com> wrote:

> On Tue, May 24, 2016 at 10:36 AM, Eric Moore <e...@redtetrahedron.org>
> wrote:
> > I'd say the most compelling case for it is that is how it has always
> worked.
> > How much code will break if we make that change? (Or if not break, at
> least
> > have a change in dtype?)  Is that worth it?
>
> The current behavior for arrays is:
>
> # Returns int
> In [2]: np.arange(10) ** 2
> Out[2]: array([ 0,  1,  4,  9, 16, 25, 36, 49, 64, 81])
>
> # Returns nonsensical/useless results
> In [3]: np.arange(10) ** -1
> /home/njs/.user-python3.5-64bit/bin/ipython:1: RuntimeWarning: divide by
> zero encountered in power
>   #!/home/njs/.user-python3.5-64bit/bin/python3.5
> /home/njs/.user-python3.5-64bit/bin/ipython:1: RuntimeWarning: invalid
> value encountered in power
>   #!/home/njs/.user-python3.5-64bit/bin/python3.5
> Out[3]:
> array([-9223372036854775808,1,0,
>   0,0,0,
>   0,0,0,
>   0])
>
> -n
>
> --
> Nathaniel J. Smith -- https://vorpus.org
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Integers to integer powers

2016-05-24 Thread Eric Moore
I'd say the most compelling case for it is that is how it has always
worked.  How much code will break if we make that change? (Or if not break,
at least have a change in dtype?)  Is that worth it?

Eric

On Tue, May 24, 2016 at 1:31 PM, Alan Isaac  wrote:

> On 5/24/2016 1:19 PM, Stephan Hoyer wrote:
>
>> the int ** 2 example feels quite compelling to me
>>
>
>
> Yes, but that one case is trivial: a*a
>
> And at least as compelling is not have a**-2 fail
> and not being tricked by say np.arange(10)**10.
> The latter is a promise of hidden errors.
>
> Alan
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal: numpy.random.random_seed

2016-05-17 Thread Eric Moore
On Tue, May 17, 2016 at 9:40 AM, Sturla Molden 
wrote:

> Stephan Hoyer  wrote:
> > I have recently encountered several use cases for randomly generate
> random
> > number seeds:
> >
> > 1. When writing a library of stochastic functions that take a seed as an
> > input argument, and some of these functions call multiple other such
> > stochastic functions. Dask is one such example [1].
> >
> > 2. When a library needs to produce results that are reproducible after
> > calling numpy.random.seed, but that do not want to use the functions in
> > numpy.random directly. This came up recently in a pandas pull request
> [2],
> > because we want to allow using RandomState objects as an alternative to
> > global state in numpy.random. A major advantage of this approach is that
> it
> > provides an obvious alternative to reusing the private
> numpy.random._mtrand
> > [3].
>
>
> What about making numpy.random a finite state machine, and keeping a stack
> of RandomState seeds? That is, something similar to what OpenGL does for
> its matrices? Then we get two functions, numpy.random.push_seed and
> numpy.random.pop_seed.
>

I don't like the idea of adding this kind of internal state.  Having it
built into the module means that it is shared by all callers, libraries
user code etc. That's not the right choice when a stack of seeds could be
easily built around the RandomState object if that is really what someone
needs.

Eric
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Multidimension array access in C via Python API

2016-04-05 Thread Eric Moore
Eh. The order of the outputs will be different than your code, if that
makes a difference.

On Tue, Apr 5, 2016 at 3:31 PM, Eric Moore <e...@redtetrahedron.org> wrote:

> def reduce_data(buffer, resolution):
> thinned_buffer = np.zeros((resolution**3, 3))
>
> min_xyz = buffer.min(axis=0)
> max_xyz = buffer.max(axis=0)
> delta_xyz = max_xyz - min_xyz
>
> inds_xyz = np.floor(resolution * (buffer - min_xyz) /
> delta_xyz).astype(int)
>
> # handle values right at the max
> inds_xyz[inds_xyz == resolution] -= 1
>
> # covert to linear indices so that we can use np.add.at
> inds_lin = inds_xyz[:,0]
> inds_lin += inds_xyz[:,1] * resolution
> inds_lin += inds_xyz[:,2] * resolution**2
>
> np.add.at(thinned_buffer, inds_lin, buffer)
> counts = np.bincount(inds_lin, minlength=resolution**3)
>
> thinned_buffer[counts != 0, :] /= counts[counts != 0, None]
> return thinned_buffer
>
>
> The bulk of the time is spent in np.add.at, so just over 5 s here with
> your 1e7 to 1e6 example.
>
> On Tue, Apr 5, 2016 at 2:09 PM, mpc <matt.p.co...@gmail.com> wrote:
>
>> This wasn't intended to be a histogram, but you're right in that it would
>> be
>> much better if I can just go through each point once and bin the results,
>> that makes more sense, thanks!
>>
>>
>>
>> --
>> View this message in context:
>> http://numpy-discussion.10968.n7.nabble.com/Multidimension-array-access-in-C-via-Python-API-tp42710p42733.html
>> Sent from the Numpy-discussion mailing list archive at Nabble.com.
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Multidimension array access in C via Python API

2016-04-05 Thread Eric Moore
def reduce_data(buffer, resolution):
thinned_buffer = np.zeros((resolution**3, 3))

min_xyz = buffer.min(axis=0)
max_xyz = buffer.max(axis=0)
delta_xyz = max_xyz - min_xyz

inds_xyz = np.floor(resolution * (buffer - min_xyz) /
delta_xyz).astype(int)

# handle values right at the max
inds_xyz[inds_xyz == resolution] -= 1

# covert to linear indices so that we can use np.add.at
inds_lin = inds_xyz[:,0]
inds_lin += inds_xyz[:,1] * resolution
inds_lin += inds_xyz[:,2] * resolution**2

np.add.at(thinned_buffer, inds_lin, buffer)
counts = np.bincount(inds_lin, minlength=resolution**3)

thinned_buffer[counts != 0, :] /= counts[counts != 0, None]
return thinned_buffer


The bulk of the time is spent in np.add.at, so just over 5 s here with your
1e7 to 1e6 example.

On Tue, Apr 5, 2016 at 2:09 PM, mpc  wrote:

> This wasn't intended to be a histogram, but you're right in that it would
> be
> much better if I can just go through each point once and bin the results,
> that makes more sense, thanks!
>
>
>
> --
> View this message in context:
> http://numpy-discussion.10968.n7.nabble.com/Multidimension-array-access-in-C-via-Python-API-tp42710p42733.html
> Sent from the Numpy-discussion mailing list archive at Nabble.com.
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Multidimension array access in C via Python API

2016-04-05 Thread Eric Moore
Its difficult to say why your code is slow without seeing it.  i.e. are you
generating large temporaries?  Or doing loops in python that can be pushed
down to C via vectorizing? It may or may not be necessary to leave python
to get things to run fast enough.

-Eric


On Tue, Apr 5, 2016 at 11:39 AM, mpc  wrote:

> This is the reason I'm doing this in the first place, because I made a pure
> python version but it runs really slow for larger data sets, so I'm
> basically rewriting the same function but using the Python and Numpy C API,
> but if you're saying it won't run any faster then maybe I'm going at it the
> wrong way. (Why use the C function version if it's the same speed anyway?)
>
> You're suggesting perhaps a cython approach, or perhaps a strictly C/C++
> approach given the raw data?
>
>
>
> --
> View this message in context:
> http://numpy-discussion.10968.n7.nabble.com/Multidimension-array-access-in-C-via-Python-API-tp42710p42719.html
> Sent from the Numpy-discussion mailing list archive at Nabble.com.
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Multidimension array access in C via Python API

2016-04-04 Thread Eric Moore
Yes, PySlice_New(NULL, NULL, NULL) is the same as ':'.   Depending on what
exactly you want to do with the column once you've extracted it, this may
not be the best way to do it.  Are you absolutely certain that you actually
need a PyArrayObject that points to the column?

Eric

On Mon, Apr 4, 2016 at 3:59 PM, mpc  wrote:

> Thanks for responding.
>
> It looks you made/found these yourself since I can't find anything like
> this
> in the API. I can't believe it isn't, so convenient!
>
> By the way, from what I understand, the ':' is represented as
> *PySlice_New(NULL, NULL, NULL) *in the C API when accessing by index,
> correct?
>
>
> Therefore the final result will be something like:
>
> *PyObject* first_column_tuple = PyTuple_New(2);
> PyTuple_SET_ITEM(first_column_tuple, 0, PySlice_New(NULL, NULL, NULL));
> PyTuple_SET_ITEM(first_column_tuple, 1, PyInt_FromLong(0));
> PyObject* first_column_buffer = PyObject_GetItem(src_buffer,
> first_column_tuple);
> *
>
>
>
> --
> View this message in context:
> http://numpy-discussion.10968.n7.nabble.com/Multidimension-array-access-in-C-via-Python-API-tp42710p42715.html
> Sent from the Numpy-discussion mailing list archive at Nabble.com.
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Multidimension array access in C via Python API

2016-04-04 Thread Eric Moore
/* obj[ind] */
PyObject* DoIndex(PyObject* obj, int ind)
{
PyObject *oind, *ret;
oind = PyLong_FromLong(ind);
if (!oind) {
return NULL;
}
ret = PyObject_GetItem(obj, oind);
Py_DECREF(oind);
return ret;
}

/* obj[inds[0], inds[1], ... inds[n_ind-1]] */
PyObject* DoMultiIndex(PyObject* obj, int *inds, int n_ind)
{
PyObject *ret, *oind, *temp;
oind = PyTuple_New(n_ind);
if (!oind)
return NULL;

for (int k = 0; k < n_ind; ++k)
{
temp = PyLong_FromLong(inds[k]);
if (!temp)
Py_DECREF(oind);
PyTuple_SET_ITEM(oind, k, temp);
}
ret = PyObject_GetItem(obj, oind);
Py_DECREF(oind);
return ret;
}

/* obj[b:e:step] */
PyObject* DoSlice(PyObject* obj, int b, int e, int step)
{
PyObject *oind, *ret, *ob, *oe, *ostep;
ob = PyLong_FromLong(b);
if (!ob)
return NULL;
oe = PyLong_FromLong(e);
if (!oe) {
Py_DECREF(ob);
return NULL;
}
ostep = PyLong_FromLong(step);
if (!ostep) {
Py_DECREF(ob);
Py_DECREF(oe);
return NULL;
}
oind = PySlice_New(ob, oe, ostep);
Py_DECREF(ob);
Py_DECREF(oe);
Py_DECREF(ostep);

if (!oind)
return NULL;

ret = PyObject_GetItem(obj, oind);
Py_DECREF(oind);
return ret;
}

-Eric

On Mon, Apr 4, 2016 at 1:35 PM, mpc  wrote:

> Hello,
>
> is there a C-API function for numpy that can implement Python's
> multidimensional indexing?
>
> For example, if I had a 2d array:
>
>PyArrayObject * M;
>
> and an index
>
>int i;
>
> how do I extract the i-th row M[i,:] or i-th column M[:,i]?
>
> Ideally it would be great if it returned another PyArrayObject* object (not
> a newly allocated one, but whose data will point to the correct memory
> locations of M).
>
> I've searched everywhere in the API documentation, Google, and SO, but no
> luck.
>
> Any help is greatly appreciated.
>
> Thank you.
>
> -Matthew
>
>
>
> --
> View this message in context:
> http://numpy-discussion.10968.n7.nabble.com/Multidimension-array-access-in-C-via-Python-API-tp42710.html
> Sent from the Numpy-discussion mailing list archive at Nabble.com.
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Multi-dimensional array of splitted array

2016-03-23 Thread Eric Moore
Try just calling np.array_split on the full 2D array.  It splits along a
particular axis, which is selected using the axis argument of
np.array_split.  The axis to split along defaults to the first so the two
calls to np.array_split below are exactly equivalent.

In [16]: a = np.c_[:10,10:20,20:30]


In [17]: np.array_split(a, [2,5,8])

Out[17]:

[array([[ 0, 10, 20],

[ 1, 11, 21]]), array([[ 2, 12, 22],

[ 3, 13, 23],

[ 4, 14, 24]]), array([[ 5, 15, 25],

[ 6, 16, 26],

[ 7, 17, 27]]), array([[ 8, 18, 28],

[ 9, 19, 29]])]


In [18]: np.array_split(a, [2,5,8], 0)

Out[18]:

[array([[ 0, 10, 20],

[ 1, 11, 21]]), array([[ 2, 12, 22],

[ 3, 13, 23],

[ 4, 14, 24]]), array([[ 5, 15, 25],

[ 6, 16, 26],

[ 7, 17, 27]]), array([[ 8, 18, 28],

[ 9, 19, 29]])]


Eric


On Wed, Mar 23, 2016 at 9:06 AM, Ibrahim EL MEREHBI 
wrote:

> Hello,
>
> I have a multi-diensional array that I would like to split its columns.
>
> For example consider,
>
> dat = np.array([np.arange(10),np.arange(10,20), np.arange(20,30)]).T
>
> array([[ 0, 10, 20],
>[ 1, 11, 21],
>[ 2, 12, 22],
>[ 3, 13, 23],
>[ 4, 14, 24],
>[ 5, 15, 25],
>[ 6, 16, 26],
>[ 7, 17, 27],
>[ 8, 18, 28],
>[ 9, 19, 29]])
>
>
> I already can split one column at a time:
>
> np.array_split(dat[:,0], [2,5,8])
>
> [array([0, 1]), array([2, 3, 4]), array([5, 6, 7]), array([8, 9])]
>
>
> How can I extend this for all columns and (overwrite or) have a new
> multi-dimensional array?
>
> Thank you,
> Bob
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fast vectorized arithmetic with ~32 significant digits under Numpy

2015-12-11 Thread Eric Moore
I have a mostly complete wrapping of the double-double type from the QD
library (http://crd-legacy.lbl.gov/~dhbailey/mpdist/) into a numpy dtype.
The real problem is, as david pointed out, user dtypes aren't quite full
equivalents of the builtin dtypes.  I can post the code if there is
interest.

Something along the lines of what's being discussed here would be nice,
since the extended type is subject to such variation.

Eric

On Fri, Dec 11, 2015 at 12:51 PM, Charles R Harris <
charlesr.har...@gmail.com> wrote:

>
>
> On Fri, Dec 11, 2015 at 10:45 AM, Nathaniel Smith  wrote:
>
>> On Dec 11, 2015 7:46 AM, "Charles R Harris" 
>> wrote:
>> >
>> >
>> >
>> > On Fri, Dec 11, 2015 at 6:25 AM, Thomas Baruchel 
>> wrote:
>> >>
>> >> From time to time it is asked on forums how to extend precision of
>> computation on Numpy array. The most common answer
>> >> given to this question is: use the dtype=object with some arbitrary
>> precision module like mpmath or gmpy.
>> >> See
>> http://stackoverflow.com/questions/6876377/numpy-arbitrary-precision-linear-algebra
>> or
>> http://stackoverflow.com/questions/21165745/precision-loss-numpy-mpmath
>> or
>> http://stackoverflow.com/questions/15307589/numpy-array-with-mpz-mpfr-values
>> >>
>> >> While this is obviously the most relevant answer for many users
>> because it will allow them to use Numpy arrays exactly
>> >> as they would have used them with native types, the wrong thing is
>> that from some point of view "true" vectorization
>> >> will be lost.
>> >>
>> >> With years I got very familiar with the extended double-double type
>> which has (for usual architectures) about 32 accurate
>> >> digits with faster arithmetic than "arbitrary precision types". I even
>> used it for research purpose in number theory and
>> >> I got convinced that it is a very wonderful type as long as such
>> precision is suitable.
>> >>
>> >> I often implemented it partially under Numpy, most of the time by
>> trying to vectorize at a low-level the libqd library.
>> >>
>> >> But I recently thought that a very nice and portable way of
>> implementing it under Numpy would be to use the existing layer
>> >> of vectorization on floats for computing the arithmetic operations by
>> "columns containing half of the numbers" rather than
>> >> by "full numbers". As a proof of concept I wrote the following file:
>> https://gist.github.com/baruchel/c86ed748939534d8910d
>> >>
>> >> I converted and vectorized the Algol 60 codes from
>> http://szmoore.net/ipdf/documents/references/dekker1971afloating.pdf
>> >> (Dekker, 1971).
>> >>
>> >> A test is provided at the end; for inverting 100,000 numbers, my type
>> is about 3 or 4 times faster than GMPY and almost
>> >> 50 times faster than MPmath. It should be even faster for some other
>> operations since I had to create another np.ones
>> >> array for testing this type because inversion isn't implemented here
>> (which could of course be done). You can run this file by yourself
>> >> (maybe you will have to discard mpmath or gmpy if you don't have it).
>> >>
>> >> I would like to discuss about the way to make available something
>> related to that.
>> >>
>> >> a) Would it be relevant to include that in Numpy ? (I would think to
>> some "contribution"-tool rather than including it in
>> >> the core of Numpy because it would be painful to code all ufuncs; on
>> the other hand I am pretty sure that many would be happy
>> >> to perform several arithmetic operations by knowing that they can't
>> use cos/sin/etc. on this type; in other words, I am not
>> >> sure it would be a good idea to embed it as an every-day type but I
>> think it would be nice to have it quickly available
>> >> in some way). If you agree with that, in which way should I code it
>> (the current link only is a "proof of concept"; I would
>> >> be very happy to code it in some cleaner way)?
>> >>
>> >> b) Do you think such attempt should remain something external to Numpy
>> itself and be released on my Github account without being
>> >> integrated to Numpy?
>> >
>> >
>> > I think astropy does something similar for time and dates. There has
>> also been some talk of adding a user type for ieee 128 bit doubles. I've
>> looked once for relevant code for the latter and, IIRC, the available
>> packages were GPL :(.
>>
>> You're probably thinking of the __float128 support in gcc, which relies
>> on a LGPL (not GPL) runtime support library. (LGPL = any patches to the
>> support library itself need to remain open source, but no restrictions are
>> imposed on code that merely uses it.)
>>
>> Still, probably something that should be done outside of numpy itself for
>> now.
>>
>
> No, there are several other software packages out there. I know of the gcc
> version, but was looking for something more portable.
>
> Chuck
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> 

Re: [Numpy-discussion] Failed numpy.test() with numpy-1.10.1 on RHEL 6

2015-11-09 Thread Eric Moore
This fails because numpy uses the function `cacosh` from the libm and on
RHEL6 this function has a bug.  As long as you don't care about getting the
sign right at the branch cut in this function, then it's harmless.  If you
do care, the easiest solution will be to install something like anaconda
that does not link against the relatively old libm that RHEL6 ships.

On Mon, Nov 9, 2015 at 1:11 AM, Lintula  wrote:

> Hello,
>
> I'm setting up numpy 1.10.1 on RHEL6 (python 2.6.6, atlas-3.8.4,
> lapack-3.2.1, gcc-4.4.7), and this test fails for me. I notice that
> someone else has had the same at
> https://github.com/numpy/numpy/issues/6063 in July.
>
> Is this harmless or is it of concern?
>
>
> ==
> FAIL: test_umath.TestComplexFunctions.test_branch_cuts( 'arccosh'>, [-1, 0.5], [1j, 1j], 1, -1, True)
> --
> Traceback (most recent call last):
>   File "/usr/lib/python2.6/site-packages/nose/case.py", line 182, in
> runTest
> self.test(*self.arg)
>   File
> "/usr/lib64/python2.6/site-packages/numpy/core/tests/test_umath.py",
> line 1748, in _check_branch_cut
> assert_(np.all(np.absolute(y0.imag - yp.imag) < atol), (y0, yp))
>   File "/usr/lib64/python2.6/site-packages/numpy/testing/utils.py", line
> 53, in assert_
> raise AssertionError(smsg)
> AssertionError: (array([  0.e+00+3.14159265j,
> 1.11022302e-16-1.04719755j]), array([  4.71216091e-07+3.14159218j,
> 1.28119737e-13+1.04719755j]))
>
> --
> Ran 5955 tests in 64.284s
>
> FAILED (KNOWNFAIL=3, SKIP=2, failures=1)
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] querying backend information

2015-11-05 Thread Eric Moore
On Thu, Nov 5, 2015 at 1:37 AM, Ralf Gommers  wrote:

>
>
> On Thu, Nov 5, 2015 at 5:11 AM, Nathaniel Smith  wrote:
>
>> On Wed, Nov 4, 2015 at 4:40 PM, Stefan Seefeld 
>> wrote:
>> > Hello,
>> >
>> > is there a way to query Numpy for information about backends (BLAS,
>> > LAPACK, etc.) that it was compiled against, including compiler / linker
>> > flags that were used ?
>> > Consider the use-case where instead of calling a function such as
>> > numpy.dot() I may want to call the appropriate backend directly using
>> > the C API as an optimization technique. Is there a straight-forward way
>> > to do that ?
>> >
>> > In a somewhat related line of thought: Is there a way to see what
>> > backends are available during Numpy compile-time ? I'm looking for a
>> > list of flags to pick ATLAS/OpenBLAS/LAPACK/MKL or any other backend
>> > that might be available, combined with variables (compiler and linker
>> > flags, notably) I might have to set. Is that information available at
>> all ?
>>
>> NumPy does reveal some information about its configuration and
>> numpy.distutils does provide helper methods, but I'm not super
>> familiar with it so I'll let others answer that part.
>>
>
> np.show_config()
>
> Gives:
>
> lapack_opt_info:
> libraries = ['lapack', 'f77blas', 'cblas', 'atlas']
> library_dirs = ['/usr/lib/atlas-base/atlas', '/usr/lib/atlas-base']
> define_macros = [('NO_ATLAS_INFO', -1)]
> language = f77
> include_dirs = ['/usr/include/atlas']
> openblas_lapack_info:
>   NOT AVAILABLE
> 
> 
>
> It's a function with no docstring and not in the html docs (I think), so
> that can certainly be improved.
>
> Ralf
>

I don't think that show_config is what you want.  Those are built time
values that aren't necessarily true at run time. For instance, numpy from
conda references directories that are not on my machine.

Eric
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Video meeting this week

2015-07-10 Thread Eric Moore
It looks like all of the numpy failures there are due to a poor
implementation of hypot.  One solution would be to force the use of the
hypot code in npymath for this tool chain.  Currently this is done in
numpy/core/src/private/npy_config.h for both MSVC and mingw32.

-Eric

On Fri, Jul 10, 2015 at 7:40 AM, Olivier Grisel olivier.gri...@ensta.org
wrote:

 Good news,

 The segfaults on scikit-lern and scipy test suites are caused by a bug
 in openblas core type detection: setting the OPENBLAS_CORETYPE
 environment variable to Nehalem can make the test suite complete
 without any failure for scikit-learn.

 I will update my gist with the new test results for scipy.

 --
 Olivier
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Aternative to PyArray_SetBaseObject in NumPy 1.6?

2015-06-16 Thread Eric Moore
You have to do it by hand in numpy 1.6.  For example see
https://github.com/scipy/scipy/blob/master/scipy/signal/lfilter.c.src#L285-L292

-Eric

On Sun, Jun 14, 2015 at 10:33 PM, Sturla Molden sturla.mol...@gmail.com
wrote:

 What would be the best alternative to PyArray_SetBaseObject in NumPy 1.6?

 Purpose: Keep alive an object owning data passed to
 PyArray_SimpleNewFromData.


 Sturla

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Automatic number of bins for numpy histograms

2015-04-15 Thread Eric Moore
This blog post, and the links within also seem relevant.  Appears to have
python code available to try things out as well.

https://dataorigami.net/blogs/napkin-folding/19055451-percentile-and-quantile-estimation-of-big-data-the-t-digest

-Eric

On Wed, Apr 15, 2015 at 11:24 AM, Benjamin Root ben.r...@ou.edu wrote:

 Then you can set about convincing matplotlib and friends to
 use it by default

 Just to note, this proposal was originally made over in the matplotlib
 project. We sent it over here where its benefits would have wider reach.
 Matplotlib's plan is not to change the defaults, but to offload as much as
 possible to numpy so that it can support these new features if they are
 available. We might need to do some input validation so that users running
 older version of numpy can get a sensible error message.

 Cheers!
 Ben Root


 On Tue, Apr 14, 2015 at 7:12 PM, Nathaniel Smith n...@pobox.com wrote:

 On Mon, Apr 13, 2015 at 8:02 AM, Neil Girdhar mistersh...@gmail.com
 wrote:
  Can I suggest that we instead add the P-square algorithm for the dynamic
  calculation of histograms?
  (
 http://pierrechainais.ec-lille.fr/Centrale/Option_DAD/IMPACT_files/Dynamic%20quantiles%20calcultation%20-%20P2%20Algorythm.pdf
 )
 
  This is already implemented in C++'s boost library
  (
 http://www.boost.org/doc/libs/1_44_0/boost/accumulators/statistics/extended_p_square.hpp
 )
 
  I implemented it in Boost Python as a module, which I'm happy to share.
  This is much better than fixed-width histograms in practice.  Rather
 than
  adjusting the number of bins, it adjusts what you really want, which is
 the
  resolution of the bins throughout the domain.

 This definitely sounds like a useful thing to have in numpy or scipy
 (though if it's possible to do without using Boost/C++ that would be
 nice). But yeah, we should leave the existing histogram alone (in this
 regard) and add a new name for this like adaptive_histogram or
 something. Then you can set about convincing matplotlib and friends to
 use it by default :-)

 -n

 --
 Nathaniel J. Smith -- http://vorpus.org
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fastest way to compute summary statistics for a specific axis

2015-03-16 Thread Eric Moore
On Mon, Mar 16, 2015 at 11:53 AM, Dave Hirschfeld dave.hirschf...@gmail.com
 wrote:

 I have a number of large arrays for which I want to compute the mean and
 standard deviation over a particular axis - e.g. I want to compute the
 statistics for axis=1 as if the other axes were combined so that in the
 example below I get two values back

 In [1]: a = randn(30, 2, 1)

 For the mean this can be done easily like:

 In [2]: a.mean(0).mean(-1)
 Out[2]: array([ 0.0007, -0.0009])


 ...but this won't work for the std. Using some transformations we can
 come up with something which will work for either:

 In [3]: a.transpose(2,0,1).reshape(-1, 2).mean(axis=0)
 Out[3]: array([ 0.0007, -0.0009])

 In [4]: a.transpose(1,0,2).reshape(2, -1).mean(axis=-1)
 Out[4]: array([ 0.0007, -0.0009])


Specify all of the axes you want to reduce over as a tuple.

In [1]: import numpy as np


In [2]: a = np.random.randn(30, 2, 1)


In [3]: a.mean(axis=(0,-1))

Out[3]: array([-0.00224589, 0.00230759])


In [4]: a.std(axis=(0,-1))

Out[4]: array([ 1.00062771, 1.0001258 ])



-Eric
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] argument handling by uniform

2015-03-13 Thread Eric Moore
`low` and `high` can be arrays so, you received 1 draw from (-0.5, 201) and
1 draw from (0.5, 201).

Eric

On Fri, Mar 13, 2015 at 11:57 AM, Alan G Isaac alan.is...@gmail.com wrote:

 Today I accidentally wrote `uni = np.random.uniform((-0.5,0.5),201)`,
 supply a tuple instead of separate low and high values.  This gave
 me two draws (from [0..201] I think).  My question: how were the
 arguments interpreted?

 Thanks,
 Alan Isaac

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Access dtype kind from cython

2014-12-29 Thread Eric Moore
On Monday, December 29, 2014, Valentin Haenel valen...@haenel.co wrote:

 Hi,

 how do I access the kind of the data from cython, i.e. the single
 character string:

 'b' boolean
 'i' (signed) integer
 'u' unsigned integer
 'f' floating-point
 'c' complex-floating point
 'O' (Python) objects
 'S', 'a' (byte-)string
 'U' Unicode
 'V' raw data (void)

 In regular Python I can do:

 In [7]: d = np.dtype('S')

 In [8]: d.kind
 Out[8]: 'S'

 Looking at the definition of dtype that comes with cython, I see:

   ctypedef class numpy.dtype [object PyArray_Descr]:
   # Use PyDataType_* macros when possible, however there are no macros
   # for accessing some of the fields, so some are defined. Please
   # ask on cython-dev if you need more.
   cdef int type_num
   cdef int itemsize elsize
   cdef char byteorder
   cdef object fields
   cdef tuple names

 I.e. no kind.

 Also, i looked for an appropriate PyDataType_* macro but couldn't find one.

 Perhaps there is something simple I could use?

 best,

 V-
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org javascript:;
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

From C or cython I'd just use the typenum. Compare against the appropriate
macros, NPY_DOUBLE e.g.

Eric
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] FFTS for numpy's FFTs (was: Re: Choosing between NumPy and SciPy functions)

2014-12-11 Thread Eric Moore
On Thu, Dec 11, 2014 at 10:41 AM, Todd toddr...@gmail.com wrote:

 On Tue, Oct 28, 2014 at 5:28 AM, Nathaniel Smith n...@pobox.com wrote:

 On 28 Oct 2014 04:07, Matthew Brett matthew.br...@gmail.com wrote:
 
  Hi,
 
  On Mon, Oct 27, 2014 at 8:07 PM, Sturla Molden sturla.mol...@gmail.com
 wrote:
   Sturla Molden sturla.mol...@gmail.com wrote:
  
   If we really need a
   kick-ass fast FFT we need to go to libraries like FFTW, Intel MKL or
   Apple's Accelerate Framework,
  
   I should perhaps also mention FFTS here, which claim to be faster
 than FFTW
   and has a BSD licence:
  
   http://anthonix.com/ffts/index.html
 
  Nice.  And a funny New Zealand name too.
 
  Is this an option for us?  Aren't we a little behind the performance
  curve on FFT after we lost FFTW?

 It's definitely attractive. Some potential issues that might need dealing
 with, based on a quick skim:

 - seems to have a hard requirement for a processor supporting SSE, AVX,
 or NEON. No fallback for old CPUs or other architectures. (I'm not even
 sure whether it has x86-32 support.)

 -  no runtime CPU detection, e.g. SSE vs AVX appears to be a compile time
 decision

 - not sure if it can handle non-power-of-two problems at all, or at all
 efficiently. (FFTPACK isn't great here either but major regressions would
 be bad.)

 - not sure if it supports all the modes we care about (e.g. rfft)

 This stuff is all probably solveable though, so if someone has a
 hankering to make numpy (or scipy) fft dramatically faster then you should
 get in touch with the author and see what they think.

 -n


 I recently became aware of another C-library for doing FFTs (and other
 things):

 https://github.com/arrayfire/arrayfire

 They claim to have comparable FFT performance to MKL when run on a CPU
 (they also support running on the GPU but that is probably outside the
 scope of numpy or scipy).  It used to be proprietary but now it is under a
 BSD-3-Clause license.  It seems it supports non-power-of-2 FFT operations
 as well (although those are slower).  I don't know much beyond that, but it
 is probably worth looking in



AFAICT the cpu backend is a FFTW wrapper.

Eric
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Should ndarray be a context manager?

2014-12-10 Thread Eric Moore
The second argument is named `refcheck` rather than check_refs.

Eric

On Wed, Dec 10, 2014 at 2:36 PM, Chris Barker chris.bar...@noaa.gov wrote:

 On Tue, Dec 9, 2014 at 11:03 PM, Sturla Molden sturla.mol...@gmail.com
 wrote:

 Nathaniel Smith n...@pobox.com wrote:

  @contextmanager
  def tmp_zeros(*args, **kwargs):
  arr = np.zeros(*args, **kwargs)
  try:
  yield arr
  finally:
  arr.resize((0,), check_refs=False)

 That one is interesting. I have actually never used ndarray.resize(). It
 did not even occur to me that such an abomination existed :-)


  and I thought that it would only work if there were no other references
 to the array, in which case it gets garbage collected anyway, but I see the
 nifty check_refs keyword. However:

 In [32]: arr = np.ones((100,100))

 In [33]: arr.resize((0,), check_refs=False)
 ---
 TypeError Traceback (most recent call last)
 ipython-input-33-f0e634534904 in module()
  1 arr.resize((0,), check_refs=False)

 TypeError: 'check_refs' is an invalid keyword argument for this function


 In [34]: np.__version__
 Out[34]: '1.9.1'

 Was that just added (or removed?)

 -Chris


 --

 Christopher Barker, Ph.D.
 Oceanographer

 Emergency Response Division
 NOAA/NOS/ORR(206) 526-6959   voice
 7600 Sand Point Way NE   (206) 526-6329   fax
 Seattle, WA  98115   (206) 526-6317   main reception

 chris.bar...@noaa.gov

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Custom dtypes without C -- or, a standard ndarray-like type

2014-09-23 Thread Eric Moore
On Tuesday, September 23, 2014, Todd toddr...@gmail.com wrote:

 On Mon, Sep 22, 2014 at 5:31 AM, Nathaniel Smith n...@pobox.com
 javascript:_e(%7B%7D,'cvml','n...@pobox.com'); wrote:

 On Sun, Sep 21, 2014 at 7:50 PM, Stephan Hoyer sho...@gmail.com
 javascript:_e(%7B%7D,'cvml','sho...@gmail.com'); wrote:
  My feeling though is that in most of the cases you mention,
 implementing a new array-like type is huge overkill. ndarray's
 interface is vast and reimplementing even 90% of it is a huge effort.
 For most of the cases that people seem to run into in practice, the
 solution is to enhance numpy's dtype interface so that it's possible
 for mere mortals to implement new dtypes, e.g. by just subclassing
 np.dtype. This is totally doable and would enable a ton of
 awesomeness, but it requires someone with the time to sit down and
 work on it, and no-one has volunteered yet. Unfortunately it does
 require hacking on C code though.


 I'm unclear about the last sentence.  Do you mean improving the dtype
 system will require hacking on C code or even if we improve the dtype
 system dtypes will still have to be written in C?


What ends up making this hard is every place numpy does anything with a
dtype needs at least audited and probably changed. All of that is in c
right now, and most of it would likely still be after the fact, simply
because the rest of numpy is in c. Improving the dtype system requires
working on c code.

Eric
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Is this a bug?

2014-09-16 Thread Eric Moore
On Tuesday, September 16, 2014, Jaime Fernández del Río 
jaime.f...@gmail.com wrote:

 On Tue, Sep 16, 2014 at 3:26 PM, Charles R Harris 
 charlesr.har...@gmail.com
 javascript:_e(%7B%7D,'cvml','charlesr.har...@gmail.com'); wrote:



 On Tue, Sep 16, 2014 at 2:51 PM, Nathaniel Smith n...@pobox.com
 javascript:_e(%7B%7D,'cvml','n...@pobox.com'); wrote:

 On Tue, Sep 16, 2014 at 4:31 PM, Jaime Fernández del Río
 jaime.f...@gmail.com
 javascript:_e(%7B%7D,'cvml','jaime.f...@gmail.com'); wrote:
  If it is a bug, it is an extended one, because it is the same behavior
 of
  einsum:
 
  np.einsum('i,i', [1,1,1], [1])
  3
  np.einsum('i,i', [1,1,1], [1,1])
  Traceback (most recent call last):
File stdin, line 1, in module
  ValueError: operands could not be broadcast together with remapped
 shapes
  [origi
  nal-remapped]: (3,)-(3,) (2,)-(2,)
 
  And I think it is a conscious design decision, there is a comment about
  broadcasting missing core dimensions here:
 
 
 https://github.com/numpy/numpy/blob/master/numpy/core/src/umath/ufunc_object.c#L1940

 intentional and sensible are not always the same thing :-). That
 said, it isn't totally obvious to me what the correct behaviour for
 einsum is in this case.

  and the code makes it very explicit that input argument dimensions
 with the
  same label are broadcast to a common shape, see here:
 
 
 https://github.com/numpy/numpy/blob/master/numpy/core/src/umath/ufunc_object.c#L1956
 
  I kind of expect numpy to broadcast whenever possible, so this doesn't
 feel
  wrong to me.

 The case Chuck is talking about is like if we allowed matrix
 multiplication between an array with shape (n, 1) with an array with
 (k, m), because (n, 1) can be broadcast to (n, k). This feels VERY
 wrong to me, will certainly hide many bugs, and is definitely not how
 it works right now (for np.dot, anyway; apparently it does work that
 way for the brand-new gufunc np.linalg.matrix_multiply, but this must
 be an accident).

  That said, it is hard to come up with convincing examples of how this
  behavior would be useful in any practical context. But changing
 something
  that has been working like that for so long seems like a risky thing.
 And I
  cannot come with a convincing example of why it would be harmful
 either.

 gufuncs are very new.


 Or at least newly used. They've been sitting around for years with little
 use and less testing. This is probably (easily?) fixable as the shape of
 the operands is available.

 In [22]: [d.shape for d in nditer([[1,1,1], [[1,1,1]]*3]).operands]
 Out[22]: [(3,), (3, 3)]

 In [23]: [d.shape for d in nditer([[[1,1,1]], [[1,1,1]]*3]).operands]
 Out[23]: [(1, 3), (3, 3)]


 If we agree that it is broken, which I still am not fully sure of, then
 yes, it is very easy to fix. I have been looking into that code quite a bit
 lately, so I could patch something up pretty quick.

 Are we OK with the appending of size 1 dimensions to complete the core
 dimensions? That is, should matrix_multiply([1,1,1], [[1],[1],[1]]) work,
 or should it complain about the first argument having less dimensions than
 the core dimensions in the signature?

 Lastly, there is an interesting side effect of the way this broadcasting
 is handled: if a gufunc specifies a core dimension in an output argument
 only, and an `out` kwarg is not passed in, then the output array will have
 that core dimension set to be of size 1, e.g. if the signature of `f` is
 '(),()-(a)', then f(1, 2).shape is (1,). This has always felt funny to me,
 and I think that an unspecified dimension in an output array should either
 be specified by a passed out array, or raise an error about an unspecified
 core dimension or something like that. Does this sound right?

 Jaime

 --
 (\__/)
 ( O.o)
 (  ) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
 de dominación mundial.


Given this and the earlier discussion about improvements to this code, I
wonder if it wouldn't be worth implemented the logic in python first. This
way there is something to test against, and something to play while all of
the cases are sorted out.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Test error with ATLAS, Windows 64 bit

2014-04-14 Thread Eric Moore
On Monday, April 14, 2014, Matthew Brett matthew.br...@gmail.com wrote:

 Hi,

 On Mon, Apr 14, 2014 at 12:12 PM, Warren Weckesser
 warren.weckes...@gmail.com javascript:; wrote:
 
  On Mon, Apr 14, 2014 at 2:59 PM, Matthew Brett 
  matthew.br...@gmail.comjavascript:;
 
  wrote:
 
  Hi,
 
  With Carl Kleffner, I am trying to build a numpy 1.8.1 wheel for
  Windows 64-bit, and latest stable ATLAS.
 
  It works fine, apart from the following test failure:
 
  ==
  FAIL: test_special (test_umath.TestExpm1)
  --
  Traceback (most recent call last):
File C:\Python27\lib\site-packages\numpy\core\tests\test_umath.py,
  line 329, in test_special
  assert_equal(ncu.expm1(-0.), -0.)
File C:\Python27\lib\site-packages\numpy\testing\utils.py, line
  311, in assert_equal
  raise AssertionError(msg)
  AssertionError:
  Items are not equal:
   ACTUAL: 0.0
   DESIRED: -0.0
 
  Has anyone seen this?  Is it in fact necessary that expm1(-0.) return
  -0 instead of 0?
 
 
 
  What a cowinky dink.  This moring I ran into this issue in a scipy pull
  request (https://github.com/scipy/scipy/pull/3547), and I asked about
 this
  comparison failing on the mailing list a few hours ago.  In the pull
  request, the modified function returns -0.0 where it used to return 0.0,
 and
  the test for the value 0.0 failed.  My work-around was to use
  `assert_array_equal` instead of `assert_equal`.  The array comparison
  functions treat the values -0.0 and 0.0 as equal.  `assert_equal` has
 code
  that checks for signed zeros, and fails if they are not the same sign.

 Yes, sorry, I should have mentioned that I saw your post too.  I'd
 live to use that workaround, but it looks like the teste against -0 is
 intentional, and I was hoping for help understanding.

 The relevant two lines of the test are:

 assert_equal(ncu.expm1(0.), 0.)
 assert_equal(ncu.expm1(-0.), -0.)

 Julian - I think this one is for you - as the author of these lines:

 f53ab41a numpy/core/tests/test_umath.py (Julian Taylor
 2014-03-02 02:55:30 +0100  329) assert_equal(ncu.expm1(-0.),
 -0.)

 Can you say why you wanted -0 specifically?  Any clue as to why ATLAS
 64 bit may give 0 instead?

 Cheers,

 Matthew
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org javascript:;
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


I think this is a real bug in the version of exp1m in
core/src/npymath/npy_math.c.src that's being used since your windows build
couldn't find a system version of exp1m to use.

Eric
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] It looks like Py 3.5 will include a dedicated infix matrix multiply operator

2014-03-16 Thread Eric Moore
On Sunday, March 16, 2014, josef.p...@gmail.com wrote:




 On Sun, Mar 16, 2014 at 10:54 AM, Nathaniel Smith 
 n...@pobox.comjavascript:_e(%7B%7D,'cvml','n...@pobox.com');
  wrote:

 On Sun, Mar 16, 2014 at 2:39 PM, Eelco Hoogendoorn
 hoogendoorn.ee...@gmail.comjavascript:_e(%7B%7D,'cvml','hoogendoorn.ee...@gmail.com');
 wrote:
  Note that I am not opposed to extra operators in python, and only mildly
  opposed to a matrix multiplication operator in numpy; but let me lay
 out the
  case against, for your consideration.
 
  First of all, the use of matrix semantics relative to arrays semantics
 is
  extremely rare; even in linear algebra heavy code, arrays semantics
 often
  dominate. As such, the default of array semantics for numpy has been a
 great
  choice. Ive never looked back at MATLAB semantics.

 Different people work on different code and have different experiences
 here -- yours may or may be typical yours. Pauli did some quick checks
 on scikit-learn  nipy  scipy, and found that in their test suites,
 uses of np.dot and uses of elementwise-multiplication are ~equally
 common: https://github.com/numpy/numpy/pull/4351#issuecomment-37717330h

  Secondly, I feel the urge to conform to a historical mathematical
 notation
  is misguided, especially for the problem domain of linear algebra.
 Perhaps
  in the world of mathematics your operation is associative or commutes,
 but
  on your computer, the order of operations will influence both outcomes
 and
  performance. Even for products, we usually care not only about the
 outcome,
  but also how that outcome is arrived at. And along the same lines, I
 don't
  suppose I need to explain how I feel about A@@-1 and the likes. Sure,
 it
  isn't to hard to learn or infer this implies a matrix inverse, but why
 on
  earth would I want to pretend the rich complexity of numerical matrix
  inversion can be mangled into one symbol? Id much rather write inv or
 pinv,
  or whatever particular algorithm happens to be called for given the
  situation. Considering this isn't the num-lisp discussion group, I
 suppose I
  am hardly the only one who feels so.
 

 My impression from the other thread is that @@ probably won't end up
 existing, so you're safe here ;-).

  On the whole, I feel the @ operator is mostly superfluous. I prefer to
 be
  explicit about where I place my brackets. I prefer to be explicit about
 the
  data layout and axes that go into a (multi)linear product, rather than
 rely
  on obtuse row/column conventions which are not transparent across
 function
  calls. When I do linear algebra, it is almost always vectorized over
  additional axes; how does a special operator which is only well defined
 for
  a few special cases of 2d and 1d tensors help me with that?

 Einstein notation is coming up on its 100th birthday and is just as
 blackboard-friendly as matrix product notation. Yet there's still a
 huge number of domains where the matrix notation dominates. It's cool
 if you aren't one of the people who find it useful, but I don't think
 it's going anywhere soon.

  Note that I don't think there is much harm in an @ operator; but I
 don't see
  myself using it either. Aside from making textbook examples like a
  gram-schmidt orthogonalization more compact to write, I don't see it
 having
  much of an impact in the real world.

 The analysis in the PEP found ~780 calls to np.dot, just in the two
 projects I happened to look at. @ will get tons of use in the real
 world. Maybe all those people who will be using it would be happier if
 they were using einsum instead, I dunno, but it's an argument you'll
 have to convince them of, not me :-).


 Just as example

 I just read for the first time two journal articles in econometrics that
 use einsum notation.
 I have no idea what their formulas are supposed to mean, no sum signs and
 no matrix algebra.
 I need to have a strong incentive to stare at those formulas again.

 (statsmodels search finds 1520 dot, including sandbox and examples)

 Josef
 TODO: learn how to use einsums



 -n

 --
 Nathaniel J. Smith
 Postdoctoral researcher - Informatics - University of Edinburgh
 http://vorpus.org
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.orgjavascript:_e(%7B%7D,'cvml','NumPy-Discussion@scipy.org');
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


An important distinction between calling dot or @ is that matrix
multiplication is a domain where enormous effort has already been spent on
algorithms and building fast, scalable libraries. Yes einsum can call these
for some subset of calls but it's also trivial to set up a case where it
can't. This is a huge pitfall because it hides this complexity.
 Matrix-matrix and matrix-vector products are the fundamental operations,
generalized multilinear products etc are not.

Einsum, despite the brevity that it can provide, is too general to make a
basic building block. There isn't a good way to reason 

Re: [Numpy-discussion] GSoC project: draft of proposal

2014-03-14 Thread Eric Moore
On Friday, March 14, 2014, Gregor Thalhammer gregor.thalham...@gmail.com
wrote:


 Am 13.03.2014 um 18:35 schrieb Leo Mao lmao20...@gmail.com javascript:;
 :

  Hi,
 
  Thanks a lot for your advice, Chuck.
  Following your advice, I have modified my draft of proposal. (attachment)
  I think it still needs more comments so that I can make it better.
 
  And I found that maybe I can also make some functions related to linalg
 (like dot, svd or something else) faster by integrating a proper library
 into numpy.
 
  Regards,
  Leo Mao
 
 Dear Leo,

 large parts of your proposal are covered by the uvml package
 https://github.com/geggo/uvml
 In my opinion you should also consider Intels VML (part of MKL) as a
 candidate. (Yes I know, it is not free). To my best knowledge it provides
 many more vectorized functions than the open source alternatives.
 Concerning your time table, once you implemented support for one function,
 adding more functions is very easy.

 Gregor


I'm not sure that your week old project is enough to discourage this gsoc
project. In particular, it would be nice to be able to ship this directly
as part of numpy and that won't really be possible with mlk.

Eric


 __
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org javascript:;
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal: Chaining np.dot with mdot helper function

2014-02-20 Thread Eric Moore
On Thursday, February 20, 2014, Eelco Hoogendoorn 
hoogendoorn.ee...@gmail.com wrote:

 If the standard semantics are not affected, and the most common
 two-argument scenario does not take more than a single if-statement
 overhead, I don't see why it couldn't be a replacement for the existing
 np.dot; but others mileage may vary.


 On Thu, Feb 20, 2014 at 11:34 AM, Stefan Otte 
 stefan.o...@gmail.comjavascript:_e(%7B%7D,'cvml','stefan.o...@gmail.com');
  wrote:

 Hey,

 so I propose the following.  I'll implement a new function `mdot`.
 Incorporating the changes in `dot` are unlikely. Later, one can still
 include
 the features in `dot` if desired.

 `mdot` will have a default parameter `optimize`.  If `optimize==True` the
 reordering of the multiplication is done.  Otherwise it simply chains the
 multiplications.

 I'll test and benchmark my implementation and create a pull request.

 Cheers,
  Stefan
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.orgjavascript:_e(%7B%7D,'cvml','NumPy-Discussion@scipy.org');
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


 Another consideration here is that we need a better way to work with
stacked matrices such as np.linalg handles now.  Ie I want to compute the
matrix product of two (k, n, n) arrays producing a (k,n,n) result.  Near
as  I can tell there isn't a way to do this right now that doesn't involve
an explicit loop. Since dot will return a (k, n, k, n) result. Yes this
output contains what I want but it also computes a lot of things that I
don't want too.

It would also be nice to be able to do a matrix product reduction, (k, n,
n) - (n, n) in a single line too.

Eric
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] C99 compatible complex number tests fail

2014-01-04 Thread Eric Moore
On Saturday, January 4, 2014, Ralf Gommers wrote:




 On Mon, Dec 23, 2013 at 12:14 AM, Matti Picus 
 matti.pi...@gmail.comjavascript:_e({}, 'cvml', 'matti.pi...@gmail.com');
  wrote:

 Hi. I started to port the stdlib cmath C99 compatible complex number
 tests to numpy, after noticing that numpy seems to have different
 complex number routines than cmath. The work is available on a
 retest_complex branch of numpy
 https://github.com/mattip/numpy/tree/retest_complex
 The tests can be run by pulling the branch (no need to rebuild numpy)
 and running

 python path-to-branch/numpy/core/tests/test_umath_complex.py 
 test.log 21

 So far it is just a couple of  commits that run the tests on numpy, I
 did not dive into modifying the math routines. If I did the work
 correctly, failures point to some differences, most due to edge cases
 with inf and nan, but there are a number of failures due to different
 finite values (for some small definition of different).
 I guess my first question is did I do the tests properly.


 They work fine, however you did it in a nonstandard way which makes the
 output hard to read. Some comments:
 - the assert_* functions expect actual as first input and desired
 next, while you have them reversed.
 - it would be good to split those tests into multiple cases, for example
 one per function to be tested.
 - you shouldn't print anything, just let it fail. If you want to see each
 individual failure, use generator tests.
 - the cmathtestcases.txt is a little nonstandard but should be OK to keep
 it like that.

 Assuming I did, the next question is are the inconsistencies
 intentional i.e. are they that way in order to be compatible with
 Matlab  or some other non-C99 conformant library?


 The implementation should conform to IEEE 754.


 For instance, a comparison between the implementation of cmath's sqrt
 and numpy's sqrt shows that numpy does not check for subnormals.


 I suspect no handling for denormals was done on purpose, since that should
 have a significant performance penalty. I'm not sure about other
 differences, probably just following a different reference.

 And I am probably mistaken since I am new to the generator methods of
 numpy,
 but could it be that trigonometric functions like acos and acosh are
 generated in umath/funcs.inc.src, using a very different algorithm than
 cmathmodule.c?


 You're not mistaken.


 Would there be interest in a pull request that changed the routines to
 be more compatible with results from cmath?


 I don't think compatibility with cmath should be a goal, but if you find
 differences where cmath has a more accurate or faster implementation, then
 a PR to adopt the cmath algorithm would be very welcome.

 Ralf


Have you seen  https://github.com/numpy/numpy/pull/3010 ?  This adds C99
compatible complex functions and tests with build time checking if the
system provided functions can pass our tests.

I should have some time to get back to it soon, but somemore eyes and tests
and input would be good. Especially since it's not clear to me if all of
the changes will be accepted.

Eric
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Relative speed

2013-08-29 Thread Eric Moore

 African or European?


 Why on earth would you ask that?



Its a Monty Python and the Holy Grail reference.

Eric
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ndarray: How to create and initialize with a value other than ones or zeros?

2013-06-06 Thread Eric Moore
James Adams wrote:
 I would like to create an array object and initialize the array's
 values with an arbitrary fill value, like you can do using the ones()
 and zeros() creation routines to create and initialize arrays with
 ones or zeros.  Is there an easy way to do this?  If this isn't
 possible then what is the most efficient way to initialize a numpy
 array with an arbitrary fill value?

 In order to provide such an array creation routine I can imagine that
 it'd be as simple as taking the code for ones() and/or zeros() and
 modifying that code so that it provides an additional fill value
 argument and then within the section which does the initialization of
 the array it could use that fill value instead of 1 or 0.  Is this a
 naive assumption?

 Thanks in advance for your help with this issue.

 --James
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

There's also https://github.com/numpy/numpy/pull/2875 in the queue.

Eric
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Assignment function with a signature similar to take?

2012-11-17 Thread Eric Moore
Is there a function that operates like 'take' but does assignment? 
Specifically that takes indices and an axis?  As far as I can tell no 
such function exists.  Is there any particular reason?

One can fake such a thing by doing (code untested):

s = len(a.shape)*[np.s_[:]]
s[axis] = np.s_[1::2]
a[s] = b.take(np.arange(1,b.shape[axis],2), axis)

Or by using np.rollaxis:

a = np.rollaxis(a, axis, len(a.shape))
a[..., 1::2] = b[..., 1::2]
a = np.rollaxis(a, len(a.shape)-1, axis)

But I don't really think that either of these are particularly clear, 
but probably prefer the rollaxis solution.

Also, while I'm here, what about having take also be able to use a slice 
object in lieu of a collection of indices?

-Eric
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion