[Numpy-discussion] Why does asarray() create an intermediate memoryview?

2016-03-27 Thread Alexander Belopolsky
In the following session a numpy array is created from an stdlib array:

In [1]: import array

In [2]: base = array.array('i', [1, 2])

In [3]: a = np.asarray(base)

In [4]: a.base
Out[4]: 

In [5]: a.base.obj
Out[5]: array('i', [1, 2])

In [6]: a.base.obj is base
Out[6]: True

Why can't a.base be base?  What is the need for the intermediate memoryview
object?
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Making datetime64 timezone naive

2015-10-18 Thread Alexander Belopolsky
On Sat, Oct 17, 2015 at 6:59 PM, Chris Barker  wrote:

> If anyone decides to actually get around to leap seconds support in numpy
> datetime, s/he can decide ...


This attitude is the reason why we will probably never have bug free
software when it comes to civil time reckoning.   Even though ANSI C has
the difftime(time_t time1, time_t time0) function which in theory may not
reduce to time1 - time0, in practice it is only useful to avoid overflows
in integer to float conversions in cross-platform code and cannot account
for the fact some days are longer than others.

Similarly, current numpy.datetime64 design ties arithmetic with encoding.
This makes arithmetic easier, but in the long run may preclude designs that
better match the problem domain.

Note how the development of PEP 495 has highlighted the fact that allowing
binary operations (subtraction, comparison etc.) between times in different
timezones was a design mistake.  It will be wise to learn from such
mistakes when redesigning numpy.datetime64.

If you ever plan to support civil time in some form, you should think about
it now.  In Python 3.6, datetime.now() will return different values in the
first and the second repeated hour in the "fall-back fold."   If you allow
datetime.datetime to numpy.datetime64 conversion, you should decide what
you do with that difference.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Making datetime64 timezone naive

2015-10-18 Thread Alexander Belopolsky
On Sat, Oct 17, 2015 at 6:59 PM, Chris Barker  wrote:

> Off the top of my head, I think allowing a 60th second makes more sense --
> jsut like we do leap years.


Yet we don't implement DST by allowing the 24th hour.  Even the countries
that adjust the clocks at midnight don't do that.

In some sense leap seconds are more similar to timezone changes (DST or
political) because they are irregular and unpredictable.

Furthermore, the notion of "fold" is not tied to a particular 24/60/60
system of encoding times and thus more applicable to numpy where
times are encoded as binary integers.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Making datetime64 timezone naive

2015-10-16 Thread Alexander Belopolsky
On Tue, Oct 13, 2015 at 6:48 PM, Chris Barker  wrote:

> And because we probably want fast absolute delta computation, when we add
> timezones, we'll probably want to store the datetime in UTC, and apply the
> timezone on I/O.
>
> Alexander: Am I right that we don't need the "fold" bit in this case?
> You'd still need it when specifying a time in a timezone with folds.. --
> but again, only on I/O


Since Guido hates leap seconds, PEP 495 is silent on this issue, but
strictly speaking UTC leap seconds are "folds."   AFAICT, a strictly POSIX
system must repeat the same value of time_t when a leap second is
inserted.  While datetime will never extend the second field to allow
second=60, with PEP 495, it is now possible to represent 23:59:60 as
 23:59:59/fold=1.

Apart from leap seconds, there is no need to use "fold" on datetimes that
represent time in UTC or any timezone at a fixed offset from utc.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Making datetime64 timezone naive

2015-10-12 Thread Alexander Belopolsky
On Mon, Oct 12, 2015 at 3:10 AM, Stephan Hoyer  wrote:

> The tentative consensus from last year's discussion was that we should
> make datetime64 timezone naive, like the standard library's
> datetime.datetime



If you are going to make datetime64 more like datetime.datetime, please
consider adding the "fold" bit.  See PEP 495. [1]

[1]: https://www.python.org/dev/peps/pep-0495/
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] matmul needs some clarification.

2015-06-04 Thread Alexander Belopolsky
On Wed, Jun 3, 2015 at 5:12 PM, Charles R Harris charlesr.har...@gmail.com
wrote:

 but is as good as dot right now except it doesn't handle object arrays.



This is a fairly low standard. :-(
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] matmul needs some clarification.

2015-06-03 Thread Alexander Belopolsky
On Sat, May 30, 2015 at 6:23 PM, Charles R Harris charlesr.har...@gmail.com
 wrote:

 The problem arises when multiplying a stack of matrices times a vector.
 PEP465 defines this as appending a '1' to the dimensions of the vector and
 doing the defined stacked matrix multiply, then removing the last dimension
 from the result. Note that in the middle step we have a stack of matrices
 and after removing the last dimension we will still have a stack of
 matrices. What we want is a stack of vectors, but we can't have those with
 our conventions. This makes the result somewhat unexpected. How should we
 resolve this?


I think that before tackling the @ operator, we should implement the pure
dot of stacks of matrices and dot of stacks of vectors generalized ufuncs.
  The first will have a 2d core and the second - 1d.  Let's tentatively
call them matmul and vecmul.  Hopefully matrix vector product can be
reduced to the vecmul,
but I have not fully figured this out.  If not - we may need the third
ufunc.

Once we have these ufuncs, we can decide what @ operator should do in terms
of them and possibly some axes manipulation.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Two questions about PEP 465 dot product

2015-05-22 Thread Alexander Belopolsky
On Fri, May 22, 2015 at 4:58 PM, Nathaniel Smith n...@pobox.com wrote:

 For higher dimension inputs like (i, j, n, m) it acts like any other
 gufunc (e.g., everything in np.linalg)


Unfortunately, not everything in linalg acts the same way.  For example,
matrix_rank and lstsq don't.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Two questions about PEP 465 dot product

2015-05-22 Thread Alexander Belopolsky
On Thu, May 21, 2015 at 9:37 PM, Nathaniel Smith n...@pobox.com wrote:

 .. there's been some discussion of the possibility of
 adding specialized gufuncs for broadcasted vector-vector,
 vector-matrix, matrix-vector multiplication, which wouldn't do the
 magic vector promotion that dot and @ do.


This would be nice.  What I would like to see is some consistency between
multi-matrix
support in linalg methods and dot.

For example, when A is a matrix and b is a vector and

a = linalg.solve(A, b)

then

dot(A, a) returns b, but if either or both A and b are stacks, this
invariant does not hold.  I would like
to see a function (say xdot) that I can use instead of dot and have xdot(A,
a) return b whenever a = linalg.solve(A, b).

Similarly, if w,v =  linalg.eig(A), then dot(A,v) returns w * v, but only
if A is 2d.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Two questions about PEP 465 dot product

2015-05-21 Thread Alexander Belopolsky
1. Is there a simple expression using existing numpy functions that
implements PEP 465 semantics for @?

2. Suppose I have a function that takes two vectors x and y, and a matrix M
and returns x.dot(M.dot(y)).  I would like to vectorize this function so
that it works with x and y of any ndim = 1 and M of any ndim = 2 treating
multi-dimensional x and y as arrays of vectors and M as an array of
matrices (broadcasting as necessary).  The result should be an array of xMy
products.  How would I achieve that using  PEP 465's @?
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ANN: numexpr 2.4.3 released

2015-04-27 Thread Alexander Belopolsky
On Mon, Apr 27, 2015 at 7:14 PM, Nathaniel Smith n...@pobox.com wrote:

 There's no way to access the ast reliably at runtime in python -- it gets
 thrown away during compilation.


The meta package supports bytecode to ast translation.  See 
http://meta.readthedocs.org/en/latest/api/decompile.html.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] 3D array and the right hand rule

2015-01-29 Thread Alexander Belopolsky
On Mon, Jan 26, 2015 at 6:06 AM, Dieter Van Eessen 
dieter.van.ees...@gmail.com wrote:

 I've read that numpy.array isn't arranged according to the
 'right-hand-rule' (right-hand-rule = thumb = +x; index finger = +y, bend
 middle finder = +z). This is also confirmed by an old message I dug up from
 the mailing list archives. (see message below)


Dieter,

It looks like you are confusing dimensionality of the array with the
dimensionality of a vector that it might store.  If you are interested in
using numpy for 3D modeling, you will likely only encounter 1-dimensional
arrays (vectors) of size 3 and 2-dimensional arrays  (matrices) of size 9
or shape (3, 3).

A 3-dimensional array is a stack of matrices and the 'right-hand-rule' does
not really apply.  The notion of C/F-contiguous deals with the order of
axes (e.g. width first or depth first) while the right-hand-rule is about
the direction of the axes (if you flip the middle finger right hand
becomes left.)  In the case of arrays this would probably correspond to
little-endian vs. big-endian: is a[0] stored at a higher or lower address
than a[1].  However, whatever the answer to this question is for a
particular system, it is the same for all axes in the array, so right-hand
- left-hand distinction does not apply.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Equality of dtypes does not imply equality of type kinds

2015-01-12 Thread Alexander Belopolsky
On Mon, Jan 12, 2015 at 8:48 PM, Charles R Harris charlesr.har...@gmail.com
wrote:

 That is to say, in this case C long has the same precision as C long
long. That varies depending on the platform, which is one reason the
precision nomenclature came in. It can be confusing, and I've often
fantasized getting rid of the long type altogether ;) So it isn't exactly
intended, but there is a reason...


It is also confusing that numpy has two constructors that produce 32-bit
integers on 32-bit platforms and 64-bit integers on 64-bit platforms, but
neither of these constructors is called long.  Instead, they are called
numpy.int_ and numpy.intp.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Characteristic of a Matrix.

2015-01-06 Thread Alexander Belopolsky
On Tue, Jan 6, 2015 at 8:20 PM, Nathaniel Smith n...@pobox.com wrote:

  Since matrices are now part of some high school curricula, I urge that
 they
  be treated appropriately in Numpy.  Further, I suggest that
 consideration be
  given to establishing V and VT sub-classes, to cover vectors and
 transposed
  vectors.

 The numpy devs don't really have the interest or the skills to create
 a great library for pedagogical use in high schools. If you're
 interested in an interface like this, then I'd suggest creating a new
 package focused specifically on that (which might use numpy
 internally). There's really no advantage in glomming this into numpy
 proper.


Sorry for taking this further off-topic, but I recently discovered an
excellent SAGE package, http://www.sagemath.org/.  While it's targeted
audience includes math graduate students and research mathematicians, parts
of it are accessible to schoolchildren.  SAGE is written in Python and
integrates a number of packages including numpy.

I would highly recommend to anyone interested in using Python for education
to take a look at SAGE.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] The future of ndarray.diagonal()

2015-01-01 Thread Alexander Belopolsky
A discussion [1] is currently underway at GitHub which will benefit from a
larger forum.

In version 1.9, the diagonal() method was changed to return a read-only
(non-contiguous) view into the original array instead of a plain copy.
Also, it has been announced [2] that in 1.10 the view will become
read/write.

A concern has now been raised [3] that this change breaks backward
compatibility too much.

Consider the following code:

x = numy.eye(2)
d = x.diagonal()
d[0] = 2

In 1.8, this code runs without errors and results in [2, 1] stored in array
d.  In 1.9, this is an error.  With the current plan, in 1.10 this will
become valid again, but the result will be different: x[0,0] will be 2
while it is 1 in 1.8.

Two alternatives  are suggested for discussion:

1. Add copy=True flag to diagonal() method.
2. Roll back 1.9 change to diagonal() and introduce an additional
diagonal_view() method to return a view.



[1] https://github.com/numpy/numpy/pull/5409
[2] http://docs.scipy.org/doc/numpy/reference/generated/numpy.diagonal.html
[3] http://khinsen.wordpress.com/2014/09/12/the-state-of-numpy/
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Clarifications in numpy.ma module

2014-12-30 Thread Alexander Belopolsky
On Tue, Dec 30, 2014 at 1:45 PM, Benjamin Root ben.r...@ou.edu wrote:

 What do you mean that the mean function doesn't take care of the case
 where the array is empty? In the example you provided, they both end up
 being NaN, which is exactly correct.


Operations on masked arrays should not produce NaNs.  They should produce
ma.masked.  For example,

 np.ma.array(0)/0
masked

The fact that the user sees runtime warnings also suggests that the edge
case was not thought out.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Clarifications in numpy.ma module

2014-12-30 Thread Alexander Belopolsky
On Tue, Dec 30, 2014 at 2:49 PM, Benjamin Root ben.r...@ou.edu wrote:

 Where does it say that operations on masked arrays should not produce NaNs?


Masked arrays were invented with the specific goal to avoid carrying NaNs
in computations.  Back in the days, NaNs were not available on some
platforms and had significant performance issues on others.  These days NaN
support for floating point types is nearly universal, but numpy types are
not limited by floating point.

 Having np.mean([]) return the same thing as np.ma.mean([]) makes complete
sense.

Does the following make sense as well?

 import numpy
 numpy.ma.masked_values([0, 0], 0).mean()
masked
 numpy.ma.masked_values([0], 0).mean()
masked
 numpy.ma.masked_values([], 0).mean()
* Two warnings *
masked_array(data = nan,
 mask = False,
   fill_value = 0.0)
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Finding values in an array

2014-11-27 Thread Alexander Belopolsky
I probably miss something very basic, but how given two arrays a and b, can
I find positions in a where elements of b are located?  If a were sorted, I
could use searchsorted, but I don't want to get valid positions for
elements that are not in a.  In my case, a has unique elements, but in the
general case I would accept the first match.  In other words, I am looking
for an array analog of list.index() method.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Add `nrows` to `genfromtxt`

2014-11-02 Thread Alexander Belopolsky
On Sun, Nov 2, 2014 at 1:56 PM, Warren Weckesser warren.weckes...@gmail.com
 wrote:

 Or you could just call genfromtxt() once with `max_rows=1` to skip a row.
 (I'm assuming that the first argument to genfromtxt is the open file
 object--or some other iterator--and not the filename.)


That's hackish.  If I have to resort to something like this, I would just
call next() on the open file object or iterator.

Still, the case of dtype=None, name=None is problematic.   Suppose I want
genfromtxt()  to detect the column names from the 1-st row and data types
from the 3-rd.  How would you do that?
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Add `nrows` to `genfromtxt`

2014-11-02 Thread Alexander Belopolsky
Sorry, I meant names=True, not name=None.

On Sun, Nov 2, 2014 at 2:18 PM, Alexander Belopolsky ndar...@mac.com
wrote:


 On Sun, Nov 2, 2014 at 1:56 PM, Warren Weckesser 
 warren.weckes...@gmail.com wrote:

 Or you could just call genfromtxt() once with `max_rows=1` to skip a
 row.  (I'm assuming that the first argument to genfromtxt is the open file
 object--or some other iterator--and not the filename.)


 That's hackish.  If I have to resort to something like this, I would just
 call next() on the open file object or iterator.

 Still, the case of dtype=None, name=None is problematic.   Suppose I want
 genfromtxt()  to detect the column names from the 1-st row and data types
 from the 3-rd.  How would you do that?

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Add `nrows` to `genfromtxt`

2014-11-02 Thread Alexander Belopolsky
On Sun, Nov 2, 2014 at 2:32 PM, Warren Weckesser warren.weckes...@gmail.com
 wrote:


 Still, the case of dtype=None, name=None is problematic.   Suppose I want
 genfromtxt()  to detect the column names from the 1-st row and data types
 from the 3-rd.  How would you do that?



 This may sound like a cop out, but at some point, I stop trying to make
 genfromtxt() handle every possible case, and instead I would write a custom
 header reader to handle this.


In the abstract, I would agree with you.  It is often the case that 2-3
lines of clear Python code is better than a terse function call with half a
dozen non-obvious options.  Specifically, I would be against the proposed
slice_rows because it is either equivalent to  genfromtxt(islice(..), ..)
or hard to specify.

On the other hand, skip_rows is different for two reasons:

1. It is not a new option.  It is currently a deprecated alias to
skip_header, so a change is expected - either removal or redefinition.
2. The intended use-case - inferring column names and type information from
a file where data is separated from the column names is hard to code
explicitly.  (Try it!)
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Add `nrows` to `genfromtxt`

2014-11-01 Thread Alexander Belopolsky
On Sat, Nov 1, 2014 at 3:15 PM, Warren Weckesser warren.weckes...@gmail.com
 wrote:

 Is there wider interest in such an argument to `genfromtxt`?  For my
 use-cases, `max_rows` is sufficient.  I can't recall ever needing the full
 generality of a slice for pulling apart a text file.  Does anyone have
 compelling use-cases that are not handled by `max_rows`?


It is occasionally useful to be able to skip rows after the header.  Maybe
we should de-deprecate skip_rows and give it the meaning different from
skip_header in case of names = None?  For example,

genfromtxt(fname,  skip_header= 3, skip_rows = 1, max_rows = 100)

would mean skip 3 lines, read column names from the 4-th, skip 5-th,
process up to 100 more lines.  This may be useful if the file contains some
meta-data about the column below the header line.  For example, it is
common to put units of measurement below the column names.

Another application could be processing a large text file in chunks, which
again can be covered nicely by  skip_rows/max_rows.

I cannot think of a situation where I would need more generality such as
reading every 3rd row or rows with the given numbers.  Such processing is
normally done after the text data is loaded into an array.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Why ndarray provides four ways to flatten?

2014-10-29 Thread Alexander Belopolsky
On Tue, Oct 28, 2014 at 10:11 PM, Nathaniel Smith n...@pobox.com wrote:

  I don't think so - I think all the heavy lifting is already done in
 flatiter.  The missing parts are mostly trivial things like .size or .shape
 or can be fudged by coercing to true ndarray using existing
 flatiter.__array__ method.

 Now try .resize()...


Simple:

def resize(self, shape):
if self.shape == shape:
return
else:
raise ValueError


From ndarray.resize documentation:

Raises
--
ValueError
If `a` does not own its own data or references or views to it exist,
and the data memory must be changed.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Why ndarray provides four ways to flatten?

2014-10-29 Thread Alexander Belopolsky
On Tue, Oct 28, 2014 at 10:11 PM, Nathaniel Smith n...@pobox.com wrote:

 .diagonal has no magic, it just turns out that the diagonal of any strided
 array is also expressible as a strided array. (Specifically, new_strides =
 (sum(old_strides),).)



This is genius!  Once you mentioned this, it is obvious how the new
diagonal() works and one can only wonder why it took over 20 years to get
this feature in NumPy.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] help using np.einsum for stacked matrix multiplication

2014-10-29 Thread Alexander Belopolsky
On Wed, Oct 29, 2014 at 5:39 AM, Andrew Nelson andyf...@gmail.com wrote:

 I have a 4D array, A, that has the shape (NX, NY, 2, 2).  I wish to
 perform matrix multiplication of the 'NY' 2x2 matrices, resulting in the
 matrix B.  B would have shape (NX, 2, 2).


What you are looking for is dot.reduce and NumPy does not implement that.
You can save an explicit loop by doing reduce(np.dot, matrices).  For
example


In [6] A
Out[6]
array([[[ 1.,  0.],
[ 0.,  1.]],

   [[ 2.,  0.],
[ 0.,  2.]],

   [[ 3.,  0.],
[ 0.,  3.]]])

In [7] reduce(np.dot, A)
Out[7]
array([[ 6.,  0.],
   [ 0.,  6.]])
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Why ndarray provides four ways to flatten?

2014-10-28 Thread Alexander Belopolsky
On Mon, Oct 27, 2014 at 9:41 PM, Yuxiang Wang yw...@virginia.edu wrote:

 In my opinion - because they don't do the same thing, especially when
 you think in terms in lower-level.

 ndarray.flat returns an iterator; ndarray.flatten() returns a copy;
 ndarray.ravel() only makes copies when necessary; ndarray.reshape() is
 more general purpose, even though you can use it to flatten arrays.


Out of the four ways, I find x.flat the most confusing.  Unfortunately, it
is also the most obvious name for the operation  (and ravel is the least,
but it is the fault of the English language where to ravel means to
unravel.).  What x.flat returns, is not really an iterator.  It is some
hybrid between a view and an iterator.  Consider this:

 x = numpy.arange(6).reshape((2,3))
 i = x.flat
 i.next()
0
 i.next()
1
 i.next()
2

So far no surprises, but what should i[0] return now?  If you think of i as
a C pointer you would expect 3, but

 i[0]
0

What is worse, the above resets the index and now

 i.index
0

OK, so now I expect that i[5] will reset the index to 5, but no

 i[5]
5
 i.index
0

When would you prefer to use x.flat over x.ravel()?

Is x.reshape(-1) always equivalent to x.ravel()?

What is x.flat.copy()?  Is it the same as x.flatten()?  Why does flatiter
even have a .copy() method?  Isn't  i.copy() the same as i.base.flatten(),
only slower?

And  with all these methods, I still don't have the one that would flatten
any array including a nested array like this:

 x = np.array([np.arange(2), np.arange(3), np.arange(4)])

I need yet another function here, for example

 np.hstack(x)
array([0, 1, 0, 1, 2, 0, 1, 2, 3])

and what if I want to flatten a higher dimensional nested array, say

 y = np.array([x[:1],x[:2],x])

can I do better than

 np.hstack(np.hstack(y))
array([0, 1, 0, 1, 0, 1, 2, 0, 1, 0, 1, 2, 0, 1, 2, 3])

?
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Why ndarray provides four ways to flatten?

2014-10-28 Thread Alexander Belopolsky
On Tue, Oct 28, 2014 at 1:42 PM, Stephan Hoyer sho...@gmail.com wrote:

 .flat lets you iterate over all elements of a N-dimensional array as if it
 was 1D, without ever needing to make a copy of the array. In contrast,
 ravel() and reshape(-1) cannot always avoid a copy, because they need to
 return another ndarray.


In some cases ravel() returns a copy where a view can be easily constructed.
For example,

 x = np.arange(10)
 y = x[::2]
 y.ravel().flags['OWNDATA']
True

Interestingly, in the same case reshape(-1) returns a view:

 y.reshape(-1).flags['OWNDATA']
False

(This suggests at least a documentation bug - numpy.ravel documentation
says that it is equivalent to reshape(-1).)

It is only in situations like this

 a = np.arange(16).reshape((4,4))
 a[1::2,1::2].ravel()
array([ 5,  7, 13, 15])

where flat view cannot be an ndarray, but .flat can still return something
that is at least duck-typing compatible with ndarray (if not an ndarray
subclass) and behaves as a view into original data.

My preferred design would be for x.flat to return a flat view into x.  This
would be consistent with the way .T and .real attributes are defined and
close enough to .imag.  An obvious way to obtain a flat copy would be
x.flat.copy().  Once we have this, ravel() and flatten() can be deprecated
and reshape(-1) discouraged.

I think this would be backward compatible except for rather questionable
situations like this:

 i = x.flat
 list(i)
[0, 1, 2, 3, 4, 0, 6, 7, 8, 9]
 list(i)
[]
 np.array(i)
array([0, 1, 2, 3, 4, 0, 6, 7, 8, 9])
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Why ndarray provides four ways to flatten?

2014-10-28 Thread Alexander Belopolsky
On Tue, Oct 28, 2014 at 9:23 PM, Nathaniel Smith n...@pobox.com wrote:

 OTOH trying to make .flat into a full duck-compatible ndarray-like
 type is a non-starter; it would take a tremendous amount of work for
 no clear gain.


I don't think so - I think all the heavy lifting is already done in
flatiter.  The missing parts are mostly trivial things like .size or .shape
or can be fudged by coercing to true ndarray using existing
flatiter.__array__ method.

It would be more interesting however if we could always return a true
ndarray view.  How is ndarray.diagonal() view implemented in 1.9?  Can
something similar be used to create a flat view?
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Why ndarray provides four ways to flatten?

2014-10-27 Thread Alexander Belopolsky
Given an n-dim array x, I can do

1. x.flat
2. x.flatten()
3. x.ravel()
4. x.reshape(-1)

Each of these expressions returns a flat version of x with some
variations.  Why does NumPy implement four different ways to do essentially
the same thing?
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Short-hand array creation in `numpy.mat` style

2014-07-15 Thread Alexander Belopolsky
Also, the use of strings will confuse most syntax highlighters.  Compare
the two options in this screenshot:

[image: Inline image 2]
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] String type again.

2014-07-14 Thread Alexander Belopolsky
On Sat, Jul 12, 2014 at 8:02 PM, Nathaniel Smith n...@pobox.com wrote:

 I feel like for most purposes, what we *really* want is a variable length
 string dtype (I.e., where each element can be a different length.).



I've been toying with the idea of creating an array type for interned
strings.  In many applications dealing with large arrays of variable size
strings, the strings come from a relatively short set of names.  Arrays of
interned strings can be manipulated very efficiently because in may
respects they are just like arrays of integers.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Short-hand array creation in `numpy.mat` style

2014-07-14 Thread Alexander Belopolsky
On Fri, Jul 11, 2014 at 4:30 PM, Daniel da Silva var.mail.dan...@gmail.com
wrote:

 If leading a presentation on scientific computing in Python to beginners,
 which would look better on a bullet in a slide?

-

np.build('.2 .7 .1; .3 .5 .2; .1 .1 .9'))

-

np.array([[.2, .7, .1], [.3, .5, .2], [.1, .1, .9]])


 np.array([[.2, .7, .1],
  [.3, .5, .2],
  [.1, .1, .9]])
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Short-hand array creation in `numpy.mat` style

2014-07-06 Thread Alexander Belopolsky
On Sun, Jul 6, 2014 at 6:06 PM, Eric Firing efir...@hawaii.edu wrote:

  (I'm not entirely convinced
 np.arr() is a good idea at all; but if it is, it must be kept simple.)


If you are going to introduce this functionality, please don't call it
np.arr.

Right now, np.atab presents you with a whopping 53 completion choices.
 Adding r, narrows that to 21, but np.arrtab completes to np.array
right away.  Please don't introduce another bump in this road.

Namespaces are one honking great idea -- let's do more of those!

I would suggest calling it something like np.array_simple or
np.array_from_string, but the best choice IMO, would be
np.ndarray.from_string (a static constructor method).
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Short-hand array creation in `numpy.mat` style

2014-07-06 Thread Alexander Belopolsky
On Sun, Jul 6, 2014 at 10:59 PM, Eric Firing efir...@hawaii.edu wrote:

  I would suggest calling it something like np.array_simple or
  np.array_from_string, but the best choice IMO, would be
  np.ndarray.from_string (a static constructor method).


 I think the problem is that this defeats the point: minimizing typing
 when doing an off-the-cuff demo or test.


You can always put np.arr = np.ndarray.from_string or even arr =
np.ndarray.from_string right next to the line where you define np.  (Which
makes me wonder if something like this belongs to ipython magic.)
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] percentile function for masked array?

2014-06-02 Thread Alexander Belopolsky

 It seems that there is not a percentile function for masked array in numpy
 or scipy?


Percentile is not  the only function missing in ma.   See for example

https://github.com/numpy/numpy/issues/4356
https://github.com/numpy/numpy/issues/4355

It seems to me that ma was treated on par with np.matrix in the recent
years while several attempts were made to replace it with something better.

I don't think any better alternative have materialized, so it is probably
time to declare that ma in the supported mechanism to deal with missing
values in numpy and make an effort to keep the np and ma interfaces in
sync.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] percentile function for masked array?

2014-06-02 Thread Alexander Belopolsky
On Mon, Jun 2, 2014 at 11:48 AM, Charles R Harris charlesr.har...@gmail.com
 wrote:

 Masked arrays have no maintainer, and haven't for several years, nor do I
 see anyone coming along to take it up.


I was effectively a maintainer of ma after Numeric - numpy transition and
before it was rewritten to use inheritance from ndarray.

I cannot commit to implementing new features myself, but I will review the
patches that come along.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] percentile function for masked array?

2014-06-02 Thread Alexander Belopolsky
On Mon, Jun 2, 2014 at 12:25 PM, Charles R Harris charlesr.har...@gmail.com
 wrote:

 I think the masked array code is also due a cleanup/rationalization. Any
 comments you have along that line are welcome.


Here are a few thoughts:

1. Please avoid another major rewrite.
2. Stop pretending that instances of ma.MaskedArray and ndarray have is a
relationship.  Use of inheritance should become an implementation detail
and any method that is not explicitly overridden should raise an exception.
3. Add a mechanism to keep numpy and numpy.ma APIs in sync.  At a minimum -
add a test comparing public functions and methods and for pure python
functions compare signatures.
4. Consider deprecating the ma.masked scalar.
5. Support duck-typing in MaskedArray constructors.  If supplied data
object has mask attribute it should be used as mask.  This will allow
interoperability with alternative missing values implementations.  (ndarray
may itself grow mask attribute one day which will be equivalent to isnan.
 Bit views, anyone?)
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposed new function for joining arrays: np.interleave

2014-04-26 Thread Alexander Belopolsky
On Mon, Apr 7, 2014 at 11:12 AM, Björn Dahlgren bjo...@gmail.com wrote:

 I think the code needed for the general n dimensional case with m number
 of arrays
 is non-trivial enough for it to be useful to provide such a function in
 numpy


As of version 1.8.1, I count 571 public names in numpy namespace:

 len([x for x in dir(numpy) if not x.startswith('_')])
571

Rather than adding 572nd name, we should investigate why it is non-trivial
enough to express this using existing functions.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Universal functions and introspection

2014-04-16 Thread Alexander Belopolsky
On Wed, Apr 16, 2014 at 3:50 PM, Fernando Perez fperez@gmail.comwrote:

 Does argument clinic work with python2 or is it python3 only?

 http://legacy.python.org/dev/peps/pep-0436/

 It is python3 only, but is should not be hard to adapt it to generate 2/3
compatible code.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] boolean operations on boolean arrays

2014-04-12 Thread Alexander Belopolsky
On Sat, Apr 12, 2014 at 10:02 AM, Alan G Isaac alan.is...@gmail.com wrote:

 Are there any considerations besides convenience in choosing
 between:

 ab   a*b logical_and(a,b)
 a|b   a+b logical_or(a,b)
 ~aTrue-a  logical_not(a)


Boolean - is being deprecated:

https://github.com/numpy/numpy/pull/4105

The choice between | and *+ is best dictated by what is more natural in
your problem domain and how your functions should treat non-boolean types.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] index partition

2014-04-12 Thread Alexander Belopolsky
On Sat, Apr 12, 2014 at 4:47 PM, Alan G Isaac alan.is...@gmail.com wrote:

 As a simple example, suppose for array `a` I want
 np.flatnonzero(a0) and np.flatnonzero(a=0).
 Can I get them both in one go?


I don't think you can do better than

x = a  0
p, q = np.flatnonzero(x), np.flatnonzero(~x)
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] index partition

2014-04-12 Thread Alexander Belopolsky
On Sat, Apr 12, 2014 at 5:03 PM, Sebastian Berg
sebast...@sipsolutions.netwrote:

  As a simple example, suppose for array `a` I want
  np.flatnonzero(a0) and np.flatnonzero(a=0).
  Can I get them both in one go?
 

 Might be missing something, but I don't think there is a way to do it in
 one go. The result is irregularly structured and there are few functions
 like nonzero which give something like that.


The set routines [1] are in this category and may help you deal with
partitions, but I would recommend using boolean arrays instead. If you
commonly deal with both a subset and a complement, set representation does
not give you a memory advantage over a boolean mask.

[1] http://docs.scipy.org/doc/numpy/reference/routines.set.html
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-04-11 Thread Alexander Belopolsky
On Fri, Apr 11, 2014 at 7:58 PM, Stephan Hoyer sho...@gmail.com wrote:

 print datetime(2010, 1, 1)  np.datetime64('2011-01-01') # raises exception


This is somewhat consistent with

 from datetime import *
 datetime(2010, 1, 1)  date(2010, 1, 1)
Traceback (most recent call last):
  File stdin, line 1, in module
TypeError: can't compare datetime.datetime to datetime.date

but I would expect date(2010, 1, 1)  np.datetime64('2011-01-01') to return
False.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] PEP 465 has been accepted / volunteers needed

2014-04-08 Thread Alexander Belopolsky
Benjamin Peterson has posted a complete patch implementing the @ operator
for Python 3.5:

http://bugs.python.org/file34762/mat-mult5.patch

Now we should implement matmul in numpy:

https://github.com/numpy/numpy/issues/4464
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] PEP 465 has been accepted / volunteers needed

2014-04-07 Thread Alexander Belopolsky
I took liberty and reposted this as an ENH issue on the Python bug
tracker.

http://bugs.python.org/issue21176


On Mon, Apr 7, 2014 at 7:23 PM, Nathaniel Smith n...@pobox.com wrote:

 Guido just formally accepted PEP 465:
   https://mail.python.org/pipermail/python-dev/2014-April/133819.html
   http://legacy.python.org/dev/peps/pep-0465/#implementation-details

 Yay.

 The next step is to implement it, in CPython and in numpy. I have time
 to advise on this, but not to do it myself, so, any volunteers? Ever
 wanted to hack on the interpreter itself, with BDFL guarantee your
 patch will be accepted (if correct)?

 The todo list for CPython is here:
 http://legacy.python.org/dev/peps/pep-0465/#implementation-details
 There's one open question which is where the type slots should be
 added. I'd just add them to PyNumberMethods and then if someone
 objects during patch review it can be changed.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-04-01 Thread Alexander Belopolsky
On Tue, Apr 1, 2014 at 12:10 PM, Chris Barker chris.bar...@noaa.gov wrote:

 For a naive object, the %z and %Z format codes are replaced by empty
 strings.

  though I'm not entirely sure what that means -- probably only for writing.


That's right:

 from datetime import *
 datetime.now().strftime('%z')
''
 datetime.now(timezone.utc).strftime('%z')
'+'
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-04-01 Thread Alexander Belopolsky
On Tue, Apr 1, 2014 at 12:10 PM, Chris Barker chris.bar...@noaa.gov wrote:

 It seems this committee of two has come to a consensus on naive -- and
 you're probably right, raise an exception if there is a time zone specifier.


Count me as +1 on naive, but consider converting garbage (including strings
with trailing Z) to NaT.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-04-01 Thread Alexander Belopolsky
On Tue, Apr 1, 2014 at 1:12 PM, Nathaniel Smith n...@pobox.com wrote:

 In [6]: a[0] = garbage
 ValueError: could not convert string to float: garbage

 (Cf, Errors should never pass silently.) Any reason why datetime64
 should be different?


datetime64 is different because it has NaT support from the start.  NaN
support for floats seems to be an afterthought if not an accident of
implementation.

And it looks like some errors do pass silently:

 a[0] = 1
# not a TypeError

But I withdraw my suggestion.  The closer datetime64 behavior is to numeric
types the better.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] why sort does not accept a key?

2014-03-24 Thread Alexander Belopolsky
On Mon, Mar 24, 2014 at 11:32 AM, Alan G Isaac alan.is...@gmail.com wrote:

 I'm wondering if `sort` intentionally does not accept a `key`
 or if this is just a missing feature?


It would be very inefficient to call a key function on every element
compared during the sort.   See np.argsort and np.lexsort for faster
alternatives.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Resolving the associativity/precedence debate for @

2014-03-22 Thread Alexander Belopolsky
On Sat, Mar 22, 2014 at 10:35 PM, Sturla Molden sturla.mol...@gmail.com
wrote:

 On the other hand, this

 vec.T @ Mat @ Mat

 would not need parentheses for optimisation when the associativity is
left.



Nor does it require .T if vec is 1d.


 By the way, the * operator for np.matrix and Matlab matrices are left
 associative as well.


This is a very strong argument, IMO.  If we want to win over the hearts
of np.matrix users, we should not tell them - BTW - treat @ as you do **,
not *.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-03-21 Thread Alexander Belopolsky
On Fri, Mar 21, 2014 at 5:31 PM, Chris Barker chris.bar...@noaa.gov wrote:

 But this brings up a good point -- having time zone handling fully
 compatible ith datetime.datetime would have its advantages.


I don't know if everyone is aware of this, but Python stdlib has support
for fixed-offset timezones since version 3.2:

http://docs.python.org/3.2/whatsnew/3.2.html#datetime-and-time

It took many years to bring in that feature, but now we can benefit from
not having to reinvent the wheel.

I will try to write up some specific proposal this weekend.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] [help needed] associativity and precedence of '@'

2014-03-20 Thread Alexander Belopolsky
On Thu, Mar 20, 2014 at 9:10 AM, Andrew Dalke da...@dalkescientific.comwrote:

 In DSL space, that means @ could be used as the inverse of ** by those
 who want to discard any ties to its use in numerics. Considering it
 now, I agree this would indeed open up some design space.

 I don't see anything disastrously wrong for that in matrix/vector use,
 though my intuition on this is very limited. I believe this gives
 results like the strong right option, no?


It is not uncommon to have v**2 @ u in numerical code for a weighted sum of
u
with weights from v-squared.  Under @ in the same line as **, this will be
interpreted
as v ** (2 @ u) and most likely be an error.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-03-20 Thread Alexander Belopolsky
On Thu, Mar 20, 2014 at 7:16 AM, Nathaniel Smith n...@pobox.com wrote:

 Your NEP suggests making all datetime64s be in UTC, and treating string
 representations from unknown timezones as UTC.


I recall that it was at some point suggested that epoch be part of dtype.
 I was not able to find the reasons for a rejection, but it would make
perfect sense to keep timezone offset in dtype and treat it effectively as
an alternative epoch.

The way I like to think about datetime is that -MM-DD hh:mm:ss.nnn is
just a fancy way to represent numbers which is more convoluted than decimal
notation, but conceptually not so different.  So different units, epochs or
timezones are just different ways to convert an abstract notion of a point
in time to a specific series of bits inside an array.  This is what dtype
is for - a description of how abstract numbers are stored in memory.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-03-20 Thread Alexander Belopolsky
On Thu, Mar 20, 2014 at 9:39 AM, Sankarshan Mudkavi
smudk...@uwaterloo.cawrote:

 A naive datetime64 would be unable to handle this, and would either have
 to ignore the tzinfo or would have to throw up an exception.


This is not true.  Python's own datetime has no problem handling this:

 t1 = datetime(2000,1,1,12)
 t2 = datetime(2000,1,1,12,tzinfo=timezone.utc)
 print(t1)
2000-01-01 12:00:00
 print(t2)
2000-01-01 12:00:00+00:00
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-03-20 Thread Alexander Belopolsky
On Thu, Mar 20, 2014 at 7:27 PM, Chris Barker chris.bar...@noaa.gov wrote:

 On Thu, Mar 20, 2014 at 4:16 AM, Nathaniel Smith n...@pobox.com wrote:

 Your NEP suggests making all datetime64s be in UTC, and treating string
 representations from unknown timezones as UTC. How does this differ from,
 and why is it superior to, making all datetime64s be naive?

 This came up in the conversation before -- I think the fact is that a
 'naive' datetime and a UTC datetime are almost exactly the same. In essence
 you can use a UTC datetime and pretend it's naive in almost all cases.

 The difference comes down to I/O.


It is more than I/O.  It is also about interoperability with Python's
datetime module.

Here is the behavior that I don't like in the current implementation:

 d = array(['2001-01-01T12:00'], dtype='M8[ms]')
 d.item(0)
datetime.datetime(2001, 1, 1, 17, 0)

If I understand NEP correctly, the proposal is to make d.item(0) return

 d.item(0).replace(tzinfo=timezone.utc)
datetime.datetime(2001, 1, 1, 12, 0, tzinfo=datetime.timezone.utc)

instead.  But this is not what I would expect: I want

  d.item(0)
datetime.datetime(2001, 1, 1, 12, 0)

When I work with naive datetime objects I don't want to be exposed to
timezones at all.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] [help needed] associativity and precedence of '@'

2014-03-17 Thread Alexander Belopolsky
On Mon, Mar 17, 2014 at 11:48 AM, Nathaniel Smith n...@pobox.com wrote:

  One more question that I think should be answered by the PEP and may
  influence the associativity decision is what happens if in an A @ B @ C
  expression, each operand has its own type that defines __matmul__ and
  __rmatmul__?  For example, A can be an ndarray, B a sympy expression and
 C a
  pyoperator.

 The general rule in Python is that in a binary operation A # B, then
 first we try A.__special__, and if that doesn't exist or it returns
 NotImplemented, then we try B.__rspecial__. (The exception is that if
 B.__class__ is a proper subclass of A.__class__, then we do it in the
 reverse order.)


This is the simple case.  My question was: what happens if in an A @ B @ C
expression, each operand has its own type that defines __matmul__ and
__rmatmul__?

Are we going to recommend that other projects adopt
numpy's __array_priority__?

In mixed-type expressions, do you expect A @ B @ C to have type of A, B, or
C?

Does __matmul__ first then __rmatmul__ rule makes sense if @ becomes
right-associative or should the order be reversed?
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] [help needed] associativity and precedence of '@'

2014-03-17 Thread Alexander Belopolsky
On Mon, Mar 17, 2014 at 12:13 PM, Nathaniel Smith n...@pobox.com wrote:

 In practice all
 well-behaved classes have to make sure that they implement __special__
 methods in such a way that all the different variations work, no
 matter which class ends up actually handling the operation.


Well-behaved classes are hard to come by in practice.  The @ operator may
fix the situation with np.matrix, so take a look at MaskedArray with its
40-line __array_wrap__ and no end of bugs.

Requiring superclass __method__ to handle creation of subclass results
correctly is turning Liskov principle on its head.  With enough clever
tricks and tight control over the full class hierarchy you can make it work
in some cases, but it is not a good design.

I am afraid that making @ special among other binary operators that
implement mathematically associative operations will create a lot of
confusion.  (The pow operator is special because the corresponding
mathematical operation is non-associative.)

Imagine teaching someone that a % b % c = (a % b) % c, but a @ b @ c = a @
(b @ c).  What are the chances that they will correctly figure out what a
// b // c means after this?
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] [help needed] associativity and precedence of '@'

2014-03-17 Thread Alexander Belopolsky
On Mon, Mar 17, 2014 at 2:55 PM, josef.p...@gmail.com wrote:

 I'm again in favor of left, because it's the simplest to understand
 A.dot(B).dot(C)


+1

Note that for many years to come the best option for repeated matrix
product will be A.dot(B).dot(C) ...

People who convert their dot(dot(dot('s to more readable method call syntax
now should not be forced to change the order or add parentheses when they
switch to @.

(Full disclosure: I am one of those people having recently converted a
large Numeric-based project to NumPy.)
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] [help needed] associativity and precedence of '@'

2014-03-17 Thread Alexander Belopolsky
On Mon, Mar 17, 2014 at 6:33 PM, Christophe Bal projet...@gmail.com wrote:


 Defining *-product to have stronger priority than the @-product, and this
 last having stronger priority than +, will make the changes in the grammar
 easier.



The easiest is to give @ the same precedence as *.  This will only require
changing

term: factor (('*'|'/'|'%'|'//') factor)*

to

term: factor (('*'|'/'|'%'|'//'|'@') factor)*

Anything else will require an extra rule, but in any case implementation is
trivial.

I don't think we need to worry about implementation details at this point.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] [help needed] associativity and precedence of '@'

2014-03-17 Thread Alexander Belopolsky
On Mon, Mar 17, 2014 at 8:54 PM, Nathaniel Smith n...@pobox.com wrote:

 Currently Python has 3 different kinds of ops: left-associative (most
 of them), right-associative (**), and chaining. Chaining is used for
 comparison ops. Example:

a  b  c

 gets parsed to something like

do_comparison(args=[a, b, c], ops=[lt, lt])


The actual parse tree is more like Compare(a, [lt, lt], [b, c]) with the
first aruments playing a distinct role:

 ast.dump(ast.parse('abc'), annotate_fields=False)
Module([Expr(Compare(Name('a', Load()), [Lt(), Lt()], [Name('b', Load()),
Name('c', Load())]))])

Your idea is very interesting and IMO, worth considering independently from
the @ operator.  I always wanted a vector between operator to be
available in numpy as low  x  high.

The only problem I see here is with mixed types, but we can follow the pow
precedent [1]: Note that ternary pow() will not try calling __rpow__()
(the coercion rules would become too complicated).


[1]
http://docs.python.org/3/reference/datamodel.html#emulating-numeric-types
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] [help needed] associativity and precedence of '@'

2014-03-15 Thread Alexander Belopolsky
On Fri, Mar 14, 2014 at 11:41 PM, Nathaniel Smith n...@pobox.com wrote:

 Here's the main blocker for adding a matrix multiply operator '@' to
 Python: we need to decide what we think its precedence and associativity
 should be.


I am not ready to form my own opinion, but I hope the following will help
shaping the discussion.

Currently, [1], Python operator precedence is

+, -Addition and subtraction*, /, //, %Multiplication, division, remainder
[5] http://docs.python.org/3/reference/expressions.html#id20+x, -x,
~xPositive,
negative, bitwise NOT**Exponentiation
[6]http://docs.python.org/3/reference/expressions.html#id21
x[index], x[index:index], x(arguments...), x.attributeSubscription,
slicing, call, attribute reference

We need to decide whether @ belongs to one of the existing row or deserves
one of its own.

The associativity debate is one of those debates [2] where there is no
right answer.  Guido has very wisely left it for the numeric community to
decide.  I would start with surveying the prior art of using right
associativity and the reasons it was chosen and see if those reasons apply.
 (An example of a choice made for wrong reasons is our decimal system.  We
write our numbers backwards - from high to low place value - only because
we took them from people who write text from right to left.  As a result,
computer parsers have to skip to the last or count the number of digits
before they can start evaluating the number.)

Here is the start:

1. APL uses right to left associativity for all operators and all operators
have the same precedence.
2. Exponentiation operator is right associative in most languages with
MATLAB being a notable exception.


[1] http://docs.python.org/3/reference/expressions.html#evaluation-order
[2] http://en.wikipedia.org/wiki/Lilliput_and_Blefuscu
[3] http://www.tcl.tk/cgi-bin/tct/tip/274.html
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] [help needed] associativity and precedence of '@'

2014-03-15 Thread Alexander Belopolsky
On Sat, Mar 15, 2014 at 2:25 PM, Alexander Belopolsky ndar...@mac.comwrote:

 On Fri, Mar 14, 2014 at 11:41 PM, Nathaniel Smith n...@pobox.com wrote:

 Here's the main blocker for adding a matrix multiply operator '@' to
 Python: we need to decide what we think its precedence and associativity
 should be.


 I am not ready to form my own opinion, but I hope the following will help
 shaping the discussion.


One more question that I think should be answered by the PEP and may
influence the associativity decision is what happens if in an A @ B @ C
expression, each operand has its own type that defines __matmul__ and
__rmatmul__?  For example, A can be an ndarray, B a sympy expression and C
a pyoperator.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] [help needed] associativity and precedence of '@'

2014-03-15 Thread Alexander Belopolsky
On Sat, Mar 15, 2014 at 3:29 PM, Nathaniel Smith n...@pobox.com wrote:

  It would be nice if u@v@None, or some such, would evaluate as a dyad.
 Or else we will still need the concept of row and column 1-D matrices. I
 still think v.T should set a flag so that one can distinguish u@v.T(dyad) 
 from u.T@v(inner product), where 1-D arrays are normally treated as column 
 vectors.

 This sounds important but I have no idea what any of it means :-) (What's
 a dyadic matrix?) Can you elaborate?


I assume dyadic means 2d.

This discussion gave me an idea that is only tangentially relevant to the
discussion at hand.  It looks like numpy operators commonly need to make a
choice whether to treat an Nd array as a unit (atom) or as a list to
broadcast itself over.

APL-derived languages solve this problem by using operator modifiers.
 Applied to our case, given a dot-product operator @, each[@] operator
works on  2d arrays by dotting them pair-wise and returning a 1d array.
 Similarly, eachleft[@] would operate on 2d, 1d operands by broadcasting
itself over the left operand (incidentally reproducing the mat @ vec
behavior) and eachright[@] would treat its left operand atomically and
broadcast over the right operand.

My idea is inspired by Guido's use facade suggestion.  We can define
ndarray.each(axes=(0,)) method that would return a light-weigh proxy object
so that

a each[@] b  is spelled a.each() @ b.each()
a eachleft[@] b is spelled a.each() @ b
a eachright[@] b is spelled a @ b.each()
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] [help needed] associativity and precedence of '@'

2014-03-15 Thread Alexander Belopolsky
On Sat, Mar 15, 2014 at 4:00 PM, Charles R Harris charlesr.har...@gmail.com
 wrote:

 These days they are usually written as v*w.T, i.e., the outer product of
 two vectors and are a fairly common occurrence in matrix expressions. For
 instance, covariance matrices  are defined as E(v * v.T)


With the current numpy, we can do

 x = arange(1, 5)
 x[:,None].dot(x[None,:])
array([[ 1,  2,  3,  4],
   [ 2,  4,  6,  8],
   [ 3,  6,  9, 12],
   [ 4,  8, 12, 16]])

I assume once @ becomes available, we will have

 x[:,None] @ x[None,:]
array([[ 1,  2,  3,  4],
   [ 2,  4,  6,  8],
   [ 3,  6,  9, 12],
   [ 4,  8, 12, 16]])
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ndarray is not a sequence

2014-03-01 Thread Alexander Belopolsky
On Fri, Feb 28, 2014 at 10:34 AM, Chris Barker - NOAA Federal 
chris.bar...@noaa.gov wrote:


 Whatever happened to duck typing?


http://legacy.python.org/dev/peps/pep-3119/#abcs-vs-duck-typing
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] cPickle.loads and Numeric

2014-02-25 Thread Alexander Belopolsky
On Tue, Feb 25, 2014 at 11:29 AM, Benjamin Root ben.r...@ou.edu wrote:

 I seem to recall reading somewhere that pickles are not intended to be
 long-term archives as there is no guarantee that a pickle made in one
 version of python would work in another version, much less between
 different versions of the same (or similar) packages.


That's not true about Python core and stdlib.  Python developers strive to
maintain backward compatibility and any instance of newer python failing to
read older pickles would be considered a bug.  This is even true across 2.x
/ 3.x line.

You mileage with 3rd party packages, especially 10+ years old ones may vary.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] except expression discussion on python-ideas

2014-02-18 Thread Alexander Belopolsky
I would like to invite numpy community to weigh in on the idea that is
getting momentum at

https://mail.python.org/pipermail/python-ideas/2014-February/025437.html

The main motivation is to provide syntactic alternative to proliferation of
default value options, so that

x = getattr(u, 'answer', 42)

can be written as

x = y.answer except ... 42

For a dictionary d,

x = d.get('answer', 42)

can be written as

x = d['answer'] except ... 42

For a list L,

try:
x = L[i]
except IndexError:
x= 42

can be written as

x = L[i] except ... 42


The ellipsis in the above stands for syntax being debated.

Effectively, Python is about to gain support for a new operator and
operators are very precious for numpy.  So, I think numpy community has a
horse in that race.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] bool value of dtype is False?

2014-02-14 Thread Alexander Belopolsky
On Fri, Feb 14, 2014 at 4:51 PM, Charles G. Waldman char...@crunch.iowrote:

  d = numpy.dtype(int)
  if d: print OK
 ... else: print I'm surprised

 I'm surprised
 ___


I think this is an artifact of regular dtypes having length of zero:

 len(array(1.).dtype)
0

For record arrays dtypes you would get True:

 len(numpy.dtype([('x', int)]))
1
 bool(numpy.dtype([('x', int)]))
True
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] deprecate numpy.matrix

2014-02-10 Thread Alexander Belopolsky
On Sun, Feb 9, 2014 at 4:59 PM, alex argri...@ncsu.edu wrote:

 On the other hand, it really needs to be deprecated.


While numpy.matrix may have its problems, a NEP should list a better
rationale than the above to gain acceptance.

Personally, I decided not to use numpy.matrix in production code about 10
years ago and never looked back to that decision.  I've heard however that
some of the worst inheritance warts have been fixed over the years.  I also
resisted introducing inheritance  in the implementation of masked arrays,
but I lost that argument.  For better or worse, inheritance from ndarray is
here to stay and I would rather see numpy.matrix stay as a test-bed for
fixing inheritance issues rather than see it deprecated and have the same
issues pop up in ma or elsewhere.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Inheriting from ndarray Was: deprecate numpy.matrix

2014-02-10 Thread Alexander Belopolsky
On Mon, Feb 10, 2014 at 11:31 AM, Nathaniel Smith n...@pobox.com wrote:

 And in the long run, I
 think the goal is to move people away from inheriting from np.ndarray.


This is music to my ears, but what is the future of numpy.ma?  I understand
that numpy.oldnumeric.ma (the older version written without inheritance)
has been deprecated and slated to be removed in 1.9.  I also have seen some
attempts to bring ma functionality into the core ndarray object, but those
have not been successful as far as I can tell.

In general, what is the future of inheriting from np.ndarray?
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fast decrementation of indices

2014-02-02 Thread Alexander Belopolsky
On Sun, Feb 2, 2014 at 2:58 PM, Mads Ipsen mads.ip...@gmail.com wrote:

 Since atoms [1,2,3,7,8] have been
 deleted, the remaining atoms with indices larger than the deleted atoms
 must be decremented.


Let

 x
array([[ 0,  1,  2,  3],
   [ 4,  5,  6,  7],
   [ 8,  9, 10, 11]])

and

 i = [1, 0, 2]

Create a shape of x matrix with 1's at (k, i[k]) and zeros elsewhere
 b = zeros_like(x)
 b.put(i + arange(3)*4 + 1, 1)  # there must be a simpler way

 x - b.cumsum(1)
array([[ 0,  1,  1,  2],
   [ 4,  4,  5,  6],
   [ 8,  9, 10, 10]])

seems to be the result you want.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Does NumPy support indirect memory views?

2013-12-14 Thread Alexander Belopolsky
PEP 3118 [1] allows exposing multi-dimensional data that is organized as
array of pointers.  It appears, however that NumPy cannot consume such
memory views.

Looking at _array_from_buffer_3118() function [2], I don't see any attempt
to process suboffsets.   The documentation [3] is also silent on this issue.

What is the status of indirect memory views/buffers support in NumPy?


[1] http://www.python.org/dev/peps/pep-3118/
[2]
https://github.com/numpy/numpy/blob/4050ac73af79ae8cc513648ff02e9a22041501c4/numpy/core/src/multiarray/ctors.c#L1253
[3] http://docs.scipy.org/doc/numpy/reference/arrays.interface.html
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Does NumPy support indirect memory views?

2013-12-14 Thread Alexander Belopolsky
On Sat, Dec 14, 2013 at 2:59 PM, David Cournapeau courn...@gmail.comwrote:

 There is indeed no support in NumPy for this. Unfortunately, fixing this
 would be a significant amount of work, as buffer management is not really
 abstracted in NumPy ATM.


While providing a full support for indirect buffers as a storage for NumPy
ndarrays does look like a daunting task, I think some partial support can
be implemented rather easily.

When an ndarray from object constructor encounters an object that can only
expose its memory as an indirect buffer, the constructor can gather the
data into a contiguous buffer.

At the very least, _array_from_buffer_3118() should detect non-null
suboffsets and bail out with a meaningful message rather than expose
pointers as data.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Deprecate boolean math operators?

2013-12-06 Thread Alexander Belopolsky
On Fri, Dec 6, 2013 at 11:13 AM, Alan G Isaac alan.is...@gmail.com wrote:

 On 12/5/2013 11:14 PM, Alexander Belopolsky wrote:
  did you find minus to be as useful?


 It is also a correct usage.


Can you provide a reference?



 I think a good approach to this is to first realize that
 there were good reasons for the current behavior.


Maybe there were, in which case the current behavior should be documented
somewhere.

What is the rationale for this:

 -array(True) + array(True)
True

?

I am not aware of any algebraic system where unary minus denotes anything
other than additive inverse.

Having bools form a semiring under + and * is a fine (yet somewhat unusual)
choice, but once you've made that choice you loose subtraction because True
+ x = True no longer has a unique solution.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Deprecate boolean math operators?

2013-12-06 Thread Alexander Belopolsky
On Fri, Dec 6, 2013 at 1:46 PM, Alan G Isaac alan.is...@gmail.com wrote:

 On 12/6/2013 1:35 PM, josef.p...@gmail.com wrote:
  unary versus binary minus

 Oh right; I consider binary `-` broken for
 Boolean arrays. (Sorry Alexander; I did not
 see your entire issue.)


  I'd rather write ~ than unary - if that's what it is.

 I agree.  So I have no objection to elimination
 of the `-`.


It looks like we are close to reaching a consensus on the following points:

1. * is well-defined on boolean arrays and may be used in preference of 
in code that is designed to handle 1s and 0s of any dtype in addition to
booleans.

2. + is defined consistently with * and the only issue is the absence of
additive inverse.  This is not a problem as long as presence of - does not
suggest otherwise.

3. binary and unary minus should be deprecated because its use in
expressions where variables can be either boolean or numeric would lead to
subtle bugs.  For example -x*y would produce different results from -(x*y)
depending on whether x is boolean or not.  In all situations, ^ is
preferable to binary - and ~ is preferable to unary -.

4. changing boolean arithmetics to auto-promotion to int is precluded by a
significant use-case of boolean matrices.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Deprecate boolean math operators?

2013-12-05 Thread Alexander Belopolsky
On Thu, Dec 5, 2013 at 5:37 PM, Sebastian Berg
sebast...@sipsolutions.netwrote:

 For the moment I saw one annoying change in
 numpy, and that is `abs(x - y)` being used for allclose and working
 nicely currently.


It would probably be an improvement if allclose returned all(x == y) unless
one of the arguments is inexact.  At the moment allclose() fails for char
arrays:

 allclose('abc', 'abc')
Traceback (most recent call last):
  File stdin, line 1, in module
  File numpy/core/numeric.py, line 2114, in allclose
xinf = isinf(x)
TypeError: Not implemented for this type
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Deprecate boolean math operators?

2013-12-05 Thread Alexander Belopolsky
On Thu, Dec 5, 2013 at 5:37 PM, Sebastian Berg sebast...@sipsolutions.net
wrote:
 there was a discussion that for numpy booleans math operators +,-,* (and
 the unary -), while defined, are not very helpful.

It has been suggested at the Github that there is an area where it is
useful to have linear algebra operations like matrix multiplication to be
defined over a semiring:

http://en.wikipedia.org/wiki/Logical_matrix

This still does not justify having unary or binary -, so I suggest that we
first discuss deprecation of those.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Deprecate boolean math operators?

2013-12-05 Thread Alexander Belopolsky
On Thu, Dec 5, 2013 at 10:35 PM, josef.p...@gmail.com wrote:

 what about np.dot,np.dot(mask, x) which is the same as (mask *
 x).sum(0) ?


I am not sure which way your argument goes, but I don't think you would
find the following natural:

 x = array([True, True])
 dot(x,x)
True
 (x*x).sum()
2
 (x*x).sum(0)
2
 (x*x).sum(False)
2
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Deprecate boolean math operators?

2013-12-05 Thread Alexander Belopolsky
On Thu, Dec 5, 2013 at 11:05 PM, Alan G Isaac alan.is...@gmail.com wrote:

 For + and * (and thus `dot`), this will fix something that is not broken.


+ and * are not broken - just redundant given | and .

What is really broken is -, both unary and binary:

 int(np.bool_(0) - np.bool_(1))
1
 int(-np.bool_(0))
1

 I'm sure I cannot be the only one who has for years taught students
 about Boolean matrices using NumPy

(I would not be so sure:-)

In that experience, did you find minus to be as useful?
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion