Re: [Numpy-discussion] ndarray.T2 for 2D transpose

2016-04-07 Thread Stéfan van der Walt
On 7 April 2016 at 11:17, Chris Barker  wrote:
> np.col_vector(arr)
>
> which would be a synonym for np.reshape(arr, (-1,1))
>
> would that make anyone happy?

I'm curious to see use cases where this doesn't solve the problem.

The most common operations that I run into:

colvec = lambda x: np.c_[x]

x = np.array([1, 2, 3])
A = np.arange(9).reshape((3, 3))


1) x @ x   (equivalent to x @ colvec(x))
2) A @ x  (equivalent to A @ colvec(x), apart from the shape)
3) x @ A
4) x @ colvec(x)  -- gives an error, but perhaps this should work and
be equivalent to np.dot(colvec(x), rowvec(x)) ?

If (4) were changed, 1D arrays would mostly* be interpreted as row
vectors, and there would be no need for a rowvec function.  And we
already do that kind of magic for (2).

Stéfan

* not for special case (1)
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ndarray.T2 for 2D transpose

2016-04-07 Thread Chris Barker
On Thu, Apr 7, 2016 at 11:31 AM,  wrote:

> maybe a warning?
>>
>
> AFAIR, there is a lot of code that works correctly with .T being a noop
> for 1D
> e.g. covariance matrix/inner product x.T dot y as mentioned before.
>

oh well, then no warning, either.


> write unit tests with non square 2d arrays and the exception / test error
> shows up fast.
>

Guido wrote a note to python-ideas about the conflict between the use cases
of "scripting" and "large system development" -- he urged both camps, to
respect and listen to each other.

I think this is very much a "scripters" issue -- so no unit tests, etc

For my part, I STILL have to kick myself once in a while for using square
arrays in testing/exploration!

-CHB


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ndarray.T2 for 2D transpose

2016-04-07 Thread Nathaniel Smith
On Thu, Apr 7, 2016 at 10:00 AM, Ian Henriksen
 wrote:
>
> Here's another example that I've seen catch people now and again.
>
> A = np.random.rand(100, 100)
> b =  np.random.rand(10)
> A * b.T
>
> In this case the user pretty clearly meant to be broadcasting along the rows
> of A
> rather than along the columns, but the code fails silently. When an issue
> like this
> gets mixed into a larger series of broadcasting operations, the error
> becomes
> difficult to find.

I feel like this is an argument for named axes, and broadcasting rules
that respect those names, as in xarray? There's been some speculative
discussion about adding something along these lines to numpy, though
nothing that's even reached the half-baked stage.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ndarray.T2 for 2D transpose

2016-04-07 Thread josef.pktd
On Thu, Apr 7, 2016 at 4:07 PM, Ian Henriksen <
insertinterestingnameh...@gmail.com> wrote:

> On Thu, Apr 7, 2016 at 1:53 PM  wrote:
>
>> On Thu, Apr 7, 2016 at 3:26 PM, Ian Henriksen <
>> insertinterestingnameh...@gmail.com> wrote:
>>
>>> On Thu, Apr 7, 2016 at 12:31 PM  wrote:
>>>
 write unit tests with non square 2d arrays and the exception / test
 error shows up fast.

 Josef


>>> Absolutely, but good programming practices don't totally obviate helpful
>>> error
>>> messages.
>>>
>>
>> The current behavior is perfectly well defined, and I don't want a lot of
>> warnings showing up because .T works suddenly only for ndim != 1.
>> I make lots of mistakes during programming. But shape mismatch are
>> usually very fast to catch.
>>
>> If you want safe programming, then force everyone to use only 2-D like in
>> matlab. It would have prevented me from making many mistakes.
>>
>> >>> np.array(1).T
>> array(1)
>>
>> another noop. Why doesn't it convert it to 2d?
>>
>> Josef
>>
>>
> I think we've misunderstood each other. Sorry if I was unclear. As I've
> understood the discussion thus far, "raising an error" refers to raising
> an error when
> a 1D array is passed used with the syntax a.T2 (for swapping the last two
> dimensions?). As far as whether or not a.T should raise an error for 1D
> arrays, that
> ship has definitely already sailed. I'm making the case that there's value
> in having
> an abbreviated syntax that helps prevent errors from accidentally using a
> 1D array,
> not that we should change the existing semantics.
>

Sorry, I misunderstood.

I'm not sure which case CHB initially meant.

Josef



>
> Cheers,
>
> -Ian
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ndarray.T2 for 2D transpose

2016-04-07 Thread Ian Henriksen
On Thu, Apr 7, 2016 at 1:53 PM  wrote:

> On Thu, Apr 7, 2016 at 3:26 PM, Ian Henriksen <
> insertinterestingnameh...@gmail.com> wrote:
>
>> On Thu, Apr 7, 2016 at 12:31 PM  wrote:
>>
>>> write unit tests with non square 2d arrays and the exception / test
>>> error shows up fast.
>>>
>>> Josef
>>>
>>>
>> Absolutely, but good programming practices don't totally obviate helpful
>> error
>> messages.
>>
>
> The current behavior is perfectly well defined, and I don't want a lot of
> warnings showing up because .T works suddenly only for ndim != 1.
> I make lots of mistakes during programming. But shape mismatch are usually
> very fast to catch.
>
> If you want safe programming, then force everyone to use only 2-D like in
> matlab. It would have prevented me from making many mistakes.
>
> >>> np.array(1).T
> array(1)
>
> another noop. Why doesn't it convert it to 2d?
>
> Josef
>
>
I think we've misunderstood each other. Sorry if I was unclear. As I've
understood the discussion thus far, "raising an error" refers to raising an
error when
a 1D array is passed used with the syntax a.T2 (for swapping the last two
dimensions?). As far as whether or not a.T should raise an error for 1D
arrays, that
ship has definitely already sailed. I'm making the case that there's value
in having
an abbreviated syntax that helps prevent errors from accidentally using a
1D array,
not that we should change the existing semantics.

Cheers,

-Ian
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ndarray.T2 for 2D transpose

2016-04-07 Thread josef.pktd
On Thu, Apr 7, 2016 at 3:26 PM, Ian Henriksen <
insertinterestingnameh...@gmail.com> wrote:

> On Thu, Apr 7, 2016 at 12:31 PM  wrote:
>
>> write unit tests with non square 2d arrays and the exception / test error
>> shows up fast.
>>
>> Josef
>>
>>
> Absolutely, but good programming practices don't totally obviate helpful
> error
> messages.
>

The current behavior is perfectly well defined, and I don't want a lot of
warnings showing up because .T works suddenly only for ndim != 1.
I make lots of mistakes during programming. But shape mismatch are usually
very fast to catch.

If you want safe programming, then force everyone to use only 2-D like in
matlab. It would have prevented me from making many mistakes.

>>> np.array(1).T
array(1)

another noop. Why doesn't it convert it to 2d?

Josef




>
> Best,
> -Ian
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ndarray.T2 for 2D transpose

2016-04-07 Thread Ian Henriksen
On Thu, Apr 7, 2016 at 12:31 PM  wrote:

> write unit tests with non square 2d arrays and the exception / test error
> shows up fast.
>
> Josef
>
>
Absolutely, but good programming practices don't totally obviate helpful
error
messages.

Best,
-Ian
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ndarray.T2 for 2D transpose

2016-04-07 Thread Ian Henriksen
On Thu, Apr 7, 2016 at 12:18 PM Chris Barker  wrote:

> On Thu, Apr 7, 2016 at 10:00 AM, Ian Henriksen <
> insertinterestingnameh...@gmail.com> wrote:
>
>> Here's another example that I've seen catch people now and again.
>>
>> A = np.random.rand(100, 100)
>> b =  np.random.rand(10)
>> A * b.T
>>
>
> typo? that was supposed to be
>
> b =  np.random.rand(100). yes?
>

Hahaha, thanks, yes, in describing a common typo I demonstrated another
one. At
least this one doesn't fail silently.


>
> This is exactly what someone else referred to as the expectations of
> someone that comes from MATLAB, and doesn't yet "get" that 1D arrays are 1D
> arrays.
>
> All of this is EXACTLY the motivation for the matric class -- which never
> took off, and was never complete (it needed a row and column vector
> implementation, if you ask me. But Ithikn the reason it didn't take off is
> that it really isn't that useful, but is different enough from regular
> arrays to be a greater source of confusion. And it was decided that all
> people REALLY wanted was an obviou sway to get matric multiply, which we
> now have with @.
>

Most of the cases I've seen this error have come from people unfamiliar with
matlab who, like I said, weren't tracking dimensions quite as carefully as
they
should have. That said, it's just anecdotal evidence. I wouldn't be at all
surprised if
this were an issue for matlab users as well.

As far as the matrix class goes, we really shouldn't be telling anyone to
use that
anymore.


>
> So this discussion brings up that we also need an easy an obvious way to
> make a column vector --
>
> maybe:
>
> np.col_vector(arr)
>
> which would be a synonym for np.reshape(arr, (-1,1))
>
> would that make anyone happy?
>
> NOTE: having transposing a 1D array raise an exception would help remove a
> lot  of the confusion, but it may be too late for that
>

>
> In this case the user pretty clearly meant to be broadcasting along the
>> rows of A
>> rather than along the columns, but the code fails silently.
>>
>
> hence the exception idea
>
>
Yep. An exception may be the best way forward here. My biggest objection is
that
the current semantics make it easy for people to silently get unintended
behavior.


> maybe a warning?
>
> -CHB
>
>
-Ian Henriksen
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ndarray.T2 for 2D transpose

2016-04-07 Thread Irvin Probst

On Thu, 7 Apr 2016 14:31:17 -0400, josef.p...@gmail.com wrote:

So this discussion brings up that we also need an easy an obvious
way to make a column vector -- 

maybe:

np.col_vector(arr)



FWIW I would give a +1e42 to something like np.colvect and np.rowvect 
(or whatever variant of these names). This is human readable and does 
not break anything, it's just an explicit shortcut to 
reshape/atleast_2d/etc.


Regards.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ndarray.T2 for 2D transpose

2016-04-07 Thread josef.pktd
On Thu, Apr 7, 2016 at 2:17 PM, Chris Barker  wrote:

> On Thu, Apr 7, 2016 at 10:00 AM, Ian Henriksen <
> insertinterestingnameh...@gmail.com> wrote:
>
>> Here's another example that I've seen catch people now and again.
>>
>> A = np.random.rand(100, 100)
>> b =  np.random.rand(10)
>> A * b.T
>>
>
> typo? that was supposed to be
>
> b =  np.random.rand(100). yes?
>
> This is exactly what someone else referred to as the expectations of
> someone that comes from MATLAB, and doesn't yet "get" that 1D arrays are 1D
> arrays.
>
> All of this is EXACTLY the motivation for the matric class -- which never
> took off, and was never complete (it needed a row and column vector
> implementation, if you ask me. But Ithikn the reason it didn't take off is
> that it really isn't that useful, but is different enough from regular
> arrays to be a greater source of confusion. And it was decided that all
> people REALLY wanted was an obviou sway to get matric multiply, which we
> now have with @.
>
> So this discussion brings up that we also need an easy an obvious way to
> make a column vector --
>
> maybe:
>
> np.col_vector(arr)
>
> which would be a synonym for np.reshape(arr, (-1,1))
>
> would that make anyone happy?
>
> NOTE: having transposing a 1D array raise an exception would help remove a
> lot  of the confusion, but it may be too late for that
>
>
> In this case the user pretty clearly meant to be broadcasting along the
>> rows of A
>> rather than along the columns, but the code fails silently.
>>
>
> hence the exception idea
>
> maybe a warning?
>

AFAIR, there is a lot of code that works correctly with .T being a noop for
1D
e.g. covariance matrix/inner product x.T dot y as mentioned before.

write unit tests with non square 2d arrays and the exception / test error
shows up fast.

Josef



>
> -CHB
>
>
> --
>
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR(206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115   (206) 526-6317   main reception
>
> chris.bar...@noaa.gov
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ndarray.T2 for 2D transpose

2016-04-07 Thread Matthew Brett
On Thu, Apr 7, 2016 at 11:17 AM, Chris Barker  wrote:
> On Thu, Apr 7, 2016 at 10:00 AM, Ian Henriksen
>  wrote:
>>
>> Here's another example that I've seen catch people now and again.
>>
>> A = np.random.rand(100, 100)
>> b =  np.random.rand(10)
>> A * b.T
>
>
> typo? that was supposed to be
>
> b =  np.random.rand(100). yes?
>
> This is exactly what someone else referred to as the expectations of someone
> that comes from MATLAB, and doesn't yet "get" that 1D arrays are 1D arrays.
>
> All of this is EXACTLY the motivation for the matric class -- which never
> took off, and was never complete (it needed a row and column vector
> implementation, if you ask me. But Ithikn the reason it didn't take off is
> that it really isn't that useful, but is different enough from regular
> arrays to be a greater source of confusion. And it was decided that all
> people REALLY wanted was an obviou sway to get matric multiply, which we now
> have with @.
>
> So this discussion brings up that we also need an easy an obvious way to
> make a column vector --
>
> maybe:
>
> np.col_vector(arr)
>
> which would be a synonym for np.reshape(arr, (-1,1))

Yes, I was going to suggest `colvec` and `rowvec`.

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ndarray.T2 for 2D transpose

2016-04-07 Thread Chris Barker
On Thu, Apr 7, 2016 at 10:00 AM, Ian Henriksen <
insertinterestingnameh...@gmail.com> wrote:

> Here's another example that I've seen catch people now and again.
>
> A = np.random.rand(100, 100)
> b =  np.random.rand(10)
> A * b.T
>

typo? that was supposed to be

b =  np.random.rand(100). yes?

This is exactly what someone else referred to as the expectations of
someone that comes from MATLAB, and doesn't yet "get" that 1D arrays are 1D
arrays.

All of this is EXACTLY the motivation for the matric class -- which never
took off, and was never complete (it needed a row and column vector
implementation, if you ask me. But Ithikn the reason it didn't take off is
that it really isn't that useful, but is different enough from regular
arrays to be a greater source of confusion. And it was decided that all
people REALLY wanted was an obviou sway to get matric multiply, which we
now have with @.

So this discussion brings up that we also need an easy an obvious way to
make a column vector --

maybe:

np.col_vector(arr)

which would be a synonym for np.reshape(arr, (-1,1))

would that make anyone happy?

NOTE: having transposing a 1D array raise an exception would help remove a
lot  of the confusion, but it may be too late for that


In this case the user pretty clearly meant to be broadcasting along the
> rows of A
> rather than along the columns, but the code fails silently.
>

hence the exception idea

maybe a warning?

-CHB


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ndarray.T2 for 2D transpose

2016-04-07 Thread josef.pktd
On Thu, Apr 7, 2016 at 1:35 PM, Sebastian Berg 
wrote:

> On Do, 2016-04-07 at 13:29 -0400, josef.p...@gmail.com wrote:
> >
> >
> > On Thu, Apr 7, 2016 at 1:20 PM, Sebastian Berg <
> > sebast...@sipsolutions.net> wrote:
> > > On Do, 2016-04-07 at 11:56 -0400, josef.p...@gmail.com wrote:
> > > >
> > > >
> > >
> > > 
> > >
> > > >
> > > > I don't think numpy treats 1d arrays as row vectors. numpy has C
> > > > -order for axis preference which coincides in many cases with row
> > > > vector behavior.
> > > >
> > >
> > > Well, broadcasting rules, are that (n,) should typically behave
> > > similar
> > > to (1, n). However, for dot/matmul and @ the rules are stretched to
> > > mean "the one dimensional thing that gives an inner product" (using
> > > matmul since my python has no @ yet):
> > >
> > > In [12]: a = np.arange(20)
> > > In [13]: b = np.arange(20)
> > >
> > > In [14]: np.matmul(a, b)
> > > Out[14]: 2470
> > >
> > > In [15]: np.matmul(a, b[:, None])
> > > Out[15]: array([2470])
> > >
> > > In [16]: np.matmul(a[None, :], b)
> > > Out[16]: array([2470])
> > >
> > > In [17]: np.matmul(a[None, :], b[:, None])
> > > Out[17]: array([[2470]])
> > >
> > > which indeed gives us a fun thing, because if you look at the last
> > > line, the outer product equivalent would be:
> > >
> > > outer = np.matmul(a[None, :].T, b[:, None].T)
> > >
> > > Now if I go back to the earlier example:
> > >
> > > a.T @ b
> > >
> > > Does not achieve the outer product at all with using T2, since
> > >
> > > a.T2 @ b.T2  # only correct for a, but not for b
> > > a.T2 @ b  # b attempts to be "inner", so does not work
> > >
> > > It almost seems to me that the example is a counter example,
> > > because on
> > > first sight the `T2` attribute would still leave you with no
> > > shorthand
> > > for `b`.
> > a.T2 @ b.T2.T
> >
>
> Actually, better would be:
>
>   a.T2 @ b.T2.T2  # Aha?
>
> And true enough, that works, but is it still reasonably easy to find
> and understand?
> Or is it just frickeling around, the same as you would try `a[:, None]`
> before finding `a[None, :]`, maybe worse?
>

I had thought about it earlier, but its "too cute" for my taste (and I
think I would complain during code review when I see this.)

Josef



>
> - Sebastian
>
> >
> > (T2 as shortcut for creating a[:, None] that's neat, except if a is
> > already 2D)
> >
> > Josef
> >
> > >
> > > I understand the pain of having to write (and parse get into the
> > > depth
> > > of) things like `arr[:, np.newaxis]` or reshape. I also understand
> > > the
> > > idea of a shorthand for vectorized matrix operations. That is, an
> > > argument for a T2 attribute which errors on 1D arrays (not sure I
> > > like
> > > it, but that is a different issue).
> > >
> > > However, it seems that implicit adding of an axis which only works
> > > half
> > > the time does not help too much? I have to admit I don't write
> > > these
> > > things too much, but I wonder if it would not help more if we just
> > > provided some better information/link to longer examples in the
> > > "dimension mismatch" error message?
> > >
> > > In the end it is quite simple, as Nathaniel, I think I would like
> > > to
> > > see some example code, where the code obviously looks easier then
> > > before? With the `@` operator that was the case, with the
> > > "dimension
> > > adding logic" I am not so sure, plus it seems it may add other
> > > pitfalls.
> > >
> > > - Sebastian
> > >
> > >
> > >
> > >
> > > > >>> np.concatenate(([[1,2,3]], [4,5,6]))
> > > > Traceback (most recent call last):
> > > >   File "", line 1, in 
> > > > np.concatenate(([[1,2,3]], [4,5,6]))
> > > > ValueError: arrays must have same number of dimensions
> > > >
> > > > It's not an uncommon exception for me.
> > > >
> > > > Josef
> > > >
> > > > >
> > > > > ___
> > > > > NumPy-Discussion mailing list
> > > > > NumPy-Discussion@scipy.org
> > > > > https://mail.scipy.org/mailman/listinfo/numpy-discussion
> > > > >
> > > > ___
> > > > NumPy-Discussion mailing list
> > > > NumPy-Discussion@scipy.org
> > > > https://mail.scipy.org/mailman/listinfo/numpy-discussion
> > >
> > > ___
> > > NumPy-Discussion mailing list
> > > NumPy-Discussion@scipy.org
> > > https://mail.scipy.org/mailman/listinfo/numpy-discussion
> > >
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@scipy.org
> > https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ndarray.T2 for 2D transpose

2016-04-07 Thread Sebastian Berg
On Do, 2016-04-07 at 13:29 -0400, josef.p...@gmail.com wrote:
> 
> 
> On Thu, Apr 7, 2016 at 1:20 PM, Sebastian Berg <
> sebast...@sipsolutions.net> wrote:
> > On Do, 2016-04-07 at 11:56 -0400, josef.p...@gmail.com wrote:
> > >
> > >
> > 
> > 
> > 
> > >
> > > I don't think numpy treats 1d arrays as row vectors. numpy has C
> > > -order for axis preference which coincides in many cases with row
> > > vector behavior.
> > >
> > 
> > Well, broadcasting rules, are that (n,) should typically behave
> > similar
> > to (1, n). However, for dot/matmul and @ the rules are stretched to
> > mean "the one dimensional thing that gives an inner product" (using
> > matmul since my python has no @ yet):
> > 
> > In [12]: a = np.arange(20)
> > In [13]: b = np.arange(20)
> > 
> > In [14]: np.matmul(a, b)
> > Out[14]: 2470
> > 
> > In [15]: np.matmul(a, b[:, None])
> > Out[15]: array([2470])
> > 
> > In [16]: np.matmul(a[None, :], b)
> > Out[16]: array([2470])
> > 
> > In [17]: np.matmul(a[None, :], b[:, None])
> > Out[17]: array([[2470]])
> > 
> > which indeed gives us a fun thing, because if you look at the last
> > line, the outer product equivalent would be:
> > 
> > outer = np.matmul(a[None, :].T, b[:, None].T)
> > 
> > Now if I go back to the earlier example:
> > 
> > a.T @ b
> > 
> > Does not achieve the outer product at all with using T2, since
> > 
> > a.T2 @ b.T2  # only correct for a, but not for b
> > a.T2 @ b  # b attempts to be "inner", so does not work
> >  
> > It almost seems to me that the example is a counter example,
> > because on
> > first sight the `T2` attribute would still leave you with no
> > shorthand
> > for `b`.
> a.T2 @ b.T2.T
> 

Actually, better would be:

  a.T2 @ b.T2.T2  # Aha?

And true enough, that works, but is it still reasonably easy to find
and understand?
Or is it just frickeling around, the same as you would try `a[:, None]`
before finding `a[None, :]`, maybe worse?

- Sebastian

> 
> (T2 as shortcut for creating a[:, None] that's neat, except if a is
> already 2D)
> 
> Josef
>  
> >  
> > I understand the pain of having to write (and parse get into the
> > depth
> > of) things like `arr[:, np.newaxis]` or reshape. I also understand
> > the
> > idea of a shorthand for vectorized matrix operations. That is, an
> > argument for a T2 attribute which errors on 1D arrays (not sure I
> > like
> > it, but that is a different issue).
> > 
> > However, it seems that implicit adding of an axis which only works
> > half
> > the time does not help too much? I have to admit I don't write
> > these
> > things too much, but I wonder if it would not help more if we just
> > provided some better information/link to longer examples in the
> > "dimension mismatch" error message?
> > 
> > In the end it is quite simple, as Nathaniel, I think I would like
> > to
> > see some example code, where the code obviously looks easier then
> > before? With the `@` operator that was the case, with the
> > "dimension
> > adding logic" I am not so sure, plus it seems it may add other
> > pitfalls.
> > 
> > - Sebastian
> > 
> > 
> > 
> > 
> > > >>> np.concatenate(([[1,2,3]], [4,5,6]))
> > > Traceback (most recent call last):
> > >   File "", line 1, in 
> > > np.concatenate(([[1,2,3]], [4,5,6]))
> > > ValueError: arrays must have same number of dimensions
> > >
> > > It's not an uncommon exception for me.
> > >
> > > Josef
> > >
> > > >
> > > > ___
> > > > NumPy-Discussion mailing list
> > > > NumPy-Discussion@scipy.org
> > > > https://mail.scipy.org/mailman/listinfo/numpy-discussion
> > > >
> > > ___
> > > NumPy-Discussion mailing list
> > > NumPy-Discussion@scipy.org
> > > https://mail.scipy.org/mailman/listinfo/numpy-discussion
> > 
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@scipy.org
> > https://mail.scipy.org/mailman/listinfo/numpy-discussion
> > 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ndarray.T2 for 2D transpose

2016-04-07 Thread josef.pktd
On Thu, Apr 7, 2016 at 1:20 PM, Sebastian Berg 
wrote:

> On Do, 2016-04-07 at 11:56 -0400, josef.p...@gmail.com wrote:
> >
> >
>
> 
>
> >
> > I don't think numpy treats 1d arrays as row vectors. numpy has C
> > -order for axis preference which coincides in many cases with row
> > vector behavior.
> >
>
> Well, broadcasting rules, are that (n,) should typically behave similar
> to (1, n). However, for dot/matmul and @ the rules are stretched to
> mean "the one dimensional thing that gives an inner product" (using
> matmul since my python has no @ yet):
>
> In [12]: a = np.arange(20)
> In [13]: b = np.arange(20)
>
> In [14]: np.matmul(a, b)
> Out[14]: 2470
>
> In [15]: np.matmul(a, b[:, None])
> Out[15]: array([2470])
>
> In [16]: np.matmul(a[None, :], b)
> Out[16]: array([2470])
>
> In [17]: np.matmul(a[None, :], b[:, None])
> Out[17]: array([[2470]])
>
> which indeed gives us a fun thing, because if you look at the last
> line, the outer product equivalent would be:
>
> outer = np.matmul(a[None, :].T, b[:, None].T)
>
> Now if I go back to the earlier example:
>
> a.T @ b
>
> Does not achieve the outer product at all with using T2, since
>
> a.T2 @ b.T2  # only correct for a, but not for b
> a.T2 @ b  # b attempts to be "inner", so does not work
>


> It almost seems to me that the example is a counter example, because on
> first sight the `T2` attribute would still leave you with no shorthand
> for `b`.
>

a.T2 @ b.T2.T


(T2 as shortcut for creating a[:, None] that's neat, except if a is already
2D)

Josef


>
> I understand the pain of having to write (and parse get into the depth
> of) things like `arr[:, np.newaxis]` or reshape. I also understand the
> idea of a shorthand for vectorized matrix operations. That is, an
> argument for a T2 attribute which errors on 1D arrays (not sure I like
> it, but that is a different issue).
>
> However, it seems that implicit adding of an axis which only works half
> the time does not help too much? I have to admit I don't write these
> things too much, but I wonder if it would not help more if we just
> provided some better information/link to longer examples in the
> "dimension mismatch" error message?
>
> In the end it is quite simple, as Nathaniel, I think I would like to
> see some example code, where the code obviously looks easier then
> before? With the `@` operator that was the case, with the "dimension
> adding logic" I am not so sure, plus it seems it may add other
> pitfalls.
>
> - Sebastian
>
>
>
>
> > >>> np.concatenate(([[1,2,3]], [4,5,6]))
> > Traceback (most recent call last):
> >   File "", line 1, in 
> > np.concatenate(([[1,2,3]], [4,5,6]))
> > ValueError: arrays must have same number of dimensions
> >
> > It's not an uncommon exception for me.
> >
> > Josef
> >
> > >
> > > ___
> > > NumPy-Discussion mailing list
> > > NumPy-Discussion@scipy.org
> > > https://mail.scipy.org/mailman/listinfo/numpy-discussion
> > >
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@scipy.org
> > https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ndarray.T2 for 2D transpose

2016-04-07 Thread Sebastian Berg
On Do, 2016-04-07 at 11:56 -0400, josef.p...@gmail.com wrote:
> 
> 



> 
> I don't think numpy treats 1d arrays as row vectors. numpy has C
> -order for axis preference which coincides in many cases with row
> vector behavior.
> 

Well, broadcasting rules, are that (n,) should typically behave similar
to (1, n). However, for dot/matmul and @ the rules are stretched to
mean "the one dimensional thing that gives an inner product" (using
matmul since my python has no @ yet):

In [12]: a = np.arange(20)
In [13]: b = np.arange(20)

In [14]: np.matmul(a, b)
Out[14]: 2470

In [15]: np.matmul(a, b[:, None])
Out[15]: array([2470])

In [16]: np.matmul(a[None, :], b)
Out[16]: array([2470])

In [17]: np.matmul(a[None, :], b[:, None])
Out[17]: array([[2470]])

which indeed gives us a fun thing, because if you look at the last
line, the outer product equivalent would be:

outer = np.matmul(a[None, :].T, b[:, None].T)

Now if I go back to the earlier example:

a.T @ b

Does not achieve the outer product at all with using T2, since

a.T2 @ b.T2  # only correct for a, but not for b
a.T2 @ b  # b attempts to be "inner", so does not work

It almost seems to me that the example is a counter example, because on
first sight the `T2` attribute would still leave you with no shorthand
for `b`.

I understand the pain of having to write (and parse get into the depth
of) things like `arr[:, np.newaxis]` or reshape. I also understand the
idea of a shorthand for vectorized matrix operations. That is, an
argument for a T2 attribute which errors on 1D arrays (not sure I like
it, but that is a different issue).

However, it seems that implicit adding of an axis which only works half
the time does not help too much? I have to admit I don't write these
things too much, but I wonder if it would not help more if we just
provided some better information/link to longer examples in the
"dimension mismatch" error message?

In the end it is quite simple, as Nathaniel, I think I would like to
see some example code, where the code obviously looks easier then
before? With the `@` operator that was the case, with the "dimension
adding logic" I am not so sure, plus it seems it may add other
pitfalls.

- Sebastian




> >>> np.concatenate(([[1,2,3]], [4,5,6]))
> Traceback (most recent call last):
>   File "", line 1, in 
> np.concatenate(([[1,2,3]], [4,5,6]))
> ValueError: arrays must have same number of dimensions
> 
> It's not an uncommon exception for me.
> 
> Josef
> 
> >
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@scipy.org
> > https://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ndarray.T2 for 2D transpose

2016-04-07 Thread Chris Barker
On Thu, Apr 7, 2016 at 8:13 AM, Todd  wrote:

> First you need to turn a into a 2D array.  I can think of 10 ways to do
> this off the top of my head, and there may be more:
>
> snip

Basically, my argument here is the same as the argument from pep465 for the
> inclusion of the @ operator:
>
> https://www.python.org/dev/peps/pep-0465/#transparent-syntax-is-especially-crucial-for-non-expert-programmers
>
> I think is this all a good argument for a clean and obvious way to make a
column vector, but I don't think overloading transpose is the way to do
that.

-CHB


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ndarray.T2 for 2D transpose

2016-04-07 Thread Ian Henriksen
On Wed, Apr 6, 2016 at 3:21 PM Nathaniel Smith  wrote:

> Can you elaborate on what you're doing that you find verbose and
> confusing, maybe paste an example? I've never had any trouble like
> this doing linear algebra with @ or dot (which have similar semantics
> for 1d arrays), which is probably just because I've had different use
> cases, but it's much easier to talk about these things with a concrete
> example in front of us to put everyone on the same page.
>
> -n
>

Here's another example that I've seen catch people now and again.

A = np.random.rand(100, 100)
b =  np.random.rand(10)
A * b.T

In this case the user pretty clearly meant to be broadcasting along the
rows of A
rather than along the columns, but the code fails silently. When an issue
like this
gets mixed into a larger series of broadcasting operations, the error
becomes
difficult to find. This error isn't necessarily unique to beginners either.
It's a
common typo that catches intermediate users who know about broadcasting
semantics but weren't keeping close enough track of the dimensionality of
the
different intermediate expressions in their code.

Best,

-Ian Henriksen
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ndarray.T2 for 2D transpose

2016-04-07 Thread josef.pktd
On Thu, Apr 7, 2016 at 11:42 AM, Todd  wrote:
> On Thu, Apr 7, 2016 at 11:35 AM,  wrote:
>>
>> On Thu, Apr 7, 2016 at 11:13 AM, Todd  wrote:
>> > On Wed, Apr 6, 2016 at 5:20 PM, Nathaniel Smith  wrote:
>> >>
>> >> On Wed, Apr 6, 2016 at 10:43 AM, Todd  wrote:
>> >> >
>> >> > My intention was to make linear algebra operations easier in numpy.
>> >> > With
>> >> > the @ operator available, it is now very easy to do basic linear
>> >> > algebra
>> >> > on
>> >> > arrays without needing the matrix class.  But getting an array into
a
>> >> > state
>> >> > where you can use the @ operator effectively is currently pretty
>> >> > verbose
>> >> > and
>> >> > confusing.  I was trying to find a way to make the @ operator more
>> >> > useful.
>> >>
>> >> Can you elaborate on what you're doing that you find verbose and
>> >> confusing, maybe paste an example? I've never had any trouble like
>> >> this doing linear algebra with @ or dot (which have similar semantics
>> >> for 1d arrays), which is probably just because I've had different use
>> >> cases, but it's much easier to talk about these things with a concrete
>> >> example in front of us to put everyone on the same page.
>> >>
>> >
>> > Let's say you want to do a simple matrix multiplication example.  You
>> > create
>> > two example arrays like so:
>> >
>> >a = np.arange(20)
>> >b = np.arange(10, 50, 10)
>> >
>> > Now you want to do
>> >
>> > a.T @ b
>> >
>> > First you need to turn a into a 2D array.  I can think of 10 ways to do
>> > this
>> > off the top of my head, and there may be more:
>> >
>> > 1a) a[:, None]
>> > 1b) a[None]
>> > 1c) a[None, :]
>> > 2a) a.shape = (1, -1)
>> > 2b) a.shape = (-1, 1)
>> > 3a) a.reshape(1, -1)
>> > 3b) a.reshape(-1, 1)
>> > 4a) np.reshape(a, (1, -1))
>> > 4b) np.reshape(a, (-1, 1))
>> > 5) np.atleast_2d(a)
>> >
>> > 5 is pretty clear, and will work fine with any number of dimensions,
but
>> > is
>> > also long to type out when trying to do a simple example.  The
different
>> > variants of 1, 2, 3, and 4, however, will only work with 1D arrays
>> > (making
>> > them less useful for functions), are not immediately obvious to me what
>> > the
>> > result will be (I always need to try it to make sure the result is what
>> > I
>> > expect), and are easy to get mixed up in my opinion.  They also require
>> > people keep a mental list of lots of ways to do what should be a very
>> > simple
>> > task.
>> >
>> > Basically, my argument here is the same as the argument from pep465 for
>> > the
>> > inclusion of the @ operator:
>> >
>> >
https://www.python.org/dev/peps/pep-0465/#transparent-syntax-is-especially-crucial-for-non-expert-programmers
>> >
>> > "A large proportion of scientific code is written by people who are
>> > experts
>> > in their domain, but are not experts in programming. And there are many
>> > university courses run each year with titles like "Data analysis for
>> > social
>> > scientists" which assume no programming background, and teach some
>> > combination of mathematical techniques, introduction to programming,
and
>> > the
>> > use of programming to implement these mathematical techniques, all
>> > within a
>> > 10-15 week period. These courses are more and more often being taught
in
>> > Python rather than special-purpose languages like R or Matlab.
>> >
>> > For these kinds of users, whose programming knowledge is fragile, the
>> > existence of a transparent mapping between formulas and code often
means
>> > the
>> > difference between succeeding and failing to write that code at all."
>>
>> This doesn't work because of the ambiguity between column and row vector.
>>
>> In most cases 1d vectors in statistics/econometrics are column
>> vectors. Sometime it takes me a long time to figure out whether an
>> author uses row or column vector for transpose.
>>
>> i.e. I often need x.T dot y   which works for 1d and 2d to produce
>> inner product.
>> but the outer product would require most of the time a column vector
>> so it's defined as x dot x.T.
>>
>> I think keeping around explicitly 2d arrays if necessary is less error
>> prone and confusing.
>>
>> But I wouldn't mind a shortcut for atleast_2d   (although more often I
>> need atleast_2dcol to translate formulas)
>>
>
> At least from what I have seen, in all cases in numpy where a 1D array is
> treated as a 2D array, it is always treated as a row vector, the examples
I
> can think of being atleast_2d, hstack, vstack, and dstack. So using this
> convention would be in line with how it is used elsewhere in numpy.

AFAIK, linear algebra works differently, 1-D is special

>>> xx = np.arange(20).reshape(4,5)
>>> yy = np.arange(4)
>>> xx.dot(yy)
Traceback (most recent call last):
  File "", line 1, in 
xx.dot(yy)
ValueError: objects are not aligned

>>> yy = np.arange(5)
>>> xx.dot(yy)
array([ 30,  

Re: [Numpy-discussion] ndarray.T2 for 2D transpose

2016-04-07 Thread Todd
On Thu, Apr 7, 2016 at 11:35 AM,  wrote:

> On Thu, Apr 7, 2016 at 11:13 AM, Todd  wrote:
> > On Wed, Apr 6, 2016 at 5:20 PM, Nathaniel Smith  wrote:
> >>
> >> On Wed, Apr 6, 2016 at 10:43 AM, Todd  wrote:
> >> >
> >> > My intention was to make linear algebra operations easier in numpy.
> >> > With
> >> > the @ operator available, it is now very easy to do basic linear
> algebra
> >> > on
> >> > arrays without needing the matrix class.  But getting an array into a
> >> > state
> >> > where you can use the @ operator effectively is currently pretty
> verbose
> >> > and
> >> > confusing.  I was trying to find a way to make the @ operator more
> >> > useful.
> >>
> >> Can you elaborate on what you're doing that you find verbose and
> >> confusing, maybe paste an example? I've never had any trouble like
> >> this doing linear algebra with @ or dot (which have similar semantics
> >> for 1d arrays), which is probably just because I've had different use
> >> cases, but it's much easier to talk about these things with a concrete
> >> example in front of us to put everyone on the same page.
> >>
> >
> > Let's say you want to do a simple matrix multiplication example.  You
> create
> > two example arrays like so:
> >
> >a = np.arange(20)
> >b = np.arange(10, 50, 10)
> >
> > Now you want to do
> >
> > a.T @ b
> >
> > First you need to turn a into a 2D array.  I can think of 10 ways to do
> this
> > off the top of my head, and there may be more:
> >
> > 1a) a[:, None]
> > 1b) a[None]
> > 1c) a[None, :]
> > 2a) a.shape = (1, -1)
> > 2b) a.shape = (-1, 1)
> > 3a) a.reshape(1, -1)
> > 3b) a.reshape(-1, 1)
> > 4a) np.reshape(a, (1, -1))
> > 4b) np.reshape(a, (-1, 1))
> > 5) np.atleast_2d(a)
> >
> > 5 is pretty clear, and will work fine with any number of dimensions, but
> is
> > also long to type out when trying to do a simple example.  The different
> > variants of 1, 2, 3, and 4, however, will only work with 1D arrays
> (making
> > them less useful for functions), are not immediately obvious to me what
> the
> > result will be (I always need to try it to make sure the result is what I
> > expect), and are easy to get mixed up in my opinion.  They also require
> > people keep a mental list of lots of ways to do what should be a very
> simple
> > task.
> >
> > Basically, my argument here is the same as the argument from pep465 for
> the
> > inclusion of the @ operator:
> >
> https://www.python.org/dev/peps/pep-0465/#transparent-syntax-is-especially-crucial-for-non-expert-programmers
> >
> > "A large proportion of scientific code is written by people who are
> experts
> > in their domain, but are not experts in programming. And there are many
> > university courses run each year with titles like "Data analysis for
> social
> > scientists" which assume no programming background, and teach some
> > combination of mathematical techniques, introduction to programming, and
> the
> > use of programming to implement these mathematical techniques, all
> within a
> > 10-15 week period. These courses are more and more often being taught in
> > Python rather than special-purpose languages like R or Matlab.
> >
> > For these kinds of users, whose programming knowledge is fragile, the
> > existence of a transparent mapping between formulas and code often means
> the
> > difference between succeeding and failing to write that code at all."
>
> This doesn't work because of the ambiguity between column and row vector.
>
> In most cases 1d vectors in statistics/econometrics are column
> vectors. Sometime it takes me a long time to figure out whether an
> author uses row or column vector for transpose.
>
> i.e. I often need x.T dot y   which works for 1d and 2d to produce
> inner product.
> but the outer product would require most of the time a column vector
> so it's defined as x dot x.T.
>
> I think keeping around explicitly 2d arrays if necessary is less error
> prone and confusing.
>
> But I wouldn't mind a shortcut for atleast_2d   (although more often I
> need atleast_2dcol to translate formulas)
>
>
At least from what I have seen, in all cases in numpy where a 1D array is
treated as a 2D array, it is always treated as a row vector, the examples I
can think of being atleast_2d, hstack, vstack, and dstack. So using this
convention would be in line with how it is used elsewhere in numpy.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ndarray.T2 for 2D transpose

2016-04-07 Thread josef.pktd
On Thu, Apr 7, 2016 at 11:13 AM, Todd  wrote:
> On Wed, Apr 6, 2016 at 5:20 PM, Nathaniel Smith  wrote:
>>
>> On Wed, Apr 6, 2016 at 10:43 AM, Todd  wrote:
>> >
>> > My intention was to make linear algebra operations easier in numpy.
>> > With
>> > the @ operator available, it is now very easy to do basic linear algebra
>> > on
>> > arrays without needing the matrix class.  But getting an array into a
>> > state
>> > where you can use the @ operator effectively is currently pretty verbose
>> > and
>> > confusing.  I was trying to find a way to make the @ operator more
>> > useful.
>>
>> Can you elaborate on what you're doing that you find verbose and
>> confusing, maybe paste an example? I've never had any trouble like
>> this doing linear algebra with @ or dot (which have similar semantics
>> for 1d arrays), which is probably just because I've had different use
>> cases, but it's much easier to talk about these things with a concrete
>> example in front of us to put everyone on the same page.
>>
>
> Let's say you want to do a simple matrix multiplication example.  You create
> two example arrays like so:
>
>a = np.arange(20)
>b = np.arange(10, 50, 10)
>
> Now you want to do
>
> a.T @ b
>
> First you need to turn a into a 2D array.  I can think of 10 ways to do this
> off the top of my head, and there may be more:
>
> 1a) a[:, None]
> 1b) a[None]
> 1c) a[None, :]
> 2a) a.shape = (1, -1)
> 2b) a.shape = (-1, 1)
> 3a) a.reshape(1, -1)
> 3b) a.reshape(-1, 1)
> 4a) np.reshape(a, (1, -1))
> 4b) np.reshape(a, (-1, 1))
> 5) np.atleast_2d(a)
>
> 5 is pretty clear, and will work fine with any number of dimensions, but is
> also long to type out when trying to do a simple example.  The different
> variants of 1, 2, 3, and 4, however, will only work with 1D arrays (making
> them less useful for functions), are not immediately obvious to me what the
> result will be (I always need to try it to make sure the result is what I
> expect), and are easy to get mixed up in my opinion.  They also require
> people keep a mental list of lots of ways to do what should be a very simple
> task.
>
> Basically, my argument here is the same as the argument from pep465 for the
> inclusion of the @ operator:
> https://www.python.org/dev/peps/pep-0465/#transparent-syntax-is-especially-crucial-for-non-expert-programmers
>
> "A large proportion of scientific code is written by people who are experts
> in their domain, but are not experts in programming. And there are many
> university courses run each year with titles like "Data analysis for social
> scientists" which assume no programming background, and teach some
> combination of mathematical techniques, introduction to programming, and the
> use of programming to implement these mathematical techniques, all within a
> 10-15 week period. These courses are more and more often being taught in
> Python rather than special-purpose languages like R or Matlab.
>
> For these kinds of users, whose programming knowledge is fragile, the
> existence of a transparent mapping between formulas and code often means the
> difference between succeeding and failing to write that code at all."

This doesn't work because of the ambiguity between column and row vector.

In most cases 1d vectors in statistics/econometrics are column
vectors. Sometime it takes me a long time to figure out whether an
author uses row or column vector for transpose.

i.e. I often need x.T dot y   which works for 1d and 2d to produce
inner product.
but the outer product would require most of the time a column vector
so it's defined as x dot x.T.

I think keeping around explicitly 2d arrays if necessary is less error
prone and confusing.

But I wouldn't mind a shortcut for atleast_2d   (although more often I
need atleast_2dcol to translate formulas)

Josef

>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ndarray.T2 for 2D transpose

2016-04-07 Thread Todd
On Thu, Apr 7, 2016 at 3:39 AM, Irvin Probst  wrote:

> On 06/04/2016 04:11, Todd wrote:
>
> When you try to transpose a 1D array, it does nothing.  This is the
> correct behavior, since it transposing a 1D array is meaningless.  However,
> this can often lead to unexpected errors since this is rarely what you
> want.  You can convert the array to 2D, using `np.atleast_2d` or
> `arr[None]`, but this makes simple linear algebra computations more
> difficult.
>
> I propose adding an argument to transpose, perhaps called `expand` or
> `expanddim`, which if `True` (it is `False` by default) will force the
> array to be at least 2D.  A shortcut property, `ndarray.T2`, would be the
> same as `ndarray.transpose(True)`
>
> Hello,
> My two cents here, I've seen hundreds of people (literally hundreds)
> stumbling on this .T trick with 1D vectors when they were trying to do some
> linear algebra with numpy so at first I had the same feeling as you. But
> the real issue was that *all* these people were coming from matlab and
> expected numpy to behave the same way. Once the logic behind 1D vectors was
> explained it made sense to most of them and there were no more problems.
>
>
The problem isn't necessarily understanding, although that is a problem.
The bigger problem is having to jump through hoops to do basic matrix math.


> And by the way I don't see any way to tell apart a 1D "row vector" from a
> 1D "column vector", think of a code mixing a Rn=>R jacobian matrix and some
> data supposed to be used as measurements in a linear system, so we have
> J=np.array([1,2,3,4]) and B=np.array([5,6,7,8]), what would the output of
> J.T2 and B.T2 be ?
>
>
As I said elsewhere, we already have a convention for this established by
`np.atleast_2d`.  1D arrays are treated as row vectors.  `np.hstack` and
`np.vstack` also treat 1D arrays as row vectors.  So `arr.T2` will follow
this convention, being equivalent to `np.atleast_2d(arr).T`.


> I think it's much better to get used to writing
> J=np.array([1,2,3,4]).reshape(1,4) and B=np.array([5,6,7,8]).reshape(4,1),
> then you can use .T and @ without any verbosity and at least if forces
> users (read "my students" here) to think twice before writing some linear
> algebra nonsense.
>
>
That works okay when you know beforehand what the shape of the array is
(although it may very well be the different between a simple, 1-line piece
of code and a 3-line piece of code).  But what if you try to turn this into
a general-purpose function?  Then any function that has linear algebra
needs to call `atleast_2d` on every value used in that linear algebra, or
use `if` tests.  And if you forget, it may not be obvious until much later
depending on what you initially use the function for and what you use it
for later.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ndarray.T2 for 2D transpose

2016-04-07 Thread Todd
On Thu, Apr 7, 2016 at 4:59 AM, Joseph Martinot-Lagarde <
contreba...@gmail.com> wrote:

> Alan Isaac  gmail.com> writes:
>
> > But underlying the proposal is apparently the
> > idea that there be an attribute equivalent to
> > `atleast_2d`.  Then call it `d2p`.
> > You can now have `a.d2p.T` which is a lot
> > more explicit and general than say `a.T2`,
> > while requiring only 3 more keystrokes.
>
>
> How about a.T2d or a .T2D ?
>
>
I thought of that, but I wanted to keep things as short as possible (but
not shorter).
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ndarray.T2 for 2D transpose

2016-04-07 Thread Todd
On Wed, Apr 6, 2016 at 5:20 PM, Nathaniel Smith  wrote:

> On Wed, Apr 6, 2016 at 10:43 AM, Todd  wrote:
> >
> > My intention was to make linear algebra operations easier in numpy.  With
> > the @ operator available, it is now very easy to do basic linear algebra
> on
> > arrays without needing the matrix class.  But getting an array into a
> state
> > where you can use the @ operator effectively is currently pretty verbose
> and
> > confusing.  I was trying to find a way to make the @ operator more
> useful.
>
> Can you elaborate on what you're doing that you find verbose and
> confusing, maybe paste an example? I've never had any trouble like
> this doing linear algebra with @ or dot (which have similar semantics
> for 1d arrays), which is probably just because I've had different use
> cases, but it's much easier to talk about these things with a concrete
> example in front of us to put everyone on the same page.
>
>
Let's say you want to do a simple matrix multiplication example.  You
create two example arrays like so:

   a = np.arange(20)
   b = np.arange(10, 50, 10)

Now you want to do

a.T @ b

First you need to turn a into a 2D array.  I can think of 10 ways to do
this off the top of my head, and there may be more:

1a) a[:, None]
1b) a[None]
1c) a[None, :]
2a) a.shape = (1, -1)
2b) a.shape = (-1, 1)
3a) a.reshape(1, -1)
3b) a.reshape(-1, 1)
4a) np.reshape(a, (1, -1))
4b) np.reshape(a, (-1, 1))
5) np.atleast_2d(a)

5 is pretty clear, and will work fine with any number of dimensions, but is
also long to type out when trying to do a simple example.  The different
variants of 1, 2, 3, and 4, however, will only work with 1D arrays (making
them less useful for functions), are not immediately obvious to me what the
result will be (I always need to try it to make sure the result is what I
expect), and are easy to get mixed up in my opinion.  They also require
people keep a mental list of lots of ways to do what should be a very
simple task.

Basically, my argument here is the same as the argument from pep465 for the
inclusion of the @ operator:
https://www.python.org/dev/peps/pep-0465/#transparent-syntax-is-especially-crucial-for-non-expert-programmers

"A large proportion of scientific code is written by people who are experts
in their domain, but are not experts in programming. And there are many
university courses run each year with titles like "Data analysis for social
scientists" which assume no programming background, and teach some
combination of mathematical techniques, introduction to programming, and
the use of programming to implement these mathematical techniques, all
within a 10-15 week period. These courses are more and more often being
taught in Python rather than special-purpose languages like R or Matlab.
For these kinds of users, whose programming knowledge is fragile, the
existence of a transparent mapping between formulas and code often means
the difference between succeeding and failing to write that code at all."
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] ANN: bcolz 1.0.0 (final) released

2016-04-07 Thread Francesc Alted
=
Announcing bcolz 1.0.0 final
=

What's new
==

Yeah, 1.0.0 is finally here.  We are not introducing any exciting new
feature (just some optimizations and bug fixes), but bcolz is already 6
years old and it implements most of the capabilities that it was
designed for, so I decided to release a 1.0.0 meaning that the format is
declared stable and that people can be assured that future bcolz
releases will be able to read bcolz 1.0 data files (and probably much
earlier ones too) for a long while.  Such a format is fully described
at:

https://github.com/Blosc/bcolz/blob/master/DISK_FORMAT_v1.rst

Also, a 1.0.0 release means that bcolz 1.x series will be based on
C-Blosc 1.x series (https://github.com/Blosc/c-blosc).  After C-Blosc
2.x (https://github.com/Blosc/c-blosc2) would be out, a new bcolz 2.x is
expected taking advantage of shiny new features of C-Blosc2 (more
compressors, more filters, native variable length support and the
concept of super-chunks), which should be very beneficial for next bcolz
generation.

Important: this is a final release and there are no important known bugs
there, so this is recommended to be used in production.  Enjoy!

For a more detailed change log, see:

https://github.com/Blosc/bcolz/blob/master/RELEASE_NOTES.rst

For some comparison between bcolz and other compressed data containers,
see:

https://github.com/FrancescAlted/DataContainersTutorials

specially chapters 3 (in-memory containers) and 4 (on-disk containers).

Also, if it happens that you are in Madrid during this weekend, you can
drop by my tutorial and talk:

http://pydata.org/madrid2016/schedule/

See you!


What it is
==

*bcolz* provides columnar and compressed data containers that can live
either on-disk or in-memory.  Column storage allows for efficiently
querying tables with a large number of columns.  It also allows for
cheap addition and removal of column.  In addition, bcolz objects are
compressed by default for reducing memory/disk I/O needs. The
compression process is carried out internally by Blosc, an
extremely fast meta-compressor that is optimized for binary data. Lastly,
high-performance iterators (like ``iter()``, ``where()``) for querying
the objects are provided.

bcolz can use numexpr internally so as to accelerate many vector and
query operations (although it can use pure NumPy for doing so too).
numexpr optimizes the memory usage and use several cores for doing the
computations, so it is blazing fast.  Moreover, since the carray/ctable
containers can be disk-based, and it is possible to use them for
seamlessly performing out-of-memory computations.

bcolz has minimal dependencies (NumPy), comes with an exhaustive test
suite and fully supports both 32-bit and 64-bit platforms.  Also, it is
typically tested on both UNIX and Windows operating systems.

Together, bcolz and the Blosc compressor, are finally fulfilling the
promise of accelerating memory I/O, at least for some real scenarios:

http://nbviewer.ipython.org/github/Blosc/movielens-bench/blob/master/querying-ep14.ipynb#Plots

Other users of bcolz are Visualfabriq (http://www.visualfabriq.com/) ,
Quantopian
(https://www.quantopian.com/) and Scikit-Allel (
https://github.com/cggh/scikit-allel) which you can read more about by
pointing your browser at the links below.

* Visualfabriq:

  * *bquery*, A query and aggregation framework for Bcolz:
  * https://github.com/visualfabriq/bquery

* Quantopian:

  * Using compressed data containers for faster backtesting at scale:
  * https://quantopian.github.io/talks/NeedForSpeed/slides.html

* Scikit-Allel

  * Provides an alternative backend to work with compressed arrays
  * https://scikit-allel.readthedocs.org/en/latest/model/bcolz.html


Resources
=

Visit the main bcolz site repository at:
http://github.com/Blosc/bcolz

Manual:
http://bcolz.blosc.org

Home of Blosc compressor:
http://blosc.org

User's mail list:
bc...@googlegroups.com
http://groups.google.com/group/bcolz

License is the new BSD:
https://github.com/Blosc/bcolz/blob/master/LICENSES/BCOLZ.txt

Release notes can be found in the Git repository:
https://github.com/Blosc/bcolz/blob/master/RELEASE_NOTES.rst



  **Enjoy data!**

-- 
Francesc Alted
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] ANN: python-blosc 1.3.1

2016-04-07 Thread Francesc Alted
=
Announcing python-blosc 1.3.1
=

What is new?


This is an important release in terms of stability.  Now, the -O1 flag
for compiling the included C-Blosc sources on Linux.  This represents
slower performance, but fixes the nasty issue #110.  In case maximum
speed is needed, please `compile python-blosc with an external C-Blosc
library <
https://github.com/Blosc/python-blosc#compiling-with-an-installed-blosc-library-recommended
)>`_.

Also, symbols like BLOSC_MAX_BUFFERSIZE have been replaced for allowing
backward compatibility with python-blosc 1.2.x series.

For whetting your appetite, look at some benchmarks here:

https://github.com/Blosc/python-blosc#benchmarking

For more info, you can have a look at the release notes in:

https://github.com/Blosc/python-blosc/blob/master/RELEASE_NOTES.rst

More docs and examples are available in the documentation site:

http://python-blosc.blosc.org


What is it?
===

Blosc (http://www.blosc.org) is a high performance compressor optimized
for binary data.  It has been designed to transmit data to the processor
cache faster than the traditional, non-compressed, direct memory fetch
approach via a memcpy() OS call.  Blosc works well for compressing
numerical arrays that contains data with relatively low entropy, like
sparse data, time series, grids with regular-spaced values, etc.

python-blosc (http://python-blosc.blosc.org/) is the Python wrapper for
the Blosc compression library, with added functions (`compress_ptr()`
and `pack_array()`) for efficiently compressing NumPy arrays, minimizing
the number of memory copies during the process.  python-blosc can be
used to compress in-memory data buffers for transmission to other
machines, persistence or just as a compressed cache.

There is also a handy tool built on top of python-blosc called Bloscpack
(https://github.com/Blosc/bloscpack). It features a commmand line
interface that allows you to compress large binary datafiles on-disk.
It also comes with a Python API that has built-in support for
serializing and deserializing Numpy arrays both on-disk and in-memory at
speeds that are competitive with regular Pickle/cPickle machinery.


Sources repository
==

The sources and documentation are managed through github services at:

http://github.com/Blosc/python-blosc




  **Enjoy data!**

-- 
Francesc Alted
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] ANN: numexpr 2.5.2 released

2016-04-07 Thread Francesc Alted
=
 Announcing Numexpr 2.5.2
=

Numexpr is a fast numerical expression evaluator for NumPy.  With it,
expressions that operate on arrays (like "3*a+4*b") are accelerated
and use less memory than doing the same calculation in Python.

It wears multi-threaded capabilities, as well as support for Intel's
MKL (Math Kernel Library), which allows an extremely fast evaluation
of transcendental functions (sin, cos, tan, exp, log...) while
squeezing the last drop of performance out of your multi-core
processors.  Look here for a some benchmarks of numexpr using MKL:

https://github.com/pydata/numexpr/wiki/NumexprMKL

Its only dependency is NumPy (MKL is optional), so it works well as an
easy-to-deploy, easy-to-use, computational engine for projects that
don't want to adopt other solutions requiring more heavy dependencies.

What's new
==

This is a maintenance release shaking some remaining problems with VML
(it is nice to see how Anaconda VML's support helps raising hidden
issues).  Now conj() and abs() are actually added as VML-powered
functions, preventing the same problems than log10() before (PR #212);
thanks to Tom Kooij.  Upgrading to this release is highly recommended.

In case you want to know more in detail what has changed in this
version, see:

https://github.com/pydata/numexpr/blob/master/RELEASE_NOTES.rst

Where I can find Numexpr?
=

The project is hosted at GitHub in:

https://github.com/pydata/numexpr

You can get the packages from PyPI as well (but not for RC releases):

http://pypi.python.org/pypi/numexpr

Share your experience
=

Let us know of any bugs, suggestions, gripes, kudos, etc. you may
have.


Enjoy data!

-- 
Francesc Alted
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ndarray.T2 for 2D transpose

2016-04-07 Thread Joseph Martinot-Lagarde
> > For a 1D array a of shape (N,), I expect a.T2 to be of shape (N, 1),
> 
> Why not (1,N)? -- it is not well defined, though I suppose it's not so
> bad to establish a convention that a 1-D array is a "row vector"
> rather than a "column vector".
I like Todd's simple proposal: a.T2 should be equivalent to np.atleast_2d(arr).T

> BTW, if transposing a (N,) array gives you a (N,1) array, what does
> transposing a (N,1) array give you?
> 
> (1,N) or (N,) ?
The proposal changes nothin for dims > 1, so (1,N). That means that a.T2.T2
doesn"t have the same shape as a.

It boils down to practicality vs purity, as often !


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] [ANN] Summer School "Advanced Scientific Programming in Python" in Reading, UK, September 5—11, 2016

2016-04-07 Thread Tiziano Zito
Advanced Scientific Programming in Python
=
a Summer School by the G-Node, and the Centre for Integrative Neuroscience and 
Neurodynamics, School of Psychology and Clinical Language Sciences, University 
of Reading, UK

Scientists spend more and more time writing, maintaining, and debugging 
software. While techniques for doing this efficiently have evolved, only few 
scientists have been trained to use them. As a result, instead of doing their 
research, they spend far too much time writing deficient code and reinventing 
the wheel. In this course we will present a selection of advanced programming 
techniques and best practices which are standard in the industry, but 
especially tailored to the needs of a programming scientist. Lectures are 
devised to be interactive and to give the students enough time to acquire 
direct hands-on experience with the materials. Students will work in pairs 
throughout the school and will team up to practice the newly learned skills in 
a real programming project — an entertaining computer game.

We use the Python programming language for the entire course. Python works as a 
simple programming language for beginners, but more importantly, it also works 
great in scientific simulations and data analysis. We show how clean language 
design, ease of extensibility, and the great wealth of open source libraries 
for scientific computing and data visualization are driving Python to become a 
standard tool for the programming scientist.

This school is targeted at Master or PhD students and Post-docs from all areas 
of science. Competence in Python or in another language such as Java, C/C++, 
MATLAB, or Mathematica is absolutely required. Basic knowledge of Python and of 
a version control system such as git, subversion, mercurial, or bazaar is 
assumed. Participants without any prior experience with Python and/or git 
should work through the proposed introductory material before the course.

We are striving hard to get a pool of students which is international and 
gender-balanced.

You can apply online: https://python.g-node.org
Application deadline: 23:59 UTC, May 15, 2016.
Be sure to read the FAQ before applying. 

Participation is for free, i.e. no fee is charged! Participants however should 
take care of travel, living, and accommodation expenses by themselves. Travel 
grants may be available.

Date & Location
===
September 5—11, 2016. Reading, UK

Program
===
- Best Programming Practices
  • Best practices for scientific programming
  • Version control with git and how to contribute to open source projects with 
GitHub
  • Best practices in data visualization

- Software Carpentry
  • Test-driven development
  • Debugging with a debuggger
  • Profiling code

- Scientific Tools for Python
  • Advanced NumPy

- Advanced Python
  • Decorators
  • Context managers
  • Generators

- The Quest for Speed
  • Writing parallel applications
  • Interfacing to C with Cython
  • Memory-bound problems and memory profiling
  • Data containers: storage and fast access to large data

- Practical Software Development
  • Group project

Preliminary Faculty
===

• Francesc Alted, freelance consultant, author of PyTables, Spain
• Pietro Berkes, Enthought Inc., Cambridge, UK
• Zbigniew Jędrzejewski-Szmek, Krasnow Institute, George Mason University, 
Fairfax, VA, USA
• Eilif Muller, Blue Brain Project, École Polytechnique Fédérale de Lausanne, 
Switzerland
• Juan Nunez-Iglesias, Victorian Life Sciences Computation Initiative, 
University of Melbourne, Australia
• Rike-Benjamin Schuppner, Institute for Theoretical Biology, 
Humboldt-Universität zu Berlin, Germany
• Bartosz Teleńczuk, European Institute for Theoretical Neuroscience, CNRS, 
Paris, France
• Stéfan van der Walt, Berkeley Institute for Data Science, UC Berkeley, CA, USA
• Nelle Varoquaux, Centre for Computational Biology Mines ParisTech, Institut 
Curie, U900 INSERM, Paris, France
• Tiziano Zito, freelance consultant, Germany

Organizers
==
For the German Neuroinformatics Node of the INCF (G-Node) Germany:
• Tiziano Zito, freelance consultant, Germany
• Zbigniew Jędrzejewski-Szmek, Krasnow Institute, George Mason University, 
Fairfax, USA
• Jakob Jordan, Institute of Neuroscience and Medicine (INM-6), 
Forschungszentrum Jülich GmbH, Germany

For the Centre for Integrative Neuroscience and Neurodynamics, School of 
Psychology and Clinical Language Sciences, University of Reading UK:
• Etienne Roesch, Centre for Integrative Neuroscience and Neurodynamics, 
University of Reading, UK

Website: https://python.g-node.org
Contact: python-i...@g-node.org
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ndarray.T2 for 2D transpose

2016-04-07 Thread Joseph Martinot-Lagarde
Alan Isaac  gmail.com> writes:

> But underlying the proposal is apparently the
> idea that there be an attribute equivalent to
> `atleast_2d`.  Then call it `d2p`.
> You can now have `a.d2p.T` which is a lot
> more explicit and general than say `a.T2`,
> while requiring only 3 more keystrokes.


How about a.T2d or a .T2D ?

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ndarray.T2 for 2D transpose

2016-04-07 Thread Irvin Probst

On 06/04/2016 04:11, Todd wrote:


When you try to transpose a 1D array, it does nothing.  This is the 
correct behavior, since it transposing a 1D array is meaningless.  
However, this can often lead to unexpected errors since this is rarely 
what you want.  You can convert the array to 2D, using `np.atleast_2d` 
or `arr[None]`, but this makes simple linear algebra computations more 
difficult.


I propose adding an argument to transpose, perhaps called `expand` or 
`expanddim`, which if `True` (it is `False` by default) will force the 
array to be at least 2D.  A shortcut property, `ndarray.T2`, would be 
the same as `ndarray.transpose(True)`



Hello,
My two cents here, I've seen hundreds of people (literally hundreds) 
stumbling on this .T trick with 1D vectors when they were trying to do 
some linear algebra with numpy so at first I had the same feeling as 
you. But the real issue was that *all* these people were coming from 
matlab and expected numpy to behave the same way. Once the logic behind 
1D vectors was explained it made sense to most of them and there were no 
more problems.


And by the way I don't see any way to tell apart a 1D "row vector" from 
a 1D "column vector", think of a code mixing a Rn=>R jacobian matrix and 
some data supposed to be used as measurements in a linear system, so we 
have J=np.array([1,2,3,4]) and B=np.array([5,6,7,8]), what would the 
output of J.T2 and B.T2 be ?


I think it's much better to get used to writing 
J=np.array([1,2,3,4]).reshape(1,4) and 
B=np.array([5,6,7,8]).reshape(4,1), then you can use .T and @ without 
any verbosity and at least if forces users (read "my students" here) to 
think twice before writing some linear algebra nonsense.


Regards.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion