Re: [Numpy-discussion] GSoC project: draft of proposal

2014-03-14 Thread Leo Mao
On Fri, Mar 14, 2014 at 1:43 AM, alex argri...@ncsu.edu wrote:

 I think everyone who wants fast numpy linalg already connects to
 something like OpenBLAS or MKL.  When these are not available, numpy
 uses its own lapack-lite which is way slower.  I don't think you are
 going to beat OpenBLAS, so are you suggesting to speed up the slow
 default lapack-lite, or are you proposing something else?


I think most CPUs nowadays support instructions like SSE2, AVX etc,
so maybe numpy can use OpenBLAS (or somethine else) by default ?
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] python array

2014-03-14 Thread Sudheer Joseph
Thank you Olsen,
My objective was to find out, how many values 
are falling under different ranges. ie, find RMS  ,5 and then rms between .5 
and .8 etc. If there is a speficic python way of handling mask and making 
boolean operation with out any doubt, I was  looking for that. The data I am 
using has a mask and if I wanted to tell python do not consider the masked 
values and masked grid points, ( a percentatge is calculated afterwards using 
the number of grid points) while doing the calculation. I will try in detail 
the example you send and see how python handles this.

with best regards,
Sudheer

***
Sudheer Joseph 
Indian National Centre for Ocean Information Services
Ministry of Earth Sciences, Govt. of India
POST BOX NO: 21, IDA Jeedeemetla P.O.
Via Pragathi Nagar,Kukatpally, Hyderabad; Pin:5000 55
Tel:+91-40-23886047(O),Fax:+91-40-23895011(O),
Tel:+91-40-23044600(R),Tel:+91-40-9440832534(Mobile)
E-mail:sjo.in...@gmail.com;sudheer.jos...@yahoo.com
Web- http://oppamthadathil.tripod.com
***


On Fri, 14/3/14, Brett Olsen brett.ol...@gmail.com wrote:

 Subject: Re: [Numpy-discussion] python array
 To: Discussion of Numerical Python numpy-discussion@scipy.org
 Date: Friday, 14 March, 2014, 2:07 AM
 
 The difference appears
 to be that the boolean selection pulls out all data values
 = 0.5 whether or not they are masked, and then carries
 over the appropriate masks to the new array.  So r2010 and
 bt contain identical unmasked values but different numbers
 of masked values.  Because the initial fill value for your
 masked values was a large negative number, in r2010 those
 masked values are carried over.  In bt, you've taken
 the absolute value of the data array, so those fill values
 are now positive and they are no longer carried over into
 the indexed array.
 
 Because the final arrays are still masked, you
 are observing no difference in the statistical properties of
 the arrays, only their sizes, because one contains many more
 masked values than the other.  I don't think this
 should be a problem for your computations. If you're
 concerned, you could always explicitly demask them before
 your computations.  See the example problem below.
 
 ~Brett
 In [61]: import numpy as np
 In [62]: import numpy.ma as ma
 In [65]: a = np.arange(-8, 8).reshape((4,
 4))
 
 In [66]:
 aOut[66]:array([[-8, -7, -6,
 -5],       [-4, -3, -2, -1],   
    [ 0,  1,  2,  3],       [ 4,  5,
  6,  7]])
 In [68]: b = ma.masked_array(a, mask=a  0)
 
 
 In [69]: b
 Out[69]:masked_array(data
 = [[-- -- -- --] [-- -- --
 --] [0 1 2 3] [4 5 6
 7]],             mask =
  [[ True  True  True  True] [ True  True
  True  True] [False False False
 False] [False False False False]], 
      fill_value = 99)
 In [70]: b.data
 Out[70]:array([[-8, -7, -6,
 -5],       [-4, -3, -2, -1],   
    [ 0,  1,  2,  3],       [ 4,  5,
  6,  7]])
 In [71]: c = abs(b)
 
 In [72]: c[c = 4].shapeOut[72]:
 (9L,)
 In [73]: b[b = 4].shapeOut[73]:
 (13L,)
 In [74]: b[b =
 4]Out[74]:masked_array(data = [-- --
 -- -- -- -- -- -- 0 1 2 3 4],
              mask = [ True  True  True  True
  True  True  True  True False False False
 False False],       fill_value =
 99)
 
 In [75]: c[c = 4]
 Out[75]:masked_array(data = [-- -- -- -- 0 1
 2 3 4],             mask = [ True  True
  True  True False False False False False], 
      fill_value = 99)
 
 
 On Thu, Mar 13, 2014
 at 8:14 PM, Sudheer Joseph sudheer.jos...@yahoo.com
 wrote:
 
 Sorry,
 
            The below solution I thoght working was not
 working but was just giving array size.
 
 
 
 
 
 On Fri, 14/3/14, Sudheer Joseph
 sudheer.jos...@yahoo.com
 wrote:
 
 
 
  Subject: Re: [Numpy-discussion] python array
 
  To: Discussion of Numerical Python numpy-discussion@scipy.org
 
  Date: Friday, 14 March, 2014, 1:09 AM
 
 
 
  Thank you very much Nicolas and
 
  Chris,
 
                 
 
               The
 
  hint was helpful and from that I treid below steps ( a
 crude
 
  way I would say) and getting same result now
 
 
 
  I have been using abs available by default and it is the
 
  same with numpy.absolute( i checked).
 
 
 
  nr= ((r2010r2010.min())  (r2010r2010.max()))
 
  nr[nr.5].shape
 
  Out[25]: (33868,)
 
  anr=numpy.absolute(nr)
 
  anr[anr.5].shape
 
  Out[27]: (33868,)
 
 
 
  This way I used may have problem when mask used has
 values
 
  which can affect the min max operation.
 
 
 
  So I would like to know if there is a standard formal (
 
  python/numpy) way to handle masked array when they need
 to
 
  be subjected to boolean operations.
 
 
 
  with best regards,
 
  Sudheer
 
 
 
 
 
  ***
 
  Sudheer Joseph         
 
  Indian National Centre for Ocean Information 

Re: [Numpy-discussion] python array

2014-03-14 Thread Sudheer Joseph
Dear Oslen,
 
I had  a detailed look at the example you send and points I got were below

a = np.arange(-8, 8).reshape((4, 4))
b = ma.masked_array(a, mask=a  0)


Out[33]: b[b4]
masked_array(data = [-- -- -- -- -- -- -- -- 0 1 2 3],
 mask = [ True  True  True  True  True  True  True  True False 
False False False],
   fill_value = 99)
In [34]: b[b4].shape
Out[34]: (12,)
In [35]: b[b4].data
Out[35]: array([-8, -7, -6, -5, -4, -3, -2, -1,  0,  1,  2,  3])

This shows while numpy can do the bolean operation and list the data meeting 
the criteria( by masking the data further), it do not actually allow us get the 
count of data that meets the crieteria. I was interested in count. Because my 
objective was to find out how many numbers in the grid fall under different 
catagory.( =4 , 4  =8 , 8=10) etc. and find the percentage of them.

 Is there a way to get the counts correctly ? that is my botheration now !!

with best regards,
Sudheer









On Fri, 14/3/14, Brett Olsen brett.ol...@gmail.com wrote:

 Subject: Re: [Numpy-discussion] python array
 To: Discussion of Numerical Python numpy-discussion@scipy.org
 Date: Friday, 14 March, 2014, 2:07 AM
 
 The difference appears
 to be that the boolean selection pulls out all data values
 = 0.5 whether or not they are masked, and then carries
 over the appropriate masks to the new array.  So r2010 and
 bt contain identical unmasked values but different numbers
 of masked values.  Because the initial fill value for your
 masked values was a large negative number, in r2010 those
 masked values are carried over.  In bt, you've taken
 the absolute value of the data array, so those fill values
 are now positive and they are no longer carried over into
 the indexed array.
 
 Because the final arrays are still masked, you
 are observing no difference in the statistical properties of
 the arrays, only their sizes, because one contains many more
 masked values than the other.  I don't think this
 should be a problem for your computations. If you're
 concerned, you could always explicitly demask them before
 your computations.  See the example problem below.
 
 ~Brett
 In [61]: import numpy as np
 In [62]: import numpy.ma as ma
 In [65]: a = np.arange(-8, 8).reshape((4,
 4))
 
 In [66]:
 aOut[66]:array([[-8, -7, -6,
 -5],       [-4, -3, -2, -1],   
    [ 0,  1,  2,  3],       [ 4,  5,
  6,  7]])
 In [68]: b = ma.masked_array(a, mask=a  0)
 
 
 In [69]: b
 Out[69]:masked_array(data
 = [[-- -- -- --] [-- -- --
 --] [0 1 2 3] [4 5 6
 7]],             mask =
  [[ True  True  True  True] [ True  True
  True  True] [False False False
 False] [False False False False]], 
      fill_value = 99)
 In [70]: b.data
 Out[70]:array([[-8, -7, -6,
 -5],       [-4, -3, -2, -1],   
    [ 0,  1,  2,  3],       [ 4,  5,
  6,  7]])
 In [71]: c = abs(b)
 
 In [72]: c[c = 4].shapeOut[72]:
 (9L,)
 In [73]: b[b = 4].shapeOut[73]:
 (13L,)
 In [74]: b[b =
 4]Out[74]:masked_array(data = [-- --
 -- -- -- -- -- -- 0 1 2 3 4],
              mask = [ True  True  True  True
  True  True  True  True False False False
 False False],       fill_value =
 99)
 
 In [75]: c[c = 4]
 Out[75]:masked_array(data = [-- -- -- -- 0 1
 2 3 4],             mask = [ True  True
  True  True False False False False False], 
      fill_value = 99)
 
 
 On Thu, Mar 13, 2014 at
 8:14 PM, Sudheer Joseph sudheer.jos...@yahoo.com
 wrote:
 
 Sorry,
 
            The below solution I thoght working was not
 working but was just giving array size.
 
 
 
 
 
 On Fri, 14/3/14, Sudheer Joseph
 sudheer.jos...@yahoo.com
 wrote:
 
 
 
  Subject: Re: [Numpy-discussion] python array
 
  To: Discussion of Numerical Python numpy-discussion@scipy.org
 
  Date: Friday, 14 March, 2014, 1:09 AM
 
 
 
  Thank you very much Nicolas and
 
  Chris,
 
                 
 
               The
 
  hint was helpful and from that I treid below steps ( a
 crude
 
  way I would say) and getting same result now
 
 
 
  I have been using abs available by default and it is the
 
  same with numpy.absolute( i checked).
 
 
 
  nr= ((r2010r2010.min())  (r2010r2010.max()))
 
  nr[nr.5].shape
 
  Out[25]: (33868,)
 
  anr=numpy.absolute(nr)
 
  anr[anr.5].shape
 
  Out[27]: (33868,)
 
 
 
  This way I used may have problem when mask used has
 values
 
  which can affect the min max operation.
 
 
 
  So I would like to know if there is a standard formal (
 
  python/numpy) way to handle masked array when they need
 to
 
  be subjected to boolean operations.
 
 
 
  with best regards,
 
  Sudheer
 
 
 
 
 
  ***
 
  Sudheer Joseph         
 
  Indian National Centre for Ocean Information Services
 
  Ministry of Earth Sciences, Govt. of India
 
  POST BOX NO: 21, IDA Jeedeemetla P.O.
 
  Via Pragathi Nagar,Kukatpally, Hyderabad; Pin:5000 55
 
  Tel:+91-40-23886047(O),Fax:+91-40-23895011(O),
 
  

Re: [Numpy-discussion] python array

2014-03-14 Thread Eric Firing
On 2014/03/13 9:09 PM, Sudheer Joseph wrote:
 Dear Oslen,

 I had  a detailed look at the example you send and points I got were below

 a = np.arange(-8, 8).reshape((4, 4))
 b = ma.masked_array(a, mask=a  0)


 Out[33]: b[b4]
 masked_array(data = [-- -- -- -- -- -- -- -- 0 1 2 3],
   mask = [ True  True  True  True  True  True  True  True False 
 False False False],
 fill_value = 99)
 In [34]: b[b4].shape
 Out[34]: (12,)
 In [35]: b[b4].data
 Out[35]: array([-8, -7, -6, -5, -4, -3, -2, -1,  0,  1,  2,  3])

 This shows while numpy can do the bolean operation and list the data meeting 
 the criteria( by masking the data further), it do not actually allow us get 
 the count of data that meets the crieteria. I was interested in count. 
 Because my objective was to find out how many numbers in the grid fall under 
 different catagory.( =4 , 4  =8 , 8=10) etc. and find the percentage of 
 them.

   Is there a way to get the counts correctly ? that is my botheration now !!

Certainly.  If all you need are statistics of the type you describe, 
where you are working with a 1-D array, then extract the unmasked values 
into an ordinary ndarray, and work with that:

a = np.random.randn(100)
am = np.ma.masked_less(a, -0.2)
print am.count()  # number of masked values
a_nomask = am.compressed()
print type(a_nomask)
print a_nomask.shape

# number of points with value less than 0.5:
print (a_nomask  0.5).sum()
# (Boolean True is 1)

# Or if you want the actual array of values, not just the count:
a_nomask[a_nomask  0.5]

Eric




 with best regards,
 Sudheer

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] python array

2014-03-14 Thread Sudheer Joseph
Thank you  Eric, 
  The compress is the option which is gets the correct 
numbers.
   
a = np.arange(-8, 8).reshape((4, 4))
In [67]:  b = ma.masked_array(a, mask=a  0)
In [68]: bb=b.compressed()
In [69]: b[b4].size
Out[69]: 12
In [70]: bb=b.compressed()
In [71]: bb[bb=4].size
Out[71]: 5

with best regards,
Sudheer

 
***
Sudheer Joseph 
Indian National Centre for Ocean Information Services
Ministry of Earth Sciences, Govt. of India
POST BOX NO: 21, IDA Jeedeemetla P.O.
Via Pragathi Nagar,Kukatpally, Hyderabad; Pin:5000 55
Tel:+91-40-23886047(O),Fax:+91-40-23895011(O),
Tel:+91-40-23044600(R),Tel:+91-40-9440832534(Mobile)
E-mail:sjo.in...@gmail.com;sudheer.jos...@yahoo.com
Web- http://oppamthadathil.tripod.com
***


On Fri, 14/3/14, Eric Firing efir...@hawaii.edu wrote:

 Subject: Re: [Numpy-discussion] python array
 To: numpy-discussion@scipy.org
 Date: Friday, 14 March, 2014, 7:20 AM
 
 On 2014/03/13 9:09 PM, Sudheer Joseph
 wrote:
  Dear Oslen,
 
  I had  a detailed look at the example you send and
 points I got were below
 
  a = np.arange(-8, 8).reshape((4, 4))
  b = ma.masked_array(a, mask=a  0)
 
 
  Out[33]: b[b4]
  masked_array(data = [-- -- -- -- -- -- -- -- 0 1 2 3],
            
    mask = [ True  True  True 
 True  True  True  True  True False False
 False False],
          fill_value =
 99)
  In [34]: b[b4].shape
  Out[34]: (12,)
  In [35]: b[b4].data
  Out[35]: array([-8, -7, -6, -5, -4, -3, -2, -1, 
 0,  1,  2,  3])
 
  This shows while numpy can do the bolean operation and
 list the data meeting the criteria( by masking the data
 further), it do not actually allow us get the count of data
 that meets the crieteria. I was interested in count. Because
 my objective was to find out how many numbers in the grid
 fall under different catagory.( =4 , 4  =8
 , 8=10) etc. and find the percentage of them.
 
    Is there a way to get the counts
 correctly ? that is my botheration now !!
 
 Certainly.  If all you need are statistics of the type
 you describe, 
 where you are working with a 1-D array, then extract the
 unmasked values 
 into an ordinary ndarray, and work with that:
 
 a = np.random.randn(100)
 am = np.ma.masked_less(a, -0.2)
 print am.count()  # number of masked values
 a_nomask = am.compressed()
 print type(a_nomask)
 print a_nomask.shape
 
 # number of points with value less than 0.5:
 print (a_nomask  0.5).sum()
 # (Boolean True is 1)
 
 # Or if you want the actual array of values, not just the
 count:
 a_nomask[a_nomask  0.5]
 
 Eric
 
 
 
 
  with best regards,
  Sudheer
 
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
 
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] GSoC project: draft of proposal

2014-03-14 Thread Gregor Thalhammer

Am 13.03.2014 um 18:35 schrieb Leo Mao lmao20...@gmail.com:

 Hi,
 
 Thanks a lot for your advice, Chuck.
 Following your advice, I have modified my draft of proposal. (attachment)
 I think it still needs more comments so that I can make it better.
 
 And I found that maybe I can also make some functions related to linalg (like 
 dot, svd or something else) faster by integrating a proper library into numpy.
 
 Regards,
 Leo Mao
 
Dear Leo,

large parts of your proposal are covered by the uvml package 
https://github.com/geggo/uvml
In my opinion you should also consider Intels VML (part of MKL) as a candidate. 
(Yes I know, it is not free). To my best knowledge it provides many more 
vectorized functions than the open source alternatives. 
Concerning your time table, once you implemented support for one function, 
adding more functions is very easy. 

Gregor

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] GSoC project: draft of proposal

2014-03-14 Thread Eric Moore
On Friday, March 14, 2014, Gregor Thalhammer gregor.thalham...@gmail.com
wrote:


 Am 13.03.2014 um 18:35 schrieb Leo Mao lmao20...@gmail.com javascript:;
 :

  Hi,
 
  Thanks a lot for your advice, Chuck.
  Following your advice, I have modified my draft of proposal. (attachment)
  I think it still needs more comments so that I can make it better.
 
  And I found that maybe I can also make some functions related to linalg
 (like dot, svd or something else) faster by integrating a proper library
 into numpy.
 
  Regards,
  Leo Mao
 
 Dear Leo,

 large parts of your proposal are covered by the uvml package
 https://github.com/geggo/uvml
 In my opinion you should also consider Intels VML (part of MKL) as a
 candidate. (Yes I know, it is not free). To my best knowledge it provides
 many more vectorized functions than the open source alternatives.
 Concerning your time table, once you implemented support for one function,
 adding more functions is very easy.

 Gregor


I'm not sure that your week old project is enough to discourage this gsoc
project. In particular, it would be nice to be able to ship this directly
as part of numpy and that won't really be possible with mlk.

Eric


 __
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org javascript:;
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] GSoC project: draft of proposal

2014-03-14 Thread Gregor Thalhammer

Am 14.03.2014 um 11:00 schrieb Eric Moore e...@redtetrahedron.org:

 
 
 On Friday, March 14, 2014, Gregor Thalhammer gregor.thalham...@gmail.com 
 wrote:
 
 Am 13.03.2014 um 18:35 schrieb Leo Mao lmao20...@gmail.com:
 
  Hi,
 
  Thanks a lot for your advice, Chuck.
  Following your advice, I have modified my draft of proposal. (attachment)
  I think it still needs more comments so that I can make it better.
 
  And I found that maybe I can also make some functions related to linalg 
  (like dot, svd or something else) faster by integrating a proper library 
  into numpy.
 
  Regards,
  Leo Mao
 
 Dear Leo,
 
 large parts of your proposal are covered by the uvml package
 https://github.com/geggo/uvml
 In my opinion you should also consider Intels VML (part of MKL) as a 
 candidate. (Yes I know, it is not free). To my best knowledge it provides 
 many more vectorized functions than the open source alternatives.
 Concerning your time table, once you implemented support for one function, 
 adding more functions is very easy.
 
 Gregor
 
 
 I'm not sure that your week old project is enough to discourage this gsoc 
 project. In particular, it would be nice to be able to ship this directly as 
 part of numpy and that won't really be possible with mlk.
 
 Eric 
  

Hi,

it's not at all my intention to discourage this project. I hope Leo Mao can use 
the uvml package as a starting point for further improvements. Since most 
vectorized math libraries share a very similar interface, I think the actual 
choice of the library could be made a configurable option. Adapting uvml to use 
e.g. yeppp instead of MKL should be straightforward. Similar to numpy or scipy 
built with MKL lapack and distributed by enthought or Christoph Gohlke, using 
MKL should not be ruled out completely.

Gregor




___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] GSoC project: draft of proposal

2014-03-14 Thread Frédéric Bastien
Just a comment, supporting a library that is bsd 3 clauses could help
to higly reduce the compilation problem like what we have with blas.
We could just include it in numpy/download it automatically or
whatever to make the install trivial and then we could suppose all
users have it. Deadling with blas is already not fun, if new
dependency could be trivial to link to, it would be great.

Fred

On Fri, Mar 14, 2014 at 8:57 AM, Gregor Thalhammer
gregor.thalham...@gmail.com wrote:

 Am 14.03.2014 um 11:00 schrieb Eric Moore e...@redtetrahedron.org:



 On Friday, March 14, 2014, Gregor Thalhammer gregor.thalham...@gmail.com
 wrote:


 Am 13.03.2014 um 18:35 schrieb Leo Mao lmao20...@gmail.com:

  Hi,
 
  Thanks a lot for your advice, Chuck.
  Following your advice, I have modified my draft of proposal.
  (attachment)
  I think it still needs more comments so that I can make it better.
 
  And I found that maybe I can also make some functions related to linalg
  (like dot, svd or something else) faster by integrating a proper library
  into numpy.
 
  Regards,
  Leo Mao
 
 Dear Leo,

 large parts of your proposal are covered by the uvml package
 https://github.com/geggo/uvml
 In my opinion you should also consider Intels VML (part of MKL) as a
 candidate. (Yes I know, it is not free). To my best knowledge it provides
 many more vectorized functions than the open source alternatives.
 Concerning your time table, once you implemented support for one function,
 adding more functions is very easy.

 Gregor


 I'm not sure that your week old project is enough to discourage this gsoc
 project. In particular, it would be nice to be able to ship this directly as
 part of numpy and that won't really be possible with mlk.

 Eric



 Hi,

 it's not at all my intention to discourage this project. I hope Leo Mao can
 use the uvml package as a starting point for further improvements. Since
 most vectorized math libraries share a very similar interface, I think the
 actual choice of the library could be made a configurable option. Adapting
 uvml to use e.g. yeppp instead of MKL should be straightforward. Similar to
 numpy or scipy built with MKL lapack and distributed by enthought or
 Christoph Gohlke, using MKL should not be ruled out completely.

 Gregor





 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] ANN: HDF5 for Python 2.3.0 BETA

2014-03-14 Thread Andrew Collette
Announcing HDF5 for Python (h5py) 2.3.0 BETA


The h5py team is happy to announce the availability of h5py 2.3.0 beta. This
beta release will be available for approximately two weeks.

What's h5py?


The h5py package is a Pythonic interface to the HDF5 binary data format.

It lets you store huge amounts of numerical data, and easily manipulate
that data from NumPy. For example, you can slice into multi-terabyte
datasets stored on disk, as if they were real NumPy arrays. Thousands of
datasets can be stored in a single file, categorized and tagged however
you want.

Changes
---

This release introduces some important new features, including:

* Support for arbitrary vlen data
* Improved exception messages
* Improved setuptools support
* Multiple additions to the low-level API
* Improved support for MPI features
* Single-step build for HDF5 on Windows

A complete description of changes is available online:

http://docs.h5py.org/en/latest/whatsnew/2.3.html

Where to get it
---

Downloads, documentation, and more are available at the h5py website:

http://www.h5py.org

Acknowledgements


The h5py package relies on third-party testing and contributions.  For the
2.3 release, thanks especially to:

* Martin Teichmann
* Florian Rathgerber
* Pierre de Buyl
* Thomas Caswell
* Andy Salnikov
* Darren Dale
* Robert David Grant
* Toon Verstraelen
* Many others who contributed bug reports and testing
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] GSoC project: draft of proposal

2014-03-14 Thread Leo Mao
Hi everyone,

Thanks for your relies!
I think Gregor's uvml package is really a good starting point for me.

 I think the actual choice of the library could be made a configurable
 option.


Sounds like a good idea? If the implementations are very similar, maybe I
can implement multiple libraries bindings?
A potential issue is that some libraries may lack some functions.
For example, Yeppp is a good candidates as long as it provides pre-build
libraries on many platforms and its API is pretty clear.
But Yeppp lacks some functions like inverse trigonometric functions.
Intels VML provides much more functions but sadly it is not free.

I found another library called Vc, which looks like a potential candidates
for this project:
http://code.compeng.uni-frankfurt.de/projects/vc
I haven't digged into it yet so I'm not sure if it provides what we want.

supporting a library that is bsd 3 clauses could help
 to higly reduce the compilation problem like what we have with blas.


Yeppp is bsd 3 clauses so I think Yeppp is really a good choice.
Is there a list of licenses which can be added into numpy without pain?
(how about LGPL3 ?)

Regards,
Leo Mao


On Fri, Mar 14, 2014 at 9:20 PM, Frédéric Bastien no...@nouiz.org wrote:

 Just a comment, supporting a library that is bsd 3 clauses could help
 to higly reduce the compilation problem like what we have with blas.
 We could just include it in numpy/download it automatically or
 whatever to make the install trivial and then we could suppose all
 users have it. Deadling with blas is already not fun, if new
 dependency could be trivial to link to, it would be great.

 Fred

 On Fri, Mar 14, 2014 at 8:57 AM, Gregor Thalhammer
 gregor.thalham...@gmail.com wrote:
 
  Am 14.03.2014 um 11:00 schrieb Eric Moore e...@redtetrahedron.org:
 
 
 
  On Friday, March 14, 2014, Gregor Thalhammer 
 gregor.thalham...@gmail.com
  wrote:
 
 
  Am 13.03.2014 um 18:35 schrieb Leo Mao lmao20...@gmail.com:
 
   Hi,
  
   Thanks a lot for your advice, Chuck.
   Following your advice, I have modified my draft of proposal.
   (attachment)
   I think it still needs more comments so that I can make it better.
  
   And I found that maybe I can also make some functions related to
 linalg
   (like dot, svd or something else) faster by integrating a proper
 library
   into numpy.
  
   Regards,
   Leo Mao
  
  Dear Leo,
 
  large parts of your proposal are covered by the uvml package
  https://github.com/geggo/uvml
  In my opinion you should also consider Intels VML (part of MKL) as a
  candidate. (Yes I know, it is not free). To my best knowledge it
 provides
  many more vectorized functions than the open source alternatives.
  Concerning your time table, once you implemented support for one
 function,
  adding more functions is very easy.
 
  Gregor
 
 
  I'm not sure that your week old project is enough to discourage this gsoc
  project. In particular, it would be nice to be able to ship this
 directly as
  part of numpy and that won't really be possible with mlk.
 
  Eric
 
 
 
  Hi,
 
  it's not at all my intention to discourage this project. I hope Leo Mao
 can
  use the uvml package as a starting point for further improvements. Since
  most vectorized math libraries share a very similar interface, I think
 the
  actual choice of the library could be made a configurable option.
 Adapting
  uvml to use e.g. yeppp instead of MKL should be straightforward. Similar
 to
  numpy or scipy built with MKL lapack and distributed by enthought or
  Christoph Gohlke, using MKL should not be ruled out completely.
 
  Gregor
 
 
 
 
 
  ___
  NumPy-Discussion mailing list
  NumPy-Discussion@scipy.org
  http://mail.scipy.org/mailman/listinfo/numpy-discussion
 
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] GSoC project: draft of proposal

2014-03-14 Thread Robert Kern
On Fri, Mar 14, 2014 at 4:33 PM, Leo Mao lmao20...@gmail.com wrote:

 Yeppp is bsd 3 clauses so I think Yeppp is really a good choice.
 Is there a list of licenses which can be added into numpy without pain? (how
 about LGPL3 ?)

No, just BSD and its rough equivalents like the Expat license.

-- 
Robert Kern
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] It looks like Py 3.5 will include a dedicated infix matrix multiply operator

2014-03-14 Thread Nathaniel Smith
Well, that was fast. Guido says he'll accept the addition of '@' as an
infix operator for matrix multiplication, once some details are ironed
out:
  https://mail.python.org/pipermail/python-ideas/2014-March/027109.html
  http://legacy.python.org/dev/peps/pep-0465/

Specifically, we need to figure out whether we want to make an
argument for a matrix power operator (@@), and what
precedence/associativity we want '@' to have. I'll post two separate
threads to get feedback on those in an organized way -- this is just a
heads-up.

-n

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] It looks like Py 3.5 will include a dedicated infix matrix multiply operator

2014-03-14 Thread Aron Ahmadia
That's the best news I've had all week.

Thanks for all your work on this Nathan.

-A


On Fri, Mar 14, 2014 at 8:51 PM, Nathaniel Smith n...@pobox.com wrote:

 Well, that was fast. Guido says he'll accept the addition of '@' as an
 infix operator for matrix multiplication, once some details are ironed
 out:
   https://mail.python.org/pipermail/python-ideas/2014-March/027109.html
   http://legacy.python.org/dev/peps/pep-0465/

 Specifically, we need to figure out whether we want to make an
 argument for a matrix power operator (@@), and what
 precedence/associativity we want '@' to have. I'll post two separate
 threads to get feedback on those in an organized way -- this is just a
 heads-up.

 -n

 --
 Nathaniel J. Smith
 Postdoctoral researcher - Informatics - University of Edinburgh
 http://vorpus.org
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] It looks like Py 3.5 will include a dedicated infix matrix multiply operator

2014-03-14 Thread Frédéric Bastien
This is great news. Excellent work Nathaniel and all others!

Frédéric

On Fri, Mar 14, 2014 at 8:57 PM, Aron Ahmadia a...@ahmadia.net wrote:
 That's the best news I've had all week.

 Thanks for all your work on this Nathan.

 -A


 On Fri, Mar 14, 2014 at 8:51 PM, Nathaniel Smith n...@pobox.com wrote:

 Well, that was fast. Guido says he'll accept the addition of '@' as an
 infix operator for matrix multiplication, once some details are ironed
 out:
   https://mail.python.org/pipermail/python-ideas/2014-March/027109.html
   http://legacy.python.org/dev/peps/pep-0465/

 Specifically, we need to figure out whether we want to make an
 argument for a matrix power operator (@@), and what
 precedence/associativity we want '@' to have. I'll post two separate
 threads to get feedback on those in an organized way -- this is just a
 heads-up.

 -n

 --
 Nathaniel J. Smith
 Postdoctoral researcher - Informatics - University of Edinburgh
 http://vorpus.org
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] It looks like Py 3.5 will include a dedicated infix matrix multiply operator

2014-03-14 Thread Christophe Bal
This id good for Numpyists but this will be another operator that good also
help in another contexts.

As a math user, I was first very skeptical but finally this is a good news
for non Numpyists too.

Christophe BAL
Le 15 mars 2014 02:01, Frédéric Bastien no...@nouiz.org a écrit :

 This is great news. Excellent work Nathaniel and all others!

 Frédéric

 On Fri, Mar 14, 2014 at 8:57 PM, Aron Ahmadia a...@ahmadia.net wrote:
  That's the best news I've had all week.
 
  Thanks for all your work on this Nathan.
 
  -A
 
 
  On Fri, Mar 14, 2014 at 8:51 PM, Nathaniel Smith n...@pobox.com wrote:
 
  Well, that was fast. Guido says he'll accept the addition of '@' as an
  infix operator for matrix multiplication, once some details are ironed
  out:
https://mail.python.org/pipermail/python-ideas/2014-March/027109.html
http://legacy.python.org/dev/peps/pep-0465/
 
  Specifically, we need to figure out whether we want to make an
  argument for a matrix power operator (@@), and what
  precedence/associativity we want '@' to have. I'll post two separate
  threads to get feedback on those in an organized way -- this is just a
  heads-up.
 
  -n
 
  --
  Nathaniel J. Smith
  Postdoctoral researcher - Informatics - University of Edinburgh
  http://vorpus.org
  ___
  NumPy-Discussion mailing list
  NumPy-Discussion@scipy.org
  http://mail.scipy.org/mailman/listinfo/numpy-discussion
 
 
 
  ___
  NumPy-Discussion mailing list
  NumPy-Discussion@scipy.org
  http://mail.scipy.org/mailman/listinfo/numpy-discussion
 
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] It looks like Py 3.5 will include a dedicated infix matrix multiply operator

2014-03-14 Thread Chris Laumann
That’s great. 

Does this mean that, in the not-so-distant future, the matrix class will go the 
way of the dodos? I have had more subtle to fix bugs sneak into code b/c 
something returns a matrix instead of an array than almost any other single 
source I can think of. Having two almost indistinguishable types for 2d arrays 
with slightly different semantics for a small subset of operations is terrible.

Best, C


-- 
Chris Laumann
Sent with Airmail

On March 14, 2014 at 7:16:24 PM, Christophe Bal (projet...@gmail.com) wrote:

This id good for Numpyists but this will be another operator that good also 
help in another contexts.

As a math user, I was first very skeptical but finally this is a good news for 
non Numpyists too.

Christophe BAL

Le 15 mars 2014 02:01, Frédéric Bastien no...@nouiz.org a écrit :
This is great news. Excellent work Nathaniel and all others!

Frédéric

On Fri, Mar 14, 2014 at 8:57 PM, Aron Ahmadia a...@ahmadia.net wrote:
 That's the best news I've had all week.

 Thanks for all your work on this Nathan.

 -A


 On Fri, Mar 14, 2014 at 8:51 PM, Nathaniel Smith n...@pobox.com wrote:

 Well, that was fast. Guido says he'll accept the addition of '@' as an
 infix operator for matrix multiplication, once some details are ironed
 out:
   https://mail.python.org/pipermail/python-ideas/2014-March/027109.html
   http://legacy.python.org/dev/peps/pep-0465/

 Specifically, we need to figure out whether we want to make an
 argument for a matrix power operator (@@), and what
 precedence/associativity we want '@' to have. I'll post two separate
 threads to get feedback on those in an organized way -- this is just a
 heads-up.

 -n

 --
 Nathaniel J. Smith
 Postdoctoral researcher - Informatics - University of Edinburgh
 http://vorpus.org
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
___  
NumPy-Discussion mailing list  
NumPy-Discussion@scipy.org  
http://mail.scipy.org/mailman/listinfo/numpy-discussion  
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] [help needed] associativity and precedence of '@'

2014-03-14 Thread Nathaniel Smith
Hi all,

Here's the main blocker for adding a matrix multiply operator '@' to
Python: we need to decide what we think its precedence and associativity
should be. I'll explain what that means so we're on the same page, and what
the choices are, and then we can all argue about it. But even better would
be if we could get some data to guide our decision, and this would be a lot
easier if some of you all can help; I'll suggest some ways you might be
able to do that.

So! Precedence and left- versus right-associativity. If you already know
what these are you can skim down until you see CAPITAL LETTERS.

We all know what precedence is. Code like this:
  a + b * c
gets evaluated as:
  a + (b * c)
because * has higher precedence than +. It binds more tightly, as they
say. Python's complete precedence able is here:
  http://docs.python.org/3/reference/expressions.html#operator-precedence

Associativity, in the parsing sense, is less well known, though it's just
as important. It's about deciding how to evaluate code like this:
  a * b * c
Do we use
  a * (b * c)# * is right associative
or
  (a * b) * c# * is left associative
? Here all the operators have the same precedence (because, uh... they're
the same operator), so precedence doesn't help. And mostly we can ignore
this in day-to-day life, because both versions give the same answer, so who
cares. But a programming language has to pick one (consider what happens if
one of those objects has a non-default __mul__ implementation). And of
course it matters a lot for non-associative operations like
  a - b - c
or
  a / b / c
So when figuring out order of evaluations, what you do first is check the
precedence, and then if you have multiple operators next to each other with
the same precedence, you check their associativity. Notice that this means
that if you have different operators that share the same precedence level
(like + and -, or * and /), then they have to all have the same
associativity. All else being equal, it's generally considered nice to have
fewer precedence levels, because these have to be memorized by users.

Right now in Python, every precedence level is left-associative, except for
'**'. If you write these formulas without any parentheses, then what the
interpreter will actually execute is:
  (a * b) * c
  (a - b) - c
  (a / b) / c
but
  a ** (b ** c)

Okay, that's the background. Here's the question. We need to decide on
precedence and associativity for '@'. In particular, there are three
different options that are interesting:

OPTION 1 FOR @:
Precedence: same as *
Associativity: left
My shorthand name for it: same-left (yes, very creative)

This means that if you don't use parentheses, you get:
   a @ b @ c  -  (a @ b) @ c
   a * b @ c  -  (a * b) @ c
   a @ b * c  -  (a @ b) * c

OPTION 2 FOR @:
Precedence: more-weakly-binding than *
Associativity: right
My shorthand name for it: weak-right

This means that if you don't use parentheses, you get:
   a @ b @ c  -  a @ (b @ c)
   a * b @ c  -  (a * b) @ c
   a @ b * c  -  a @ (b * c)

OPTION 3 FOR @:
Precedence: more-tightly-binding than *
Associativity: right
My shorthand name for it: tight-right

This means that if you don't use parentheses, you get:
   a @ b @ c  -  a @ (b @ c)
   a * b @ c  -  a * (b @ c)
   a @ b * c  -  (a @ b) * c

We need to pick which of which options we think is best, based on whatever
reasons we can think of, ideally more than hmm, weak-right gives me warm
fuzzy feelings ;-). (In principle the other 2 possible options are
tight-left and weak-left, but there doesn't seem to be any argument in
favor of either, so we'll leave them out of the discussion.)

Some things to consider:

* and @ are actually not associative (in the math sense) with respect to
each other, i.e., (a * b) @ c and a * (b @ c) in general give different
results when 'a' is not a scalar. So considering the two expressions 'a * b
@ c' and 'a @ b * c', we can see that each of these three options gives
produces different results in some cases.

Same-left is the easiest to explain and remember, because it's just, @
acts like * and /. So we already have to know the rule in order to
understand other non-associative expressions like a / b / c or a - b - c,
and it'd be nice if the same rule applied to things like a * b @ c so we
only had to memorize *one* rule. (Of course there's ** which uses the
opposite rule, but I guess everyone internalized that one in secondary
school; that's not true for * versus @.) This is definitely the default we
should choose unless we have a good reason to do otherwise.

BUT: there might indeed be a good reason to do otherwise, which is the
whole reason this has come up. Consider:
Mat1 @ Mat2 @ vec
Obviously this will execute much more quickly if we do
Mat1 @ (Mat2 @ vec)
because that results in two cheap matrix-vector multiplies, while
(Mat1 @ Mat2) @ vec
starts out by doing an expensive matrix-matrix multiply. So: maybe @ should
be right associative, so that we 

Re: [Numpy-discussion] It looks like Py 3.5 will include a dedicated infix matrix multiply operator

2014-03-14 Thread Nathaniel Smith
On Sat, Mar 15, 2014 at 3:18 AM, Chris Laumann chris.laum...@gmail.com wrote:

 That’s great.

 Does this mean that, in the not-so-distant future, the matrix class will go 
 the way of the dodos? I have had more subtle to fix bugs sneak into code b/c 
 something returns a matrix instead of an array than almost any other single 
 source I can think of. Having two almost indistinguishable types for 2d 
 arrays with slightly different semantics for a small subset of operations is 
 terrible.

Well, it depends on what your definition of distant is :-). Py 3.5
won't be out for some time (3.*4* is coming out this week). And we'll
still need to sit down and figure out if there's any bits of matrix we
want to save (e.g., maybe create an ndarray version of the parser used
for np.matrix(1 2; 3 4)), come up with a transition plan, have a
long mailing list argument about it, etc. But the goal (IMO) is
definitely to get rid of np.matrix as soon as reasonable given these
considerations, and similarly to find a way to switch scipy.sparse
matrices to a more ndarray-like API. So it'll be a few years at least,
but I think we'll get there.

-n

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] [help needed] associativity and precedence of '@'

2014-03-14 Thread Chris Laumann
Hi all,

Let me preface my two cents by saying that I think the best part of @ being 
accepted is the potential for deprecating the matrix class — the syntactic 
beauty of infix for matrix multiply is a nice side effect IMHO :) This may be 
why my basic attitude is:

I don’t think it matters very much but I would vote (weakly) for weak-right. 
Where there is ambiguity, I suspect most practitioners will just put in 
parentheses anyway — especially with combinations of * and @, where I don’t 
think there is a natural intuitive precedence relationship. At least, 
element-wise multiplication is very rare in math/physics texts as an explicitly 
defined elementary operation so I’d be surprised if anybody had a strong 
intuition about the precedence of the ‘*’ operator. And the binding order 
doesn’t matter if it is scalar multiplication.

I have quite a bit of code with large matrices where the order of matrix-vector 
multiplies is an important optimization and I would certainly have a few 
simpler looking expressions for op @ op @ vec, hence the weak preference for 
right-associativity. That said, I routinely come across situations where the 
optimal matrix multiplication order is more complicated than can be expressed 
as left-right or right-left (because some matrices might be diagonal, CSR or 
CSC), which is why the preference is only weak. I don’t see a down-side in the 
use-case that it is actually associative (as in matrix-matrix-vector). 

Best, Chris



-- 
Chris Laumann
Sent with Airmail

On March 14, 2014 at 8:42:00 PM, Nathaniel Smith (n...@pobox.com) wrote:

Hi all,

Here's the main blocker for adding a matrix multiply operator '@' to Python: we 
need to decide what we think its precedence and associativity should be. I'll 
explain what that means so we're on the same page, and what the choices are, 
and then we can all argue about it. But even better would be if we could get 
some data to guide our decision, and this would be a lot easier if some of you 
all can help; I'll suggest some ways you might be able to do that.

So! Precedence and left- versus right-associativity. If you already know what 
these are you can skim down until you see CAPITAL LETTERS.

We all know what precedence is. Code like this:
  a + b * c
gets evaluated as:
  a + (b * c)
because * has higher precedence than +. It binds more tightly, as they say. 
Python's complete precedence able is here:
  http://docs.python.org/3/reference/expressions.html#operator-precedence

Associativity, in the parsing sense, is less well known, though it's just as 
important. It's about deciding how to evaluate code like this:
  a * b * c
Do we use
  a * (b * c)    # * is right associative
or
  (a * b) * c    # * is left associative
? Here all the operators have the same precedence (because, uh... they're the 
same operator), so precedence doesn't help. And mostly we can ignore this in 
day-to-day life, because both versions give the same answer, so who cares. But 
a programming language has to pick one (consider what happens if one of those 
objects has a non-default __mul__ implementation). And of course it matters a 
lot for non-associative operations like
  a - b - c
or
  a / b / c
So when figuring out order of evaluations, what you do first is check the 
precedence, and then if you have multiple operators next to each other with the 
same precedence, you check their associativity. Notice that this means that if 
you have different operators that share the same precedence level (like + and 
-, or * and /), then they have to all have the same associativity. All else 
being equal, it's generally considered nice to have fewer precedence levels, 
because these have to be memorized by users.

Right now in Python, every precedence level is left-associative, except for 
'**'. If you write these formulas without any parentheses, then what the 
interpreter will actually execute is:
  (a * b) * c
  (a - b) - c
  (a / b) / c
but
  a ** (b ** c)

Okay, that's the background. Here's the question. We need to decide on 
precedence and associativity for '@'. In particular, there are three different 
options that are interesting:

OPTION 1 FOR @:
Precedence: same as *
Associativity: left
My shorthand name for it: same-left (yes, very creative)

This means that if you don't use parentheses, you get:
   a @ b @ c  -  (a @ b) @ c
   a * b @ c  -  (a * b) @ c
   a @ b * c  -  (a @ b) * c

OPTION 2 FOR @:
Precedence: more-weakly-binding than *
Associativity: right
My shorthand name for it: weak-right

This means that if you don't use parentheses, you get:
   a @ b @ c  -  a @ (b @ c)
   a * b @ c  -  (a * b) @ c
   a @ b * c  -  a @ (b * c)

OPTION 3 FOR @:
Precedence: more-tightly-binding than *
Associativity: right
My shorthand name for it: tight-right

This means that if you don't use parentheses, you get:
   a @ b @ c  -  a @ (b @ c)
   a * b @ c  -  a * (b @ c)
   a @ b * c  -  (a @ b) * c

We need to pick which of which options we think is best, based on 

Re: [Numpy-discussion] It looks like Py 3.5 will include a dedicated infix matrix multiply operator

2014-03-14 Thread Travis Oliphant
Congratulations Nathaniel!

This is great news!

Well done on starting the process and taking things forward.

Travis
On Mar 14, 2014 7:51 PM, Nathaniel Smith n...@pobox.com wrote:

 Well, that was fast. Guido says he'll accept the addition of '@' as an
 infix operator for matrix multiplication, once some details are ironed
 out:
   https://mail.python.org/pipermail/python-ideas/2014-March/027109.html
   http://legacy.python.org/dev/peps/pep-0465/

 Specifically, we need to figure out whether we want to make an
 argument for a matrix power operator (@@), and what
 precedence/associativity we want '@' to have. I'll post two separate
 threads to get feedback on those in an organized way -- this is just a
 heads-up.

 -n

 --
 Nathaniel J. Smith
 Postdoctoral researcher - Informatics - University of Edinburgh
 http://vorpus.org
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] [RFC] should we argue for a matrix power operator, @@?

2014-03-14 Thread Nathaniel Smith
Hi all,

Here's the second thread for discussion about Guido's concerns about
PEP 465. The issue here is that PEP 465 as currently written proposes
two new operators, @ for matrix multiplication and @@ for matrix power
(analogous to * and **):
  http://legacy.python.org/dev/peps/pep-0465/

The main thing we care about of course is @; I pushed for including @@
because I thought it was nicer to have than not, and I thought the
analogy between * and ** might make the overall package more appealing
to Guido's aesthetic sense.

It turns out I was wrong :-). Guido is -0 on @@, but willing to be
swayed if we think it's worth the trouble to make a solid case.

Note that question now is *not*, how will @@ affect the reception of
@. @ itself is AFAICT a done deal, regardless of what happens with @@.
For this discussion let's assume @ can be taken for granted, and that
we can freely choose to either add @@ or not add @@ to the language.
The question is: which do we think makes Python a better language (for
us and in general)?

Some thoughts to start us off:

Here are the interesting use cases for @@ that I can think of:
- 'vector @@ 2' gives the squared Euclidean length (because it's the
same as vector @ vector). Kind of handy.
- 'matrix @@ n' of course gives the matrix power, which is of marginal
use but does come in handy sometimes, e.g., when looking at graph
connectivity.
- 'matrix @@ -1' provides a very transparent notation for translating
textbook formulas (with all their inverses) into code. It's a bit
unhelpful in practice, because (a) usually you should use solve(), and
(b) 'matrix @@ -1' is actually more characters than 'inv(matrix)'. But
sometimes transparent notation may be important. (And in some cases,
like using numba or theano or whatever, 'matrix @@ -1 @ foo' could be
compiled into a call to solve() anyway.)

(Did I miss any?)

In practice it seems to me that the last use case is the one that's
might matter a lot practice, but then again, it might not -- I'm not
sure. For example, does anyone who teaches programming with numpy have
a feeling about whether the existence of '@@ -1' would make a big
difference to you and your students? (Alan? I know you were worried
about losing the .I attribute on matrices if switching to ndarrays for
teaching -- given that ndarray will probably not get a .I attribute,
how much would the existence of @@ -1 affect you?)

On a more technical level, Guido is worried about how @@'s precedence
should work (and this is somewhat related to the other thread about
@'s precedence and associativity, because he feels that if we end up
giving @ and * different precedence, then that makes it much less
clear what to do with @@, and reduces the strength of the */**/@/@@
analogy). In particular, if we want to argue for @@ then we'll need to
figure out what expressions like
   a @@ b @@ c
and
   a ** b @@ c
and
   a @@ b ** c
should do.

A related question is what @@ should do if given an array as its right
argument. In the current PEP, only integers are accepted, which rules
out a bunch of the more complicated cases like a @@ b @@ c (at least
assuming @@ is right-associative, like **, and I can't see why you'd
want anything else). OTOH, in the brave new gufunc world, it
technically would make sense to define @@ as being a gufunc with
signature (m,m),()-(m,m), and the way gufuncs work this *would* allow
the power to be an array -- for example, we'd have:

   mat = randn(m, m)
   pow = range(n)
   result = gufunc_matrix_power(mat, pow)
   assert result.shape == (n, m, m)
   for i in xrange(n):
   assert np.all(result[i, :, :] == mat ** i)

In this case, a @@ b @@ c would at least be a meaningful expression to
write. OTOH it would be incredibly bizarre and useless, so probably
no-one would ever write it.

As far as these technical issues go, my guess is that the correct rule
is that @@ should just have the same precedence and the same (right)
associativity as **, and in practice no-one will ever write stuff like
a @@ b @@ c. But if we want to argue for @@ we need to come to some
consensus or another here.

It's also possible the answer is ugh, these issues are too
complicated, we should defer this until later when we have more
experience with @ and gufuncs and stuff. After all, I doubt anyone
else will swoop in and steal @@ to mean something else! OTOH, if e.g.
there's a strong feeling that '@@ -1' will make a big difference in
pedagogical contexts, then putting that off for years might be a
mistake.

-n

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] [RFC] should we argue for a matrix power operator, @@?

2014-03-14 Thread Jaime Fernández del Río
On Fri, Mar 14, 2014 at 9:32 PM, Nathaniel Smith n...@pobox.com wrote:


 Here are the interesting use cases for @@ that I can think of:
 - 'vector @@ 2' gives the squared Euclidean length (because it's the
 same as vector @ vector). Kind of handy.
 - 'matrix @@ n' of course gives the matrix power, which is of marginal
 use but does come in handy sometimes, e.g., when looking at graph
 connectivity.
 - 'matrix @@ -1' provides a very transparent notation for translating
 textbook formulas (with all their inverses) into code. It's a bit
 unhelpful in practice, because (a) usually you should use solve(), and
 (b) 'matrix @@ -1' is actually more characters than 'inv(matrix)'. But
 sometimes transparent notation may be important. (And in some cases,
 like using numba or theano or whatever, 'matrix @@ -1 @ foo' could be
 compiled into a call to solve() anyway.)

 (Did I miss any?)


I'm not really arguing for it, and I am not sure how, or even if, it fits
in the general scheme. But for completeness sake, 'e @@ Matrix' is used in
some treatments of linear systems of differential equations, where:

dvector/dt = matrix @ vector

would have solution

vector = e @@ (matrix * t) @ vector_0

I don't think it makes any sense to use it as such in the context of numpy,
as I think it would make broadcasting undecidable. But there may be
parallel universes where having n @@ matrix and matrix @@ n both with
well defined, yet different meanings may make sense. It is my impression
that in this entirely made up scenario you would want e @@ A @@ 3 to be
evaluated as (e @@ A) @@ 3. Which probably has more to do with the fact
that the two @@ mean different things, than with the associativity that
repeated calls to the same @@ should have.

Personally I couldn't care less, and if I had a vote I would let @@ rest
for now, until we see how @ plays out.

Jaime
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] [help needed] associativity and precedence of '@'

2014-03-14 Thread Jaime Fernández del Río
On Fri, Mar 14, 2014 at 9:15 PM, Chris Laumann chris.laum...@gmail.comwrote:

 Hi all,

 Let me preface my two cents by saying that I think the best part of @
 being accepted is the potential for deprecating the matrix class -- the
 syntactic beauty of infix for matrix multiply is a nice side effect IMHO :)
 This may be why my basic attitude is:

 I don't think it matters very much but I would vote (weakly) for
 weak-right. Where there is ambiguity, I suspect most practitioners will
 just put in parentheses anyway -- especially with combinations of * and @,
 where I don't think there is a natural intuitive precedence relationship.

At least, element-wise multiplication is very rare in math/physics texts as
 an explicitly defined elementary operation so I'd be surprised if anybody
 had a strong intuition about the precedence of the '*' operator.


My take on this is that if you mix * and @, you are probably using * to
build the matrices you want to __matmul__ with @. So weak-right would be
the way to go from that point of view.

 Jaime
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion