Re: [Numpy-discussion] Matrix Class
On 14-Feb-15 11:35 AM, josef.p...@gmail.com wrote: On Wed, Feb 11, 2015 at 4:18 PM, Ryan Nelson rnelsonc...@gmail.com wrote: Colin, I currently use Py3.4 and Numpy 1.9.1. However, I built a quick test conda environment with Python2.7 and Numpy 1.7.0, and I get the same: Python 2.7.9 |Continuum Analytics, Inc.| (default, Dec 18 2014, 16:57:52) [MSC v .1500 64 bit (AMD64)] Type copyright, credits or license for more information. IPython 2.3.1 -- An enhanced Interactive Python. Anaconda is brought to you by Continuum Analytics. Please check out: http://continuum.io/thanks and https://binstar.org ? - Introduction and overview of IPython's features. %quickref - Quick reference. help - Python's own help system. object? - Details about 'object', use 'object??' for extra details. In [1]: import numpy as np In [2]: np.__version__ Out[2]: '1.7.0' In [3]: np.mat([4,'5',6]) Out[3]: matrix([['4', '5', '6']], dtype='|S1') In [4]: np.mat([4,'5',6], dtype=int) Out[4]: matrix([[4, 5, 6]]) ### As to your comment about coordinating with Statsmodels, you should see the links in the thread that Alan posted: http://permalink.gmane.org/gmane.comp.python.numeric.general/56516 http://permalink.gmane.org/gmane.comp.python.numeric.general/56517 Josef's comments at the time seem to echo the issues the devs (and others) have with the matrix class. Maybe things have changed with Statsmodels. Not changed, we have a strict policy against using np.matrix. generic efficient versions for linear operators, kronecker or sparse block matrix styly operations would be useful, but I would use array semantics, similar to using dot or linalg functions on ndarrays. Josef (long reply canceled because I'm writing too much that might only be of tangential interest or has been in some of the matrix discussion before.) Josef, Many thanks. I have gained the impression that there is some antipathy to np.matrix, perhaps this is because, as others have suggested, the array doesn't provide an appropriate framework. Where are such policy decisions documented? Numpy doesn't appear to have a BDFL. I had read Alan's links back in February and now have note of them. Colin W. I know I mentioned Sage and SageMathCloud before. I'll just point out that there are folks that use this for real research problems, not just as a pedagogical tool. They have a Matrix/vector/column_matrix class that do what you were expecting from your problems posted above. Indeed below is a (truncated) cut and past from a Sage Worksheet. (See http://www.sagemath.org/doc/tutorial/tour_linalg.html) ## In : Matrix([1,'2',3]) Error in lines 1-1 Traceback (most recent call last): TypeError: unable to find a common ring for all elements In : Matrix([[1,2,3],[4,5]]) ValueError: List of rows is not valid (rows are wrong types or lengths) In : vector([1,2,3]) (1, 2, 3) In : column_matrix([1,2,3]) [1] [2] [3] ## Large portions of the custom code and wrappers in Sage are written in Python. I don't think their Matrix object is a subclass of ndarray, so perhaps you could strip out the Matrix stuff from here to make a separate project with just the Matrix stuff, if you don't want to go through the Sage interface. On Wed, Feb 11, 2015 at 11:54 AM, cjw c...@ncf.ca wrote: On 11-Feb-15 10:21 AM, Ryan Nelson wrote: So: In [2]: np.mat([4,'5',6]) Out[2]: matrix([['4', '5', '6']], dtype='U11') In [3]: np.mat([4,'5',6], dtype=int) Out[3]: matrix([[4, 5, 6]]) Thanks Ryan, We are not singing from the same hymn book. Using PyScripter, I get: *** Python 2.7.9 (default, Dec 10 2014, 12:28:03) [MSC v.1500 64 bit (AMD64)] on win32. *** import numpy as np print('Numpy version: ', np.__version__) ('Numpy version: ', '1.9.0') Could you say which version you are using please? Colin W On Tue, Feb 10, 2015 at 5:07 PM, cjw c...@ncf.ca wrote: It seems to be agreed that there are weaknesses in the existing Numpy Matrix Class. Some problems are illustrated below. I'll try to put some suggestions over the coming weeks and would appreciate comments. Colin W. Test Script: if __name__ == '__main__': a= mat([4, 5, 6]) # Good print('a: ', a) b= mat([4, '5', 6]) # Not the expected result print('b: ', b) c= mat([[4, 5, 6], [7, 8]]) # Wrongly accepted as rectangular print('c: ', c) d= mat([[1, 2, 3]]) try: d[0, 1]= 'b'# Correctly flagged, not numeric except ValueError: print(d[0, 1]= 'b' # Correctly flagged, not numeric, ' ValueError') print('d: ', d) Result: *** Python 2.7.9 (default, Dec 10 2014, 12:28:03) [MSC v.1500 64 bit (AMD64)] on win32. *** a: [[4 5 6]] b: [['4' '5' '6']] c: [[[4, 5, 6] [7, 8]]] d[0, 1]= 'b' # Correctly flagged, not numeric ValueError
Re: [Numpy-discussion] Matrix Class
On Sat, Feb 14, 2015 at 12:05 PM, cjw c...@ncf.ca wrote: On 14-Feb-15 11:35 AM, josef.p...@gmail.com wrote: On Wed, Feb 11, 2015 at 4:18 PM, Ryan Nelson rnelsonc...@gmail.com wrote: Colin, I currently use Py3.4 and Numpy 1.9.1. However, I built a quick test conda environment with Python2.7 and Numpy 1.7.0, and I get the same: Python 2.7.9 |Continuum Analytics, Inc.| (default, Dec 18 2014, 16:57:52) [MSC v .1500 64 bit (AMD64)] Type copyright, credits or license for more information. IPython 2.3.1 -- An enhanced Interactive Python. Anaconda is brought to you by Continuum Analytics. Please check out: http://continuum.io/thanks and https://binstar.org ? - Introduction and overview of IPython's features. %quickref - Quick reference. help - Python's own help system. object? - Details about 'object', use 'object??' for extra details. In [1]: import numpy as np In [2]: np.__version__ Out[2]: '1.7.0' In [3]: np.mat([4,'5',6]) Out[3]: matrix([['4', '5', '6']], dtype='|S1') In [4]: np.mat([4,'5',6], dtype=int) Out[4]: matrix([[4, 5, 6]]) ### As to your comment about coordinating with Statsmodels, you should see the links in the thread that Alan posted: http://permalink.gmane.org/gmane.comp.python.numeric.general/56516 http://permalink.gmane.org/gmane.comp.python.numeric.general/56517 Josef's comments at the time seem to echo the issues the devs (and others) have with the matrix class. Maybe things have changed with Statsmodels. Not changed, we have a strict policy against using np.matrix. generic efficient versions for linear operators, kronecker or sparse block matrix styly operations would be useful, but I would use array semantics, similar to using dot or linalg functions on ndarrays. Josef (long reply canceled because I'm writing too much that might only be of tangential interest or has been in some of the matrix discussion before.) Josef, Many thanks. I have gained the impression that there is some antipathy to np.matrix, perhaps this is because, as others have suggested, the array doesn't provide an appropriate framework. It's not directly antipathy, it's cost-benefit analysis. np.matrix has few advantages, but makes reading and maintaining code much more difficult. Having to watch out for multiplication `*` is a lot of extra work. Checking shapes and fixing bugs with unexpected dtypes is also a lot of work, but we have large benefits. For a long time the policy in statsmodels was to keep pandas out of the core of functions (i.e. out of the actual calculations) and restrict it to inputs and returns. However, pandas is becoming more popular and can do some things much better than plain numpy, so it is slowly moving inside some of our core calculations. It's still an easy source of bugs, but we do gain something. Benefits like these don't exist for np.matrix. Where are such policy decisions documented? Numpy doesn't appear to have a BDFL. In general it's a mix of mailing list discussions and discussion in issues and PRs. I'm not directly involved in numpy and don't subscribe to the numpy's github notifications. For scipy (and partially for statsmodels): I think large parts of policies for code and workflow are not explicitly specified, but are more an understanding of maintainers and developers that can slowly change over time, build up through spread out discussion as temporary consensus (or without strong objections). scipy has a hacking text file to describe some of it, but I haven't read it in ages. (long term changes compared to 6 years ago: required code review and required test coverage.) Josef I had read Alan's links back in February and now have note of them. Colin W. I know I mentioned Sage and SageMathCloud before. I'll just point out that there are folks that use this for real research problems, not just as a pedagogical tool. They have a Matrix/vector/column_matrix class that do what you were expecting from your problems posted above. Indeed below is a (truncated) cut and past from a Sage Worksheet. (See http://www.sagemath.org/doc/tutorial/tour_linalg.html) ## In : Matrix([1,'2',3]) Error in lines 1-1 Traceback (most recent call last): TypeError: unable to find a common ring for all elements In : Matrix([[1,2,3],[4,5]]) ValueError: List of rows is not valid (rows are wrong types or lengths) In : vector([1,2,3]) (1, 2, 3) In : column_matrix([1,2,3]) [1] [2] [3] ## Large portions of the custom code and wrappers in Sage are written in Python. I don't think their Matrix object is a subclass of ndarray, so perhaps you could strip out the Matrix stuff from here to make a separate project with just the Matrix stuff, if you don't want to go through the Sage interface. On Wed, Feb 11, 2015 at 11:54 AM, cjw c...@ncf.ca wrote: On 11-Feb-15 10:21 AM, Ryan Nelson wrote: So: In [2]: np.mat([4,'5',6]) Out[2]:
Re: [Numpy-discussion] Matrix Class
On Sat, Feb 14, 2015 at 12:36 PM, josef.p...@gmail.com wrote: On Sat, Feb 14, 2015 at 12:05 PM, cjw c...@ncf.ca wrote: On 14-Feb-15 11:35 AM, josef.p...@gmail.com wrote: On Wed, Feb 11, 2015 at 4:18 PM, Ryan Nelson rnelsonc...@gmail.com wrote: Colin, I currently use Py3.4 and Numpy 1.9.1. However, I built a quick test conda environment with Python2.7 and Numpy 1.7.0, and I get the same: Python 2.7.9 |Continuum Analytics, Inc.| (default, Dec 18 2014, 16:57:52) [MSC v .1500 64 bit (AMD64)] Type copyright, credits or license for more information. IPython 2.3.1 -- An enhanced Interactive Python. Anaconda is brought to you by Continuum Analytics. Please check out: http://continuum.io/thanks and https://binstar.org ? - Introduction and overview of IPython's features. %quickref - Quick reference. help - Python's own help system. object? - Details about 'object', use 'object??' for extra details. In [1]: import numpy as np In [2]: np.__version__ Out[2]: '1.7.0' In [3]: np.mat([4,'5',6]) Out[3]: matrix([['4', '5', '6']], dtype='|S1') In [4]: np.mat([4,'5',6], dtype=int) Out[4]: matrix([[4, 5, 6]]) ### As to your comment about coordinating with Statsmodels, you should see the links in the thread that Alan posted: http://permalink.gmane.org/gmane.comp.python.numeric.general/56516 http://permalink.gmane.org/gmane.comp.python.numeric.general/56517 Josef's comments at the time seem to echo the issues the devs (and others) have with the matrix class. Maybe things have changed with Statsmodels. Not changed, we have a strict policy against using np.matrix. generic efficient versions for linear operators, kronecker or sparse block matrix styly operations would be useful, but I would use array semantics, similar to using dot or linalg functions on ndarrays. Josef (long reply canceled because I'm writing too much that might only be of tangential interest or has been in some of the matrix discussion before.) Josef, Many thanks. I have gained the impression that there is some antipathy to np.matrix, perhaps this is because, as others have suggested, the array doesn't provide an appropriate framework. It's not directly antipathy, it's cost-benefit analysis. np.matrix has few advantages, but makes reading and maintaining code much more difficult. Having to watch out for multiplication `*` is a lot of extra work. Checking shapes and fixing bugs with unexpected dtypes is also a lot of work, but we have large benefits. For a long time the policy in statsmodels was to keep pandas out of the core of functions (i.e. out of the actual calculations) and restrict it to inputs and returns. However, pandas is becoming more popular and can do some things much better than plain numpy, so it is slowly moving inside some of our core calculations. It's still an easy source of bugs, but we do gain something. Any bits of Pandas that might be good for numpy/scipy to steal? snip Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Matrix Class
On Sat, Feb 14, 2015 at 4:27 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Sat, Feb 14, 2015 at 12:36 PM, josef.p...@gmail.com wrote: On Sat, Feb 14, 2015 at 12:05 PM, cjw c...@ncf.ca wrote: On 14-Feb-15 11:35 AM, josef.p...@gmail.com wrote: On Wed, Feb 11, 2015 at 4:18 PM, Ryan Nelson rnelsonc...@gmail.com wrote: Colin, I currently use Py3.4 and Numpy 1.9.1. However, I built a quick test conda environment with Python2.7 and Numpy 1.7.0, and I get the same: Python 2.7.9 |Continuum Analytics, Inc.| (default, Dec 18 2014, 16:57:52) [MSC v .1500 64 bit (AMD64)] Type copyright, credits or license for more information. IPython 2.3.1 -- An enhanced Interactive Python. Anaconda is brought to you by Continuum Analytics. Please check out: http://continuum.io/thanks and https://binstar.org ? - Introduction and overview of IPython's features. %quickref - Quick reference. help - Python's own help system. object? - Details about 'object', use 'object??' for extra details. In [1]: import numpy as np In [2]: np.__version__ Out[2]: '1.7.0' In [3]: np.mat([4,'5',6]) Out[3]: matrix([['4', '5', '6']], dtype='|S1') In [4]: np.mat([4,'5',6], dtype=int) Out[4]: matrix([[4, 5, 6]]) ### As to your comment about coordinating with Statsmodels, you should see the links in the thread that Alan posted: http://permalink.gmane.org/gmane.comp.python.numeric.general/56516 http://permalink.gmane.org/gmane.comp.python.numeric.general/56517 Josef's comments at the time seem to echo the issues the devs (and others) have with the matrix class. Maybe things have changed with Statsmodels. Not changed, we have a strict policy against using np.matrix. generic efficient versions for linear operators, kronecker or sparse block matrix styly operations would be useful, but I would use array semantics, similar to using dot or linalg functions on ndarrays. Josef (long reply canceled because I'm writing too much that might only be of tangential interest or has been in some of the matrix discussion before.) Josef, Many thanks. I have gained the impression that there is some antipathy to np.matrix, perhaps this is because, as others have suggested, the array doesn't provide an appropriate framework. It's not directly antipathy, it's cost-benefit analysis. np.matrix has few advantages, but makes reading and maintaining code much more difficult. Having to watch out for multiplication `*` is a lot of extra work. Checking shapes and fixing bugs with unexpected dtypes is also a lot of work, but we have large benefits. For a long time the policy in statsmodels was to keep pandas out of the core of functions (i.e. out of the actual calculations) and restrict it to inputs and returns. However, pandas is becoming more popular and can do some things much better than plain numpy, so it is slowly moving inside some of our core calculations. It's still an easy source of bugs, but we do gain something. Any bits of Pandas that might be good for numpy/scipy to steal? I'm not a Pandas expert. Some of it comes into statsmodels because we need the data handling also inside a function, e.g. keeping track of labels, indices, and so on. Another reason is that contributors are more familiar with pandas's way of solving a problems, even if I suspect numpy would be more efficient. However, a recent change, replaces where I would have used np.unique with pandas.factorize which is supposed to be faster. https://github.com/statsmodels/statsmodels/pull/2213 Two or three years ago my numpy way of group handling (using np.unique, bincount and similar) was still faster than the pandas `apply` version, I'm not sure that's still true. And to emphasize: all our heavy stuff especially the big models still only have numpy and scipy inside (with the exception of one model waiting in a PR). Josef snip Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Matrix Class
On Sat, Feb 14, 2015 at 5:21 PM, josef.p...@gmail.com wrote: On Sat, Feb 14, 2015 at 4:27 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Sat, Feb 14, 2015 at 12:36 PM, josef.p...@gmail.com wrote: On Sat, Feb 14, 2015 at 12:05 PM, cjw c...@ncf.ca wrote: On 14-Feb-15 11:35 AM, josef.p...@gmail.com wrote: On Wed, Feb 11, 2015 at 4:18 PM, Ryan Nelson rnelsonc...@gmail.com wrote: Colin, I currently use Py3.4 and Numpy 1.9.1. However, I built a quick test conda environment with Python2.7 and Numpy 1.7.0, and I get the same: Python 2.7.9 |Continuum Analytics, Inc.| (default, Dec 18 2014, 16:57:52) [MSC v .1500 64 bit (AMD64)] Type copyright, credits or license for more information. IPython 2.3.1 -- An enhanced Interactive Python. Anaconda is brought to you by Continuum Analytics. Please check out: http://continuum.io/thanks and https://binstar.org ? - Introduction and overview of IPython's features. %quickref - Quick reference. help - Python's own help system. object? - Details about 'object', use 'object??' for extra details. In [1]: import numpy as np In [2]: np.__version__ Out[2]: '1.7.0' In [3]: np.mat([4,'5',6]) Out[3]: matrix([['4', '5', '6']], dtype='|S1') In [4]: np.mat([4,'5',6], dtype=int) Out[4]: matrix([[4, 5, 6]]) ### As to your comment about coordinating with Statsmodels, you should see the links in the thread that Alan posted: http://permalink.gmane.org/gmane.comp.python.numeric.general/56516 http://permalink.gmane.org/gmane.comp.python.numeric.general/56517 Josef's comments at the time seem to echo the issues the devs (and others) have with the matrix class. Maybe things have changed with Statsmodels. Not changed, we have a strict policy against using np.matrix. generic efficient versions for linear operators, kronecker or sparse block matrix styly operations would be useful, but I would use array semantics, similar to using dot or linalg functions on ndarrays. Josef (long reply canceled because I'm writing too much that might only be of tangential interest or has been in some of the matrix discussion before.) Josef, Many thanks. I have gained the impression that there is some antipathy to np.matrix, perhaps this is because, as others have suggested, the array doesn't provide an appropriate framework. It's not directly antipathy, it's cost-benefit analysis. np.matrix has few advantages, but makes reading and maintaining code much more difficult. Having to watch out for multiplication `*` is a lot of extra work. Checking shapes and fixing bugs with unexpected dtypes is also a lot of work, but we have large benefits. For a long time the policy in statsmodels was to keep pandas out of the core of functions (i.e. out of the actual calculations) and restrict it to inputs and returns. However, pandas is becoming more popular and can do some things much better than plain numpy, so it is slowly moving inside some of our core calculations. It's still an easy source of bugs, but we do gain something. Any bits of Pandas that might be good for numpy/scipy to steal? I'm not a Pandas expert. Some of it comes into statsmodels because we need the data handling also inside a function, e.g. keeping track of labels, indices, and so on. Another reason is that contributors are more familiar with pandas's way of solving a problems, even if I suspect numpy would be more efficient. However, a recent change, replaces where I would have used np.unique with pandas.factorize which is supposed to be faster. https://github.com/statsmodels/statsmodels/pull/2213 Numpy could use some form of hash table for its arraysetops, which is where pandas is getting its advantage from. It is a tricky thing though, see e.g. these timings: a = np.ranomdom.randint(10, size=1000) srs = pd.Series(a) %timeit np.unique(a) 10 loops, best of 3: 13.2 µs per loop %timeit srs.unique() 10 loops, best of 3: 15.6 µs per loop %timeit pd.factorize(a) 1 loops, best of 3: 25.6 µs per loop %timeit np.unique(a, return_inverse=True) 1 loops, best of 3: 82.5 µs per loop This last timings are with 1.9.0 an 0.14.0, so numpy doesn't have https://github.com/numpy/numpy/pull/5012 yet, which makes the operation in which numpy is slower about 2x faster. And if you need your unique values sorted, then things are more even, especially if numpy runs 2x faster: %timeit pd.factorize(a, sort=True) 1 loops, best of 3: 36.4 µs per loop The algorithms scale differently though, so for sufficiently large data Pandas is going to win almost certainly. Not sure if they support all dtypes, nor how efficient their use of memory is. I did a toy implementation of a hash table, mimicking Python's dictionary, for numpy some time
Re: [Numpy-discussion] Matrix Class
Thanks Ryan. There are a number of good thoughts in your message. I'll try to keep track of them. Another respondent reported different results than mine. I'm in the process of re-installing to check. Colin W. On 11 February 2015 at 16:18, Ryan Nelson rnelsonc...@gmail.com wrote: Colin, I currently use Py3.4 and Numpy 1.9.1. However, I built a quick test conda environment with Python2.7 and Numpy 1.7.0, and I get the same: Python 2.7.9 |Continuum Analytics, Inc.| (default, Dec 18 2014, 16:57:52) [MSC v .1500 64 bit (AMD64)] Type copyright, credits or license for more information. IPython 2.3.1 -- An enhanced Interactive Python. Anaconda is brought to you by Continuum Analytics. Please check out: http://continuum.io/thanks and https://binstar.org ? - Introduction and overview of IPython's features. %quickref - Quick reference. help - Python's own help system. object? - Details about 'object', use 'object??' for extra details. In [1]: import numpy as np In [2]: np.__version__ Out[2]: '1.7.0' In [3]: np.mat([4,'5',6]) Out[3]: matrix([['4', '5', '6']], dtype='|S1') In [4]: np.mat([4,'5',6], dtype=int) Out[4]: matrix([[4, 5, 6]]) ### As to your comment about coordinating with Statsmodels, you should see the links in the thread that Alan posted: http://permalink.gmane.org/gmane.comp.python.numeric.general/56516 http://permalink.gmane.org/gmane.comp.python.numeric.general/56517 Josef's comments at the time seem to echo the issues the devs (and others) have with the matrix class. Maybe things have changed with Statsmodels. I know I mentioned Sage and SageMathCloud before. I'll just point out that there are folks that use this for real research problems, not just as a pedagogical tool. They have a Matrix/vector/column_matrix class that do what you were expecting from your problems posted above. Indeed below is a (truncated) cut and past from a Sage Worksheet. (See http://www.sagemath.org/doc/tutorial/tour_linalg.html) ## In : Matrix([1,'2',3]) Error in lines 1-1 Traceback (most recent call last): TypeError: unable to find a common ring for all elements In : Matrix([[1,2,3],[4,5]]) ValueError: List of rows is not valid (rows are wrong types or lengths) In : vector([1,2,3]) (1, 2, 3) In : column_matrix([1,2,3]) [1] [2] [3] ## Large portions of the custom code and wrappers in Sage are written in Python. I don't think their Matrix object is a subclass of ndarray, so perhaps you could strip out the Matrix stuff from here to make a separate project with just the Matrix stuff, if you don't want to go through the Sage interface. On Wed, Feb 11, 2015 at 11:54 AM, cjw c...@ncf.ca wrote: On 11-Feb-15 10:21 AM, Ryan Nelson wrote: So: In [2]: np.mat([4,'5',6]) Out[2]: matrix([['4', '5', '6']], dtype='U11') In [3]: np.mat([4,'5',6], dtype=int) Out[3]: matrix([[4, 5, 6]]) Thanks Ryan, We are not singing from the same hymn book. Using PyScripter, I get: *** Python 2.7.9 (default, Dec 10 2014, 12:28:03) [MSC v.1500 64 bit (AMD64)] on win32. *** import numpy as np print('Numpy version: ', np.__version__) ('Numpy version: ', '1.9.0') Could you say which version you are using please? Colin W On Tue, Feb 10, 2015 at 5:07 PM, cjw c...@ncf.ca c...@ncf.ca wrote: It seems to be agreed that there are weaknesses in the existing Numpy Matrix Class. Some problems are illustrated below. I'll try to put some suggestions over the coming weeks and would appreciate comments. Colin W. Test Script: if __name__ == '__main__': a= mat([4, 5, 6]) # Good print('a: ', a) b= mat([4, '5', 6]) # Not the expected result print('b: ', b) c= mat([[4, 5, 6], [7, 8]]) # Wrongly accepted as rectangular print('c: ', c) d= mat([[1, 2, 3]]) try: d[0, 1]= 'b'# Correctly flagged, not numeric except ValueError: print(d[0, 1]= 'b' # Correctly flagged, not numeric, ' ValueError') print('d: ', d) Result: *** Python 2.7.9 (default, Dec 10 2014, 12:28:03) [MSC v.1500 64 bit (AMD64)] on win32. *** a: [[4 5 6]] b: [['4' '5' '6']] c: [[[4, 5, 6] [7, 8]]] d[0, 1]= 'b' # Correctly flagged, not numeric ValueError d: [[1 2 3]] -- View this message in context:http://numpy-discussion.10968.n7.nabble.com/Matrix-Class-tp39719.html Sent from the Numpy-discussion mailing list archive at Nabble.com. ___ NumPy-Discussion mailing listNumPy-Discussion@scipy.orghttp://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing listNumPy-Discussion@scipy.orghttp://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list
Re: [Numpy-discussion] Matrix Class
On 11-Feb-15 10:47 AM, Sebastian Berg wrote: On Di, 2015-02-10 at 15:07 -0700, cjw wrote: It seems to be agreed that there are weaknesses in the existing Numpy Matrix Class. Some problems are illustrated below. Not to delve deeply into a discussion, but unfortunately, there seem far more fundamental problems because of the always 2-D thing and the simple fact that matrix is more of a second class citizen in numpy (or in other words a lot of this is just the general fact that it is an ndarray subclass). Thanks Sebastian, We'll have to see what comes out of the discussion. I would be grateful if you could expand on the "always 2D thing". Is there a need for a collection of matrices, where a function is applied to each component of the collection? Colin W. I think some of these issues were summarized in the discussion about the @ operator. I am not saying that a matrix class separate from numpy cannot solve these, but within numpy it seems hard. I'll try to put some suggestions over the coming weeks and would appreciate comments. Colin W. Test Script: if __name__ == '__main__': a= mat([4, 5, 6]) # Good print('a: ', a) b= mat([4, '5', 6]) # Not the expected result print('b: ', b) c= mat([[4, 5, 6], [7, 8]]) # Wrongly accepted as rectangular print('c: ', c) d= mat([[1, 2, 3]]) try: d[0, 1]= 'b'# Correctly flagged, not numeric except ValueError: print("d[0, 1]= 'b' # Correctly flagged, not numeric", ' ValueError') print('d: ', d) Result: *** Python 2.7.9 (default, Dec 10 2014, 12:28:03) [MSC v.1500 64 bit (AMD64)] on win32. *** a: [[4 5 6]] b: [['4' '5' '6']] c: [[[4, 5, 6] [7, 8]]] d[0, 1]= 'b' # Correctly flagged, not numeric ValueError d: [[1 2 3]] -- View this message in context: http://numpy-discussion.10968.n7.nabble.com/Matrix-Class-tp39719.html Sent from the Numpy-discussion mailing list archive at Nabble.com. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Matrix Class
Just recalling the one-year-ago discussion: http://comments.gmane.org/gmane.comp.python.numeric.general/56494 Alan Isaac ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Matrix Class
On Mi, 2015-02-11 at 11:38 -0500, cjw wrote: On 11-Feb-15 10:47 AM, Sebastian Berg wrote: On Di, 2015-02-10 at 15:07 -0700, cjw wrote: It seems to be agreed that there are weaknesses in the existing Numpy Matrix Class. Some problems are illustrated below. Not to delve deeply into a discussion, but unfortunately, there seem far more fundamental problems because of the always 2-D thing and the simple fact that matrix is more of a second class citizen in numpy (or in other words a lot of this is just the general fact that it is an ndarray subclass). Thanks Sebastian, We'll have to see what comes out of the discussion. I would be grateful if you could expand on the always 2D thing. Is there a need for a collection of matrices, where a function is applied to each component of the collection? No, I just mean the fact that a matrix is always 2D. This makes some things like some indexing operations awkward and some functions that expect a numpy array (but think they can handle subclasses fine) may just plain brake. And then ndarray subclasses are just a bit problematic In short, you cannot generally expect a function which works great with arrays to also work great with matrices, I believe. this is true for some things within numpy and certainly for third party libraries I am sure. - Sebastian Colin W. I think some of these issues were summarized in the discussion about the @ operator. I am not saying that a matrix class separate from numpy cannot solve these, but within numpy it seems hard. I'll try to put some suggestions over the coming weeks and would appreciate comments. Colin W. Test Script: if __name__ == '__main__': a= mat([4, 5, 6]) # Good print('a: ', a) b= mat([4, '5', 6]) # Not the expected result print('b: ', b) c= mat([[4, 5, 6], [7, 8]]) # Wrongly accepted as rectangular print('c: ', c) d= mat([[1, 2, 3]]) try: d[0, 1]= 'b'# Correctly flagged, not numeric except ValueError: print(d[0, 1]= 'b' # Correctly flagged, not numeric, ' ValueError') print('d: ', d) Result: *** Python 2.7.9 (default, Dec 10 2014, 12:28:03) [MSC v.1500 64 bit (AMD64)] on win32. *** a: [[4 5 6]] b: [['4' '5' '6']] c: [[[4, 5, 6] [7, 8]]] d[0, 1]= 'b' # Correctly flagged, not numeric ValueError d: [[1 2 3]] -- View this message in context: http://numpy-discussion.10968.n7.nabble.com/Matrix-Class-tp39719.html Sent from the Numpy-discussion mailing list archive at Nabble.com. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Matrix Class
On 11-Feb-15 10:21 AM, Ryan Nelson wrote: So: In [2]: np.mat([4,'5',6]) Out[2]: matrix([['4', '5', '6']], dtype='U11') In [3]: np.mat([4,'5',6], dtype=int) Out[3]: matrix([[4, 5, 6]]) Thanks Ryan, We are not singing from the same hymn book. Using PyScripter, I get: *** Python 2.7.9 (default, Dec 10 2014, 12:28:03) [MSC v.1500 64 bit (AMD64)] on win32. *** import numpy as np print('Numpy version: ', np.__version__) ('Numpy version: ', '1.9.0') Could you say which version you are using please? Colin W On Tue, Feb 10, 2015 at 5:07 PM, cjw c...@ncf.ca wrote: It seems to be agreed that there are weaknesses in the existing Numpy Matrix Class. Some problems are illustrated below. I'll try to put some suggestions over the coming weeks and would appreciate comments. Colin W. Test Script: if __name__ == '__main__': a= mat([4, 5, 6]) # Good print('a: ', a) b= mat([4, '5', 6]) # Not the expected result print('b: ', b) c= mat([[4, 5, 6], [7, 8]]) # Wrongly accepted as rectangular print('c: ', c) d= mat([[1, 2, 3]]) try: d[0, 1]= 'b'# Correctly flagged, not numeric except ValueError: print("d[0, 1]= 'b' # Correctly flagged, not numeric", ' ValueError') print('d: ', d) Result: *** Python 2.7.9 (default, Dec 10 2014, 12:28:03) [MSC v.1500 64 bit (AMD64)] on win32. *** a: [[4 5 6]] b: [['4' '5' '6']] c: [[[4, 5, 6] [7, 8]]] d[0, 1]= 'b' # Correctly flagged, not numeric ValueError d: [[1 2 3]] -- View this message in context: http://numpy-discussion.10968.n7.nabble.com/Matrix-Class-tp39719.html Sent from the Numpy-discussion mailing list archive at Nabble.com. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Matrix Class
On 11-Feb-15 12:13 PM, Alan G Isaac wrote: Just recalling the one-year-ago discussion: http://comments.gmane.org/gmane.comp.python.numeric.general/56494 Alan Isaac ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion Thanks Alan, I've kept a pointer. My interest is not oriented towards tuition but in exploring the possibility of making Matrix as efficient as possible. Others have suggested Sage Maths for tuition. What methods should be included? You have suggested adding the Hermitian. I think of the matrix as a numeric object. What would the case be for having a Boolean matrix? The Hat matrix and SVD are suggested. Possible coordination with stats models. I'll try to put a first draft specification over the next few weeks. Colin W. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Matrix Class
Thanks Sebastian, This would appear to make a case for considering not having Matrix as a sub-class of an np array. On the other hand, so much work has gone into np, and there is some commonality between the needs of Matrix and Array. Colin W. On 11-Feb-15 12:19 PM, Sebastian Berg wrote: On Mi, 2015-02-11 at 11:38 -0500, cjw wrote: On 11-Feb-15 10:47 AM, Sebastian Berg wrote: On Di, 2015-02-10 at 15:07 -0700, cjw wrote: It seems to be agreed that there are weaknesses in the existing Numpy Matrix Class. Some problems are illustrated below. Not to delve deeply into a discussion, but unfortunately, there seem far more fundamental problems because of the always 2-D thing and the simple fact that matrix is more of a second class citizen in numpy (or in other words a lot of this is just the general fact that it is an ndarray subclass). Thanks Sebastian, We'll have to see what comes out of the discussion. I would be grateful if you could expand on the "always 2D thing". Is there a need for a collection of matrices, where a function is applied to each component of the collection? No, I just mean the fact that a matrix is always 2D. This makes some things like some indexing operations awkward and some functions that expect a numpy array (but think they can handle subclasses fine) may just plain brake. And then ndarray subclasses are just a bit problematic In short, you cannot generally expect a function which works great with arrays to also work great with matrices, I believe. this is true for some things within numpy and certainly for third party libraries I am sure. - Sebastian Colin W. I think some of these issues were summarized in the discussion about the @ operator. I am not saying that a matrix class separate from numpy cannot solve these, but within numpy it seems hard. I'll try to put some suggestions over the coming weeks and would appreciate comments. Colin W. Test Script: if __name__ == '__main__': a= mat([4, 5, 6]) # Good print('a: ', a) b= mat([4, '5', 6]) # Not the expected result print('b: ', b) c= mat([[4, 5, 6], [7, 8]]) # Wrongly accepted as rectangular print('c: ', c) d= mat([[1, 2, 3]]) try: d[0, 1]= 'b'# Correctly flagged, not numeric except ValueError: print("d[0, 1]= 'b' # Correctly flagged, not numeric", ' ValueError') print('d: ', d) Result: *** Python 2.7.9 (default, Dec 10 2014, 12:28:03) [MSC v.1500 64 bit (AMD64)] on win32. *** a: [[4 5 6]] b: [['4' '5' '6']] c: [[[4, 5, 6] [7, 8]]] d[0, 1]= 'b' # Correctly flagged, not numeric ValueError d: [[1 2 3]] -- View this message in context: http://numpy-discussion.10968.n7.nabble.com/Matrix-Class-tp39719.html Sent from the Numpy-discussion mailing list archive at Nabble.com. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Matrix Class
So: In [2]: np.mat([4,'5',6]) Out[2]: matrix([['4', '5', '6']], dtype='U11') In [3]: np.mat([4,'5',6], dtype=int) Out[3]: matrix([[4, 5, 6]]) On Tue, Feb 10, 2015 at 5:07 PM, cjw c...@ncf.ca wrote: It seems to be agreed that there are weaknesses in the existing Numpy Matrix Class. Some problems are illustrated below. I'll try to put some suggestions over the coming weeks and would appreciate comments. Colin W. Test Script: if __name__ == '__main__': a= mat([4, 5, 6]) # Good print('a: ', a) b= mat([4, '5', 6]) # Not the expected result print('b: ', b) c= mat([[4, 5, 6], [7, 8]]) # Wrongly accepted as rectangular print('c: ', c) d= mat([[1, 2, 3]]) try: d[0, 1]= 'b'# Correctly flagged, not numeric except ValueError: print(d[0, 1]= 'b' # Correctly flagged, not numeric, ' ValueError') print('d: ', d) Result: *** Python 2.7.9 (default, Dec 10 2014, 12:28:03) [MSC v.1500 64 bit (AMD64)] on win32. *** a: [[4 5 6]] b: [['4' '5' '6']] c: [[[4, 5, 6] [7, 8]]] d[0, 1]= 'b' # Correctly flagged, not numeric ValueError d: [[1 2 3]] -- View this message in context: http://numpy-discussion.10968.n7.nabble.com/Matrix-Class-tp39719.html Sent from the Numpy-discussion mailing list archive at Nabble.com. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Matrix Class
On Di, 2015-02-10 at 15:07 -0700, cjw wrote: It seems to be agreed that there are weaknesses in the existing Numpy Matrix Class. Some problems are illustrated below. Not to delve deeply into a discussion, but unfortunately, there seem far more fundamental problems because of the always 2-D thing and the simple fact that matrix is more of a second class citizen in numpy (or in other words a lot of this is just the general fact that it is an ndarray subclass). I think some of these issues were summarized in the discussion about the @ operator. I am not saying that a matrix class separate from numpy cannot solve these, but within numpy it seems hard. I'll try to put some suggestions over the coming weeks and would appreciate comments. Colin W. Test Script: if __name__ == '__main__': a= mat([4, 5, 6]) # Good print('a: ', a) b= mat([4, '5', 6]) # Not the expected result print('b: ', b) c= mat([[4, 5, 6], [7, 8]]) # Wrongly accepted as rectangular print('c: ', c) d= mat([[1, 2, 3]]) try: d[0, 1]= 'b'# Correctly flagged, not numeric except ValueError: print(d[0, 1]= 'b' # Correctly flagged, not numeric, ' ValueError') print('d: ', d) Result: *** Python 2.7.9 (default, Dec 10 2014, 12:28:03) [MSC v.1500 64 bit (AMD64)] on win32. *** a: [[4 5 6]] b: [['4' '5' '6']] c: [[[4, 5, 6] [7, 8]]] d[0, 1]= 'b' # Correctly flagged, not numeric ValueError d: [[1 2 3]] -- View this message in context: http://numpy-discussion.10968.n7.nabble.com/Matrix-Class-tp39719.html Sent from the Numpy-discussion mailing list archive at Nabble.com. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Matrix Class
On 2/11/2015 2:25 PM, cjw wrote: I think of the matrix as a numeric object. What would the case be for having a Boolean matrix? It's one of my primary uses: https://en.wikipedia.org/wiki/Adjacency_matrix Numpy alread provides SVD: http://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.svd.html A lot of core linear algebra is in `numpy.linalg`, and SciPy has much more. Remember for matrix `M` you can always apply any numpy function to `M.A`. I think gains could be in lazy evaluation structures (e.g., a KroneckerProduct object that never actually produces the product unless forced to.) Cheers, Alan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Matrix Class
11.02.2015, 21:57, Alan G Isaac kirjoitti: [clip] I think gains could be in lazy evaluation structures (e.g., a KroneckerProduct object that never actually produces the product unless forced to.) This sounds like an abstract linear operator interface. Several attempts have been made to this direction in Python world, but I think none of them has really gained traction so far. One is even in Scipy. Unfortunately, that one's design has grown organically, and it's mostly suited just for specifying inputs to sparse solvers etc. rather than abstract manipulations. If there was a popular way to deal with these objects, it could become even more popular reasonably quickly. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Matrix Class
Colin, I currently use Py3.4 and Numpy 1.9.1. However, I built a quick test conda environment with Python2.7 and Numpy 1.7.0, and I get the same: Python 2.7.9 |Continuum Analytics, Inc.| (default, Dec 18 2014, 16:57:52) [MSC v .1500 64 bit (AMD64)] Type copyright, credits or license for more information. IPython 2.3.1 -- An enhanced Interactive Python. Anaconda is brought to you by Continuum Analytics. Please check out: http://continuum.io/thanks and https://binstar.org ? - Introduction and overview of IPython's features. %quickref - Quick reference. help - Python's own help system. object? - Details about 'object', use 'object??' for extra details. In [1]: import numpy as np In [2]: np.__version__ Out[2]: '1.7.0' In [3]: np.mat([4,'5',6]) Out[3]: matrix([['4', '5', '6']], dtype='|S1') In [4]: np.mat([4,'5',6], dtype=int) Out[4]: matrix([[4, 5, 6]]) ### As to your comment about coordinating with Statsmodels, you should see the links in the thread that Alan posted: http://permalink.gmane.org/gmane.comp.python.numeric.general/56516 http://permalink.gmane.org/gmane.comp.python.numeric.general/56517 Josef's comments at the time seem to echo the issues the devs (and others) have with the matrix class. Maybe things have changed with Statsmodels. I know I mentioned Sage and SageMathCloud before. I'll just point out that there are folks that use this for real research problems, not just as a pedagogical tool. They have a Matrix/vector/column_matrix class that do what you were expecting from your problems posted above. Indeed below is a (truncated) cut and past from a Sage Worksheet. (See http://www.sagemath.org/doc/tutorial/tour_linalg.html) ## In : Matrix([1,'2',3]) Error in lines 1-1 Traceback (most recent call last): TypeError: unable to find a common ring for all elements In : Matrix([[1,2,3],[4,5]]) ValueError: List of rows is not valid (rows are wrong types or lengths) In : vector([1,2,3]) (1, 2, 3) In : column_matrix([1,2,3]) [1] [2] [3] ## Large portions of the custom code and wrappers in Sage are written in Python. I don't think their Matrix object is a subclass of ndarray, so perhaps you could strip out the Matrix stuff from here to make a separate project with just the Matrix stuff, if you don't want to go through the Sage interface. On Wed, Feb 11, 2015 at 11:54 AM, cjw c...@ncf.ca wrote: On 11-Feb-15 10:21 AM, Ryan Nelson wrote: So: In [2]: np.mat([4,'5',6]) Out[2]: matrix([['4', '5', '6']], dtype='U11') In [3]: np.mat([4,'5',6], dtype=int) Out[3]: matrix([[4, 5, 6]]) Thanks Ryan, We are not singing from the same hymn book. Using PyScripter, I get: *** Python 2.7.9 (default, Dec 10 2014, 12:28:03) [MSC v.1500 64 bit (AMD64)] on win32. *** import numpy as np print('Numpy version: ', np.__version__) ('Numpy version: ', '1.9.0') Could you say which version you are using please? Colin W On Tue, Feb 10, 2015 at 5:07 PM, cjw c...@ncf.ca c...@ncf.ca wrote: It seems to be agreed that there are weaknesses in the existing Numpy Matrix Class. Some problems are illustrated below. I'll try to put some suggestions over the coming weeks and would appreciate comments. Colin W. Test Script: if __name__ == '__main__': a= mat([4, 5, 6]) # Good print('a: ', a) b= mat([4, '5', 6]) # Not the expected result print('b: ', b) c= mat([[4, 5, 6], [7, 8]]) # Wrongly accepted as rectangular print('c: ', c) d= mat([[1, 2, 3]]) try: d[0, 1]= 'b'# Correctly flagged, not numeric except ValueError: print(d[0, 1]= 'b' # Correctly flagged, not numeric, ' ValueError') print('d: ', d) Result: *** Python 2.7.9 (default, Dec 10 2014, 12:28:03) [MSC v.1500 64 bit (AMD64)] on win32. *** a: [[4 5 6]] b: [['4' '5' '6']] c: [[[4, 5, 6] [7, 8]]] d[0, 1]= 'b' # Correctly flagged, not numeric ValueError d: [[1 2 3]] -- View this message in context:http://numpy-discussion.10968.n7.nabble.com/Matrix-Class-tp39719.html Sent from the Numpy-discussion mailing list archive at Nabble.com. ___ NumPy-Discussion mailing listNumPy-Discussion@scipy.orghttp://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing listNumPy-Discussion@scipy.orghttp://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Matrix Class
It seems to be agreed that there are weaknesses in the existing Numpy Matrix Class. Some problems are illustrated below. I'll try to put some suggestions over the coming weeks and would appreciate comments. Colin W. Test Script: if __name__ == '__main__': a= mat([4, 5, 6]) # Good print('a: ', a) b= mat([4, '5', 6]) # Not the expected result print('b: ', b) c= mat([[4, 5, 6], [7, 8]]) # Wrongly accepted as rectangular print('c: ', c) d= mat([[1, 2, 3]]) try: d[0, 1]= 'b'# Correctly flagged, not numeric except ValueError: print(d[0, 1]= 'b' # Correctly flagged, not numeric, ' ValueError') print('d: ', d) Result: *** Python 2.7.9 (default, Dec 10 2014, 12:28:03) [MSC v.1500 64 bit (AMD64)] on win32. *** a: [[4 5 6]] b: [['4' '5' '6']] c: [[[4, 5, 6] [7, 8]]] d[0, 1]= 'b' # Correctly flagged, not numeric ValueError d: [[1 2 3]] -- View this message in context: http://numpy-discussion.10968.n7.nabble.com/Matrix-Class-tp39719.html Sent from the Numpy-discussion mailing list archive at Nabble.com. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Matrix Class [was numpy release]
On Apr 24, 2008, at 8:52 PM, Bill Spotz wrote: On Apr 24, 2008, at 5:45 PM, Timothy Hochberg wrote: Bill Spotz wrote: I have generally thought about this in the context of, say, a Krylov-space iterative method, and what that type of interface would lead to the most readable code. Can you whip up a small example, starting with the current implementation? This sounds like work. ;-) I'll shoot for putting something together this weekend. I think Tim is right; we've been talking around in circles, and we need something concrete. I have posted my example (using the current interface) at http://www.scipy.org/ConjugateGradientExample . I have also added a link to it from Alan's http://www.scipy.org/MatrixIndexing so that they will be somewhat connected. It is just one example, and provides one small area where the vector classes would be useful. I hope others will post similar examples, so that the design discussion can address concrete, agreed-upon situations. Note that I started using row_vector and col_vector as the class names in my discussion. No reason to re-introduce camel-case to numpy, and I think col is a universally recognized abbreviation for column and I like the symmetry of the three-letter row_ and col_ prefixes. ** Bill Spotz ** ** Sandia National Laboratories Voice: (505)845-0170 ** ** P.O. Box 5800 Fax: (505)284-0154 ** ** Albuquerque, NM 87185-0370Email: [EMAIL PROTECTED] ** ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Matrix Class [was numpy release]
Chris.Barker wrote: Alan G Isaac wrote: the cost of complexity should be justified by a gain in functionality. I don't think functionality is the right word here. the Matrix class(es) is all about clean, convenient API, i.e. style, not functionality -- we have all the functionality already, indeed we have it with plain old arrays, so I think that's really beside the point. Not entirely, there's no good way do deal with arrays of matrices at present. This could be fixed by tweaking dot, but it could also be part of a reform of the matrix class. [CHOP] Timothy Hochberg wrote: 1. The matrices and arrays should become more alike if possible I'm not sure I agree -- the more alike they are, the less point there is to having them. That's the best possible outcome. If some solution can be reached that naturally supports enough matrix operations on array, without significantly complexifying array, to satisfy the matrix users then that would be great. Less stuff to maintain, less stuff to learn, etc, etc. With that in mind, what is minimum amount of stuff that matrix should support: 1. Indexing along a single axis should produce a row or column vector as appropriate. 2. '*' should mean matrix multiplication. 3. '**' should mean matrix exponentiation. I suspect that this is less crucial than 2, but no more difficult. There's some other stuff that's less critical IMO (.H and .I for example) but feel free to yell at me if you think I'm mistaken. There's some other properties that a fix should have as well, in my opinion and in some cases in others opinions as well. 1. A single index into an array should yield a 1D object just as indexing an array does. This does not have to inconsistent with #1 above; Chris Barker proposed one solution. I'm not sold on the details of that solution, but I approve of the direction that it points to. [In general A[i][j] should equal A[i,j]. I know that fancy indexing breaks this; that was a mistake IMO, but that's a discussion for another day]. 2. It should be possible to embed both vectors and matrices naturally into arrays. This would allow natural and efficient implementation of, for example, rotating a 1000 3D vectors. One could spell that R * V, where R is a 2x2 matrix and V is a 1000x3 array, where the last axis is understood to be a vector. 3. I'm pretty certain there's a third thing I wanted to mention but it escapes me. It'll resurface at the most inopportune time Let's play with Chris Barker's RowVector and ColVector proposal for a moment. Let's suppose that we have four four classes: Scalar, RowVector, ColVector and Matrix. They're all pretty much the same as today's array class except that: 1. They are displayed a little differently so that you can tell them apart. 2. __mul__ and __pow__ are treated differently Let's consider __mul__: when a RowVector multiplied with ColVector, the dot product is computed. If, for example, the arrays are both 2D, the they are treated as arrays of vectors and the dot product of each pair of vectors is computed in turn: I think broadcasting should work correctly, but I haven't thought that all the way through yet. Ignoring broadcasting for a moment, the rules are: 1. (n)D Scalar * (n)D Scalar = (n)D Scalar (elementwise) 2. (n)D RowVector * (n)D ColVector = (n-1)D Scalar (dot product) 3. (n+1)D Matrix * (n)D ColVector = (n)D ColVector (matrix-vector product) 4. (n)D Matrix * n(D) Matrix = (n)D Matrix (matrix) product Other combinations would be an error. In principal you could do dyadic products, but I suspect we'd be better off using a separate function for that since most of the times that would just indicate a mistake. Note that in this example Scalar is playing the role of the present day array, which I've assumed has magically vanished from the scene somehow. This looks like it gets of most the way towards where we want to be. There are some issues though. For example, all of the sudden you want to different transpose operators; one that transposes matrices and flips colVectors to RowVectors and leaves Scalars alone, and another that manipulates the rest of the array structure. There is probably other stuff like that too, it doesn't look insurmountable to me right now, however. Still, I'm not sold on the whole RowVector/ColVector/Matrix approach. I have a gut feeling that we'd be better off with a single array class that was somehow tagged to indicate it's type. Also, I keep hoping to come up with an elegant generalization to this that turns arrays into quasi-tensor objects where all the matrix behavior falls out naturally. Sadly, attempts in this direction keep collapsing under there own weight but I'm still thinking about it. However, I do think that the RowVector/ColVector/Matrix approach is a step in the right direction and is certainly worth discussing to see where it leads. -tim should share more
Re: [Numpy-discussion] Matrix Class [was numpy release]
On Apr 24, 2008, at 5:45 PM, Timothy Hochberg wrote: Bill Spotz wrote: I have generally thought about this in the context of, say, a Krylov-space iterative method, and what that type of interface would lead to the most readable code. Can you whip up a small example, starting with the current implementation? This sounds like work. ;-) I'll shoot for putting something together this weekend. I think Tim is right; we've been talking around in circles, and we need something concrete. I also think that a conjugate gradient algorithm, say, is high-level and doesn't require iterating over rows (or columns). We should also look at (at least) one other algorithm, one that iterates over rows. I would suggest Gaussian elimination, unless this is too obviously a high-level functionality that should be part of the extension module. ** Bill Spotz ** ** Sandia National Laboratories Voice: (505)845-0170 ** ** P.O. Box 5800 Fax: (505)284-0154 ** ** Albuquerque, NM 87185-0370Email: [EMAIL PROTECTED] ** ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion