Re: [Numpy-discussion] Matrix Class

2015-02-14 Thread Jaime Fernández del Río
On Sat, Feb 14, 2015 at 5:21 PM,  wrote:

> On Sat, Feb 14, 2015 at 4:27 PM, Charles R Harris
>  wrote:
> >
> >
> > On Sat, Feb 14, 2015 at 12:36 PM,  wrote:
> >>
> >> On Sat, Feb 14, 2015 at 12:05 PM, cjw  wrote:
> >> >
> >> > On 14-Feb-15 11:35 AM, josef.p...@gmail.com wrote:
> >> >>
> >> >> On Wed, Feb 11, 2015 at 4:18 PM, Ryan Nelson 
> >> >> wrote:
> >> >>>
> >> >>> Colin,
> >> >>>
> >> >>> I currently use Py3.4 and Numpy 1.9.1. However, I built a quick test
> >> >>> conda
> >> >>> environment with Python2.7 and Numpy 1.7.0, and I get the same:
> >> >>>
> >> >>> 
> >> >>> Python 2.7.9 |Continuum Analytics, Inc.| (default, Dec 18 2014,
> >> >>> 16:57:52)
> >> >>> [MSC v
> >> >>> .1500 64 bit (AMD64)]
> >> >>> Type "copyright", "credits" or "license" for more information.
> >> >>>
> >> >>> IPython 2.3.1 -- An enhanced Interactive Python.
> >> >>> Anaconda is brought to you by Continuum Analytics.
> >> >>> Please check out: http://continuum.io/thanks and
> https://binstar.org
> >> >>> ? -> Introduction and overview of IPython's features.
> >> >>> %quickref -> Quick reference.
> >> >>> help  -> Python's own help system.
> >> >>> object?   -> Details about 'object', use 'object??' for extra
> details.
> >> >>>
> >> >>> In [1]: import numpy as np
> >> >>>
> >> >>> In [2]: np.__version__
> >> >>> Out[2]: '1.7.0'
> >> >>>
> >> >>> In [3]: np.mat([4,'5',6])
> >> >>> Out[3]:
> >> >>> matrix([['4', '5', '6']],
> >> >>> dtype='|S1')
> >> >>>
> >> >>> In [4]: np.mat([4,'5',6], dtype=int)
> >> >>> Out[4]: matrix([[4, 5, 6]])
> >> >>> ###
> >> >>>
> >> >>> As to your comment about coordinating with Statsmodels, you should
> see
> >> >>> the
> >> >>> links in the thread that Alan posted:
> >> >>> http://permalink.gmane.org/gmane.comp.python.numeric.general/56516
> >> >>> http://permalink.gmane.org/gmane.comp.python.numeric.general/56517
> >> >>> Josef's comments at the time seem to echo the issues the devs (and
> >> >>> others)
> >> >>> have with the matrix class. Maybe things have changed with
> >> >>> Statsmodels.
> >> >>
> >> >> Not changed, we have a strict policy against using np.matrix.
> >> >>
> >> >> generic efficient versions for linear operators, kronecker or sparse
> >> >> block matrix styly operations would be useful, but I would use array
> >> >> semantics, similar to using dot or linalg functions on ndarrays.
> >> >>
> >> >> Josef
> >> >> (long reply canceled because I'm writing too much that might only be
> >> >> of tangential interest or has been in some of the matrix discussion
> >> >> before.)
> >> >
> >> > Josef,
> >> >
> >> > Many thanks.  I have gained the impression that there is some
> antipathy
> >> > to
> >> > np.matrix, perhaps this is because, as others have suggested, the
> array
> >> > doesn't provide an appropriate framework.
> >>
> >> It's not directly antipathy, it's cost-benefit analysis.
> >>
> >> np.matrix has few advantages, but makes reading and maintaining code
> >> much more difficult.
> >> Having to watch out for multiplication `*` is a lot of extra work.
> >>
> >> Checking shapes and fixing bugs with unexpected dtypes is also a lot
> >> of work, but we have large benefits.
> >> For a long time the policy in statsmodels was to keep pandas out of
> >> the core of functions (i.e. out of the actual calculations) and
> >> restrict it to inputs and returns. However, pandas is becoming more
> >> popular and can do some things much better than plain numpy, so it is
> >> slowly moving inside some of our core calculations.
> >> It's still an easy source of bugs, but we do gain something.
> >
> >
> > Any bits of Pandas that might be good for numpy/scipy to steal?
>
> I'm not a Pandas expert.
> Some of it comes into statsmodels because we need the data handling
> also inside a function, e.g. keeping track of labels, indices, and so
> on. Another reason is that contributors are more familiar with
> pandas's way of solving a problems, even if I suspect numpy would be
> more efficient.
>
> However, a recent change, replaces where I would have used np.unique
> with pandas.factorize which is supposed to be faster.
> https://github.com/statsmodels/statsmodels/pull/2213


Numpy could use some form of hash table for its arraysetops, which is where
pandas is getting its advantage from. It is a tricky thing though, see e.g.
these timings:

a = np.ranomdom.randint(10, size=1000)
srs = pd.Series(a)

%timeit np.unique(a)

10 loops, best of 3: 13.2 µs per loop

%timeit srs.unique()

10 loops, best of 3: 15.6 µs per loop


%timeit pd.factorize(a)

1 loops, best of 3: 25.6 µs per loop

%timeit np.unique(a, return_inverse=True)

1 loops, best of 3: 82.5 µs per loop

This last timings are with 1.9.0 an 0.14.0, so numpy doesn't have
https://github.com/numpy/numpy/pull/5012 yet, which makes the operation in
which numpy is slower about 2x faster. And if you need your unique values
sorted, then things are more even, especially if numpy runs 2x faster:

Re: [Numpy-discussion] Matrix Class

2015-02-14 Thread josef.pktd
On Sat, Feb 14, 2015 at 4:27 PM, Charles R Harris
 wrote:
>
>
> On Sat, Feb 14, 2015 at 12:36 PM,  wrote:
>>
>> On Sat, Feb 14, 2015 at 12:05 PM, cjw  wrote:
>> >
>> > On 14-Feb-15 11:35 AM, josef.p...@gmail.com wrote:
>> >>
>> >> On Wed, Feb 11, 2015 at 4:18 PM, Ryan Nelson 
>> >> wrote:
>> >>>
>> >>> Colin,
>> >>>
>> >>> I currently use Py3.4 and Numpy 1.9.1. However, I built a quick test
>> >>> conda
>> >>> environment with Python2.7 and Numpy 1.7.0, and I get the same:
>> >>>
>> >>> 
>> >>> Python 2.7.9 |Continuum Analytics, Inc.| (default, Dec 18 2014,
>> >>> 16:57:52)
>> >>> [MSC v
>> >>> .1500 64 bit (AMD64)]
>> >>> Type "copyright", "credits" or "license" for more information.
>> >>>
>> >>> IPython 2.3.1 -- An enhanced Interactive Python.
>> >>> Anaconda is brought to you by Continuum Analytics.
>> >>> Please check out: http://continuum.io/thanks and https://binstar.org
>> >>> ? -> Introduction and overview of IPython's features.
>> >>> %quickref -> Quick reference.
>> >>> help  -> Python's own help system.
>> >>> object?   -> Details about 'object', use 'object??' for extra details.
>> >>>
>> >>> In [1]: import numpy as np
>> >>>
>> >>> In [2]: np.__version__
>> >>> Out[2]: '1.7.0'
>> >>>
>> >>> In [3]: np.mat([4,'5',6])
>> >>> Out[3]:
>> >>> matrix([['4', '5', '6']],
>> >>> dtype='|S1')
>> >>>
>> >>> In [4]: np.mat([4,'5',6], dtype=int)
>> >>> Out[4]: matrix([[4, 5, 6]])
>> >>> ###
>> >>>
>> >>> As to your comment about coordinating with Statsmodels, you should see
>> >>> the
>> >>> links in the thread that Alan posted:
>> >>> http://permalink.gmane.org/gmane.comp.python.numeric.general/56516
>> >>> http://permalink.gmane.org/gmane.comp.python.numeric.general/56517
>> >>> Josef's comments at the time seem to echo the issues the devs (and
>> >>> others)
>> >>> have with the matrix class. Maybe things have changed with
>> >>> Statsmodels.
>> >>
>> >> Not changed, we have a strict policy against using np.matrix.
>> >>
>> >> generic efficient versions for linear operators, kronecker or sparse
>> >> block matrix styly operations would be useful, but I would use array
>> >> semantics, similar to using dot or linalg functions on ndarrays.
>> >>
>> >> Josef
>> >> (long reply canceled because I'm writing too much that might only be
>> >> of tangential interest or has been in some of the matrix discussion
>> >> before.)
>> >
>> > Josef,
>> >
>> > Many thanks.  I have gained the impression that there is some antipathy
>> > to
>> > np.matrix, perhaps this is because, as others have suggested, the array
>> > doesn't provide an appropriate framework.
>>
>> It's not directly antipathy, it's cost-benefit analysis.
>>
>> np.matrix has few advantages, but makes reading and maintaining code
>> much more difficult.
>> Having to watch out for multiplication `*` is a lot of extra work.
>>
>> Checking shapes and fixing bugs with unexpected dtypes is also a lot
>> of work, but we have large benefits.
>> For a long time the policy in statsmodels was to keep pandas out of
>> the core of functions (i.e. out of the actual calculations) and
>> restrict it to inputs and returns. However, pandas is becoming more
>> popular and can do some things much better than plain numpy, so it is
>> slowly moving inside some of our core calculations.
>> It's still an easy source of bugs, but we do gain something.
>
>
> Any bits of Pandas that might be good for numpy/scipy to steal?

I'm not a Pandas expert.
Some of it comes into statsmodels because we need the data handling
also inside a function, e.g. keeping track of labels, indices, and so
on. Another reason is that contributors are more familiar with
pandas's way of solving a problems, even if I suspect numpy would be
more efficient.

However, a recent change, replaces where I would have used np.unique
with pandas.factorize which is supposed to be faster.
https://github.com/statsmodels/statsmodels/pull/2213

Two or three years ago my numpy way of group handling (using
np.unique, bincount and similar) was still faster than the pandas
`apply` version, I'm not sure that's still true.


And to emphasize: all our heavy stuff especially the big models still
only have numpy and scipy inside (with the exception of one model
waiting in a PR).

Josef


>
> 
>
> Chuck
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Matrix Class

2015-02-14 Thread Charles R Harris
On Sat, Feb 14, 2015 at 12:36 PM,  wrote:

> On Sat, Feb 14, 2015 at 12:05 PM, cjw  wrote:
> >
> > On 14-Feb-15 11:35 AM, josef.p...@gmail.com wrote:
> >>
> >> On Wed, Feb 11, 2015 at 4:18 PM, Ryan Nelson 
> >> wrote:
> >>>
> >>> Colin,
> >>>
> >>> I currently use Py3.4 and Numpy 1.9.1. However, I built a quick test
> >>> conda
> >>> environment with Python2.7 and Numpy 1.7.0, and I get the same:
> >>>
> >>> 
> >>> Python 2.7.9 |Continuum Analytics, Inc.| (default, Dec 18 2014,
> 16:57:52)
> >>> [MSC v
> >>> .1500 64 bit (AMD64)]
> >>> Type "copyright", "credits" or "license" for more information.
> >>>
> >>> IPython 2.3.1 -- An enhanced Interactive Python.
> >>> Anaconda is brought to you by Continuum Analytics.
> >>> Please check out: http://continuum.io/thanks and https://binstar.org
> >>> ? -> Introduction and overview of IPython's features.
> >>> %quickref -> Quick reference.
> >>> help  -> Python's own help system.
> >>> object?   -> Details about 'object', use 'object??' for extra details.
> >>>
> >>> In [1]: import numpy as np
> >>>
> >>> In [2]: np.__version__
> >>> Out[2]: '1.7.0'
> >>>
> >>> In [3]: np.mat([4,'5',6])
> >>> Out[3]:
> >>> matrix([['4', '5', '6']],
> >>> dtype='|S1')
> >>>
> >>> In [4]: np.mat([4,'5',6], dtype=int)
> >>> Out[4]: matrix([[4, 5, 6]])
> >>> ###
> >>>
> >>> As to your comment about coordinating with Statsmodels, you should see
> >>> the
> >>> links in the thread that Alan posted:
> >>> http://permalink.gmane.org/gmane.comp.python.numeric.general/56516
> >>> http://permalink.gmane.org/gmane.comp.python.numeric.general/56517
> >>> Josef's comments at the time seem to echo the issues the devs (and
> >>> others)
> >>> have with the matrix class. Maybe things have changed with Statsmodels.
> >>
> >> Not changed, we have a strict policy against using np.matrix.
> >>
> >> generic efficient versions for linear operators, kronecker or sparse
> >> block matrix styly operations would be useful, but I would use array
> >> semantics, similar to using dot or linalg functions on ndarrays.
> >>
> >> Josef
> >> (long reply canceled because I'm writing too much that might only be
> >> of tangential interest or has been in some of the matrix discussion
> >> before.)
> >
> > Josef,
> >
> > Many thanks.  I have gained the impression that there is some antipathy
> to
> > np.matrix, perhaps this is because, as others have suggested, the array
> > doesn't provide an appropriate framework.
>
> It's not directly antipathy, it's cost-benefit analysis.
>
> np.matrix has few advantages, but makes reading and maintaining code
> much more difficult.
> Having to watch out for multiplication `*` is a lot of extra work.
>
> Checking shapes and fixing bugs with unexpected dtypes is also a lot
> of work, but we have large benefits.
> For a long time the policy in statsmodels was to keep pandas out of
> the core of functions (i.e. out of the actual calculations) and
> restrict it to inputs and returns. However, pandas is becoming more
> popular and can do some things much better than plain numpy, so it is
> slowly moving inside some of our core calculations.
> It's still an easy source of bugs, but we do gain something.
>

Any bits of Pandas that might be good for numpy/scipy to steal?



Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Matrix Class

2015-02-14 Thread josef.pktd
On Sat, Feb 14, 2015 at 12:05 PM, cjw  wrote:
>
> On 14-Feb-15 11:35 AM, josef.p...@gmail.com wrote:
>>
>> On Wed, Feb 11, 2015 at 4:18 PM, Ryan Nelson 
>> wrote:
>>>
>>> Colin,
>>>
>>> I currently use Py3.4 and Numpy 1.9.1. However, I built a quick test
>>> conda
>>> environment with Python2.7 and Numpy 1.7.0, and I get the same:
>>>
>>> 
>>> Python 2.7.9 |Continuum Analytics, Inc.| (default, Dec 18 2014, 16:57:52)
>>> [MSC v
>>> .1500 64 bit (AMD64)]
>>> Type "copyright", "credits" or "license" for more information.
>>>
>>> IPython 2.3.1 -- An enhanced Interactive Python.
>>> Anaconda is brought to you by Continuum Analytics.
>>> Please check out: http://continuum.io/thanks and https://binstar.org
>>> ? -> Introduction and overview of IPython's features.
>>> %quickref -> Quick reference.
>>> help  -> Python's own help system.
>>> object?   -> Details about 'object', use 'object??' for extra details.
>>>
>>> In [1]: import numpy as np
>>>
>>> In [2]: np.__version__
>>> Out[2]: '1.7.0'
>>>
>>> In [3]: np.mat([4,'5',6])
>>> Out[3]:
>>> matrix([['4', '5', '6']],
>>> dtype='|S1')
>>>
>>> In [4]: np.mat([4,'5',6], dtype=int)
>>> Out[4]: matrix([[4, 5, 6]])
>>> ###
>>>
>>> As to your comment about coordinating with Statsmodels, you should see
>>> the
>>> links in the thread that Alan posted:
>>> http://permalink.gmane.org/gmane.comp.python.numeric.general/56516
>>> http://permalink.gmane.org/gmane.comp.python.numeric.general/56517
>>> Josef's comments at the time seem to echo the issues the devs (and
>>> others)
>>> have with the matrix class. Maybe things have changed with Statsmodels.
>>
>> Not changed, we have a strict policy against using np.matrix.
>>
>> generic efficient versions for linear operators, kronecker or sparse
>> block matrix styly operations would be useful, but I would use array
>> semantics, similar to using dot or linalg functions on ndarrays.
>>
>> Josef
>> (long reply canceled because I'm writing too much that might only be
>> of tangential interest or has been in some of the matrix discussion
>> before.)
>
> Josef,
>
> Many thanks.  I have gained the impression that there is some antipathy to
> np.matrix, perhaps this is because, as others have suggested, the array
> doesn't provide an appropriate framework.

It's not directly antipathy, it's cost-benefit analysis.

np.matrix has few advantages, but makes reading and maintaining code
much more difficult.
Having to watch out for multiplication `*` is a lot of extra work.

Checking shapes and fixing bugs with unexpected dtypes is also a lot
of work, but we have large benefits.
For a long time the policy in statsmodels was to keep pandas out of
the core of functions (i.e. out of the actual calculations) and
restrict it to inputs and returns. However, pandas is becoming more
popular and can do some things much better than plain numpy, so it is
slowly moving inside some of our core calculations.
It's still an easy source of bugs, but we do gain something.

Benefits like these don't exist for np.matrix.

>
> Where are such policy decisions documented?  Numpy doesn't appear to have a
> BDFL.

In general it's a mix of mailing list discussions and discussion in
issues and PRs.
I'm not directly involved in numpy and don't subscribe to the numpy's
github notifications.

For scipy (and partially for statsmodels): I think large parts of
policies for code and workflow are not explicitly specified, but are
more an understanding of maintainers and developers that can slowly
change over time, build up through spread out discussion as temporary
consensus (or without strong objections).
scipy has a hacking text file to describe some of it, but I haven't
read it in ages.

(long term changes compared to 6 years ago: required code review and
required test coverage.)

Josef


>
> I had read Alan's links back in February and now have note of them.
>
> Colin W.
>
>>
>>
>>
>>> I know I mentioned Sage and SageMathCloud before. I'll just point out
>>> that
>>> there are folks that use this for real research problems, not just as a
>>> pedagogical tool. They have a Matrix/vector/column_matrix class that do
>>> what
>>> you were expecting from your problems posted above. Indeed below is a
>>> (truncated) cut and past from a Sage Worksheet. (See
>>> http://www.sagemath.org/doc/tutorial/tour_linalg.html)
>>> ##
>>> In : Matrix([1,'2',3])
>>> Error in lines 1-1
>>> Traceback (most recent call last):
>>> TypeError: unable to find a common ring for all elements
>>>
>>> In : Matrix([[1,2,3],[4,5]])
>>> ValueError: List of rows is not valid (rows are wrong types or lengths)
>>>
>>> In : vector([1,2,3])
>>> (1, 2, 3)
>>>
>>> In : column_matrix([1,2,3])
>>> [1]
>>> [2]
>>> [3]
>>> ##
>>>
>>> Large portions of the custom code and wrappers in Sage are written in
>>> Python. I don't think their Matrix object is a subclass of ndarray, so
>>> perhaps you could strip out the Matrix stuff from here to make a separate
>>> pr

Re: [Numpy-discussion] Matrix Class

2015-02-14 Thread cjw

On 14-Feb-15 11:35 AM, josef.p...@gmail.com wrote:
> On Wed, Feb 11, 2015 at 4:18 PM, Ryan Nelson  wrote:
>> Colin,
>>
>> I currently use Py3.4 and Numpy 1.9.1. However, I built a quick test conda
>> environment with Python2.7 and Numpy 1.7.0, and I get the same:
>>
>> 
>> Python 2.7.9 |Continuum Analytics, Inc.| (default, Dec 18 2014, 16:57:52)
>> [MSC v
>> .1500 64 bit (AMD64)]
>> Type "copyright", "credits" or "license" for more information.
>>
>> IPython 2.3.1 -- An enhanced Interactive Python.
>> Anaconda is brought to you by Continuum Analytics.
>> Please check out: http://continuum.io/thanks and https://binstar.org
>> ? -> Introduction and overview of IPython's features.
>> %quickref -> Quick reference.
>> help  -> Python's own help system.
>> object?   -> Details about 'object', use 'object??' for extra details.
>>
>> In [1]: import numpy as np
>>
>> In [2]: np.__version__
>> Out[2]: '1.7.0'
>>
>> In [3]: np.mat([4,'5',6])
>> Out[3]:
>> matrix([['4', '5', '6']],
>> dtype='|S1')
>>
>> In [4]: np.mat([4,'5',6], dtype=int)
>> Out[4]: matrix([[4, 5, 6]])
>> ###
>>
>> As to your comment about coordinating with Statsmodels, you should see the
>> links in the thread that Alan posted:
>> http://permalink.gmane.org/gmane.comp.python.numeric.general/56516
>> http://permalink.gmane.org/gmane.comp.python.numeric.general/56517
>> Josef's comments at the time seem to echo the issues the devs (and others)
>> have with the matrix class. Maybe things have changed with Statsmodels.
> Not changed, we have a strict policy against using np.matrix.
>
> generic efficient versions for linear operators, kronecker or sparse
> block matrix styly operations would be useful, but I would use array
> semantics, similar to using dot or linalg functions on ndarrays.
>
> Josef
> (long reply canceled because I'm writing too much that might only be
> of tangential interest or has been in some of the matrix discussion
> before.)
Josef,

Many thanks.  I have gained the impression that there is some antipathy 
to np.matrix, perhaps this is because, as others have suggested, the 
array doesn't provide an appropriate framework.

Where are such policy decisions documented?  Numpy doesn't appear to 
have a BDFL.

I had read Alan's links back in February and now have note of them.

Colin W.
>
>
>
>> I know I mentioned Sage and SageMathCloud before. I'll just point out that
>> there are folks that use this for real research problems, not just as a
>> pedagogical tool. They have a Matrix/vector/column_matrix class that do what
>> you were expecting from your problems posted above. Indeed below is a
>> (truncated) cut and past from a Sage Worksheet. (See
>> http://www.sagemath.org/doc/tutorial/tour_linalg.html)
>> ##
>> In : Matrix([1,'2',3])
>> Error in lines 1-1
>> Traceback (most recent call last):
>> TypeError: unable to find a common ring for all elements
>>
>> In : Matrix([[1,2,3],[4,5]])
>> ValueError: List of rows is not valid (rows are wrong types or lengths)
>>
>> In : vector([1,2,3])
>> (1, 2, 3)
>>
>> In : column_matrix([1,2,3])
>> [1]
>> [2]
>> [3]
>> ##
>>
>> Large portions of the custom code and wrappers in Sage are written in
>> Python. I don't think their Matrix object is a subclass of ndarray, so
>> perhaps you could strip out the Matrix stuff from here to make a separate
>> project with just the Matrix stuff, if you don't want to go through the Sage
>> interface.
>>
>>
>> On Wed, Feb 11, 2015 at 11:54 AM, cjw  wrote:
>>>
>>> On 11-Feb-15 10:21 AM, Ryan Nelson wrote:
>>>
>>> So:
>>>
>>> In [2]: np.mat([4,'5',6])
>>> Out[2]:
>>> matrix([['4', '5', '6']], dtype='>>
>>> In [3]: np.mat([4,'5',6], dtype=int)
>>> Out[3]: matrix([[4, 5, 6]])
>>>
>>> Thanks Ryan,
>>>
>>> We are not singing from the same hymn book.
>>>
>>> Using PyScripter, I get:
>>>
>>> *** Python 2.7.9 (default, Dec 10 2014, 12:28:03) [MSC v.1500 64 bit
>>> (AMD64)] on win32. ***
>> import numpy as np
>> print('Numpy version: ', np.__version__)
>>> ('Numpy version: ', '1.9.0')
>>> Could you say which version you are using please?
>>>
>>> Colin W
>>>
>>> On Tue, Feb 10, 2015 at 5:07 PM, cjw  wrote:
>>>
>>> It seems to be agreed that there are weaknesses in the existing Numpy
>>> Matrix
>>> Class.
>>>
>>> Some problems are illustrated below.
>>>
>>> I'll try to put some suggestions over the coming weeks and would
>>> appreciate
>>> comments.
>>>
>>> Colin W.
>>>
>>> Test Script:
>>>
>>> if __name__ == '__main__':
>>>  a= mat([4, 5, 6])   # Good
>>>  print('a: ', a)
>>>  b= mat([4, '5', 6]) # Not the expected result
>>>  print('b: ', b)
>>>  c= mat([[4, 5, 6], [7, 8]]) # Wrongly accepted as rectangular
>>>  print('c: ', c)
>>>  d= mat([[1, 2, 3]])
>>>  try:
>>>  d[0, 1]= 'b'# Correctly flagged, not numeric
>>>  except ValueError:
>>>  print("d[0, 1]= 'b' # Correctly flagged, not 

Re: [Numpy-discussion] Matrix Class

2015-02-14 Thread josef.pktd
On Wed, Feb 11, 2015 at 4:18 PM, Ryan Nelson  wrote:
> Colin,
>
> I currently use Py3.4 and Numpy 1.9.1. However, I built a quick test conda
> environment with Python2.7 and Numpy 1.7.0, and I get the same:
>
> 
> Python 2.7.9 |Continuum Analytics, Inc.| (default, Dec 18 2014, 16:57:52)
> [MSC v
> .1500 64 bit (AMD64)]
> Type "copyright", "credits" or "license" for more information.
>
> IPython 2.3.1 -- An enhanced Interactive Python.
> Anaconda is brought to you by Continuum Analytics.
> Please check out: http://continuum.io/thanks and https://binstar.org
> ? -> Introduction and overview of IPython's features.
> %quickref -> Quick reference.
> help  -> Python's own help system.
> object?   -> Details about 'object', use 'object??' for extra details.
>
> In [1]: import numpy as np
>
> In [2]: np.__version__
> Out[2]: '1.7.0'
>
> In [3]: np.mat([4,'5',6])
> Out[3]:
> matrix([['4', '5', '6']],
>dtype='|S1')
>
> In [4]: np.mat([4,'5',6], dtype=int)
> Out[4]: matrix([[4, 5, 6]])
> ###
>
> As to your comment about coordinating with Statsmodels, you should see the
> links in the thread that Alan posted:
> http://permalink.gmane.org/gmane.comp.python.numeric.general/56516
> http://permalink.gmane.org/gmane.comp.python.numeric.general/56517
> Josef's comments at the time seem to echo the issues the devs (and others)
> have with the matrix class. Maybe things have changed with Statsmodels.

Not changed, we have a strict policy against using np.matrix.

generic efficient versions for linear operators, kronecker or sparse
block matrix styly operations would be useful, but I would use array
semantics, similar to using dot or linalg functions on ndarrays.

Josef
(long reply canceled because I'm writing too much that might only be
of tangential interest or has been in some of the matrix discussion
before.)




>
> I know I mentioned Sage and SageMathCloud before. I'll just point out that
> there are folks that use this for real research problems, not just as a
> pedagogical tool. They have a Matrix/vector/column_matrix class that do what
> you were expecting from your problems posted above. Indeed below is a
> (truncated) cut and past from a Sage Worksheet. (See
> http://www.sagemath.org/doc/tutorial/tour_linalg.html)
> ##
> In : Matrix([1,'2',3])
> Error in lines 1-1
> Traceback (most recent call last):
> TypeError: unable to find a common ring for all elements
>
> In : Matrix([[1,2,3],[4,5]])
> ValueError: List of rows is not valid (rows are wrong types or lengths)
>
> In : vector([1,2,3])
> (1, 2, 3)
>
> In : column_matrix([1,2,3])
> [1]
> [2]
> [3]
> ##
>
> Large portions of the custom code and wrappers in Sage are written in
> Python. I don't think their Matrix object is a subclass of ndarray, so
> perhaps you could strip out the Matrix stuff from here to make a separate
> project with just the Matrix stuff, if you don't want to go through the Sage
> interface.
>
>
> On Wed, Feb 11, 2015 at 11:54 AM, cjw  wrote:
>>
>>
>> On 11-Feb-15 10:21 AM, Ryan Nelson wrote:
>>
>> So:
>>
>> In [2]: np.mat([4,'5',6])
>> Out[2]:
>> matrix([['4', '5', '6']], dtype='>
>> In [3]: np.mat([4,'5',6], dtype=int)
>> Out[3]: matrix([[4, 5, 6]])
>>
>> Thanks Ryan,
>>
>> We are not singing from the same hymn book.
>>
>> Using PyScripter, I get:
>>
>> *** Python 2.7.9 (default, Dec 10 2014, 12:28:03) [MSC v.1500 64 bit
>> (AMD64)] on win32. ***
>> >>> import numpy as np
>> >>> print('Numpy version: ', np.__version__)
>> ('Numpy version: ', '1.9.0')
>> >>>
>>
>> Could you say which version you are using please?
>>
>> Colin W
>>
>> On Tue, Feb 10, 2015 at 5:07 PM, cjw  wrote:
>>
>> It seems to be agreed that there are weaknesses in the existing Numpy
>> Matrix
>> Class.
>>
>> Some problems are illustrated below.
>>
>> I'll try to put some suggestions over the coming weeks and would
>> appreciate
>> comments.
>>
>> Colin W.
>>
>> Test Script:
>>
>> if __name__ == '__main__':
>> a= mat([4, 5, 6])   # Good
>> print('a: ', a)
>> b= mat([4, '5', 6]) # Not the expected result
>> print('b: ', b)
>> c= mat([[4, 5, 6], [7, 8]]) # Wrongly accepted as rectangular
>> print('c: ', c)
>> d= mat([[1, 2, 3]])
>> try:
>> d[0, 1]= 'b'# Correctly flagged, not numeric
>> except ValueError:
>> print("d[0, 1]= 'b' # Correctly flagged, not numeric",
>> '
>> ValueError')
>> print('d: ', d)
>>
>> Result:
>>
>> *** Python 2.7.9 (default, Dec 10 2014, 12:28:03) [MSC v.1500 64 bit
>> (AMD64)] on win32. ***
>>
>> a:  [[4 5 6]]
>> b:  [['4' '5' '6']]
>> c:  [[[4, 5, 6] [7, 8]]]
>> d[0, 1]= 'b' # Correctly flagged, not numeric  ValueError
>> d:  [[1 2 3]]
>>
>>
>>
>> --
>> View this message in context:
>> http://numpy-discussion.10968.n7.nabble.com/Matrix-Class-tp39719.html
>> Sent from the Numpy-discussion mailing list archive at Nabble.com.
>>