Re: [Numpy-discussion] Adding keyword to asarray and asanyarray.

2015-03-06 Thread Benjamin Root
On Fri, Mar 6, 2015 at 7:59 AM, Charles R Harris 
wrote:

> Datetime64 seems to use the highest precision
>
> In [12]: result_type(ones(1, dtype='datetime64[D]'), 'datetime64[us]')
> Out[12]: dtype('
> In [13]: result_type(ones(1, dtype='datetime64[D]'), 'datetime64[Y]')
> Out[13]: dtype('


Ah, yes, that's what I'm looking for. +1 from me to have this in
asarray/asanyarray. Of course, there is always the usual caveats about
converting your datetime data in this manner, but this would be helpful in
many situations in writing functions that expect to deal with temporal data
at the resolution of minutes or somesuch.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding keyword to asarray and asanyarray.

2015-03-06 Thread josef.pktd
On Fri, Mar 6, 2015 at 7:59 AM, Charles R Harris
 wrote:
>
>
> On Thu, Mar 5, 2015 at 10:02 PM,  wrote:
>>
>> On Thu, Mar 5, 2015 at 12:33 PM, Charles R Harris
>>  wrote:
>> >
>> >
>> > On Thu, Mar 5, 2015 at 10:04 AM, Chris Barker 
>> > wrote:
>> >>
>> >> On Thu, Mar 5, 2015 at 8:42 AM, Benjamin Root  wrote:
>> >>>
>> >>> dare I say... datetime64/timedelta64 support?
>> >>
>> >>
>> >> well, the precision of those is 64 bits, yes? so if you asked for less
>> >> than that, you'd still get a dt64. If you asked for 64 bits, you'd get
>> >> it,
>> >> if you asked for datetime128  -- what would you get???
>> >>
>> >> a 128 bit integer? or an Exception, because there is no 128bit datetime
>> >> dtype.
>> >>
>> >> But I think this is the same problem with any dtype -- if you ask for a
>> >> precision that doesn't exist, you're going to get an error.
>> >>
>> >> Is there a more detailed description of the proposed feature anywhere?
>> >> Do
>> >> you specify a dtype as a precision? or jsut the precision, and let the
>> >> dtype
>> >> figure it out for itself, i.e.:
>> >>
>> >> precision=64
>> >>
>> >> would give you a float64 if the passed in array was a float type, but a
>> >> int64 if the passed in array was an int type, or a uint64 if the passed
>> >> in
>> >> array was a unsigned int type, etc.
>> >>
>> >> But in the end,  I wonder about the use case. I generaly use asarray
>> >> one
>> >> of two ways:
>> >>
>> >> Without a dtype -- to simple make sure I've got an ndarray of SOME
>> >> dtype.
>> >>
>> >> or
>> >>
>> >> With a dtype - because I really care about the dtype -- usually because
>> >> I
>> >> need to pass it on to C code or something.
>> >>
>> >> I don't think I'd ever need at least some precision, but not care if I
>> >> got
>> >> more than that...
>> >
>> >
>> > The main use that I want to cover is that float64 and complex128 have
>> > the
>> > same precision and it would be good if either is acceptable.  Also, one
>> > might just want either float32 or float64, not just one of the two.
>> > Another
>> > intent is to make the fewest possible copies. The determination of the
>> > resulting type is made using the result_type function.
>>
>>
>> How does this work for object arrays, or datetime?
>>
>> Can I specify at least float32 or float64, and it raises an exception
>> if it cannot be converted?
>>
>> The problem we have in statsmodels is that pandas frequently uses
>> object arrays and it messes up patsy or statsmodels if it's not
>> explicitly converted.
>
>
> Object arrays go to object arrays, datetime64 depends.
>
> In [10]: result_type(ones(1, dtype=object_), float32)
> Out[10]: dtype('O')
>
>
> Datetime64 seems to use the highest precision
>
> In [12]: result_type(ones(1, dtype='datetime64[D]'), 'datetime64[us]')
> Out[12]: dtype('
> In [13]: result_type(ones(1, dtype='datetime64[D]'), 'datetime64[Y]')
> Out[13]: dtype('
> but doesn't convert to float
>
> In [11]: result_type(ones(1, dtype='datetime64[D]'), float32)
> ---
> TypeError Traceback (most recent call last)
>  in ()
> > 1 result_type(ones(1, dtype='datetime64[D]'), float32)
>
> TypeError: invalid type promotion
>
> What would you like it to do?

Note: the dtype handling in statsmodels is still a mess, and we just
plugged some of the worst cases.


What we would need is asarray with at least a minimum precision (e.g.
float32) and raise an exception if it's not numeric, like string,
object, custom dtypes ...

However, we need custom dtype handling in statsmodels anyway, so the
enhancement to asarray with exceptions would mainly be convenient to
get something to work with because pandas and numpy as now "object
array friendly".

I assume scipy also has insufficient checks for non-numeric dtypes, AFAIR.


Josef


>
> Chuck
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding keyword to asarray and asanyarray.

2015-03-06 Thread Charles R Harris
On Thu, Mar 5, 2015 at 10:02 PM,  wrote:

> On Thu, Mar 5, 2015 at 12:33 PM, Charles R Harris
>  wrote:
> >
> >
> > On Thu, Mar 5, 2015 at 10:04 AM, Chris Barker 
> wrote:
> >>
> >> On Thu, Mar 5, 2015 at 8:42 AM, Benjamin Root  wrote:
> >>>
> >>> dare I say... datetime64/timedelta64 support?
> >>
> >>
> >> well, the precision of those is 64 bits, yes? so if you asked for less
> >> than that, you'd still get a dt64. If you asked for 64 bits, you'd get
> it,
> >> if you asked for datetime128  -- what would you get???
> >>
> >> a 128 bit integer? or an Exception, because there is no 128bit datetime
> >> dtype.
> >>
> >> But I think this is the same problem with any dtype -- if you ask for a
> >> precision that doesn't exist, you're going to get an error.
> >>
> >> Is there a more detailed description of the proposed feature anywhere?
> Do
> >> you specify a dtype as a precision? or jsut the precision, and let the
> dtype
> >> figure it out for itself, i.e.:
> >>
> >> precision=64
> >>
> >> would give you a float64 if the passed in array was a float type, but a
> >> int64 if the passed in array was an int type, or a uint64 if the passed
> in
> >> array was a unsigned int type, etc.
> >>
> >> But in the end,  I wonder about the use case. I generaly use asarray one
> >> of two ways:
> >>
> >> Without a dtype -- to simple make sure I've got an ndarray of SOME
> dtype.
> >>
> >> or
> >>
> >> With a dtype - because I really care about the dtype -- usually because
> I
> >> need to pass it on to C code or something.
> >>
> >> I don't think I'd ever need at least some precision, but not care if I
> got
> >> more than that...
> >
> >
> > The main use that I want to cover is that float64 and complex128 have the
> > same precision and it would be good if either is acceptable.  Also, one
> > might just want either float32 or float64, not just one of the two.
> Another
> > intent is to make the fewest possible copies. The determination of the
> > resulting type is made using the result_type function.
>
>
> How does this work for object arrays, or datetime?
>
> Can I specify at least float32 or float64, and it raises an exception
> if it cannot be converted?
>
> The problem we have in statsmodels is that pandas frequently uses
> object arrays and it messes up patsy or statsmodels if it's not
> explicitly converted.
>

Object arrays go to object arrays, datetime64 depends.

In [10]: result_type(ones(1, dtype=object_), float32)
Out[10]: dtype('O')


Datetime64 seems to use the highest precision

In [12]: result_type(ones(1, dtype='datetime64[D]'), 'datetime64[us]')
Out[12]: dtype(' in ()
> 1 result_type(ones(1, dtype='datetime64[D]'), float32)

TypeError: invalid type promotion

What would you like it to do?

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding keyword to asarray and asanyarray.

2015-03-05 Thread josef.pktd
On Thu, Mar 5, 2015 at 12:33 PM, Charles R Harris
 wrote:
>
>
> On Thu, Mar 5, 2015 at 10:04 AM, Chris Barker  wrote:
>>
>> On Thu, Mar 5, 2015 at 8:42 AM, Benjamin Root  wrote:
>>>
>>> dare I say... datetime64/timedelta64 support?
>>
>>
>> well, the precision of those is 64 bits, yes? so if you asked for less
>> than that, you'd still get a dt64. If you asked for 64 bits, you'd get it,
>> if you asked for datetime128  -- what would you get???
>>
>> a 128 bit integer? or an Exception, because there is no 128bit datetime
>> dtype.
>>
>> But I think this is the same problem with any dtype -- if you ask for a
>> precision that doesn't exist, you're going to get an error.
>>
>> Is there a more detailed description of the proposed feature anywhere? Do
>> you specify a dtype as a precision? or jsut the precision, and let the dtype
>> figure it out for itself, i.e.:
>>
>> precision=64
>>
>> would give you a float64 if the passed in array was a float type, but a
>> int64 if the passed in array was an int type, or a uint64 if the passed in
>> array was a unsigned int type, etc.
>>
>> But in the end,  I wonder about the use case. I generaly use asarray one
>> of two ways:
>>
>> Without a dtype -- to simple make sure I've got an ndarray of SOME dtype.
>>
>> or
>>
>> With a dtype - because I really care about the dtype -- usually because I
>> need to pass it on to C code or something.
>>
>> I don't think I'd ever need at least some precision, but not care if I got
>> more than that...
>
>
> The main use that I want to cover is that float64 and complex128 have the
> same precision and it would be good if either is acceptable.  Also, one
> might just want either float32 or float64, not just one of the two. Another
> intent is to make the fewest possible copies. The determination of the
> resulting type is made using the result_type function.


How does this work for object arrays, or datetime?

Can I specify at least float32 or float64, and it raises an exception
if it cannot be converted?

The problem we have in statsmodels is that pandas frequently uses
object arrays and it messes up patsy or statsmodels if it's not
explicitly converted.

Josef




>
> Chuck
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding keyword to asarray and asanyarray.

2015-03-05 Thread Charles R Harris
On Thu, Mar 5, 2015 at 10:04 AM, Chris Barker  wrote:

> On Thu, Mar 5, 2015 at 8:42 AM, Benjamin Root  wrote:
>
>> dare I say... datetime64/timedelta64 support?
>>
>
> well, the precision of those is 64 bits, yes? so if you asked for less
> than that, you'd still get a dt64. If you asked for 64 bits, you'd get it,
> if you asked for datetime128  -- what would you get???
>
> a 128 bit integer? or an Exception, because there is no 128bit datetime
> dtype.
>
> But I think this is the same problem with any dtype -- if you ask for a
> precision that doesn't exist, you're going to get an error.
>
> Is there a more detailed description of the proposed feature anywhere? Do
> you specify a dtype as a precision? or jsut the precision, and let the
> dtype figure it out for itself, i.e.:
>
> precision=64
>
> would give you a float64 if the passed in array was a float type, but a
> int64 if the passed in array was an int type, or a uint64 if the passed in
> array was a unsigned int type, etc.
>
> But in the end,  I wonder about the use case. I generaly use asarray one
> of two ways:
>
> Without a dtype -- to simple make sure I've got an ndarray of SOME dtype.
>
> or
>
> With a dtype - because I really care about the dtype -- usually because I
> need to pass it on to C code or something.
>
> I don't think I'd ever need at least some precision, but not care if I got
> more than that...
>

The main use that I want to cover is that float64 and complex128 have the
same precision and it would be good if either is acceptable.  Also, one
might just want either float32 or float64, not just one of the two. Another
intent is to make the fewest possible copies. The determination of the
resulting type is made using the result_type function.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding keyword to asarray and asanyarray.

2015-03-05 Thread Benjamin Root
On Thu, Mar 5, 2015 at 12:04 PM, Chris Barker  wrote:

> well, the precision of those is 64 bits, yes? so if you asked for less
> than that, you'd still get a dt64. If you asked for 64 bits, you'd get it,
> if you asked for datetime128  -- what would you get???
>
> a 128 bit integer? or an Exception, because there is no 128bit datetime
> dtype.
>


I was more thinking of datetime64/timedelta64's ability to specify the time
units.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding keyword to asarray and asanyarray.

2015-03-05 Thread Chris Barker
On Thu, Mar 5, 2015 at 8:42 AM, Benjamin Root  wrote:

> dare I say... datetime64/timedelta64 support?
>

well, the precision of those is 64 bits, yes? so if you asked for less than
that, you'd still get a dt64. If you asked for 64 bits, you'd get it, if
you asked for datetime128  -- what would you get???

a 128 bit integer? or an Exception, because there is no 128bit datetime
dtype.

But I think this is the same problem with any dtype -- if you ask for a
precision that doesn't exist, you're going to get an error.

Is there a more detailed description of the proposed feature anywhere? Do
you specify a dtype as a precision? or jsut the precision, and let the
dtype figure it out for itself, i.e.:

precision=64

would give you a float64 if the passed in array was a float type, but a
int64 if the passed in array was an int type, or a uint64 if the passed in
array was a unsigned int type, etc.

But in the end,  I wonder about the use case. I generaly use asarray one of
two ways:

Without a dtype -- to simple make sure I've got an ndarray of SOME dtype.

or

With a dtype - because I really care about the dtype -- usually because I
need to pass it on to C code or something.

I don't think I'd ever need at least some precision, but not care if I got
more than that

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding keyword to asarray and asanyarray.

2015-03-05 Thread Benjamin Root
dare I say... datetime64/timedelta64 support?

::ducks::

Ben Root

On Thu, Mar 5, 2015 at 11:40 AM, Charles R Harris  wrote:

> Hi All,
>
> This is apropos gh-5634 , a PR
> adding a precision keyword to asarray and asanyarray. The PR description is
>
>  The precision keyword differs from the current dtype keyword in the
>> following way.
>>
>>- It specifies a minimum precision. If the precision of the input is
>>greater than the specified precision, the input precision is preserved.
>>- Complex types are preserved. A specifies floating precision applies
>>to the dtype of the real and complex parts separately.
>>
>> For example, both complex128 and float64 dtypes have the
>> same precision and an array of dtype float64 will be unchanged if the
>> specified precision is float32.
>>
>> Ideally the precision keyword would be pushed down into the array
>> constructor so that the resulting dtype could be determined before the
>> array is constructed, but that would require adding new functions as the
>> current constructors are part of the API and cannot have their
>> signatures changed.
>>
> The name of the keyword is open to discussion, as well as its acceptable
> values. And of course, anything else that might come to mind ;)
>
> Thoughts?
>
> Chuck
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Adding keyword to asarray and asanyarray.

2015-03-05 Thread Charles R Harris
Hi All,

This is apropos gh-5634 , a PR
adding a precision keyword to asarray and asanyarray. The PR description is

 The precision keyword differs from the current dtype keyword in the
> following way.
>
>- It specifies a minimum precision. If the precision of the input is
>greater than the specified precision, the input precision is preserved.
>- Complex types are preserved. A specifies floating precision applies
>to the dtype of the real and complex parts separately.
>
> For example, both complex128 and float64 dtypes have the
> same precision and an array of dtype float64 will be unchanged if the
> specified precision is float32.
>
> Ideally the precision keyword would be pushed down into the array
> constructor so that the resulting dtype could be determined before the
> array is constructed, but that would require adding new functions as the
> current constructors are part of the API and cannot have their
> signatures changed.
>
The name of the keyword is open to discussion, as well as its acceptable
values. And of course, anything else that might come to mind ;)

Thoughts?

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion