Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-04-28 Thread Andreas Hilboll
On 19.04.2014 09:03, Andreas Hilboll wrote:
 On 14.04.2014 20:59, Chris Barker wrote:
 On Fri, Apr 11, 2014 at 4:58 PM, Stephan Hoyer sho...@gmail.com
 mailto:sho...@gmail.com wrote:

 On Fri, Apr 11, 2014 at 3:56 PM, Charles R Harris
 charlesr.har...@gmail.com mailto:charlesr.har...@gmail.com wrote:

 Are we in a position to start looking at implementation? If so,
 it would be useful to have a collection of test cases, i.e.,
 typical uses with specified results. That should also cover
 conversion from/(to?) datetime.datetime.


 yup -- tests are always good! 

 Indeed, my personal wish-list for np.datetime64 is centered much
 more on robust conversion to/from native date objects, including
 comparison.


 A good use case. 
  

 Here are some of my particular points of frustration (apologies for
 the thread jacking!):
 - NaT should have similar behavior to NaN when used for comparisons
 (i.e., comparisons should always be False).


 make sense.
  

 - You can't compare a datetime object to a datetime64 object.


 that would be nice to have.
  

 - datetime64 objects with high precision (e.g., ns) can't compare to
 datetime objects.


 That's a problem, but how do you think it should be handled? My thought
 is that it should round to microseconds, and then compare -- kind of
 like comparing float32 and float64...
  

 Pandas has a very nice wrapper around datetime64 arrays that solves
 most of these issues, but it would be nice to get much of that
 functionality in core numpy,


 yes -- it would -- but learning from pandas is certainly a good idea.


 from numpy import datetime64
 from datetime import datetime

 print np.datetime64('NaT')  np.datetime64('2011-01-01') # this
 should not to true
 print datetime(2010, 1, 1)  np.datetime64('2011-01-01') # raises
 exception
 print np.datetime64('2011-01-01T00:00', 'ns')  datetime(2010, 1, 1)
 # another exception
 print np.datetime64('2011-01-01T00:00')  datetime(2010, 1, 1) #
 finally something works!


 now to get them into proper unit tests
 
 As one further suggestion, I think it would be nice if doing arithmetic
 using np.datetime64 and datetime.timedelta objects would work:
 
np.datetime64(2011,1,1) + datetime.timedelta(1) ==
 np.datetime64(2011,1,2)
 
 And of course, but this is probably in the loop anyways,
 np.asarray([list_of_datetime.datetime_objects]) should work as expected.

One more wish / suggestion from my side (apologies if this isn't the
place to make wishes):

Array-wide access to the individual datetime components should work, i.e.,

   datetime64array.year

should yield an array of dtype int with the years.  That would allow
boolean indexing to filter data, like

   datetime64array[datetime64array.year == 2014]

would yield all entries from 2014.

Cheers,

-- Andreas.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-04-28 Thread Chris Barker
On Fri, Apr 25, 2014 at 4:57 AM, Andreas Hilboll li...@hilboll.de wrote:

 Array-wide access to the individual datetime components should work, i.e.,


datetime64array.year

 should yield an array of dtype int with the years.  That would allow
 boolean indexing to filter data, like

datetime64array[datetime64array.year == 2014]

 would yield all entries from 2014.


that would be nice, yes, but datetime64 doesn't support anything like that
at all -- i.e. array-wide or not access to the components. In this case,
you could kludge it with:

In [19]: datetimearray
Out[19]: array(['2014-02-03', '2013-03-08', '2012-03-07', '2014-04-06'],
dtype='datetime64[D]')

In [20]: datetimearray[datetimearray.astype('datetime64[Y]') ==
np.datetime64('2014')]
Out[20]: array(['2014-02-03', '2014-04-06'], dtype='datetime64[D]')

but that wouldn't work for months, for instance.

I think the current NEP should stick with simply fixing the timezone thing
-- no new functionality or consequence.

But:

Maybe it's time for a new NEP for what we want datetime64 to be in the
future -- maybe borrow from the blaze proposal cited earlier? Or wait and
see how that works out, then maybe port that code over to numpy?

In the meantime, a set of utilities that do the kind of things you're
looking for might make sense. You could do it as a ndarray subclass, and
add those sorts of methods, though ndarray subclasses do get messy

-Chris




-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-04-24 Thread Chris Barker - NOAA Federal
On Apr 23, 2014, at 8:23 PM, Sankarshan Mudkavi smudk...@uwaterloo.ca
wrote:

I've been quite busy for the past few weeks but I should be much freer
after next week and can pick up on this (fixing the code and actually
implement things).


wonderful! Thanks.

Chris

Cheers,
Sankarshan

On Apr 23, 2014, at 5:58 PM, Chris Barker chris.bar...@noaa.gov wrote:

On Wed, Mar 19, 2014 at 7:07 PM, Sankarshan Mudkavi
smudk...@uwaterloo.cawrote:


 I've written a rather rudimentary NEP, (lacking in technical details which
 I will hopefully add after some further discussion and receiving
 clarification/help on this thread).

 Please let me know how to proceed and what you think should be added to
 the current proposal (attached to this mail).

 Here is a rendered version of the same:

 https://github.com/Sankarshan-Mudkavi/numpy/blob/Enhance-datetime64/doc/neps/datetime-improvement-proposal.rst


I've done a bit of copy-editing, and added some more from this discussion.
See the pull request on gitHub.

There are a fair number of rough edges, but I think we have a consensus
among the small group of folks that participated in this discussion anyway,
so now all we need is someone to actually fix the code.

If someone steps up, then we should also go in and add a bunch of unit
tests, as discussed in this thread.

-CHB



-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


-- 
Sankarshan Mudkavi
Undergraduate in Physics, University of Waterloo
www.smudkavi.com






___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-04-24 Thread Charles R Harris
On Thu, Apr 24, 2014 at 10:26 AM, Chris Barker - NOAA Federal 
chris.bar...@noaa.gov wrote:

 On Apr 23, 2014, at 8:23 PM, Sankarshan Mudkavi smudk...@uwaterloo.ca
 wrote:

 I've been quite busy for the past few weeks but I should be much freer
 after next week and can pick up on this (fixing the code and actually
 implement things).


 wonderful! Thanks.


Might want to take a look at the datetime
proposalhttps://github.com/ContinuumIO/blaze/blob/master/docs/design/blaze-datetime.mdfor
blaze.

snip

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-04-24 Thread Chris Barker
On Thu, Apr 24, 2014 at 10:07 AM, Charles R Harris 
charlesr.har...@gmail.com wrote:

 Might want to take a look at the datetime 
 proposalhttps://github.com/ContinuumIO/blaze/blob/master/docs/design/blaze-datetime.mdfor
  blaze.


oh man! not again!.

Oh well, that is a decidedly different proposal -- maybe better, I don't
know.  But it's different enough that I think we should pretty much ignore
it for now, and still do a few fixes to make the current datetime64 usable.

Maybe as that gets mature, we could adopt it, or something like it, to
numpy. Or maybe we'll all be using Blaze then anyway ;-)

But thanks for the ping...

-CHB




 snip

 Chuck

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion




-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-04-23 Thread Chris Barker
On Wed, Mar 19, 2014 at 7:07 PM, Sankarshan Mudkavi
smudk...@uwaterloo.cawrote:


 I've written a rather rudimentary NEP, (lacking in technical details which
 I will hopefully add after some further discussion and receiving
 clarification/help on this thread).

 Please let me know how to proceed and what you think should be added to
 the current proposal (attached to this mail).

 Here is a rendered version of the same:

 https://github.com/Sankarshan-Mudkavi/numpy/blob/Enhance-datetime64/doc/neps/datetime-improvement-proposal.rst


I've done a bit of copy-editing, and added some more from this discussion.
See the pull request on gitHub.

There are a fair number of rough edges, but I think we have a consensus
among the small group of folks that participated in this discussion anyway,
so now all we need is someone to actually fix the code.

If someone steps up, then we should also go in and add a bunch of unit
tests, as discussed in this thread.

-CHB



-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-04-23 Thread Sankarshan Mudkavi
Thank you very much, I will incorporate it!

I've been quite busy for the past few weeks but I should be much freer after 
next week and can pick up on this (fixing the code and actually implement 
things).

Cheers,
Sankarshan

On Apr 23, 2014, at 5:58 PM, Chris Barker chris.bar...@noaa.gov wrote:

 On Wed, Mar 19, 2014 at 7:07 PM, Sankarshan Mudkavi smudk...@uwaterloo.ca 
 wrote:
 
 I've written a rather rudimentary NEP, (lacking in technical details which I 
 will hopefully add after some further discussion and receiving 
 clarification/help on this thread).
 
 Please let me know how to proceed and what you think should be added to the 
 current proposal (attached to this mail).
 
 Here is a rendered version of the same:
 https://github.com/Sankarshan-Mudkavi/numpy/blob/Enhance-datetime64/doc/neps/datetime-improvement-proposal.rst
 
 I've done a bit of copy-editing, and added some more from this discussion. 
 See the pull request on gitHub.
 
 There are a fair number of rough edges, but I think we have a consensus among 
 the small group of folks that participated in this discussion anyway, so now 
 all we need is someone to actually fix the code.
 
 If someone steps up, then we should also go in and add a bunch of unit tests, 
 as discussed in this thread.
 
 -CHB
 
  
 
 -- 
 
 Christopher Barker, Ph.D.
 Oceanographer
 
 Emergency Response Division
 NOAA/NOS/ORR(206) 526-6959   voice
 7600 Sand Point Way NE   (206) 526-6329   fax
 Seattle, WA  98115   (206) 526-6317   main reception
 
 chris.bar...@noaa.gov
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

-- 
Sankarshan Mudkavi
Undergraduate in Physics, University of Waterloo
www.smudkavi.com








signature.asc
Description: Message signed with OpenPGP using GPGMail
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-04-19 Thread Andreas Hilboll
On 14.04.2014 20:59, Chris Barker wrote:
 On Fri, Apr 11, 2014 at 4:58 PM, Stephan Hoyer sho...@gmail.com
 mailto:sho...@gmail.com wrote:
 
 On Fri, Apr 11, 2014 at 3:56 PM, Charles R Harris
 charlesr.har...@gmail.com mailto:charlesr.har...@gmail.com wrote:
 
 Are we in a position to start looking at implementation? If so,
 it would be useful to have a collection of test cases, i.e.,
 typical uses with specified results. That should also cover
 conversion from/(to?) datetime.datetime.
 
 
 yup -- tests are always good! 
 
 Indeed, my personal wish-list for np.datetime64 is centered much
 more on robust conversion to/from native date objects, including
 comparison.
 
 
 A good use case. 
  
 
 Here are some of my particular points of frustration (apologies for
 the thread jacking!):
 - NaT should have similar behavior to NaN when used for comparisons
 (i.e., comparisons should always be False).
 
 
 make sense.
  
 
 - You can't compare a datetime object to a datetime64 object.
 
 
 that would be nice to have.
  
 
 - datetime64 objects with high precision (e.g., ns) can't compare to
 datetime objects.
 
 
 That's a problem, but how do you think it should be handled? My thought
 is that it should round to microseconds, and then compare -- kind of
 like comparing float32 and float64...
  
 
 Pandas has a very nice wrapper around datetime64 arrays that solves
 most of these issues, but it would be nice to get much of that
 functionality in core numpy,
 
 
 yes -- it would -- but learning from pandas is certainly a good idea.
 
 
 from numpy import datetime64
 from datetime import datetime
 
 print np.datetime64('NaT')  np.datetime64('2011-01-01') # this
 should not to true
 print datetime(2010, 1, 1)  np.datetime64('2011-01-01') # raises
 exception
 print np.datetime64('2011-01-01T00:00', 'ns')  datetime(2010, 1, 1)
 # another exception
 print np.datetime64('2011-01-01T00:00')  datetime(2010, 1, 1) #
 finally something works!
 
 
 now to get them into proper unit tests

As one further suggestion, I think it would be nice if doing arithmetic
using np.datetime64 and datetime.timedelta objects would work:

   np.datetime64(2011,1,1) + datetime.timedelta(1) ==
np.datetime64(2011,1,2)

And of course, but this is probably in the loop anyways,
np.asarray([list_of_datetime.datetime_objects]) should work as expected.

-- Andreas.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-04-18 Thread Sankarshan Mudkavi
I think we'll be ready to start implementation once I get the conversion to 
datetime.datetime on the proposal with some decent examples. It would also be 
great to have opinions on what test cases should be used, so please speak up if 
you feel you have anything to say about that.

Cheers,
Sankarshan

On Apr 14, 2014, at 2:59 PM, Chris Barker chris.bar...@noaa.gov wrote:

 On Fri, Apr 11, 2014 at 4:58 PM, Stephan Hoyer sho...@gmail.com wrote:
 On Fri, Apr 11, 2014 at 3:56 PM, Charles R Harris charlesr.har...@gmail.com 
 wrote:
 Are we in a position to start looking at implementation? If so, it would be 
 useful to have a collection of test cases, i.e., typical uses with specified 
 results. That should also cover conversion from/(to?) datetime.datetime.
 
 yup -- tests are always good! 
 
 Indeed, my personal wish-list for np.datetime64 is centered much more on 
 robust conversion to/from native date objects, including comparison.
 
 A good use case. 
  
 Here are some of my particular points of frustration (apologies for the 
 thread jacking!):
 - NaT should have similar behavior to NaN when used for comparisons (i.e., 
 comparisons should always be False).
 
 make sense.
  
 - You can't compare a datetime object to a datetime64 object.
 
 that would be nice to have.
  
 - datetime64 objects with high precision (e.g., ns) can't compare to datetime 
 objects.
 
 That's a problem, but how do you think it should be handled? My thought is 
 that it should round to microseconds, and then compare -- kind of like 
 comparing float32 and float64...
  
 Pandas has a very nice wrapper around datetime64 arrays that solves most of 
 these issues, but it would be nice to get much of that functionality in core 
 numpy,
 
 yes -- it would -- but learning from pandas is certainly a good idea.
 
 from numpy import datetime64
 from datetime import datetime
 
 print np.datetime64('NaT')  np.datetime64('2011-01-01') # this should not to 
 true
 print datetime(2010, 1, 1)  np.datetime64('2011-01-01') # raises exception
 print np.datetime64('2011-01-01T00:00', 'ns')  datetime(2010, 1, 1) # 
 another exception
 print np.datetime64('2011-01-01T00:00')  datetime(2010, 1, 1) # finally 
 something works!
 
 
 now to get them into proper unit tests
 
 -CHB
  
 
 -- 
 
 Christopher Barker, Ph.D.
 Oceanographer
 
 Emergency Response Division
 NOAA/NOS/ORR(206) 526-6959   voice
 7600 Sand Point Way NE   (206) 526-6329   fax
 Seattle, WA  98115   (206) 526-6317   main reception
 
 chris.bar...@noaa.gov
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

-- 
Sankarshan Mudkavi
Undergraduate in Physics, University of Waterloo
www.smudkavi.com








signature.asc
Description: Message signed with OpenPGP using GPGMail
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-04-18 Thread Stephan Hoyer
On Mon, Apr 14, 2014 at 11:59 AM, Chris Barker chris.bar...@noaa.govwrote:

 - datetime64 objects with high precision (e.g., ns) can't compare to
 datetime objects.


 That's a problem, but how do you think it should be handled? My thought is
 that it should round to microseconds, and then compare -- kind of like
 comparing float32 and float64...


I agree -- if the ns matter, you shouldn't be using datetime.datetime
objects.

Similarly, it's currently not possible to convert high precision datetime64
objects into datetimes. Worse, this doesn't even raise an error!

 from datetime import datetime
 import numpy as np
 np.datetime64('2000-01-01T00:00:00Z', 'us').astype(datetime)

datetime.datetime(2000, 1, 1, 0, 0)

 np.datetime64('2000-01-01T00:00:00Z', 'ns').astype(datetime)

9466848000L


Other inconsistent behavior:


 np.datetime64('2000', 'M')
numpy.datetime64('2000-01')
 np.datetime64('2000', 'D')
numpy.datetime64('2000-01-01')
 np.datetime64('2000', 's')
---
TypeError Traceback (most recent call last)
ipython-input-67-bf5fc9a2985b in module()
 1 np.datetime64('2000', 's')

TypeError: Cannot parse 2000 as unit 's' using casting rule 'same_kind'

More broadly, my recommendation is to look through the unit tests for
pandas' datetIme handling:
https://github.com/pydata/pandas/tree/master/pandas/tseries/tests

Not everything is relevant but you might find some test cases you could
borrow wholesale. Pandas is BSD licensed, so you may even be able to copy
them directly into numpy.

Best,
Stephan
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-04-14 Thread Chris Barker
On Fri, Apr 11, 2014 at 4:58 PM, Stephan Hoyer sho...@gmail.com wrote:

 On Fri, Apr 11, 2014 at 3:56 PM, Charles R Harris 
 charlesr.har...@gmail.com wrote:

 Are we in a position to start looking at implementation? If so, it would
 be useful to have a collection of test cases, i.e., typical uses with
 specified results. That should also cover conversion from/(to?)
 datetime.datetime.


yup -- tests are always good!

Indeed, my personal wish-list for np.datetime64 is centered much more on
 robust conversion to/from native date objects, including comparison.


A good use case.


  Here are some of my particular points of frustration (apologies for the
 thread jacking!):
 - NaT should have similar behavior to NaN when used for comparisons (i.e.,
 comparisons should always be False).


make sense.


  - You can't compare a datetime object to a datetime64 object.


that would be nice to have.


 - datetime64 objects with high precision (e.g., ns) can't compare to
 datetime objects.


That's a problem, but how do you think it should be handled? My thought is
that it should round to microseconds, and then compare -- kind of like
comparing float32 and float64...


 Pandas has a very nice wrapper around datetime64 arrays that solves most
 of these issues, but it would be nice to get much of that functionality in
 core numpy,


yes -- it would -- but learning from pandas is certainly a good idea.


 from numpy import datetime64
 from datetime import datetime

 print np.datetime64('NaT')  np.datetime64('2011-01-01') # this should not
 to true
 print datetime(2010, 1, 1)  np.datetime64('2011-01-01') # raises exception
 print np.datetime64('2011-01-01T00:00', 'ns')  datetime(2010, 1, 1) #
 another exception
 print np.datetime64('2011-01-01T00:00')  datetime(2010, 1, 1) # finally
 something works!


now to get them into proper unit tests

-CHB


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-04-11 Thread Sankarshan Mudkavi
So is the consensus that we don't accept any tags at all (not even 
temporarily)? Would that break too much existing code?

Cheers,
Sankarshan

On Apr 1, 2014, at 2:50 PM, Alexander Belopolsky ndar...@mac.com wrote:

 
 On Tue, Apr 1, 2014 at 1:12 PM, Nathaniel Smith n...@pobox.com wrote:
 In [6]: a[0] = garbage
 ValueError: could not convert string to float: garbage
 
 (Cf, Errors should never pass silently.) Any reason why datetime64
 should be different?
 
 datetime64 is different because it has NaT support from the start.  NaN 
 support for floats seems to be an afterthought if not an accident of 
 implementation.
 
 And it looks like some errors do pass silently:
 
  a[0] = 1
 # not a TypeError
 
 But I withdraw my suggestion.  The closer datetime64 behavior is to numeric 
 types the better.
 
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

-- 
Sankarshan Mudkavi
Undergraduate in Physics, University of Waterloo
www.smudkavi.com








signature.asc
Description: Message signed with OpenPGP using GPGMail
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-04-11 Thread Nathaniel Smith
On Fri, Apr 11, 2014 at 11:25 PM, Sankarshan Mudkavi
smudk...@uwaterloo.ca wrote:
 So is the consensus that we don't accept any tags at all (not even
 temporarily)? Would that break too much existing code?

Well, we don't know. If anyone has any ideas on how to figure it out
then they should speak up :-).

Barring any brilliant suggestions though, I suggest we just go ahead
with disallowing all timezone tags for now. We can always change our
mind as we get closer to the release and people start experimenting
with the new code.

-n

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-04-11 Thread Charles R Harris
On Fri, Apr 11, 2014 at 4:25 PM, Sankarshan Mudkavi
smudk...@uwaterloo.cawrote:

 So is the consensus that we don't accept any tags at all (not even
 temporarily)? Would that break too much existing code?

 Cheers,
 Sankarshan

 On Apr 1, 2014, at 2:50 PM, Alexander Belopolsky ndar...@mac.com wrote:


 On Tue, Apr 1, 2014 at 1:12 PM, Nathaniel Smith n...@pobox.com wrote:

 In [6]: a[0] = garbage
 ValueError: could not convert string to float: garbage

 (Cf, Errors should never pass silently.) Any reason why datetime64
 should be different?


 datetime64 is different because it has NaT support from the start.  NaN
 support for floats seems to be an afterthought if not an accident of
 implementation.

 And it looks like some errors do pass silently:

  a[0] = 1
 # not a TypeError

 But I withdraw my suggestion.  The closer datetime64 behavior is to
 numeric types the better.

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion



Are we in a position to start looking at implementation? If so, it would be
useful to have a collection of test cases, i.e., typical uses with
specified results. That should also cover conversion from/(to?)
datetime.datetime.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-04-11 Thread Stephan Hoyer
On Fri, Apr 11, 2014 at 3:56 PM, Charles R Harris charlesr.har...@gmail.com
 wrote:

 Are we in a position to start looking at implementation? If so, it would
 be useful to have a collection of test cases, i.e., typical uses with
 specified results. That should also cover conversion from/(to?)
 datetime.datetime.


Indeed, my personal wish-list for np.datetime64 is centered much more on
robust conversion to/from native date objects, including comparison.

Here are some of my particular points of frustration (apologies for the
thread jacking!):
- NaT should have similar behavior to NaN when used for comparisons (i.e.,
comparisons should always be False).
- You can't compare a datetime object to a datetime64 object.
- datetime64 objects with high precision (e.g., ns) can't compare to
datetime objects.

Pandas has a very nice wrapper around datetime64 arrays that solves most of
these issues, but it would be nice to get much of that functionality in
core numpy, since I don't always want to store my values in a 1-dimensional
array + hash-table (the pandas Index):
http://pandas.pydata.org/pandas-docs/stable/timeseries.html

Here's code which reproduces all of the above:

from numpy import datetime64
from datetime import datetime

print np.datetime64('NaT')  np.datetime64('2011-01-01') # this should not
to true
print datetime(2010, 1, 1)  np.datetime64('2011-01-01') # raises exception
print np.datetime64('2011-01-01T00:00', 'ns')  datetime(2010, 1, 1) #
another exception
print np.datetime64('2011-01-01T00:00')  datetime(2010, 1, 1) # finally
something works!
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-04-11 Thread Alexander Belopolsky
On Fri, Apr 11, 2014 at 7:58 PM, Stephan Hoyer sho...@gmail.com wrote:

 print datetime(2010, 1, 1)  np.datetime64('2011-01-01') # raises exception


This is somewhat consistent with

 from datetime import *
 datetime(2010, 1, 1)  date(2010, 1, 1)
Traceback (most recent call last):
  File stdin, line 1, in module
TypeError: can't compare datetime.datetime to datetime.date

but I would expect date(2010, 1, 1)  np.datetime64('2011-01-01') to return
False.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-04-01 Thread Chris Barker
On Mon, Mar 31, 2014 at 7:19 PM, Nathaniel Smith n...@pobox.com wrote:

  The difference is that datetime.datetime doesn't provide any iso string
 parsing.

 Sure it does. datetime.strptime, with the %z modifier in particular.


that's not ISO parsing, that's parsing according to a user-defined format
string, which can be used for ISO parsing, but the user is in control of
how that's done. And I see this:

For a naive object, the %z and %Z format codes are replaced by empty
strings.

 though I'm not entirely sure what that means -- probably only for writing.

 The use case I'm imagining is for folks with ISO strings with a Z on the
 end -- they'll need to deal with pre-parsing the strings to strip off the
 Z, when it wouldn't change the result.
 
  Maybe this is an argument for UTC always rather than naive?

 Probably it is, but that approach seems a lot harder to extend to proper
 tz support later, plus being more likely to cause trouble for pandas's
 proper tz support now.

I was originally advocating for naive to begin with ;-) Someone else pushed
for UTC -- I thought it was you! (but I guess not)

It seems this committee of two has come to a consensus on naive -- and
you're probably right, raise an exception if there is a time zone specifier.

-CHB







  -n

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion




-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-04-01 Thread Alexander Belopolsky
On Tue, Apr 1, 2014 at 12:10 PM, Chris Barker chris.bar...@noaa.gov wrote:

 For a naive object, the %z and %Z format codes are replaced by empty
 strings.

  though I'm not entirely sure what that means -- probably only for writing.


That's right:

 from datetime import *
 datetime.now().strftime('%z')
''
 datetime.now(timezone.utc).strftime('%z')
'+'
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-04-01 Thread Alexander Belopolsky
On Tue, Apr 1, 2014 at 12:10 PM, Chris Barker chris.bar...@noaa.gov wrote:

 It seems this committee of two has come to a consensus on naive -- and
 you're probably right, raise an exception if there is a time zone specifier.


Count me as +1 on naive, but consider converting garbage (including strings
with trailing Z) to NaT.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-04-01 Thread Nathaniel Smith
On Tue, Apr 1, 2014 at 5:22 PM, Alexander Belopolsky ndar...@mac.com wrote:

 On Tue, Apr 1, 2014 at 12:10 PM, Chris Barker chris.bar...@noaa.gov wrote:

 It seems this committee of two has come to a consensus on naive -- and
 you're probably right, raise an exception if there is a time zone specifier.


 Count me as +1 on naive, but consider converting garbage (including strings
 with trailing Z) to NaT.

That's not how we handle other types, e.g.:

In [5]: a = np.zeros(1, dtype=float)

In [6]: a[0] = garbage
ValueError: could not convert string to float: garbage

(Cf, Errors should never pass silently.) Any reason why datetime64
should be different?

-n

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-04-01 Thread Sankarshan Mudkavi
I agree with that interpretation of naive as well. I'll change the proposal to 
reflect that. So any modifier should raise an error then? (At the risk of 
breaking people's code.)

The only question is, should we consider accepting the modifier and disregard 
it with a warning, letting the user know that this is only for temporary 
compatibility purposes?


As of now, it's not clear to me which of those options is better.

Cheers,
Sankarshan

On Apr 1, 2014, at 1:12 PM, Nathaniel Smith n...@pobox.com wrote:

 On Tue, Apr 1, 2014 at 5:22 PM, Alexander Belopolsky ndar...@mac.com wrote:
 
 On Tue, Apr 1, 2014 at 12:10 PM, Chris Barker chris.bar...@noaa.gov wrote:
 
 It seems this committee of two has come to a consensus on naive -- and
 you're probably right, raise an exception if there is a time zone specifier.
 
 
 Count me as +1 on naive, but consider converting garbage (including strings
 with trailing Z) to NaT.
 
 That's not how we handle other types, e.g.:
 
 In [5]: a = np.zeros(1, dtype=float)
 
 In [6]: a[0] = garbage
 ValueError: could not convert string to float: garbage
 
 (Cf, Errors should never pass silently.) Any reason why datetime64
 should be different?
 
 -n
 
 -- 
 Nathaniel J. Smith
 Postdoctoral researcher - Informatics - University of Edinburgh
 http://vorpus.org
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

-- 
Sankarshan Mudkavi
Undergraduate in Physics, University of Waterloo
www.smudkavi.com








signature.asc
Description: Message signed with OpenPGP using GPGMail
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-04-01 Thread Alexander Belopolsky
On Tue, Apr 1, 2014 at 1:12 PM, Nathaniel Smith n...@pobox.com wrote:

 In [6]: a[0] = garbage
 ValueError: could not convert string to float: garbage

 (Cf, Errors should never pass silently.) Any reason why datetime64
 should be different?


datetime64 is different because it has NaT support from the start.  NaN
support for floats seems to be an afterthought if not an accident of
implementation.

And it looks like some errors do pass silently:

 a[0] = 1
# not a TypeError

But I withdraw my suggestion.  The closer datetime64 behavior is to numeric
types the better.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-03-31 Thread Chris Barker
On Sat, Mar 29, 2014 at 3:08 PM, Nathaniel Smith n...@pobox.com wrote:

 On 29 Mar 2014 20:57, Chris Barker chris.bar...@noaa.gov wrote:
  I think this is somewhat open for discussion -- yes, it's odd, but in
 the spirit of practicality beats purity, it seems OK. We could allow any TZ
 specifier for that matter -- that's kind of how naive or local timezone
 (non) handling works -- it's up to the user to make sure that all DTs are
 in the same timezone.

 That isn't how naive timezone handling works in datetime.datetime, though.
 If you try to mix a timezone (even a Zulu timezone) datetime with a naive
 datetime, you get an exception.

fari enough.

The difference is that datetime.datetime doesn't provide any iso string
parsing. The use case I'm imagining is for folks with ISO strings with a Z
on the end -- they'll need to deal with pre-parsing the strings to strip
off the Z, when it wouldn't change the result.

Maybe this is an argument for UTC always rather than naive?

-Chris



-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-03-31 Thread Nathaniel Smith
On 31 Mar 2014 19:47, Chris Barker chris.bar...@noaa.gov wrote:

 On Sat, Mar 29, 2014 at 3:08 PM, Nathaniel Smith n...@pobox.com wrote:

 On 29 Mar 2014 20:57, Chris Barker chris.bar...@noaa.gov wrote:
  I think this is somewhat open for discussion -- yes, it's odd, but in
the spirit of practicality beats purity, it seems OK. We could allow any TZ
specifier for that matter -- that's kind of how naive or local timezone
(non) handling works -- it's up to the user to make sure that all DTs are
in the same timezone.

 That isn't how naive timezone handling works in datetime.datetime,
though. If you try to mix a timezone (even a Zulu timezone) datetime with a
naive datetime, you get an exception.

 fari enough.

 The difference is that datetime.datetime doesn't provide any iso string
parsing.

Sure it does. datetime.strptime, with the %z modifier in particular.

 The use case I'm imagining is for folks with ISO strings with a Z on the
end -- they'll need to deal with pre-parsing the strings to strip off the
Z, when it wouldn't change the result.

 Maybe this is an argument for UTC always rather than naive?

Probably it is, but that approach seems a lot harder to extend to proper tz
support later, plus being more likely to cause trouble for pandas's proper
tz support now.

-n
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-03-29 Thread Nathaniel Smith
On Fri, Mar 28, 2014 at 9:30 PM, Sankarshan Mudkavi
smudk...@uwaterloo.ca wrote:

 Hi Nathaniel,

 1- You give as an example of naive datetime handling:

 np.datetime64('2005-02-25T03:00Z')
 np.datetime64('2005-02-25T03:00')

 This IIUC is incorrect. The Z modifier is a timezone offset, and for normal
 naive datetimes would cause an error.


 If what I understand from reading:
 http://thread.gmane.org/gmane.comp.python.numeric.general/53805

 It looks like anything other than Z, 00:00 or UTC that has a TZ adjustment
 would raise an error, and those specific conditions would not (I'm guessing
 this is because we assume it's UTC (or the same timezone) internally,
 anything that explicitly tells us it is UTC is acceptable, although that may
 be just my misreading of it.)

If we assume it's UTC, then that's proposal 2, I think :-).

My point is just that naive datetime already has a specific meaning
in Python, and as I understand that meaning, it says that trying to
pass a Z timezone to a naive datetime should be an error.

As a separate issue, we might decide that we want to continue to allow
Z modifiers (or all offset modifiers) temporarily in numpy, to avoid
breaking code without warning. Just if we do, then we shoudn't say
that this is because we are implementing naive datetimes and this is
how naive datetimes work. Instead we should either say that we're not
implementing naive datetimes, or else say that we're implementing
naive datetimes but have some temporary compatibility hacks on top of
that (and probably issue a DeprecationWarning if anyone passes a
timezone).

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-03-29 Thread Chris Barker
On Sat, Mar 29, 2014 at 1:04 PM, Nathaniel Smith n...@pobox.com wrote:

  1- You give as an example of naive datetime handling:
 
  np.datetime64('2005-02-25T03:00Z')
  np.datetime64('2005-02-25T03:00')
 
  This IIUC is incorrect. The Z modifier is a timezone offset, and for
 normal
  naive datetimes would cause an error.


I think this is somewhat open for discussion -- yes, it's odd, but in the
spirit of practicality beats purity, it seems OK. We could allow any TZ
specifier for that matter -- that's kind of how naive or local timezone
(non) handling works -- it's up to the user to make sure that all DTs are
in the same timezone. All it would be doing is tossing out some additional
information that was in the ISO string.

If we are explicitly calling it UTC-always, then anything other than Z or
00:00 (or nothing) would need to be converted.

I think when it comes down to it, anything other than proper timezone
handling will require these user-beware compromises.


As a separate issue, we might decide that we want to continue to allow
 Z modifiers (or all offset modifiers) temporarily in numpy, to avoid
 breaking code without warning.


Maybe the best tactic -- though it's broken enough now that I'm not sure it
matters. A clear direction from here may be a better bet.

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-03-29 Thread Nathaniel Smith
On 29 Mar 2014 20:57, Chris Barker chris.bar...@noaa.gov wrote:
 I think this is somewhat open for discussion -- yes, it's odd, but in the
spirit of practicality beats purity, it seems OK. We could allow any TZ
specifier for that matter -- that's kind of how naive or local timezone
(non) handling works -- it's up to the user to make sure that all DTs are
in the same timezone.

That isn't how naive timezone handling works in datetime.datetime, though.
If you try to mix a timezone (even a Zulu timezone) datetime with a naive
datetime, you get an exception. I agree this is open for discussion, but
IMO deviating from the stdlib behavior this much would require some more
justification. Don't let errors pass silently, etc.

-n
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-03-28 Thread Nathaniel Smith
On 28 Mar 2014 05:00, Sankarshan Mudkavi smudk...@uwaterloo.ca wrote:

 Hi all,

 Apologies for the delay in following up, here is an expanded version of
the proposal, which hopefully clears up most of the details. I have not
included specific implementation details for the code, such as which
functions to modify etc. since I think those are not traditionally included
in NEPs?

The format seems fine to me. Really the point is just to have a document
that we can use as reference when deciding on behaviour, and this does that
:-).

Three quick comments:

1- You give as an example of naive datetime handling:

 np.datetime64('2005-02-25T03:00Z')
np.datetime64('2005-02-25T03:00')

This IIUC is incorrect. The Z modifier is a timezone offset, and for normal
naive datetimes would cause an error.

2- It would be good to include explicitly examples of conversion to and
from datetimes alongside the examples of conversions to and from strings.

3- It would be good to (eventually) include some discussion of the impact
of the preferred proposal on existing code. E.g., will this break a lot of
people's pipelines? (Are people currently *always* adding timezones to
their numpy input to avoid the problem, and now will have to switch to the
opposite behaviour depending on numpy version?) And we'll want to make sure
to get feedback from the pydata@ (pandas) list explicitly, though that can
wait until people here have had a chance to respond to the first draft.

Thanks for pushing this forward!
-n
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-03-28 Thread Sankarshan Mudkavi

Hi Nathaniel,

 1- You give as an example of naive datetime handling:
 
  np.datetime64('2005-02-25T03:00Z')
 np.datetime64('2005-02-25T03:00')
 
 This IIUC is incorrect. The Z modifier is a timezone offset, and for normal 
 naive datetimes would cause an error.
 


If what I understand from reading:
http://thread.gmane.org/gmane.comp.python.numeric.general/53805

It looks like anything other than Z, 00:00 or UTC that has a TZ adjustment 
would raise an error, and those specific conditions would not (I'm guessing 
this is because we assume it's UTC (or the same timezone) internally, anything 
that explicitly tells us it is UTC is acceptable, although that may be just my 
misreading of it.)

However on output we don't use the Z modifier (which is why it's different from 
the UTC datetime64).

I will change it to return an error if what I thought is incorrect and also 
include examples of conversion from datetimes as you requested.

Please let me know if there are any more changes that are required! I look 
forward to further comments/questions.

Cheers,
Sankarshan

 On Fri, Mar 28, 2014 at 5:17 AM, Nathaniel Smith n...@pobox.com wrote:
 On 28 Mar 2014 05:00, Sankarshan Mudkavi smudk...@uwaterloo.ca wrote:
 
  Hi all,
 
  Apologies for the delay in following up, here is an expanded version of the 
  proposal, which hopefully clears up most of the details. I have not 
  included specific implementation details for the code, such as which 
  functions to modify etc. since I think those are not traditionally included 
  in NEPs?
 
 The format seems fine to me. Really the point is just to have a document that 
 we can use as reference when deciding on behaviour, and this does that :-).
 
 Three quick comments:
 
 1- You give as an example of naive datetime handling:
 
  np.datetime64('2005-02-25T03:00Z')
 np.datetime64('2005-02-25T03:00')
 
 This IIUC is incorrect. The Z modifier is a timezone offset, and for normal 
 naive datetimes would cause an error.
 
 2- It would be good to include explicitly examples of conversion to and from 
 datetimes alongside the examples of conversions to and from strings.
 
 3- It would be good to (eventually) include some discussion of the impact of 
 the preferred proposal on existing code. E.g., will this break a lot of 
 people's pipelines? (Are people currently *always* adding timezones to their 
 numpy input to avoid the problem, and now will have to switch to the opposite 
 behaviour depending on numpy version?) And we'll want to make sure to get 
 feedback from the pydata@ (pandas) list explicitly, though that can wait 
 until people here have had a chance to respond to the first draft.
 
 Thanks for pushing this forward!
 -n
 
 Hi all,
 
 Apologies for the delay in following up, here is an expanded version of the 
 proposal, which hopefully clears up most of the details. I have not included 
 specific implementation details for the code, such as which functions to 
 modify etc. since I think those are not traditionally included in NEPs?
 
 Please find attached the expanded proposal, and the rendered version is 
 available here:
 https://github.com/Sankarshan-Mudkavi/numpy/blob/Enhance-datetime64/doc/neps/datetime-improvement-proposal.rst
 
 datetime-improvement-proposal.rst
 
 I look forward to comments, agreements/disagreements with this (and 
 clarification if this needs even further expansion).
 
 
 Please find attached the 
 On Mar 24, 2014, at 12:39 AM, Chris Barker chris.bar...@noaa.gov wrote:
 
 On Fri, Mar 21, 2014 at 3:43 PM, Nathaniel Smith n...@pobox.com wrote:
 On Thu, Mar 20, 2014 at 11:27 PM, Chris Barker chris.bar...@noaa.gov 
 wrote:
  * I think there are more or less three options:
 1)  a) don't have any timezone handling at all -- all datetime64s are 
  UTC. Always
   b) don't have any timezone handling at all -- all datetime64s 
  are naive
   (the only difference between these two is I/O of strings, 
  and maybe I/O of datetime objects with a time zone)
  2) Have a time zone associated with the array -- defaulting to either 
  UTC or None, but don't provide any implementation other than the tagging, 
  with the ability to add in TZ handler if you want (can this be done 
  efficiently?)
  3) Full on proper TZ handling.
 
  I think (3) is off the table for now.
 
 I think the first goal is to define what a plain vanilla datetime64
 does, without any extra attributes. This is for two practical reasons:
 First, our overriding #1 goal is to fix the nasty I/O problems that
 default datetime64's show, so until that's done any other bells and
 whistles are a distraction. And second, adding parameters to dtypes
 right now is technically messy.
 
 This rules out (2) and (3).
 
 yup -- though I'm not sure I agree that we need to do this, if we are going 
 to do something more later anyway. But you have a key point - maybe the 
 dtype system simply isn't ready to do it right, and then it may be better 
 not to try. 
 
 In which 

Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-03-28 Thread Jeff Reback
FYI

Here are docs for panda of timezone handling

wesm worked thru the various issues w.r.t. conversion, localization, and
ambiguous zone crossing.

http://pandas.pydata.org/pandas-docs/stable/timeseries.html#time-zone-handling

implementation is largely in here:

 (underlying impl is a datetime64[ns] dtype with a pytz as the timezone)

https://github.com/pydata/pandas/blob/master/pandas/tseries/index.py



On Fri, Mar 28, 2014 at 4:30 PM, Sankarshan Mudkavi
smudk...@uwaterloo.cawrote:


 Hi Nathaniel,

 1- You give as an example of naive datetime handling:

  np.datetime64('2005-02-25T03:00Z')
 np.datetime64('2005-02-25T03:00')

 This IIUC is incorrect. The Z modifier is a timezone offset, and for
 normal naive datetimes would cause an error.


 If what I understand from reading:
 http://thread.gmane.org/gmane.comp.python.numeric.general/53805

 It looks like anything other than Z, 00:00 or UTC that has a TZ adjustment
 would raise an error, and those specific conditions would not (I'm guessing
 this is because we assume it's UTC (or the same timezone) internally,
 anything that explicitly tells us it is UTC is acceptable, although that
 may be just my misreading of it.)

 However on output we don't use the Z modifier (which is why it's different
 from the UTC datetime64).

 I will change it to return an error if what I thought is incorrect and
 also include examples of conversion from datetimes as you requested.

 Please let me know if there are any more changes that are required! I look
 forward to further comments/questions.

 Cheers,
 Sankarshan

 On Fri, Mar 28, 2014 at 5:17 AM, Nathaniel Smith n...@pobox.com wrote:

 On 28 Mar 2014 05:00, Sankarshan Mudkavi smudk...@uwaterloo.ca wrote:
 
  Hi all,
 
  Apologies for the delay in following up, here is an expanded version of
 the proposal, which hopefully clears up most of the details. I have not
 included specific implementation details for the code, such as which
 functions to modify etc. since I think those are not traditionally included
 in NEPs?

 The format seems fine to me. Really the point is just to have a document
 that we can use as reference when deciding on behaviour, and this does that
 :-).

 Three quick comments:

 1- You give as an example of naive datetime handling:

  np.datetime64('2005-02-25T03:00Z')
 np.datetime64('2005-02-25T03:00')

 This IIUC is incorrect. The Z modifier is a timezone offset, and for
 normal naive datetimes would cause an error.

 2- It would be good to include explicitly examples of conversion to and
 from datetimes alongside the examples of conversions to and from strings.

 3- It would be good to (eventually) include some discussion of the impact
 of the preferred proposal on existing code. E.g., will this break a lot of
 people's pipelines? (Are people currently *always* adding timezones to
 their numpy input to avoid the problem, and now will have to switch to the
 opposite behaviour depending on numpy version?) And we'll want to make sure
 to get feedback from the pydata@ (pandas) list explicitly, though that
 can wait until people here have had a chance to respond to the first draft.

 Thanks for pushing this forward!
 -n

 Hi all,

 Apologies for the delay in following up, here is an expanded version of
 the proposal, which hopefully clears up most of the details. I have not
 included specific implementation details for the code, such as which
 functions to modify etc. since I think those are not traditionally included
 in NEPs?

 Please find attached the expanded proposal, and the rendered version is
 available here:

 https://github.com/Sankarshan-Mudkavi/numpy/blob/Enhance-datetime64/doc/neps/datetime-improvement-proposal.rst

 datetime-improvement-proposal.rst

 I look forward to comments, agreements/disagreements with this (and
 clarification if this needs even further expansion).


 Please find attached the
 On Mar 24, 2014, at 12:39 AM, Chris Barker chris.bar...@noaa.gov wrote:

 On Fri, Mar 21, 2014 at 3:43 PM, Nathaniel Smith n...@pobox.com wrote:

 On Thu, Mar 20, 2014 at 11:27 PM, Chris Barker chris.bar...@noaa.gov
 wrote:
  * I think there are more or less three options:
 1)  a) don't have any timezone handling at all -- all datetime64s
 are UTC. Always
   b) don't have any timezone handling at all -- all datetime64s
 are naive
   (the only difference between these two is I/O of strings,
 and maybe I/O of datetime objects with a time zone)
  2) Have a time zone associated with the array -- defaulting to
 either UTC or None, but don't provide any implementation other than the
 tagging, with the ability to add in TZ handler if you want (can this be
 done efficiently?)
  3) Full on proper TZ handling.
 
  I think (3) is off the table for now.

 I think the first goal is to define what a plain vanilla datetime64
 does, without any extra attributes. This is for two practical reasons:
 First, our overriding #1 goal is to fix the nasty I/O problems that
 default 

Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-03-27 Thread Sankarshan Mudkavi
Hi all,Apologies for the delay in following up, here is an expanded version of the proposal, which hopefully clears up most of the details. I have not included specific implementation details for the code, such as which functions to modify etc. since I think those are not traditionally included in NEPs?Please find attached the expanded proposal, and the rendered version is available here:https://github.com/Sankarshan-Mudkavi/numpy/blob/Enhance-datetime64/doc/neps/datetime-improvement-proposal.rst

datetime-improvement-proposal.rst
Description: Binary data
I look forward to comments, agreements/disagreements with this (and clarification if this needs even further expansion).Please find attached theOn Mar 24, 2014, at 12:39 AM, Chris Barker chris.bar...@noaa.gov wrote:On Fri, Mar 21, 2014 at 3:43 PM, Nathaniel Smith n...@pobox.com wrote:

On Thu, Mar 20, 2014 at 11:27 PM, Chris Barker chris.bar...@noaa.gov wrote:
 * I think there are more or less three options:
  1) a) don't have any timezone handling at all -- all datetime64s are UTC. Always
 b) don't have any timezone handling at all -- all datetime64s are naive
   (the only difference between these two is I/O of strings, and maybe I/O of datetime objects with a time zone)
   2) Have a time zone associated with the array -- defaulting to either UTC or None, but don't provide any implementation other than the tagging, with the ability to add in TZ handler if you want (can this be done efficiently?)


   3) Full on proper TZ handling.

 I think (3) is off the table for now.

I think the first goal is to define what a plain vanilla datetime64
does, without any extra attributes. This is for two practical reasons:
First, our overriding #1 goal is to fix the nasty I/O problems that
default datetime64's show, so until that's done any other bells and
whistles are a distraction. And second, adding parameters to dtypes
right now is technically messy.

This rules out (2) and (3).yup -- though I'm not sure I agree that we need to do this, if we are going to do something more later anyway. But you have a key point - maybe the dtype system simply isn't ready to do it right, and then it may be better not to try. 

In which case, we are down to naive or always UTC -- and again, those really aren't very different. Though I prefer naive -- always UTC adds some complication if you don't actually want UTC, and I'm not sure it actually buys us anything. And maybe it's jsut me, but all my code would need to use naive, so I"d be doing a bit of working around to use a UTC-always system.


If we additionally want to keep the option of adding a timezone
parameter later, and have the result end up looking like stdlib
datetime, then I think 1(b) is the obvious choice. My guess is that
this is also what's most compatible with pandas, which is currently
keeping its own timezone object outside of the dtype.Good point, all else being equal, compatability with Pandas would be a good thing. 


Any downsides? I guess this would mean that we start raising an error
on ISO 8601's with offsets attached, which might annoy some people?yes, but errors are better than incorrect values...
 Writing this made me think of a third option -- tracking, but no real manipulation, of TZ. This would be analogous to the ISO 8601 does -- all it does is note an offset. A given DateTime64 array would have a given offset assigned to it, and the appropriate addition and subtraction would happen at I/O. Offset of 0.00 would be UTC, and there would be a None option for naive.



Please no! An integer offset is a terrible way to represent timezones,well, it would solve the being able to read ISO strings problem, and being able to perform operations with datetimes in multiple time zones. though I guess you could get most of that with UTC-always.


and hardcoding this would just get in the way of a proper solution.well, that's a point -- if we think there is any hope of a proper solution down the road, then yes, it would be better not to make that harder.

-Chris-- Christopher Barker, Ph.D.OceanographerEmergency Response DivisionNOAA/NOS/ORR  (206) 526-6959 voice7600 Sand Point Way NE (206) 526-6329 fax

Seattle, WA 98115   (206) 526-6317 main receptionchris.bar...@noaa.gov

___NumPy-Discussion mailing listNumPy-Discussion@scipy.orghttp://mail.scipy.org/mailman/listinfo/numpy-discussion
--Sankarshan MudkaviUndergraduate in Physics, University of Waterloowww.smudkavi.com



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-03-23 Thread Chris Barker
On Fri, Mar 21, 2014 at 3:43 PM, Nathaniel Smith n...@pobox.com wrote:

 On Thu, Mar 20, 2014 at 11:27 PM, Chris Barker chris.bar...@noaa.gov
 wrote:
  * I think there are more or less three options:
 1)  a) don't have any timezone handling at all -- all datetime64s are
 UTC. Always
   b) don't have any timezone handling at all -- all datetime64s
 are naive
   (the only difference between these two is I/O of strings,
 and maybe I/O of datetime objects with a time zone)
  2) Have a time zone associated with the array -- defaulting to
 either UTC or None, but don't provide any implementation other than the
 tagging, with the ability to add in TZ handler if you want (can this be
 done efficiently?)
  3) Full on proper TZ handling.
 
  I think (3) is off the table for now.

 I think the first goal is to define what a plain vanilla datetime64
 does, without any extra attributes. This is for two practical reasons:
 First, our overriding #1 goal is to fix the nasty I/O problems that
 default datetime64's show, so until that's done any other bells and
 whistles are a distraction. And second, adding parameters to dtypes
 right now is technically messy.

 This rules out (2) and (3).


yup -- though I'm not sure I agree that we need to do this, if we are going
to do something more later anyway. But you have a key point - maybe the
dtype system simply isn't ready to do it right, and then it may be better
not to try.

In which case, we are down to naive or always UTC -- and again, those
really aren't very different. Though I prefer naive -- always UTC adds some
complication if you don't actually want UTC, and I'm not sure it actually
buys us anything. And maybe it's jsut me, but all my code would need to use
naive, so Id be doing a bit of working around to use a UTC-always system.


 If we additionally want to keep the option of adding a timezone
 parameter later, and have the result end up looking like stdlib
 datetime, then I think 1(b) is the obvious choice. My guess is that
 this is also what's most compatible with pandas, which is currently
 keeping its own timezone object outside of the dtype.


Good point, all else being equal, compatability with Pandas would be a good
thing.

Any downsides? I guess this would mean that we start raising an error
 on ISO 8601's with offsets attached, which might annoy some people?


yes, but errors are better than incorrect values...

 Writing this made me think of a third option -- tracking, but no real
manipulation, of TZ. This would be analogous to the ISO 8601 does -- all it
does is note an offset. A given DateTime64 array would have a given offset
assigned to it, and the appropriate addition and subtraction would happen
at I/O. Offset of 0.00 would be UTC, and there would be a None option for
naive.

Please no! An integer offset is a terrible way to represent timezones,


well, it would solve the being able to read ISO strings problem, and being
able to perform operations with datetimes in multiple time zones. though I
guess you could get most of that with UTC-always.


 and hardcoding this would just get in the way of a proper solution.


well, that's a point -- if we think there is any hope of a proper solution
down the road, then yes, it would be better not to make that harder.

-Chris

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-03-21 Thread Chris Barker
On Thu, Mar 20, 2014 at 4:55 PM, Sankarshan Mudkavi
smudk...@uwaterloo.cawrote:


 Yes 2) is indeed what I was suggesting. My apologies for being unclear, I
 was unsure of how much detail and technical information I should include in
 the proposal.


well, you need to put enough in that it's clear what it means. I think
examples are critical -- at least that's how I learn things.


  I'm not sure how much of a hit the performance would take if we were to
 take of the Z handler. Do you have any major concerns as of now regarding
 that, or do you want to wait till I provide more specific details?


more detail would be good.

My comment about performance is that if numpy needs to call a Python object
to do the time zone handling for each value in an array, that is going to
pretty slow -- but maybe better than not having it at all. And
there shouldn't be any reason not to have a fast path for when the array is
naive or you are working with two arrays that are in the same TZ -- the
really common case that we care about performance for. So ot probably comes
down to one extra field...

It also looks like the last option you mentioned seems quite reasonable
 too. To only do what ISO 8601 does. Perhaps, it would be better to
 implement that first and then look for an improvement later on? Do you have
 a preference for this or the option 2) ?


I'm liking that one:

It seems pretty easy to allow a tag for TZ offset, and not much extra math
when converting. And this could be pretty useful. But I'm not writing the
code...

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-03-21 Thread Chris Barker
On Thu, Mar 20, 2014 at 5:53 PM, Alexander Belopolsky ndar...@mac.comwrote:

 I recall that it was at some point suggested that epoch be part of dtype.
  I was not able to find the reasons for a rejection,


I don't think it was rejected, it just wasn't adopted by anyone to write a
NEP and write the code...

I actually think it's silly to allow changing the units without changing
the epoch. But the pre-defined epoch works fine for all my use cass, so I'm
not going to push that. I also did think it was a separate issue that
timezones, and thus shouldn't clutter up the NEP (though one someone is
opening the code, it would be a good time to do it..)

but it would make perfect sense to keep timezone offset in dtype and treat
 it effectively as an alternative epoch.


Hmm -- good point -- if we had a dynamic epoch you could just sift that to
account for the time zone offset. Though I think that's
an implementation issue.

The way I like to think about datetime is that -MM-DD hh:mm:ss.nnn is
 just a fancy way to represent numbers which is more convoluted than decimal
 notation, but conceptually not so different.  So different units, epochs or
 timezones are just different ways to convert an abstract notion of a point
 in time to a specific series of bits inside an array.  This is what dtype
 is for - a description of how abstract numbers are stored in memory.


yes -- and also how to convert to/from other types -- which is where the
trick is here.

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-03-21 Thread Chris Barker
On Thu, Mar 20, 2014 at 6:32 PM, Alexander Belopolsky ndar...@mac.comwrote:


 The difference comes down to I/O.

 It is more than I/O.  It is also about interoperability with Python's
 datetime module.


Sorry -- I was using I/O to mean converting to/from datetime64 and other
types So that included datetime.datetime.

Here is the behavior that I don't like in the current implementation:

  d = array(['2001-01-01T12:00'], dtype='M8[ms]')
  d.item(0)
 datetime.datetime(2001, 1, 1, 17, 0)


 yup , it converted to UTC using your locale setting -- really not good!
Then tossed that our when creating a datetime.datetime. This really is
quite broken.

But this brings up a good point -- having time zone handling fully
compatible ith datetime.datetime would have its advantages. So use the same
tzinfo API.

If I understand NEP correctly, the proposal is to make d.item(0) return

  d.item(0).replace(tzinfo=timezone.utc)
 datetime.datetime(2001, 1, 1, 12, 0, tzinfo=datetime.timezone.utc)

 instead.  But this is not what I would expect: I want

   d.item(0)
 datetime.datetime(2001, 1, 1, 12, 0)

 When I work with naive datetime objects I don't want to be exposed to
 timezones at all.


right -- naive time zones really would be good. The problem now with the
current code and your example, is that in:

 d = array(['2001-01-01T12:00'], dtype='M8[ms]')

'2001-01-01T12:00' is interpreted as meaning in the machines locale time
zone combining that with teh UTC assumption, and you have trouble. The
work around for what you want now is to add TZ info to the string:

In [56]: d = np.array(['2001-01-01T12:00Z'], dtype='M8[ms]')

In [57]: d.item(0)
Out[57]: datetime.datetime(2001, 1, 1, 12, 0)

or:
In [60]: d = np.array(['2001-01-01T12:00-00:00'], dtype='M8[ms]')

In [61]: d.item(0)
Out[61]: datetime.datetime(2001, 1, 1, 12, 0)

I _think_ that's what you want.

This is what I mean that naive and UTC are almost the same.

-Chris

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-03-21 Thread Alexander Belopolsky
On Fri, Mar 21, 2014 at 5:31 PM, Chris Barker chris.bar...@noaa.gov wrote:

 But this brings up a good point -- having time zone handling fully
 compatible ith datetime.datetime would have its advantages.


I don't know if everyone is aware of this, but Python stdlib has support
for fixed-offset timezones since version 3.2:

http://docs.python.org/3.2/whatsnew/3.2.html#datetime-and-time

It took many years to bring in that feature, but now we can benefit from
not having to reinvent the wheel.

I will try to write up some specific proposal this weekend.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-03-21 Thread Nathaniel Smith
On Thu, Mar 20, 2014 at 11:27 PM, Chris Barker chris.bar...@noaa.gov wrote:
 * I think there are more or less three options:
1)  a) don't have any timezone handling at all -- all datetime64s are UTC. 
 Always
  b) don't have any timezone handling at all -- all datetime64s are 
 naive
  (the only difference between these two is I/O of strings, and 
 maybe I/O of datetime objects with a time zone)
 2) Have a time zone associated with the array -- defaulting to either UTC 
 or None, but don't provide any implementation other than the tagging, with 
 the ability to add in TZ handler if you want (can this be done efficiently?)
 3) Full on proper TZ handling.

 I think (3) is off the table for now.

 I think (2) is what the NEP proposes, but I'd need more details, examples to 
 know.

 I prefer 1(b), but 1(a) is close enough that I'd be happy with that, too.

I think the first goal is to define what a plain vanilla datetime64
does, without any extra attributes. This is for two practical reasons:
First, our overriding #1 goal is to fix the nasty I/O problems that
default datetime64's show, so until that's done any other bells and
whistles are a distraction. And second, adding parameters to dtypes
right now is technically messy.

This rules out (2) and (3).

If we additionally want to keep the option of adding a timezone
parameter later, and have the result end up looking like stdlib
datetime, then I think 1(b) is the obvious choice. My guess is that
this is also what's most compatible with pandas, which is currently
keeping its own timezone object outside of the dtype.

Any downsides? I guess this would mean that we start raising an error
on ISO 8601's with offsets attached, which might annoy some people?

 Writing this made me think of a third option -- tracking, but no real 
 manipulation, of TZ. This would be analogous to the ISO 8601 does -- all it 
 does is note an offset. A given DateTime64 array would have a given offset 
 assigned to it, and the appropriate addition and subtraction would happen at 
 I/O. Offset of 0.00 would be UTC, and there would be a None option for naive.

Please no! An integer offset is a terrible way to represent timezones,
and hardcoding this would just get in the way of a proper solution.

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-03-20 Thread Nathaniel Smith
On 20 Mar 2014 02:07, Sankarshan Mudkavi smudk...@uwaterloo.ca wrote:
 I've written a rather rudimentary NEP, (lacking in technical details
which I will hopefully add after some further discussion and receiving
clarification/help on this thread).

 Please let me know how to proceed and what you think should be added to
the current proposal (attached to this mail).

 Here is a rendered version of the same:

https://github.com/Sankarshan-Mudkavi/numpy/blob/Enhance-datetime64/doc/neps/datetime-improvement-proposal.rst

Your NEP suggests making all datetime64s be in UTC, and treating string
representations from unknown timezones as UTC. How does this differ from,
and why is it superior to, making all datetime64s be naive?

-n
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-03-20 Thread Sankarshan Mudkavi
Hi Nathaniel,

It differs by allowing time zone info to be preserved if supplied. A naive 
datetime64 would be unable to handle this, and would either have to ignore the 
tzinfo or would have to throw up an exception. The current suggestion is very 
similar to a naive datetime64 and only differs in being able to handle the 
given tzinfo, rather than ignoring it or telling the user that the current 
implementation cannot handle it.

This would be superioir to a naive dateime64 for use cases that have the tzinfo 
available, and would avoid the users having to workaround NumPy's inability to 
handle them if provided.

A big thanks to Chris Barker for the write up linked in the proposal, it makes 
it very clear what the various possibilities are for improvement.

Cheers,
Sankarshan

On Mar 20, 2014, at 7:16 AM, Nathaniel Smith n...@pobox.com wrote:

 On 20 Mar 2014 02:07, Sankarshan Mudkavi smudk...@uwaterloo.ca wrote:
  I've written a rather rudimentary NEP, (lacking in technical details which 
  I will hopefully add after some further discussion and receiving 
  clarification/help on this thread).
 
  Please let me know how to proceed and what you think should be added to the 
  current proposal (attached to this mail).
 
  Here is a rendered version of the same:
  https://github.com/Sankarshan-Mudkavi/numpy/blob/Enhance-datetime64/doc/neps/datetime-improvement-proposal.rst
 
 Your NEP suggests making all datetime64s be in UTC, and treating string 
 representations from unknown timezones as UTC. How does this differ from, and 
 why is it superior to, making all datetime64s be naive?
 
 -n
 
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

-- 
Sankarshan Mudkavi
Undergraduate in Physics, University of Waterloo
www.smudkavi.com








signature.asc
Description: Message signed with OpenPGP using GPGMail
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-03-20 Thread Chris Barker
On Thu, Mar 20, 2014 at 4:16 AM, Nathaniel Smith n...@pobox.com wrote:

 Your NEP suggests making all datetime64s be in UTC, and treating string
 representations from unknown timezones as UTC. How does this differ from,
 and why is it superior to, making all datetime64s be naive?

 This came up in the conversation before -- I think the fact is that a
'naive' datetime and a UTC datetime are almost exactly the same. In essence
you can use a UTC datetime and pretend it's naive in almost all cases.

The difference comes down to I/O. If it's UTC, then an ISO 8601 string
created from it would include a Z on the end (or a +0.00, I think),
whereas naive datetime should have no TZ indicator.

On input, the question is what do you do with an ISO string with a TZ
indicator:
   1) translate to UTC  -- make sense is we have the always UTC definition
   2) raise an exception  -- makes sense if we have the naive definition
   3) ignore it -- which would make some sense if  were naive, but perhaps
a little too prone to error.


But the real issue with the current implementation is how an iso string
with no TZ indicator is handled -- it currently assumes that means use the
localle TZ, which is more than not wrong, and clearly subject to errors.

Also, it time-shifts to locale TZ when creating an ISO string, with no way
to specify that.

So:

* I'm not sure what the new NEP is suggesting at all, actually, we need a
fully description, with exampel sof what varios input / ouput would give.

* I think there are more or less three options:
   1)  a) don't have any timezone handling at all -- all datetime64s are
UTC. Always
 b) don't have any timezone handling at all -- all datetime64s are
naive
 (the only difference between these two is I/O of strings, and
maybe I/O of datetime objects with a time zone)
2) Have a time zone associated with the array -- defaulting to either
UTC or None, but don't provide any implementation other than the tagging,
with the ability to add in TZ handler if you want (can this be
done efficiently?)
3) Full on proper TZ handling.

I think (3) is off the table for now.

I think (2) is what the NEP proposes, but I'd need more details, examples
to know.

I prefer 1(b), but 1(a) is close enough that I'd be happy with that, too.

Writing this made me think of a third option -- tracking, but no
real manipulation, of TZ. This would be analogous to the ISO 8601 does --
all it does is note an offset. A given DateTime64 array would have a given
offset assigned to it, and the appropriate addition and subtraction would
happen at I/O. Offset of 0.00 would be UTC, and there would be a None
option for naive.

I haven't thought that out for the inevitable complications, though.

-CHB





















 -n

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion




-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-03-20 Thread Sankarshan Mudkavi
Hi Chris,

  I think there are more or less three options:
1)  a) don't have any timezone handling at all -- all datetime64s are UTC. 
 Always
  b) don't have any timezone handling at all -- all datetime64s are 
 naive
  (the only difference between these two is I/O of strings, and 
 maybe I/O of datetime objects with a time zone)
 2) Have a time zone associated with the array -- defaulting to either UTC 
 or None, but don't provide any implementation other than the tagging, with 
 the ability to add in TZ handler if you want (can this be done efficiently?)
 3) Full on proper TZ handling.
 
 I think (3) is off the table for now.
 
 I think (2) is what the NEP proposes, but I'd need more details, examples to 
 know.
 
 I prefer 1(b), but 1(a) is close enough that I'd be happy with that, too.

Yes 2) is indeed what I was suggesting. My apologies for being unclear, I was 
unsure of how much detail and technical information I should include in the 
proposal. I will update it and add more examples etc. to actually specify what 
I mean. I'm not sure how much of a hit the performance would take if we were to 
take of the Z handler. Do you have any major concerns as of now regarding that, 
or do you want to wait till I provide more specific details?

It also looks like the last option you mentioned seems quite reasonable too. To 
only do what ISO 8601 does. Perhaps, it would be better to implement that first 
and then look for an improvement later on? Do you have a preference for this or 
the option 2) ?

I will expand the NEP and hopefully make it clearer what it entails.

Once again, thanks for the earlier write up.

Cheers,
Sankarshan

On Mar 20, 2014, at 7:27 PM, Chris Barker chris.bar...@noaa.gov wrote:

 On Thu, Mar 20, 2014 at 4:16 AM, Nathaniel Smith n...@pobox.com wrote:
 Your NEP suggests making all datetime64s be in UTC, and treating string 
 representations from unknown timezones as UTC. How does this differ from, and 
 why is it superior to, making all datetime64s be naive?
 
 This came up in the conversation before -- I think the fact is that a 'naive' 
 datetime and a UTC datetime are almost exactly the same. In essence you can 
 use a UTC datetime and pretend it's naive in almost all cases.
 
 The difference comes down to I/O. If it's UTC, then an ISO 8601 string 
 created from it would include a Z on the end (or a +0.00, I think), whereas 
 naive datetime should have no TZ indicator.
 
 On input, the question is what do you do with an ISO string with a TZ 
 indicator:
1) translate to UTC  -- make sense is we have the always UTC definition
2) raise an exception  -- makes sense if we have the naive definition
3) ignore it -- which would make some sense if  were naive, but perhaps a 
 little too prone to error.
 
 
 But the real issue with the current implementation is how an iso string with 
 no TZ indicator is handled -- it currently assumes that means use the 
 localle TZ, which is more than not wrong, and clearly subject to errors.
 
 Also, it time-shifts to locale TZ when creating an ISO string, with no way to 
 specify that.
 
 So:
 
 * I'm not sure what the new NEP is suggesting at all, actually, we need a 
 fully description, with exampel sof what varios input / ouput would give.
 
 * I think there are more or less three options:
1)  a) don't have any timezone handling at all -- all datetime64s are UTC. 
 Always
  b) don't have any timezone handling at all -- all datetime64s are 
 naive
  (the only difference between these two is I/O of strings, and 
 maybe I/O of datetime objects with a time zone)
 2) Have a time zone associated with the array -- defaulting to either UTC 
 or None, but don't provide any implementation other than the tagging, with 
 the ability to add in TZ handler if you want (can this be done efficiently?)
 3) Full on proper TZ handling.
 
 I think (3) is off the table for now.
 
 I think (2) is what the NEP proposes, but I'd need more details, examples to 
 know.
 
 I prefer 1(b), but 1(a) is close enough that I'd be happy with that, too.
 
 Writing this made me think of a third option -- tracking, but no real 
 manipulation, of TZ. This would be analogous to the ISO 8601 does -- all it 
 does is note an offset. A given DateTime64 array would have a given offset 
 assigned to it, and the appropriate addition and subtraction would happen at 
 I/O. Offset of 0.00 would be UTC, and there would be a None option for naive.
 
 I haven't thought that out for the inevitable complications, though.
 
 -CHB
 
 
 
 
 
  
 
 
 
 
   
 
 
 
 
 
 
 
 
  
 -n
 
 
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
 
 
 
 
 -- 
 
 Christopher Barker, Ph.D.
 Oceanographer
 
 Emergency Response Division
 NOAA/NOS/ORR(206) 526-6959   voice
 7600 Sand Point Way NE   (206) 526-6329   fax
 

Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-03-20 Thread Alexander Belopolsky
On Thu, Mar 20, 2014 at 7:16 AM, Nathaniel Smith n...@pobox.com wrote:

 Your NEP suggests making all datetime64s be in UTC, and treating string
 representations from unknown timezones as UTC.


I recall that it was at some point suggested that epoch be part of dtype.
 I was not able to find the reasons for a rejection, but it would make
perfect sense to keep timezone offset in dtype and treat it effectively as
an alternative epoch.

The way I like to think about datetime is that -MM-DD hh:mm:ss.nnn is
just a fancy way to represent numbers which is more convoluted than decimal
notation, but conceptually not so different.  So different units, epochs or
timezones are just different ways to convert an abstract notion of a point
in time to a specific series of bits inside an array.  This is what dtype
is for - a description of how abstract numbers are stored in memory.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-03-20 Thread Alexander Belopolsky
On Thu, Mar 20, 2014 at 9:39 AM, Sankarshan Mudkavi
smudk...@uwaterloo.cawrote:

 A naive datetime64 would be unable to handle this, and would either have
 to ignore the tzinfo or would have to throw up an exception.


This is not true.  Python's own datetime has no problem handling this:

 t1 = datetime(2000,1,1,12)
 t2 = datetime(2000,1,1,12,tzinfo=timezone.utc)
 print(t1)
2000-01-01 12:00:00
 print(t2)
2000-01-01 12:00:00+00:00
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-03-20 Thread Alexander Belopolsky
On Thu, Mar 20, 2014 at 7:27 PM, Chris Barker chris.bar...@noaa.gov wrote:

 On Thu, Mar 20, 2014 at 4:16 AM, Nathaniel Smith n...@pobox.com wrote:

 Your NEP suggests making all datetime64s be in UTC, and treating string
 representations from unknown timezones as UTC. How does this differ from,
 and why is it superior to, making all datetime64s be naive?

 This came up in the conversation before -- I think the fact is that a
 'naive' datetime and a UTC datetime are almost exactly the same. In essence
 you can use a UTC datetime and pretend it's naive in almost all cases.

 The difference comes down to I/O.


It is more than I/O.  It is also about interoperability with Python's
datetime module.

Here is the behavior that I don't like in the current implementation:

 d = array(['2001-01-01T12:00'], dtype='M8[ms]')
 d.item(0)
datetime.datetime(2001, 1, 1, 17, 0)

If I understand NEP correctly, the proposal is to make d.item(0) return

 d.item(0).replace(tzinfo=timezone.utc)
datetime.datetime(2001, 1, 1, 12, 0, tzinfo=datetime.timezone.utc)

instead.  But this is not what I would expect: I want

  d.item(0)
datetime.datetime(2001, 1, 1, 12, 0)

When I work with naive datetime objects I don't want to be exposed to
timezones at all.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-03-19 Thread Dave Hirschfeld
Sankarshan Mudkavi smudkavi at uwaterloo.ca writes:

 
 Hey all,
 It's been a while since the last datetime and timezones discussion thread 
was visited (linked below):
 
 http://thread.gmane.org/gmane.comp.python.numeric.general/53805
 
 It looks like the best approach to follow is the UTC only approach in the 
linked thread with an optional flag to indicate the timezone (to avoid 
confusing applications where they don't expect any timezone info). Since 
this is slightly more useful than having just a naive datetime64 package and 
would be open to extension if required, it's probably the best way to start 
improving the datetime64 library.
 
snip
 I would like to start writing a NEP for this followed by implementation, 
however I'm not sure what the format etc. is, could someone direct me to a 
page where this information is provided?
 
 Please let me know if there are any ideas, comments etc.
 
 Cheers,
 Sankarshan
 

See: http://article.gmane.org/gmane.comp.python.numeric.general/55191


You could use a current NEP as a template:
https://github.com/numpy/numpy/tree/master/doc/neps


I'm a huge +100 on the simplest UTC fix.

As is, using numpy datetimes is likely to silently give incorrect results - 
something I've already seen several times in end-user data analysis code.

Concrete Example:

In [16]: dates = pd.date_range('01-Apr-2014', '04-Apr-2014', freq='H')[:-1]
...: values = np.array([1,2,3]).repeat(24)
...: records = zip(map(str, dates), values)
...: pd.TimeSeries(values, dates).groupby(lambda d: d.date()).mean()
...: 
Out[16]: 
2014-04-011
2014-04-022
2014-04-033
dtype: int32

In [17]: df = pd.DataFrame(np.array(records, dtype=[('dates', 'M8[h]'), 
('values', float)]))
...: df.set_index('dates', inplace=True)
...: df.groupby(lambda d: d.date()).mean()
...: 
Out[17]: 
  values
2014-03-31  1.00
2014-04-01  1.041667
2014-04-02  2.041667
2014-04-03  3.00

[4 rows x 1 columns]

Try it in your timezone and see what you  get!

-Dave



___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-03-19 Thread Jeff Reback
Dave,

your example is not a problem with numpy per se, rather that the default
generation is in local timezone (same as what python datetime does).
If you localize to UTC you get the results that you expect.

In [49]: dates = pd.date_range('01-Apr-2014', '04-Apr-2014', freq='H')[:-1]

In [50]: pd.TimeSeries(values, dates.tz_localize('UTC')).groupby(lambda d:
d.date()).mean()
Out[50]:
2014-04-011
2014-04-022
2014-04-033
dtype: int64

In [51]: records = zip(map(str, dates.tz_localize('UTC')), values)

In [52]: df = pd.DataFrame(np.array(records, dtype=[('dates',
'M8[h]'),('values', float)]))

In [53]: df.set_index('dates').groupby(lambda x: x.date()).mean()
Out[53]:
values
2014-04-01   1
2014-04-02   2
2014-04-03   3

[3 rows x 1 columns]



On Wed, Mar 19, 2014 at 5:21 AM, Dave Hirschfeld novi...@gmail.com wrote:

 Sankarshan Mudkavi smudkavi at uwaterloo.ca writes:

 
  Hey all,
  It's been a while since the last datetime and timezones discussion thread
 was visited (linked below):
 
  http://thread.gmane.org/gmane.comp.python.numeric.general/53805
 
  It looks like the best approach to follow is the UTC only approach in the
 linked thread with an optional flag to indicate the timezone (to avoid
 confusing applications where they don't expect any timezone info). Since
 this is slightly more useful than having just a naive datetime64 package
 and
 would be open to extension if required, it's probably the best way to start
 improving the datetime64 library.
 
 snip
  I would like to start writing a NEP for this followed by implementation,
 however I'm not sure what the format etc. is, could someone direct me to a
 page where this information is provided?
 
  Please let me know if there are any ideas, comments etc.
 
  Cheers,
  Sankarshan
 

 See: http://article.gmane.org/gmane.comp.python.numeric.general/55191


 You could use a current NEP as a template:
 https://github.com/numpy/numpy/tree/master/doc/neps


 I'm a huge +100 on the simplest UTC fix.

 As is, using numpy datetimes is likely to silently give incorrect results -
 something I've already seen several times in end-user data analysis code.

 Concrete Example:

 In [16]: dates = pd.date_range('01-Apr-2014', '04-Apr-2014', freq='H')[:-1]
 ...: values = np.array([1,2,3]).repeat(24)
 ...: records = zip(map(str, dates), values)
 ...: pd.TimeSeries(values, dates).groupby(lambda d: d.date()).mean()
 ...:
 Out[16]:
 2014-04-011
 2014-04-022
 2014-04-033
 dtype: int32

 In [17]: df = pd.DataFrame(np.array(records, dtype=[('dates', 'M8[h]'),
 ('values', float)]))
 ...: df.set_index('dates', inplace=True)
 ...: df.groupby(lambda d: d.date()).mean()
 ...:
 Out[17]:
   values
 2014-03-31  1.00
 2014-04-01  1.041667
 2014-04-02  2.041667
 2014-04-03  3.00

 [4 rows x 1 columns]

 Try it in your timezone and see what you  get!

 -Dave



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-03-19 Thread Dave Hirschfeld
Jeff Reback jeffreback at gmail.com writes:

 
 Dave,
 
 your example is not a problem with numpy per se, rather that the default 
generation is in local timezone (same as what python datetime does).
 If you localize to UTC you get the results that you expect. 
 

The problem is that the default datetime generation in *numpy* is in local 
time.

Note that this *is not* the case in Python - it doesn't try to guess the 
timezone info based on where in the world you run the code, if it's not 
provided it sets it to None.

In [7]: pd.datetime?
Type:   type
String Form:type 'datetime.datetime'
Docstring:
datetime(year, month, day[, hour[, minute[, second[, 
microsecond[,tzinfo])

The year, month and day arguments are required. tzinfo may be None, or an
instance of a tzinfo subclass. The remaining arguments may be ints or longs.

In [8]: pd.datetime(2000,1,1).tzinfo is None
Out[8]: True


This may be the best solution but as others have pointed out this is more 
difficult to implement and may have other issues.

I don't want to wait for the best solution - the assume UTC on input/output 
if not specified will solve the problem and this desperately needs to be 
fixed because it's completely broken as is IMHO.


 If you localize to UTC you get the results that you expect. 

That's the whole point - *numpy* needs to localize to UTC, not to whatever 
timezone you happen to be in when running the code. 

In a real-world data analysis problem you don't start with the data in a 
DataFrame or a numpy array it comes from the web, a csv, Excel, a database 
and you want to convert it to a DataFrame or numpy array. So what you have 
from whatever source is a list of tuples of strings and you want to convert 
them into a typed array.

Obviously you can't localize a string - you have to convert it to a date 
first and if you do that with numpy the date you have is wrong. 

In [108]: dst = np.array(['2014-03-30 00:00', '2014-03-30 01:00', '2014-03-
30 02:00'], dtype='M8[h]')
 ...: dst
 ...: 
Out[108]: array(['2014-03-30T00+', '2014-03-30T00+', '2014-03-
30T02+0100'], dtype='datetime64[h]')

In [109]: dst.tolist()
Out[109]: 
[datetime.datetime(2014, 3, 30, 0, 0),
 datetime.datetime(2014, 3, 30, 0, 0),
 datetime.datetime(2014, 3, 30, 1, 0)]


AFAICS there's no way to get the original dates back once they've passed 
through numpy's parser!?


-Dave





___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-03-19 Thread Sankarshan Mudkavi

On Mar 19, 2014, at 10:01 AM, Dave Hirschfeld novi...@gmail.com wrote:

 Jeff Reback jeffreback at gmail.com writes:
 
 
 Dave,
 
 your example is not a problem with numpy per se, rather that the default 
 generation is in local timezone (same as what python datetime does).
 If you localize to UTC you get the results that you expect. 
 
 
 The problem is that the default datetime generation in *numpy* is in local 
 time.
 
 Note that this *is not* the case in Python - it doesn't try to guess the 
 timezone info based on where in the world you run the code, if it's not 
 provided it sets it to None.
 
 In [7]: pd.datetime?
 Type:   type
 String Form:type 'datetime.datetime'
 Docstring:
 datetime(year, month, day[, hour[, minute[, second[, 
 microsecond[,tzinfo])
 
 The year, month and day arguments are required. tzinfo may be None, or an
 instance of a tzinfo subclass. The remaining arguments may be ints or longs.
 
 In [8]: pd.datetime(2000,1,1).tzinfo is None
 Out[8]: True
 
 
 This may be the best solution but as others have pointed out this is more 
 difficult to implement and may have other issues.
 
 I don't want to wait for the best solution - the assume UTC on input/output 
 if not specified will solve the problem and this desperately needs to be 
 fixed because it's completely broken as is IMHO.
 
 
 If you localize to UTC you get the results that you expect. 
 
 That's the whole point - *numpy* needs to localize to UTC, not to whatever 
 timezone you happen to be in when running the code. 
 
 In a real-world data analysis problem you don't start with the data in a 
 DataFrame or a numpy array it comes from the web, a csv, Excel, a database 
 and you want to convert it to a DataFrame or numpy array. So what you have 
 from whatever source is a list of tuples of strings and you want to convert 
 them into a typed array.
 
 Obviously you can't localize a string - you have to convert it to a date 
 first and if you do that with numpy the date you have is wrong. 
 
 In [108]: dst = np.array(['2014-03-30 00:00', '2014-03-30 01:00', '2014-03-
 30 02:00'], dtype='M8[h]')
 ...: dst
 ...: 
 Out[108]: array(['2014-03-30T00+', '2014-03-30T00+', '2014-03-
 30T02+0100'], dtype='datetime64[h]')
 
 In [109]: dst.tolist()
 Out[109]: 
 [datetime.datetime(2014, 3, 30, 0, 0),
 datetime.datetime(2014, 3, 30, 0, 0),
 datetime.datetime(2014, 3, 30, 1, 0)]
 
 
 AFAICS there's no way to get the original dates back once they've passed 
 through numpy's parser!?
 
 
 -Dave
 
 
 
 
 
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


Hi all,

I've written a rather rudimentary NEP, (lacking in technical details which I 
will hopefully add after some further discussion and receiving 
clarification/help on this thread).

Please let me know how to proceed and what you think should be added to the 
current proposal (attached to this mail).

Here is a rendered version of the same:
https://github.com/Sankarshan-Mudkavi/numpy/blob/Enhance-datetime64/doc/neps/datetime-improvement-proposal.rst

Cheers,
Sankarshan

-- 
Sankarshan Mudkavi
Undergraduate in Physics, University of Waterloo
www.smudkavi.com






___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Dates and times and Datetime64 (again)

2014-03-18 Thread Sankarshan Mudkavi
Hey all,

It's been a while since the last datetime and timezones discussion thread was 
visited (linked below):

http://thread.gmane.org/gmane.comp.python.numeric.general/53805

It looks like the best approach to follow is the UTC only approach in the 
linked thread with an optional flag to indicate the timezone (to avoid 
confusing applications where they don't expect any timezone info). Since this 
is slightly more useful than having just a naive datetime64 package and would 
be open to extension if required, it's probably the best way to start improving 
the datetime64 library.

If we do wish to have full timezone support it would very likely lead to 
performance drops (as reasoned in the thread) and we would need to have a 
dedicated, maintained tzinfo package, at which point it would make much more 
sense to just incorporate the pytz library. (I also don't have the expertise to 
implement this, so I would be unable to help resolve the current logjam)

I would like to start writing a NEP for this followed by implementation, 
however I'm not sure what the format etc. is, could someone direct me to a page 
where this information is provided?

Please let me know if there are any ideas, comments etc.

Cheers,
Sankarshan


signature.asc
Description: Message signed with OpenPGP using GPGMail
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Dates and times and Datetime64 (again)

2014-03-18 Thread Chris Barker
On Tue, Mar 18, 2014 at 2:49 PM, Sankarshan Mudkavi
smudk...@uwaterloo.cawrote:

 It's been a while since the last datetime and timezones discussion thread
 was visited (linked below):

 http://thread.gmane.org/gmane.comp.python.numeric.general/53805

 It looks like the best approach to follow is the UTC only approach in the
 linked thread with an optional flag to indicate the timezone (to avoid
 confusing applications where they don't expect any timezone info). Since
 this is slightly more useful than having just a naive datetime64 package
 and would be open to extension if required, it's probably the best way to
 start improving the datetime64 library.


IIUC, I agree -- which is why we need a NEP to specify the details. Thank
you for stepping up!

If we do wish to have full timezone support it would very likely lead to
 performance drops (as reasoned in the thread) and we would need to have a
 dedicated, maintained tzinfo package, at which point it would make much
 more sense to just incorporate the pytz library.


yup -- there is the option of doing what the stdlib datetime does --
provide a hook to incorporate timezone,s but don't provide
an implementation, unless that is a low-level hook that must
be implemented in C, it's going to be slow -- slow enough that you might as
well use a list of stdlib datetimes Also, this has gone far to long
without getting fixed -- we need something simple to implement more than
anything else.


 I would like to start writing a NEP for this followed by implementation,
 however I'm not sure what the format etc. is, could someone direct me to a
 page where this information is provided?


I don't know that there is such a thing, but you'll find the existing NEPS
here:

https://github.com/numpy/numpy/tree/master/doc/neps

I'd grab one and follow the format.


 Please let me know if there are any ideas, comments etc.


Thanks again -- I look forward to seeing it written up, -- I'm sure to have
something to say then!

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion