Re: [Numpy-discussion] Numpy 1.9 release date

2013-11-12 Thread Chris Barker
On Sun, Nov 10, 2013 at 7:27 PM, Stéfan van der Walt ste...@sun.ac.zawrote:

  that the main thing missing at this point is fixing the datetime
 problems.

 What needs to be done, and what is the plan forward?


I'm not sure that's quite been decided, but my take:

1) remove the existing time zone handling -- it simply isn't useful often,
and does cause a pain in the %^ often.
  - as far as I know, the only point of debate to the simple not-time-zone
aware datetimes is whether that means UTC or Local or Not Known --
these are pretty subtle distinctions  and I think really only have an
impact when you try to parse an iso string with a timezone attached.

2) _maybe_ do something smarter -- though this takes a lot more work and
discussion as to what that should be.

I think they key points are captured here:

http://thread.gmane.org/gmane.comp.python.numeric.general/53805

There is an issue:

https://github.com/numpy/numpy/issues/3388,

but there is no detail there.

There are a number of other issues that come up in discussion:

* More precision with lap-seconds, etc.

*  Allowing an epoch that can change -- this is really crucial if you want
picoseconds and friends to be remotely useful.

But these are orthogonal issues AFIIC, except that maybe one we open it up
it makes sense to do it at once...

-Chris











  Is there perhaps an issue one can follow?

 Thanks
 Stéfan

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion




-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Numpy 1.9 release date

2013-11-10 Thread Ralf Gommers
On Fri, Nov 8, 2013 at 8:22 PM, Charles R Harris
charlesr.har...@gmail.comwrote:

 Hi All,

 The question has come up as to how much effort we should spend backporting
 fixes to 1.8.x. An alternative would be to tag 1.9.0 early next year,
 aiming for a release around April. I think there is almost enough in
 1.9-devel to justify a release. There is Sebastian's index work, Julian's
 continuing work on speedups, the removal of oldnumeric and numarray
 support, and various other deprecations and cleanups that add up to a
 significant number of changes. I've tended to think of 1.9 as a cleanup and
 consolidation release


Makes sense.


 and think that the main thing missing at this point is fixing the datetime
 problems.


Is anyone planning to work on this? If yes, you need a rough estimate of
when this is ready to go. If no, it needs to be decided if this is critical
for the release. From the previous discussion I tend to think so. If it's
critical but no one does it, why plan a release...

A suggestion for backporting strategy: do not backport things that have
just been merged. Because (a) doing it PR by PR gives a lot of overhead,
and (b) if the commit causes issues that have to be fixed or reverted, you
have to fix things twice. Instead, just keep a list of backport candidates
in a github issue, then do it all at once when it's clear that a bugfix
release is needed.

Ralf
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Numpy 1.9 release date

2013-11-10 Thread Dave Hirschfeld
Ralf Gommers ralf.gommers at gmail.com writes:

 

 On Fri, Nov 8, 2013 at 8:22 PM, Charles R Harris charlesr.harris at 
gmail.com wrote:
 
 
 and think that the main thing missing at this point is fixing the datetime 
problems.
 
 
 Is anyone planning to work on this? If yes, you need a rough estimate of 
when this is ready to go. If no, it needs to be decided if this is critical 
for the release. From the previous discussion I tend to think so. If it's 
critical but no one does it, why plan a release... 
 
 
 Ralf
 

Just want to pipe up here as to the criticality of datetime bug.

Below is a minimal example from some data analysis code I found in our 
company that was giving incorrect results (fortunately it was caught by 
thorough testing):

In [110]: records = [
 ...:  ('2014-03-29 23:00:00', '2014-03-29 23:00:00'),
 ...:  ('2014-03-30 00:00:00', '2014-03-30 00:00:00'),
 ...:  ('2014-03-30 01:00:00', '2014-03-30 01:00:00'),
 ...:  ('2014-03-30 02:00:00', '2014-03-30 02:00:00'),
 ...:  ('2014-03-30 03:00:00', '2014-03-30 03:00:00'),
 ...:  ('2014-10-25 23:00:00', '2014-10-25 23:00:00'),
 ...:  ('2014-10-26 00:00:00', '2014-10-26 00:00:00'),
 ...:  ('2014-10-26 01:00:00', '2014-10-26 01:00:00'),
 ...:  ('2014-10-26 02:00:00', '2014-10-26 02:00:00'),
 ...:  ('2014-10-26 03:00:00', '2014-10-26 03:00:00')]
 ...: 
 ...: 
 ...: data = np.asarray(records, dtype=[('date obj', 'M8[h]'), ('str 
repr', object)])
 ...: df = pd.DataFrame(data)

In [111]: df
Out[111]: 
 date obj str repr
0 2014-03-29 23:00:00  2014-03-29 23:00:00
1 2014-03-30 00:00:00  2014-03-30 00:00:00
2 2014-03-30 00:00:00  2014-03-30 01:00:00
3 2014-03-30 01:00:00  2014-03-30 02:00:00
4 2014-03-30 02:00:00  2014-03-30 03:00:00
5 2014-10-25 22:00:00  2014-10-25 23:00:00
6 2014-10-25 23:00:00  2014-10-26 00:00:00
7 2014-10-26 01:00:00  2014-10-26 01:00:00
8 2014-10-26 02:00:00  2014-10-26 02:00:00
9 2014-10-26 03:00:00  2014-10-26 03:00:00


Note the local timezone adjusted `date obj` including the duplicate value at 
the clock-change in March and the missing value at the clock-change in 
October. As you can imagine this could very easily lead to incorrect 
analysis.

If running this exact same code in the (Eastern) US you'd see the following 
results:
 date obj str repr
0 2014-03-30 03:00:00  2014-03-29 23:00:00
1 2014-03-30 04:00:00  2014-03-30 00:00:00
2 2014-03-30 05:00:00  2014-03-30 01:00:00
3 2014-03-30 06:00:00  2014-03-30 02:00:00
4 2014-03-30 07:00:00  2014-03-30 03:00:00
5 2014-10-26 03:00:00  2014-10-25 23:00:00
6 2014-10-26 04:00:00  2014-10-26 00:00:00
7 2014-10-26 05:00:00  2014-10-26 01:00:00
8 2014-10-26 06:00:00  2014-10-26 02:00:00
9 2014-10-26 07:00:00  2014-10-26 03:00:00


Unfortunately I don't have the skills to meaningfully contribute in this 
area but it is a very real problem for users of numpy, many of whom are not 
active on the mailing list.

HTH,
Dave


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Numpy 1.9 release date

2013-11-10 Thread Stéfan van der Walt
On 9 Nov 2013 03:22, Charles R Harris charlesr.har...@gmail.com wrote:

 that the main thing missing at this point is fixing the datetime problems.

What needs to be done, and what is the plan forward? Is there perhaps an
issue one can follow?

Thanks
Stéfan
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Numpy 1.9 release date

2013-11-08 Thread Charles R Harris
Hi All,

The question has come up as to how much effort we should spend backporting
fixes to 1.8.x. An alternative would be to tag 1.9.0 early next year,
aiming for a release around April. I think there is almost enough in
1.9-devel to justify a release. There is Sebastian's index work, Julian's
continuing work on speedups, the removal of oldnumeric and numarray
support, and various other deprecations and cleanups that add up to a
significant number of changes. I've tended to think of 1.9 as a cleanup and
consolidation release and think that the main thing missing at this point
is fixing the datetime problems.

Thoughts?

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion