Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-03 Thread josef . pktd
On Tue, Apr 2, 2013 at 9:09 PM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 On Tue, Apr 2, 2013 at 7:09 PM,  josef.p...@gmail.com wrote:
 On Tue, Apr 2, 2013 at 5:52 PM, Nathaniel Smith n...@pobox.com wrote:
 On Tue, Apr 2, 2013 at 10:21 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 This is like observing that if I say go North then it's ambiguous
 about whether I want you to drive or walk, and concluding that we need
 new words for the directions depending on what sort of vehicle you
 use. So go North means drive North, go htuoS means walk North,
 etc. Totally silly. Makes much more sense to have one set of words for
 directions, and then make clear from context what the directions are
 used for -- drive North, walk North. Or iterate C-wards, store
 F-wards.

 C and Z mean exactly the same thing -- they describe a way of
 unraveling a cube into a straight line. The difference is what we do
 with the resulting straight line. That's why I'm suggesting that the
 distinction should be made in the name of the argument.

 Could you unpack that for the 'ravel' docstring?  Because these
 options all refer to the way of unraveling and not the memory layout
 that results.

 Z/C/column-major/whatever-you-want-to-call-it is a general strategy
 for converting between a 1-dim representation and a n-dim
 representation. In the case of memory storage, the 1-dim
 representation is the flat space of pointer arithmetic. In the case of
 ravel, the 1-dim representation is the flat space of a 1-dim indexed
 array. But the 1-dim-to-n-dim part is the same in both cases.

 I think that's why you're seeing people baffled by your proposal -- to
 them the C refers to this general strategy, and what's different is
 the context where it gets applied. So giving the same strategy two
 different names is silly; if anything it's the contexts that should
 have different names.

 And once we get into memory optimization (and avoiding copies and
 preserving contiguity), it is necessary to keep both orders in mind,
 is memory order in F and am I iterating/raveling in F order
 (or slicing columns).

 I think having two separate keywords give the impression we can
 choose two different things at the same time.

 I guess it could not make sense to do this:

 np.ravel(a, index_order='C', memory_order='F')

 It could make sense to do this:

 np.reshape(a, (3,4), index_order='F, memory_order='F')

 but that just points out the inherent confusion between the uses of
 'order', and in this case, the fact that you can only do:

 np.reshape(a, (3, 4), index_order='F')

 correctly distinguishes between the meanings.

So, if index_order and memory_order are never in the same function,
then the context should be enough. It was always enough for me.

np.reshape(a, (3,4), index_order='F, memory_order='F')
really hurts my head because you mix a function that operates on
views, indexing and shapes with memory creation, (or I have
no idea what memory_order should do in this case).

np.asarray(a.reshape(3,4 order=F), order=F)
or the example here
http://docs.scipy.org/doc/numpy/reference/generated/numpy.asfortranarray.html?highlight=asfortranarray#numpy.asfortranarray
http://docs.scipy.org/doc/numpy/reference/generated/numpy.asarray.html
keeps functions with index_order and functions with memory_order
nicely separated.

(It might be useful but very confusing to add memory_order to every function
 that creates a view if possible and a copy if necessary: If you have to make
a copy, then I want F memory order, otherwise give me a view
But I cannot find a candidate function right now, except for ravel and reshape
see first notes in
docs.scipy.org/doc/numpy/reference/generated/numpy.reshape.html
)


a day later (haven't changed my mind):

isn't specifying index order in the Parameter section enough as an
explanation?

something like:

```
def ravel

Parameters

order :
   index order how the array is stacked into a 1d array. F means we
stack by columns
   (fortran order, first index first),C means we stack by rows
(c-order, last index first)
```

most array *creation* functions explicitly mention memory layout in
the docstring

Josef


 Best,

 Matthew
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] timezones and datetime64

2013-04-03 Thread Dave Hirschfeld
Andreas Hilboll lists at hilboll.de writes:

 
  
  I think your point about using current timezone in interpreting user
  input being dangerous is probably correct --- perhaps UTC all the way
  would be a safer (and simpler) choice?
 
 +1
 

+10 from me!

I've recently come across a bug due to the fact that numpy interprets dates as 
being in the local timezone.

The data comes from a database query where there is no timezone information 
supplied (and dates are stored as strings). It is assumed that the user doesn't 
need to know the timezone - i.e. the dates are timezone naive.

Working out the correct timezones would be fairly laborious, but whatever the 
correct timezones are, they're certainly not the timezone the current user 
happens to find themselves in!

e.g.

In [32]: rs = [
...: (u'2000-01-17 00:00:00.00', u'2000-02-01', u'2000-02-29', 0.1203),
...: (u'2000-01-26 00:00:00.00', u'2000-02-01', u'2000-02-29', 0.1369),
...: (u'2000-01-18 00:00:00.00', u'2000-03-01', u'2000-03-31', 0.1122),
...: (u'2000-02-25 00:00:00.00', u'2000-03-01', u'2000-03-31', 0.1425)
...: ]
...: dtype = [('issue_date', 'datetime64[ns]'),
...:  ('start_date', 'datetime64[D]'),
...:  ('end_date', 'datetime64[D]'),
...:  ('value', float)]
...: #

In [33]: # What I see in London, UK
...: recordset = np.array(rs, dtype=dtype)
...: df = pd.DataFrame(recordset)
...: df = df.set_index('issue_date')
...: df
...: 
Out[33]: 
start_dateend_date   value
issue_date
2000-01-17 2000-02-01 00:00:00 2000-02-29 00:00:00  0.1203
2000-01-26 2000-02-01 00:00:00 2000-02-29 00:00:00  0.1369
2000-01-18 2000-03-01 00:00:00 2000-03-31 00:00:00  0.1122
2000-02-25 2000-03-01 00:00:00 2000-03-31 00:00:00  0.1425

In [34]: # What my colleague sees in Auckland, NZ
...: recordset = np.array(rs, dtype=dtype)
...: df = pd.DataFrame(recordset)
...: df = df.set_index('issue_date')
...: df
...: 
Out[34]: 
 start_dateend_date   value
issue_date 
2000-01-16 11:00:00 2000-02-01 00:00:00 2000-02-29 00:00:00  0.1203
2000-01-25 11:00:00 2000-02-01 00:00:00 2000-02-29 00:00:00  0.1369
2000-01-17 11:00:00 2000-03-01 00:00:00 2000-03-31 00:00:00  0.1122
2000-02-24 11:00:00 2000-03-01 00:00:00 2000-03-31 00:00:00  0.1425


Oh dear!

This isn't acceptable for my use case (in a multinational company) and I found 
no reasonable way around it other than bypassing the numpy conversion entirely 
by setting the dtype to object, manually parsing the strings and creating an 
array from the list of datetime objects.

Regards,
Dave

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] timezones and datetime64

2013-04-03 Thread Nathaniel Smith
On Wed, Apr 3, 2013 at 2:26 PM, Dave Hirschfeld
dave.hirschf...@gmail.com wrote:
 Andreas Hilboll lists at hilboll.de writes:
  I think your point about using current timezone in interpreting user
  input being dangerous is probably correct --- perhaps UTC all the way
  would be a safer (and simpler) choice?

 +1


 +10 from me!

 I've recently come across a bug due to the fact that numpy interprets dates as
 being in the local timezone.

 The data comes from a database query where there is no timezone information
 supplied (and dates are stored as strings). It is assumed that the user 
 doesn't
 need to know the timezone - i.e. the dates are timezone naive.

 Working out the correct timezones would be fairly laborious, but whatever the
 correct timezones are, they're certainly not the timezone the current user
 happens to find themselves in!

 e.g.

 In [32]: rs = [
 ...: (u'2000-01-17 00:00:00.00', u'2000-02-01', u'2000-02-29', 
 0.1203),
 ...: (u'2000-01-26 00:00:00.00', u'2000-02-01', u'2000-02-29', 
 0.1369),
 ...: (u'2000-01-18 00:00:00.00', u'2000-03-01', u'2000-03-31', 
 0.1122),
 ...: (u'2000-02-25 00:00:00.00', u'2000-03-01', u'2000-03-31', 0.1425)
 ...: ]
 ...: dtype = [('issue_date', 'datetime64[ns]'),
 ...:  ('start_date', 'datetime64[D]'),
 ...:  ('end_date', 'datetime64[D]'),
 ...:  ('value', float)]
 ...: #

 In [33]: # What I see in London, UK
 ...: recordset = np.array(rs, dtype=dtype)
 ...: df = pd.DataFrame(recordset)
 ...: df = df.set_index('issue_date')
 ...: df
 ...:
 Out[33]:
 start_dateend_date   value
 issue_date
 2000-01-17 2000-02-01 00:00:00 2000-02-29 00:00:00  0.1203
 2000-01-26 2000-02-01 00:00:00 2000-02-29 00:00:00  0.1369
 2000-01-18 2000-03-01 00:00:00 2000-03-31 00:00:00  0.1122
 2000-02-25 2000-03-01 00:00:00 2000-03-31 00:00:00  0.1425

 In [34]: # What my colleague sees in Auckland, NZ
 ...: recordset = np.array(rs, dtype=dtype)
 ...: df = pd.DataFrame(recordset)
 ...: df = df.set_index('issue_date')
 ...: df
 ...:
 Out[34]:
  start_dateend_date   value
 issue_date
 2000-01-16 11:00:00 2000-02-01 00:00:00 2000-02-29 00:00:00  0.1203
 2000-01-25 11:00:00 2000-02-01 00:00:00 2000-02-29 00:00:00  0.1369
 2000-01-17 11:00:00 2000-03-01 00:00:00 2000-03-31 00:00:00  0.1122
 2000-02-24 11:00:00 2000-03-01 00:00:00 2000-03-31 00:00:00  0.1425


 Oh dear!

 This isn't acceptable for my use case (in a multinational company) and I found
 no reasonable way around it other than bypassing the numpy conversion entirely
 by setting the dtype to object, manually parsing the strings and creating an
 array from the list of datetime objects.

Wow, that's truly broken. I'm sorry.

I'm skeptical that just switching to UTC everywhere is actually the
right solution. It smells like one of those solutions that's simple,
neat, and wrong. (I don't know anything about calendar-time series
handling, so I have no ability to actually judge this stuff, but
wouldn't one problem be if you want to know about business days/hours?
You lose the original day-of-year once you move everything to UTC.)
Maybe datetime dtypes should be parametrized by both granularity and
timezone? Or we could just declare that datetime64 is always
timezone-naive and adjust the code to match?

I'll CC the pandas list in case they have some insight. Unfortunately
AFAIK no-one who's regularly working on numpy this point works with
datetimes, so we have limited ability to judge solutions... please
help!

-n
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] timezones and datetime64

2013-04-03 Thread Dave Hirschfeld
Nathaniel Smith njs at pobox.com writes:

 
 On Wed, Apr 3, 2013 at 2:26 PM, Dave Hirschfeld
 dave.hirschfeld at gmail.com wrote:
 
  This isn't acceptable for my use case (in a multinational company) and I 
found
  no reasonable way around it other than bypassing the numpy conversion 
entirely
  by setting the dtype to object, manually parsing the strings and creating an
  array from the list of datetime objects.
 
 Wow, that's truly broken. I'm sorry.
 
 I'm skeptical that just switching to UTC everywhere is actually the
 right solution. It smells like one of those solutions that's simple,
 neat, and wrong. (I don't know anything about calendar-time series
 handling, so I have no ability to actually judge this stuff, but
 wouldn't one problem be if you want to know about business days/hours?
 You lose the original day-of-year once you move everything to UTC.)
 Maybe datetime dtypes should be parametrized by both granularity and
 timezone? Or we could just declare that datetime64 is always
 timezone-naive and adjust the code to match?
 
 I'll CC the pandas list in case they have some insight. Unfortunately
 AFAIK no-one who's regularly working on numpy this point works with
 datetimes, so we have limited ability to judge solutions... please
 help!
 
 -n
 

It think simply setting the timezone to UTC if it's not specified would solve 
99% of use cases because IIUC the internal representation is UTC so numpy would 
be doing no conversion of the dates that were passed in. It was the conversion 
which was the source of the error in my example.

The only potential issue with this is that the dates might take along an 
incorrect UTC timezone, making it more difficult to work with naive datetimes.

e.g.

In [42]: d = np.datetime64('2014-01-01 00:00:00', dtype='M8[ns]')

In [43]: d
Out[43]: numpy.datetime64('2014-01-01T00:00:00+')

In [44]: str(d)
Out[44]: '2014-01-01T00:00:00+'

In [45]: pydate(str(d))
Out[45]: datetime.datetime(2014, 1, 1, 0, 0, tzinfo=tzutc())

In [46]: pydate(str(d)) == datetime.datetime(2014, 1, 1)
Traceback (most recent call last):

  File ipython-input-46-abfc0fee9b97, line 1, in module
pydate(str(d)) == datetime.datetime(2014, 1, 1)

TypeError: can't compare offset-naive and offset-aware datetimes


In [47]: pydate(str(d)) == datetime.datetime(2014, 1, 1, tzinfo=tzutc())
Out[47]: True

In [48]: pydate(str(d)).replace(tzinfo=None) == datetime.datetime(2014, 1, 1)
Out[48]: True


In this case it may be best to have numpy not try to set the timezone at all if 
none was specified. Given the internal representation is UTC I'm not sure this 
is feasible though so defaulting to UTC may be the best solution.

Regards,
Dave


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-03 Thread Chris Barker - NOAA Federal
On Wed, Apr 3, 2013 at 6:24 AM, Sebastian Berg
sebast...@sipsolutions.net wrote:
 the context where it gets applied. So giving the same strategy two
 different names is silly; if anything it's the contexts that should
 have different names.


 Yup, thats how I think about it too...

me too...

 But I would really love if someone would try to make the documentation
 simpler!

yes, I think this is where the solution lies.

 There is also never a mention of contiguity, even though when
 we refer to memory order, then having a C/F contiguous array is often
 the reason why

good point -- in fact, I have no idea what would happen in many of
these cases for a discontiguous array (or one with arbitrarily weird
strides...)

  Also 'A' seems often explained not
 quite correctly (though that does not matter (except for reshape, where
 its explanation is fuzzy), it will matter more in the future -- even if
 I don't expect 'A' to be actually used).

I wonder about having a 'A' option in reshape at all -- what the heck
does it mean? why do we need it? Again, I come back to the fact that
memory order is kind-of orthogonal to index order. So for reshape (or
ravel, which is really just a special case of reshape...) the 'A' flag
and 'K' flag (huh?) is pretty dangerous, and prone to error. I think
of it this way:

Much of the beauty of numpy is that it presents a consistent interface
to various forms of strided data -- that way, folks can write code
that works the same way for any ndarray, while still being able to
have internal storage be efficient for the use at hand -- i.e. C order
for the common case, Fortran order for interaction with libraries that
expect that order (or for algorithms that are more efficient in that
order, though that's mostly external libs..), and non-contiguous data
so one can work on sub-parts of arrays without copying data around.

In most places, the numpy API hides the internal memory order -- this
is a good thing, most people have no need to think about it (or most
code, anyway), and you can write code that works (even if not
optimally) for any (strided) memory layout. All is good.

There are times when you really need to understand, or control or
manipulate the memory layout, to make sure your routines are
optimized, or the data is in the right form to pass of to an external
lib, or to make sense of raw data read from a file, or... That's what
we have .view() and friends for.

However, the 'A' and 'K' flags mix and match these concepts -- and I
think that's dangerous. it would be easy for the a to use the 'A'
flag, and have everything work fine and dandy with all their test
cases, only to have it blow up when  someone passes in a
different-than-expected array. So really, they should only be used in
cases where the code has checked memory order before hand, or in a
really well-defined interface where you know exactly what you're
getting. In those cases, it makes the code far more clear an less
error prone to do you re-arranging of the memory in a separate step,
rather than built-in to a ravel() or reshape() call.

[note] -- I wrote earlier that I wasn't confused by the ravel()
examples -- true for teh 'c' and 'F' flags, but I'm still not at all
clear what 'A' and 'K' woudl give me -- particularly for 'A' and
reshape()

So I think the cause of the confusion here is not that we use order
in two different contexts, nor the fact that 'C' and 'F' may not mean
anything to some people, but that we are conflating two different
process in one function, and with one flag.

My (maybe) proposal: we deprecate the 'A' and 'K' flags in ravel() and
reshape(). (maybe even deprecate ravel() -- does it add anything to
reshape? If not deprecate, at least encourage people in the docs not
to use them, and rather do their memory-structure manipulations with
.view or stride manipulation, or...

I'm still trying to figure out when you'd want the 'A' flag -- it
seems at the end of your operation you will want:

The resulting array to be a particular shape, with the elements in a
particular order

and

You _may_ want the in-memory layout a certain way.

but 'A' can't ensure both of those.

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] timezones and datetime64

2013-04-03 Thread Chris Barker - NOAA Federal
dave.hirschf...@gmail.com wrote:
  I found no reasonable way around it other than bypassing the numpy 
 conversion entirely

Exactly - we have come to the same conclusion. By the way, it's also
consistent -- an ISO string without a TZ is interpreted as a to mean
use the locale, but a datetime object without a TZ is interpreted as
UTC, so you get this:

In [68]: dt
Out[68]: datetime.datetime(2013, 4, 3, 12, 0)

In [69]: np.dateti
np.datetime64  np.datetime_as_string  np.datetime_data

In [69]: np.datetime64(dt)
Out[69]: numpy.datetime64('2013-04-03T05:00:00.00-0700')

In [70]: np.datetime64(dt.iso)
dt.isocalendar  dt.isoformatdt.isoweekday

In [70]: np.datetime64(dt.isoformat())
Out[70]: numpy.datetime64('2013-04-03T12:00:00-0700')

two different results!

(and as it happens, datetime.datetime does not have an ISO string
parser, so it's not completely trivial to round-trip though that...)

On Wed, Apr 3, 2013 at 6:49 AM, Nathaniel Smith n...@pobox.com wrote:

 Wow, that's truly broken. I'm sorry.

Did you put this in? break out the pitchforks! (  ;-) )


 I'm skeptical that just switching to UTC everywhere is actually the
 right solution. It smells like one of those solutions that's simple,
 neat, and wrong.

well, actually, I don't think UTC everywhere is quite what's proposed
-- really it's naive datetimes -- it would be up to the
user/application to make sure the time zones are consistent.

Which does mean that parsing a ISO string with a timezone becomes problematic...

 (I don't know anything about calendar-time series
 handling, so I have no ability to actually judge this stuff, but
 wouldn't one problem be if you want to know about business days/hours?

right -- then you'd want to use local time, so numpy might think it's
ISO, but it'd actually be local time. Anyway, at the moment, I don't
think datetime64 does this right anyway. I don't see mention of the
timezone in the busday functions. I havne't checked to see if they use
the locale TZ or ignore it, but either way is wrong (actually, using
the locale setting is worse...)

 Maybe datetime dtypes should be parametrized by both granularity and
 timezone?

That may be a good option. However, I suspect it's pretty hard to
actually use the timezone correctly and consistently, so Im nervous
about that. In any case, we'd need to make sure that the user could
specify timezone on I/O and busday calculations, etc, and *never*
assume the locale TZ (Or anything else about locale) unless asked for.
Using the locale TZ is almost never the right thing to do for the kind
of applications numpy is used for.

 Or we could just declare that datetime64 is always
 timezone-naive and adjust the code to match?

That would be the easy way to handle it -- from the numpy side, anyway.

 I'll CC the pandas list in case they have some insight.

I suspect pandas has their own way of dealing with all these issues
already. Which makes me think that numpy should take the same approach
as the python stdlib: provide a core datatype, but leave the use-case
specific stuff for others to build on. For instance, it seems really
odd to have the busday* functions in core numpy...

 Unfortunately
 AFAIK no-one who's regularly working on numpy this point works with
 datetimes, so we have limited ability to judge solutions...

well, that explains how this happened!

 please help!

in 1.7, it is still listed as experimental, so you could say this is
all going as planned: release something we can try to use, and see
what we find out when using it!

I _think_ one reasonable option may be:

1) Internal is UTC
2) On input:
   a) Default for no-time-zone-specified is UTC (both from datetime
objects and ISO strings)
   b) respect TZ if given, converting to UTC
3) On output:
   a) default to UTC
   a) provide a way for the user to specify the timezone desired
  ( perhaps a TZ attribute somewhere, or functions to specifically
convert to ISO strings and datetime objects that take an optional TZ
parameter.)
4) busday* and the like allow a way to specify TZ

Issues I immediate see with this:
   Respecting the TZ on output is a problem because:
 1)  if people want naive datetimes, they will get UTC ISO strings, i.e.:
  '2013-04-03T05:00:00Z' rather than '2013-04-03T05:00:00'
 - so there should be a way to specify naive or None as a timezone.

 2)  the python datetime module doesn't have any tzinfo objects
built in -- so to respect timezones, numpy would need to maintain its
own, or depend on pytz

Given all this, maybe naive is the way to go, perhaps mirroring
datetime,datetime, an having an optional tzinfo object attribute. (by
the way, I'm confused where that would live -- in the dtype instance?
in the array?

Issue with Naive: what do you do with an ISO string that specifies a TZ offset?

I'm beginning to see why the datetime doesn't support reading ISO
strings -- it would need to deal with timezones in that case!

Another note about Timezones and ISO 

Re: [Numpy-discussion] timezones and datetime64

2013-04-03 Thread Travis Oliphant
Mark Wiebe and I are both still tracking NumPy development and can provide
context and even help when needed.Apologies if we've left a different
impression.   We have to be prudent about the time we spend as we have
other projects we are pursuing as well, but we help clients with NumPy
issues all the time and are eager to continue to improve the code base.

It seems to me that the biggest issue is just the automatic conversion that
is occurring on string or date-time input.   We should stop using the local
time-zone (explicit is better than implicit strikes again) and not use any
time-zone unless time-zone information is provided in the string.  I am
definitely +1 on that.

It may be necessary to carry around another flag in the data-type to
indicate whether or not the date-time is naive (not time-zone aware) or
time-zone aware so that string printing does not print a time-zone if it
didn't have one to begin with as well.

If others agree that this is the best way forward, then Mark or I can
definitely help contribute a patch.

Best,

-Travis



On Wed, Apr 3, 2013 at 9:38 AM, Dave Hirschfeld
dave.hirschf...@gmail.comwrote:

 Nathaniel Smith njs at pobox.com writes:

 
  On Wed, Apr 3, 2013 at 2:26 PM, Dave Hirschfeld
  dave.hirschfeld at gmail.com wrote:
  
   This isn't acceptable for my use case (in a multinational company) and
 I
 found
   no reasonable way around it other than bypassing the numpy conversion
 entirely
   by setting the dtype to object, manually parsing the strings and
 creating an
   array from the list of datetime objects.
 
  Wow, that's truly broken. I'm sorry.
 
  I'm skeptical that just switching to UTC everywhere is actually the
  right solution. It smells like one of those solutions that's simple,
  neat, and wrong. (I don't know anything about calendar-time series
  handling, so I have no ability to actually judge this stuff, but
  wouldn't one problem be if you want to know about business days/hours?
  You lose the original day-of-year once you move everything to UTC.)
  Maybe datetime dtypes should be parametrized by both granularity and
  timezone? Or we could just declare that datetime64 is always
  timezone-naive and adjust the code to match?
 
  I'll CC the pandas list in case they have some insight. Unfortunately
  AFAIK no-one who's regularly working on numpy this point works with
  datetimes, so we have limited ability to judge solutions... please
  help!
 
  -n
 

 It think simply setting the timezone to UTC if it's not specified would
 solve
 99% of use cases because IIUC the internal representation is UTC so numpy
 would
 be doing no conversion of the dates that were passed in. It was the
 conversion
 which was the source of the error in my example.

 The only potential issue with this is that the dates might take along an
 incorrect UTC timezone, making it more difficult to work with naive
 datetimes.

 e.g.

 In [42]: d = np.datetime64('2014-01-01 00:00:00', dtype='M8[ns]')

 In [43]: d
 Out[43]: numpy.datetime64('2014-01-01T00:00:00+')

 In [44]: str(d)
 Out[44]: '2014-01-01T00:00:00+'

 In [45]: pydate(str(d))
 Out[45]: datetime.datetime(2014, 1, 1, 0, 0, tzinfo=tzutc())

 In [46]: pydate(str(d)) == datetime.datetime(2014, 1, 1)
 Traceback (most recent call last):

   File ipython-input-46-abfc0fee9b97, line 1, in module
 pydate(str(d)) == datetime.datetime(2014, 1, 1)

 TypeError: can't compare offset-naive and offset-aware datetimes


 In [47]: pydate(str(d)) == datetime.datetime(2014, 1, 1, tzinfo=tzutc())
 Out[47]: True

 In [48]: pydate(str(d)).replace(tzinfo=None) == datetime.datetime(2014, 1,
 1)
 Out[48]: True


 In this case it may be best to have numpy not try to set the timezone at
 all if
 none was specified. Given the internal representation is UTC I'm not sure
 this
 is feasible though so defaulting to UTC may be the best solution.

 Regards,
 Dave


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion




-- 
---
Travis Oliphant
Continuum Analytics, Inc.
http://www.continuum.io
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-03 Thread Matthew Brett
Hi,

On Wed, Apr 3, 2013 at 5:19 AM,  josef.p...@gmail.com wrote:
 On Tue, Apr 2, 2013 at 9:09 PM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 On Tue, Apr 2, 2013 at 7:09 PM,  josef.p...@gmail.com wrote:
 On Tue, Apr 2, 2013 at 5:52 PM, Nathaniel Smith n...@pobox.com wrote:
 On Tue, Apr 2, 2013 at 10:21 PM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 This is like observing that if I say go North then it's ambiguous
 about whether I want you to drive or walk, and concluding that we need
 new words for the directions depending on what sort of vehicle you
 use. So go North means drive North, go htuoS means walk North,
 etc. Totally silly. Makes much more sense to have one set of words for
 directions, and then make clear from context what the directions are
 used for -- drive North, walk North. Or iterate C-wards, store
 F-wards.

 C and Z mean exactly the same thing -- they describe a way of
 unraveling a cube into a straight line. The difference is what we do
 with the resulting straight line. That's why I'm suggesting that the
 distinction should be made in the name of the argument.

 Could you unpack that for the 'ravel' docstring?  Because these
 options all refer to the way of unraveling and not the memory layout
 that results.

 Z/C/column-major/whatever-you-want-to-call-it is a general strategy
 for converting between a 1-dim representation and a n-dim
 representation. In the case of memory storage, the 1-dim
 representation is the flat space of pointer arithmetic. In the case of
 ravel, the 1-dim representation is the flat space of a 1-dim indexed
 array. But the 1-dim-to-n-dim part is the same in both cases.

 I think that's why you're seeing people baffled by your proposal -- to
 them the C refers to this general strategy, and what's different is
 the context where it gets applied. So giving the same strategy two
 different names is silly; if anything it's the contexts that should
 have different names.

 And once we get into memory optimization (and avoiding copies and
 preserving contiguity), it is necessary to keep both orders in mind,
 is memory order in F and am I iterating/raveling in F order
 (or slicing columns).

 I think having two separate keywords give the impression we can
 choose two different things at the same time.

 I guess it could not make sense to do this:

 np.ravel(a, index_order='C', memory_order='F')

 It could make sense to do this:

 np.reshape(a, (3,4), index_order='F, memory_order='F')

 but that just points out the inherent confusion between the uses of
 'order', and in this case, the fact that you can only do:

 np.reshape(a, (3, 4), index_order='F')

 correctly distinguishes between the meanings.

 So, if index_order and memory_order are never in the same function,
 then the context should be enough. It was always enough for me.

It was not enough for me or the three others who will publicly admit
to the shame of finding it confusing without further thought.

Again, I just can't see a reason not to separate these ideas.  We are
not arguing about backwards compatibility here, only about clarity.  I
guess you do accept that some people, other than yourself, might be
less likely to get tripped up by:

np.reshape(a, (3, 4), index_order='F')

than

np.reshape(a, (3, 4), order='F')

?

 np.reshape(a, (3,4), index_order='F, memory_order='F')
 really hurts my head because you mix a function that operates on
 views, indexing and shapes with memory creation, (or I have
 no idea what memory_order should do in this case).

Right.   I think you may now be close to my own discomfort when faced
with working out (fast) what:

np.reshape(a, (3,4), order='F')

means, given 'order' means two different things, and both might be
relevant here.

Or are you saying that my brain should have quickly calculated that
that 'order' would be difficult to understand as memory layout and
therefore rejected that and seen immediately that index order was the
meaning?   Speaking as a psychologist,  I don't think that's the way
it works.

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] try to solve issue #2649 and revisit #473

2013-04-03 Thread huangkan...@gmail.com
Hello, all

I try to solve issue 2649 which is related to 473 on multiplication of a
matrix and an array. As 2649 shows

   import numpy as np
   x = np.arange(5)
   I = np.asmatrix(np.identity(5))
   print np.dot(I, x).shape
   # - (1, 5)

First of all I assume we expect that I.dot(x) and I * x behave the same, so
I suggest add function dot to matrix, like

def dot(self, other):
  return self * other

Then the major issue is the constructor of array and matrix interpret a
list differently. array([0,1]).shape = (2,)  and matrix([0,1]).shape = (1,
2). It will throw error when run np.dot(I, x), because in __mul__, x will
be converted to a 1*5 matrix first. It's not consistent with
np.dot(np.identity(5),
x), which returns x. To fix that, I suggest to check the dimension of array
when convert it to matrix. If it's 1D array, then convert it to a vertical
vector explicitly like this

 if isinstance(data, N.ndarray):
+   if len(data.shape) == 1:
+   data = data.reshape(data.shape[0], 1)
 if dtype is None:
 intype = data.dtype
 else:

Any comments?

-- 
Kan Huang
Department of Applied math  Statistics
Stony Brook University
917-767-8018
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-03 Thread Matthew Brett
Hi,

On Wed, Apr 3, 2013 at 8:52 AM, Chris Barker - NOAA Federal
chris.bar...@noaa.gov wrote:
 On Wed, Apr 3, 2013 at 6:24 AM, Sebastian Berg
 sebast...@sipsolutions.net wrote:
 the context where it gets applied. So giving the same strategy two
 different names is silly; if anything it's the contexts that should
 have different names.


 Yup, thats how I think about it too...

 me too...

 But I would really love if someone would try to make the documentation
 simpler!

 yes, I think this is where the solution lies.

No question that better docs would be an improvement, let's all agree on that.

We all agree that 'order' is used with two different and orthogonal
meanings in numpy.

I think we are now more or less agreeing that:

np.reshape(a, (3, 4), index_order='F')

is at least as clear as:

np.reshape(a, (3, 4), order='F')

Do I have that right so far?

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] try to solve issue #2649 and revisit #473

2013-04-03 Thread Alan G Isaac
On 4/3/2013 2:44 PM, huangkan...@gmail.com wrote:
 I suggest add function dot to matrix

  import numpy as np; x = np.arange(5); I = np.asmatrix(np.identity(5));
  I.dot(x)
matrix([[ 0.,  1.,  2.,  3.,  4.]])


Alan Isaac

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] timezones and datetime64

2013-04-03 Thread Mark Wiebe
On Wed, Apr 3, 2013 at 9:33 AM, Chris Barker - NOAA Federal 
chris.bar...@noaa.gov wrote:

 dave.hirschf...@gmail.com wrote:
   I found no reasonable way around it other than bypassing the numpy
 conversion entirely

 Exactly - we have come to the same conclusion. By the way, it's also
 consistent -- an ISO string without a TZ is interpreted as a to mean
 use the locale, but a datetime object without a TZ is interpreted as
 UTC, so you get this:

 In [68]: dt
 Out[68]: datetime.datetime(2013, 4, 3, 12, 0)

 In [69]: np.dateti
 np.datetime64  np.datetime_as_string  np.datetime_data

 In [69]: np.datetime64(dt)
 Out[69]: numpy.datetime64('2013-04-03T05:00:00.00-0700')

 In [70]: np.datetime64(dt.iso)
 dt.isocalendar  dt.isoformatdt.isoweekday

 In [70]: np.datetime64(dt.isoformat())
 Out[70]: numpy.datetime64('2013-04-03T12:00:00-0700')

 two different results!

 (and as it happens, datetime.datetime does not have an ISO string
 parser, so it's not completely trivial to round-trip though that...)



 On Wed, Apr 3, 2013 at 6:49 AM, Nathaniel Smith n...@pobox.com wrote:

  Wow, that's truly broken. I'm sorry.

 Did you put this in? break out the pitchforks! (  ;-) )


Many of the aspects of how the datetime64 is are from me. I started out
from the datetime64 NEP, but it wasn't fleshed out enough so I had to fill
in lots of details. I guess your pitchforks are pointing at me. ;)

For the way this specific part of the code is, I think it's hard to not
have it broken one way or another, no matter how we do it. One thing I
observed is the printing of getting the current time is weird if you're
looking at it interactively. In general, if you get the current time, and
print it in UTC, it's the wrong time unless you're in UTC. Python's
datetime doesn't help the situation by having datetime.now() return a
'local' time.

In [1]: import numpy as np

In [2]: from datetime import datetime

In [3]: np.datetime64('now')

Out[3]: numpy.datetime64('2013-04-03T12:17:58-0700')

In [4]: np.datetime_as_string(np.datetime64('now'), timezone='UTC')

Out[4]: '2013-04-03T19:17:59Z'

In [5]: datetime.now()

Out[5]: datetime.datetime(2013, 4, 3, 12, 18, 2, 582000)

In [6]: datetime.now().isoformat()

Out[6]: '2013-04-03T12:18:06.796000'

In [7]: np.datetime64(datetime.now())

Out[7]: numpy.datetime64('2013-04-03T05:18:15.525000-0700')

In [8]: np.datetime64(datetime.now().isoformat())

Out[8]: numpy.datetime64('2013-04-03T12:18:25.291000-0700')


 I'm skeptical that just switching to UTC everywhere is actually the
  right solution. It smells like one of those solutions that's simple,
  neat, and wrong.

 well, actually, I don't think UTC everywhere is quite what's proposed
 -- really it's naive datetimes -- it would be up to the
 user/application to make sure the time zones are consistent.


It seems to me that adding a time zone to the datetime64 metadata might be
a good idea, and then allowing it to be None to behave like the Python
naive datetimes. This wouldn't be a trivial addition, though. Using
Python's timezone object doesn't seem like a good idea, because would
require things to be converted to/from Python's datetime to be processed
every time, which would remove the performance benefits of NumPy. The boost
datetime library has a nice timezone object which could be used as
inspiration for an equivalent in NumPy, but I think any way we cut it would
be a lot of work.


 Which does mean that parsing a ISO string with a timezone becomes
 problematic...


Yeah, there are a number of cases.

How would it transform '2013-04-03T12:18' to a datetime64 with a timezone
by default? I guess that would be to use the datetime64's metadata probably.
How would it transform '2013-04-03T12:18Z' or '2013-04-03T12:18-0700' to a
datetime64 with no timezone? Do we throw an error in the default
conversion, and have a separate parsing function that allows more control?


  (I don't know anything about calendar-time series
  handling, so I have no ability to actually judge this stuff, but
  wouldn't one problem be if you want to know about business days/hours?

 right -- then you'd want to use local time, so numpy might think it's
 ISO, but it'd actually be local time. Anyway, at the moment, I don't
 think datetime64 does this right anyway. I don't see mention of the
 timezone in the busday functions. I havne't checked to see if they use
 the locale TZ or ignore it, but either way is wrong (actually, using
 the locale setting is worse...)


The busday functions just operate on datetime64[D]. There is no timezone
interaction there, except for how a datetime with a date unit converts
to/from a datetime which includes time.


  Maybe datetime dtypes should be parametrized by both granularity and
  timezone?

 That may be a good option. However, I suspect it's pretty hard to
 actually use the timezone correctly and consistently, so Im nervous
 about that. In any case, we'd need to make sure that the user could
 specify timezone on I/O and 

Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-03 Thread Chris Barker - NOAA Federal
On Wed, Apr 3, 2013 at 11:39 AM, Matthew Brett matthew.br...@gmail.com wrote:
 It was not enough for me or the three others who will publicly admit
 to the shame of finding it confusing without further thought.

I would submit that some of the confusion came from the fact that with
ravel(), and the 'A' and 'K' flags, you are forced to figure out BOTH
index_order and memory_order -- with one flag -- I know I'm still not
clear what I'd get in complex situations.

 Again, I just can't see a reason not to separate these ideas.

I agree, but really separating them -- but ideally having a given
function only deal with one or the other, not both at once.

  We are
 not arguing about backwards compatibility here, only about clarity.

while it could be changed while strictly maintaining backward
compatibility -- it is a change that would need to filter through the
docs, example, random blog posts, stack=overflow questions, etc..

Is that worth it? I'm not convinced

 Right.   I think you may now be close to my own discomfort when faced
 with working out (fast) what:

 np.reshape(a, (3,4), order='F')

I still think it's cause you know too much ;-)

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-03 Thread Ralf Gommers
On Wed, Apr 3, 2013 at 11:52 PM, Chris Barker - NOAA Federal 
chris.bar...@noaa.gov wrote:

 On Wed, Apr 3, 2013 at 11:39 AM, Matthew Brett matthew.br...@gmail.com
 wrote:
  It was not enough for me or the three others who will publicly admit
  to the shame of finding it confusing without further thought.

 I would submit that some of the confusion came from the fact that with
 ravel(), and the 'A' and 'K' flags, you are forced to figure out BOTH
 index_order and memory_order -- with one flag -- I know I'm still not
 clear what I'd get in complex situations.

  Again, I just can't see a reason not to separate these ideas.

 I agree, but really separating them -- but ideally having a given
 function only deal with one or the other, not both at once.

   We are
  not arguing about backwards compatibility here, only about clarity.

 while it could be changed while strictly maintaining backward
 compatibility -- it is a change that would need to filter through the
 docs, example, random blog posts, stack=overflow questions, etc..


Not only that, we would then also be in the situation of having `order`
*and* `xxx_order` keywords. This is also confusing, at least as much as the
current situation imho.

Ralf


 Is that worth it? I'm not convinced

  Right.   I think you may now be close to my own discomfort when faced
  with working out (fast) what:
 
  np.reshape(a, (3,4), order='F')

 I still think it's cause you know too much ;-)

 -Chris


 --

 Christopher Barker, Ph.D.
 Oceanographer

 Emergency Response Division
 NOAA/NOS/ORR(206) 526-6959   voice
 7600 Sand Point Way NE   (206) 526-6329   fax
 Seattle, WA  98115   (206) 526-6317   main reception

 chris.bar...@noaa.gov
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Please stop bottom posting!!

2013-04-03 Thread Nathaniel Smith
On Wed, Apr 3, 2013 at 11:00 PM, Chris Barker - NOAA Federal
chris.bar...@noaa.gov wrote:
 Best of all is intelligent editing of the thread so far -- edit it
 down to the key points you are commenting on, and intersperse your
 comments. That way your email stands on its own as meaningful, but
 there is not a big pile of left over crap to wade through to read your
 fabulous pithy opinions

Traditionally this is what the phrase bottom posting meant, as a
term of art, and is the key reason why those old netiquette guides
recommend it. I guess the unexpressed nuances of such definitions get
lost over time as people encounter them without the relevant context,
though -- sort of like how the full in-context meaning of order= gets
lost ;-).

-n
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] try to solve issue #2649 and revisit #473

2013-04-03 Thread Chris Barker - NOAA Federal
On Wed, Apr 3, 2013 at 1:03 PM, Alan G Isaac alan.is...@gmail.com wrote:
 On 4/3/2013 3:18 PM, huangkan...@gmail.com wrote:

 In my view, the result should be a 1d array,
 the same as I.A.dot(x).

 But the maintainers wanted operations with matrices to
 return matrices whenever possible.  So instead of
 returning x it returns np.matrix(x).

the matrix object is a fine idea, but the key problem is that it
provides a 2-d matrix,  but no concept of a 1-d vector. I think it
would all be a cleaner if there were a row-vector and column-vector
object to accompany matrix -- they things that naturally return a
vector could do so, You can't use a regular 1-d array because there is
no way to distinguish between a row or column version.

But as Alan sid, this was all hashed out a few years back -- a bunch
of great ideas, but no one to implement them.

The truth is that matrix has little value outside of teaching, so no
one with the skills to push it forward uses it themselves.

-Chris

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Please stop bottom posting!!

2013-04-03 Thread Steve Waterbury
On 04/03/2013 08:06 PM, Charles R Harris wrote:
snip

Nice editing!  ;)

Steve
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

2013-04-03 Thread josef . pktd
On Wed, Apr 3, 2013 at 9:13 PM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 On Wed, Apr 3, 2013 at 11:44 AM, Matthew Brett matthew.br...@gmail.com 
 wrote:
 Hi,

 On Wed, Apr 3, 2013 at 8:52 AM, Chris Barker - NOAA Federal
 chris.bar...@noaa.gov wrote:
 On Wed, Apr 3, 2013 at 6:24 AM, Sebastian Berg
 sebast...@sipsolutions.net wrote:
 the context where it gets applied. So giving the same strategy two
 different names is silly; if anything it's the contexts that should
 have different names.


 Yup, thats how I think about it too...

 me too...

 But I would really love if someone would try to make the documentation
 simpler!

 yes, I think this is where the solution lies.

 No question that better docs would be an improvement, let's all agree on 
 that.

 We all agree that 'order' is used with two different and orthogonal
 meanings in numpy.

 I think we are now more or less agreeing that:

 np.reshape(a, (3, 4), index_order='F')

 is at least as clear as:

 np.reshape(a, (3, 4), order='F')

 I believe uur job here is to come to some consensus.

 In that spirit, I think we do agree on these statements above.

 Now we have the cost / benefit.

 Benefit : Some people may find it easier to understand numpy when
 these constructs are separated.

 Cost : There might be some confusion because we have changed the
 default keywords.

 Benefit
 ---

 What proportion of people would find it easier to understand with the
 order constructs separated?   Clearly Chris and Josef and Sebastian -
 you estimate I think no change in your understanding, because your
 understanding was near complete already.

 At least I, Paul Ivanov, JB Poline found the current state strikingly
 confusing.   I think we have other votes for that position here.  It's
 difficult to estimate the proportions now because my original email
 and the subsequent discussion are based on the distinction already
 being made.  So, it is hard for us to be objective about whether a new
 user is likely to get confused.  At least it seems reasonable to say
 that some moderate proportion of users will get confused.

 In that situation, it seems to me the long-term benefit for separating
 these ideas is relatively high.   The benefit will continue over the
 long term.

 Cost
 ---

 The ravel docstring would looks something like this:

 index_order : {'C','F', 'A', 'K'}, optional
 ...   This keyword used to be called simply 'order', and you can
 also use the keyword 'order' to specify index_order (this parameter).

 The problem would then be that, for a while, there will be older code
 and docs using 'order' instead of 'index_order'.  I think this would
 not cause much trouble.  Reading the docstring will explain the
 change.  The old code will continue to work.

 This cost will decrease to zero over time.

 So, if we are planning for the long-term for numpy, I believe the
 benefit to the change considerably outweighs the cost.

 I'm happy to do the code changes, so that's not an issue.

 Cheers,

 Matthew
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Moving linalg c code

2013-04-03 Thread Charles R Harris
Hi All,

There is a PR https://github.com/numpy/numpy/pull/2954 that adds some
blas and lapack functions to numpy. I'm thinking that if that PR is merged
it would be good to move all of the blas and lapack functions, including
the current ones in numpy/linalg into a single directory somewhere in
numpy/core/src. So there are two questions here: should we be adding the
new functions, and if so, should we consolidate all the blas and lapack C
code into its own directory somewhere in numpy/core/src.

Thoughts?

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] timezones and datetime64

2013-04-03 Thread Benjamin Root
On Wed, Apr 3, 2013 at 7:52 PM, Chris Barker - NOAA Federal 
chris.bar...@noaa.gov wrote:


 Personally, I never need finer resolution than seconds, nor more than
 a century, so it's no big deal to me, but just wondering


A use case for finer resolution than seconds (in our field, no less!) is
lightning data.  At the last SciPy conference,  a fellow meteorologist
mentioned how difficult it was to plot out lightning data at resolutions
finer than microseconds (which is the resolution of the python datetime
objects).  Matplotlib has not supported the datetime64 object yet (John
passed before he could write up that patch).

Cheers!
Ben

By the way, my 12th Rule of Programming is Never roll your own datetime
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] timezones and datetime64

2013-04-03 Thread Warren Weckesser
On 4/3/13, Benjamin Root ben.r...@ou.edu wrote:
 On Wed, Apr 3, 2013 at 7:52 PM, Chris Barker - NOAA Federal 
 chris.bar...@noaa.gov wrote:


 Personally, I never need finer resolution than seconds, nor more than
 a century, so it's no big deal to me, but just wondering


 A use case for finer resolution than seconds (in our field, no less!) is
 lightning data.  At the last SciPy conference,  a fellow meteorologist
 mentioned how difficult it was to plot out lightning data at resolutions
 finer than microseconds (which is the resolution of the python datetime
 objects).  Matplotlib has not supported the datetime64 object yet (John
 passed before he could write up that patch).

 Cheers!
 Ben

 By the way, my 12th Rule of Programming is Never roll your own datetime


A rule on par with never get involved in a land war in Asia: both
equally Fraught With Peril. :)


Warren


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion