On Thu, Jun 2, 2011 at 10:57 AM, Christopher Barker <chris.bar...@noaa.gov>wrote:
> Mark Wiebe wrote: > > I'm following what I understand the NEP to mean for combining dates and > > deltas of different units. This means for timedeltas, the metadata > > becomes more precise, in particular it becomes the GCD of the input > > metadata, and between timedelta and datetime the datetime always > dominates. > > > > > https://github.com/numpy/numpy/blob/master/doc/neps/datetime-proposal.rst > > Thanks for posting this link -- a few comments on that doc follow. > > > Only Years, Months, and Business Days have a nonlinear relationship with > > the other units, so they're the only problem case for this. They can be > > arbitrarily special-cased based on what is decided to make the most > sense. > > As mentioned on my recent post -- this stuff should be handles by some > sort of "calendar" classes -- there is no one way to do that! So numpy > should provide datetime and timedelta data types that can be used, but a > timedelta should _not_ ever be defined by these weird variable units. > > I guess what I'm getting is that: > > a_date_time + a_timedelta > > is a fundamentally different operation than: > > a_date_time + a_calendar_defined_timespan > > The former can follow all the usual math properties for addition, but > the later doesn't. > > About the NEP: > > """ > A representation is also supported such that the stored date-time > integer can encode both the number of a particular unit as well as a > number of sequential events tracked for each unit. > """ > > I'm not sure I understand what this really means, but I _think_ I agree > with Pierre that this is unnecessary complication - couldn't it be > handled by multiple arrays, or maybe a structured dtype? > > """ > The datetime64 represents an absolute time. Internally it is represented > as the number of time units between the intended time and the epoch > (12:00am on January 1, 1970 --- POSIX time including its lack of leap > seconds). > """ > > The CF netcdf metadata standard provides for times to be specified as > "units since a_date_time". units can be seconds, hours, days, etc (it > does allow months and years, but it shouldn't!). This is nice, flexible > system that makes it easy to capture wildly different scales needed: > from nanoseconds to millennia. Similarly, we might want to consider a > datetime dtype as containing a reference datetime, and a tic unit. > > I think the "Time units" section does specify that you can use various > units, but it looks like the NEP sticks with the single POSIX epoch. > > I see later in the NEP: > """ > However, after thinking more about this, we found that the combination > of an absolute datetime64 with a relative timedelta64 does offer the > same functionality while removing the need for the additional origin > metadata. This is why we have removed it from this proposal. > """ > hmmm -- I don't think that's the case -- you need the "origin" if you > want to represent something like nanoseconds as a datetime, far away > from the epoch. Sure, you can supply your own by keeping the origin and > a timedelta array separately, by you could do that for all uses, also, > and the point of this is to make working with datetimes easy. If we're > going to allow different units, we might as well have different "origins". > > +1 > > I also don't think that units like "month", "year", "business day" > should be allowed -- it just adds confusion. It's not a killer if they > are defined in the spec: > > 1 year = 365.25 days (for instance0 > 1 month = 1year/12 > > But I think it's better to simply disallow them, and keep that use for > what I'm calling the "Calendar" functions. And "business day" is > particularly ugly, and, I'm sure defined differently in different places. > > Chuck
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion