Pierre GM wrote: > Using the ISO as reference, you have a good definition of months.
Yes, but only one. there are others. For instance, the climate modelers like to use a calendar that has 360 days a year: 12 30 day months. That way they get something with the same timescale as months and years, but have nice, linear, easy to use units (differentiable, and all that). Mark Wiebe wrote: > Code Interpreted as > Y 12M, 52W, 365D > M 4W, 30D, 720h > > This is even self inconsistent: > > 1Y == 365D > > 1Y == 12M == 12 * 30D == 360D > > 1Y == 12M == 12 * 4W == 12 * 4 * 7D == 336D > > 1Y == 52W == 52 * 7D == 364D > > Is it not clear from this what a mess of mis-interpretation might result > from all that? > > > This part of the code is used for mapping metadata like [Y/4] -> [3M], > or [Y/26] -> [2W]. I agree that this '/' operator in the unit metadata > is weird, and wouldn't object to removing it. Weird, dangerous, and unnecessary. I can see how some data may be on, for example quarters, but that should require a definition of quarters that's more defined. > This goes to heck is the data is expressed in something like "months > since 1995-01-01" > > Because months are only defined on a Calendar. > > > Here's what the current implementation can do with that one: > > >>> np.datetime64('1995-01-01', 'M') + 13 > numpy.datetime64('1996-02','M') I see -- I have a better idea of the intent here, and I can see that as long as you keep everything in the same unit (say, months, in this case), then this can be a clean and effective way to deal with this sort of data. As I said, the netcdf case is a different use case, but I think the issue there was that the creator of the data was thinking of it as being used like above: "months since January, 1995", and the data was all integer values for months, it makes perfect sense, and is well defined. The problem in that case is that the standard does not have a specification that enforces that the units stay months, and that the intervals are integers -- so software looked at that, converted it to, for example, python datetime instances, using some pre-defined definition for the length of a month), and gt something that mis-represented the data. The numpy use-case is different, but it's my concern that that same kind of thing could easily happen, because people want to write generic code that deals with arbitrary np.datetime64 instances. I suppose we could consider this analogous to issues with integer an floating point dtypes -- when you convert between those, it's user-beware, but I think that would be more clear if we had a set of dtypes: datetime_months datetime_hours datetime_seconds But that list would get big in a hurry! Also, with the Python datetime module, for instance, what I like about it is that I don't have to know or care how it's stored internally -- all I need to know is what range and precision it can deal with. numpy has performance issues that may not make that possible, but I still like it. maybe two types: datetime_calendar: for Calendar-type units (months, business days, ...) datetime_continuous: for "linear units" (seconds, hours, ...) or something like that? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion