On Jun 11, 2009, at 3:47 PM, Robert Kern wrote: > On Thu, Jun 11, 2009 at 14:37, Pierre GM<[email protected]> wrote: >> >> On Jun 11, 2009, at 3:07 PM, Travis Oliphant wrote: >> >>>> BTW, what is the metadata that is going to be added to the types? >>>> What purpose does it serve? >>> >>> In the date-time case, it holds what frequency the integer in the >>> data- >>> type represents. There will only be 2 new static data-types. >>> "Datetime" and "Timedelta" that use 8 bytes each. >>> >>> What those 8 bytes represent will be determined by the metadata >>> (years, months, seconds, etc...). >> >> As Charles pointed out, it'd be quite useful for units as well. Or to >> store some extra information like the filling_value of a >> MaskedArray... >> >> So, this metadata would be attached to an array, right ? > > No. The metadata is on the dtype.
Ah, OK. Still could be used for units, then. And it'll probably make things easier to define custom dtypes (I was thinking about a standard problem where all the fields of a structured array have the same dtype. A flag could be attached to the main dtype telling that it's OK to perform some functions on fields, for example... Thinking aloud here). >> Scalars would >> be considered as 0d array for that purpose, right ? eg, given a 1d >> array of dates w/ a given frequency, accessing a single element would >> give me a scalar w/ the same frequency ? > > It should. The details still need to be worked out. OK. > >>> The ufunc machinery needs to change to handle passing >>> that information in somehow. The approaches we take to doing that >>> will also hopefully allow us to define ufuncs for string, unicode, >>> and >>> void * arrays as well. >> >> In that case, could we also think about what Darren was suggesting >> for >> his units package, viz, a pre-processing function >> (__array_unwrap__ ?) >> that complements the current __array_wrap__ one ? The idea being that >> any operation would be performed on a ndarray, the corresponding >> metadata would be just passed along during the operation, and >> modifications to the metadata would be taken care of in the pre- and/ >> or post- processing steps ? > > Neither here nor there, I think. > >> Oh, just another question: why trying to put datetime and timedelta >> in >> the type ordering ? My understanding is that underneath, they're just >> long/longlong. It's only because they have a particular metadata that >> they should be processed differently, right ? > > No. They need to be different types such that the ufunc mechanism can > find the right loop implementations. Meh. I'm not familiar enough with the details of C ufuncs, so bear with me for a minute. A datetime is basically a long + a frequency attribute. All the operations recognized as valid for a datetime object will deal w/ the long part, the frequency are just patched back at the end, right ? So, a ufunc could first check the underlying type (here, long or longlong), then check whether there's a value for the 'unit': if there's one, choose the corresponding loop, if None, use the default (the one we currently have). I really fail to see why we need to see datetime/timedelta as intrinsically different from the other types (apart that they carry some extra info), and why the mechanism should be different for datetime/timedelta than for units, say. >> So, if soon we add units >> to floats, the underneath object would still be considered float, >> dealing w/ the unit has to be let for ufuncs ? > > This is why I don't think this mechanism can be used for units. Robert, would you mind pointing me offlist to the relevant part of the code so that I can try to figure out by myself ? Or just explain it in plain english (which would then be the basis for a documentation of these new features)... _______________________________________________ Numpy-discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
