numpy's functions, especially ufuncs, have had some ability to support subclasses through the ndarray.__array_wrap__ method, which provides masked arrays or quantities (for example) with an opportunity to set the class and metadata of the output array at the end of an operation. An example is
q1 = Quantity(1, 'meter') q2 = Quantity(2, 'meters') numpy.add(q1, q2) # yields Quantity(3, 'meters') At SciPy2009 we committed a change to the numpy trunk that provides a chance to determine the class and some metadata of the output *before* the ufunc performs its calculation, but after output array has been established (and its data is still uninitialized). Consider: q1 = Quantity(1, 'meter') q2 = Quantity(2, 'J') numpy.add(q1, q2, q1) # or equivalently: # q1 += q2 With only __array_wrap__, the attempt to propagate the units happens after q1's data was updated in place, too late to raise an error, the data is now corrupted. __array_prepare__ solves that problem, an exception can be raised in time. Now I'd like to suggest one more improvement to numpy to make its functions more generic. Consider one more example: q1 = Quantity(1, 'meter') q2 = Quantity(2, 'feet') numpy.add(q1, q2) In this case, I'd like an opportunity to operate on the input arrays on the way in to the ufunc, to rescale the second input to meters. I think it would be a hack to try to stuff this capability into __array_prepare__. One form of this particular example is already supported in quantities, "q1 + q2", by overriding the __add__ method to rescale the second input, but there are ufuncs that do not have an associated special method. So I'd like to look into adding another check for a special method, perhaps called __input_prepare__. My time is really tight for the next month, so I'd rather not start if there are strong objections, but otherwise, I'd like to try to try to get it in in time for numpy-1.4. (Has a timeline been established?) I think it will be not too difficult to document this overall scheme: When calling numpy functions: 1) __input_prepare__ provides an opportunity to operate on the inputs to yield versions that are compatible with the operation (they should obviously not be modified in place) 2) the output array is established 3) __array_prepare__ is used to determine the class of the output array, as well as any metadata that needs to be established before the operation proceeds 4) the ufunc performs its operations 5) __array_wrap__ provides an opportunity to update the output array based on the results of the computation Comments, criticisms? If PEP 3124^ were already a part of the standard library, that could serve as the basis for generalizing numpy's functions. But I think the PEP will not be approved in its current form, and it is unclear when and if the author will revisit the proposal. The scheme I'm imagining might be sufficient for our purposes. Darren ^ http://www.python.org/dev/peps/pep-3124/ _______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
