On Wed, Oct 31, 2018 at 3:59 PM Allan Haldane <allanhald...@gmail.com> wrote:
> On 10/30/18 5:04 AM, Matti Picus wrote: > > TL;DR - should we revert the attribute-hiding constructs in > > ndarraytypes.h and unify PyArrayObject_fields with PyArrayObject? > > > > > > Background > > > > > > NumPy 1.8 deprecated direct access to PyArrayObject fields. It made > > PyArrayObject "opaque", and hid the fields behind a PyArrayObject_fields > > structure > > > https://github.com/numpy/numpy/blob/v1.15.3/numpy/core/include/numpy/ndarraytypes.h#L659 > > with a comment about moving this to a private header. In order to access > > the fields, users are supposed to use PyArray_FIELDNAME functions, like > > PyArray_DATA and PyArray_NDIM. It seems there were thoughts at the time > > that numpy might move away from a C-struct based > > > > underlying data structure. Other changes were also made to enum names, > > but those are relatively painless to find-and-replace. > > > > > > NumPy has a mechanism to manage deprecating APIs, C users define > > NPY_NO_DEPRICATED_API to a desired level, say NPY_1_8_API_VERSION, and > > can then access the API "as if" they were using NumPy 1.8. Users who do > > not define NPY_NO_DEPRICATED_API get a warning when compiling, and > > default to the pre-1.8 API (aliasing of PyArrayObject to > > PyArrayObject_fields and direct access to the C struct fields). This is > > convenient for downstream users, both since the new API does not provide > > much added value, and it is much easier to write a->nd than > > PyArray_NDIM(a). For instance, pandas uses direct assignment to the data > > field for fast json parsing > > > https://github.com/pandas-dev/pandas/blob/master/pandas/_libs/src/ujson/python/JSONtoObj.c#L203 > > via chunks. Working around the new API in pandas would require more > > engineering. Also, for example, cython has a mechanism to transpile > > python code into C, mapping slow python attribute lookup to fast C > > struct field access > > > https://cython.readthedocs.io/en/latest/src/userguide/extension_types.html#external-extension-types > > > > > > > > In a parallel but not really related universe, cython recently upgraded > > the object mapping so that we can quiet the annoying "size changed" > > runtime warning https://github.com/numpy/numpy/issues/11788 without > > requiring warning filters, but that requires updating the numpy.pxd file > > provided with cython, and it was proposed that NumPy actually vendor its > > own file rather than depending on the cython one > > (https://github.com/numpy/numpy/issues/11803). > > > > > > The problem > > > > > > We have now made further changes to our API. In NumPy 1.14 we changed > > UPDATEIFCOPY to WRITEBACKIFCOPY, and in 1.16 we would like to deprecate > > PyArray_SetNumericOps and PyArray_GetNumericOps. The strange warning > > when NPY_NO_DEPRICATED_API is annoying. The new API cannot be supported > > by cython without some deep surgery > > (https://github.com/cython/cython/pull/2640). When I tried dogfooding an > > updated numpy.pxd for the only cython code in NumPy, mtrand.pxy, I came > > across some of these issues (https://github.com/numpy/numpy/pull/12284). > > Forcing the new API will require downstream users to refactor code or > > re-engineer constructs, as in the pandas example above. > > I haven't understood the cython issue, but just want to mention that for > optimization purposes it's nice to be able to modify the fields, like in > the pandas/json example above. > > In particular, PyArray_ConcatenateArrays uses some tricks which > temporarily clobber the data pointer and shape of an array to > concatenate arrays efficiently. It seems fairly safe to me. These tricks > would be nice to re-use in a C port of the new block code we merged > recently. > > Those optimizations aren't possible if only using PyArray_Object. > > It's OK for numpy internals to directly access the structures, as presumably they will be updated if anything changes. Maybe it would be useful for Cython to have a flag like Py_LIMITED_API? Chuck
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion