On Tue, Jun 26, 2012 at 2:40 PM, Dag Sverre Seljebotn <[email protected]> wrote: > On 06/26/2012 01:48 PM, David Cournapeau wrote: >> Hi, >> >> I am just continuing the discussion around ABI/API, the technical side >> of things that is, as this is unrelated to 1.7.x. release. >> >> On Tue, Jun 26, 2012 at 11:41 AM, Dag Sverre Seljebotn >> <[email protected]> wrote: >>> On 06/26/2012 11:58 AM, David Cournapeau wrote: >>>> On Tue, Jun 26, 2012 at 10:27 AM, Dag Sverre Seljebotn >>>> <[email protected]> wrote: >>>>> On 06/26/2012 05:35 AM, David Cournapeau wrote: >>>>>> On Tue, Jun 26, 2012 at 4:10 AM, Ondřej Čertík<[email protected]> >>>>>> wrote: >>>>>> >>>>>>> >>>>>>> My understanding is that Travis is simply trying to stress "We have to >>>>>>> think about the implications of our changes on existing users." and >>>>>>> also that little changes (with the best intentions!) that however mean >>>>>>> either a breakage or confusion for users (due to historical reasons) >>>>>>> should be avoided if possible. And I very strongly feel the same way. >>>>>>> And I think that most people on this list do as well. >>>>>> >>>>>> I think Travis is more concerned about API than ABI changes (in that >>>>>> example for 1.4, the ABI breakage was caused by a change that was >>>>>> pushed by Travis IIRC). >>>>>> >>>>>> The relative importance of API vs ABI is a tough one: I think ABI >>>>>> breakage is as bad as API breakage (but matter in different >>>>>> circumstances), but it is hard to improve the situation around our ABI >>>>>> without changing the API (especially everything around macros and >>>>>> publicly accessible structures). Changing this is politically >>>>> >>>>> But I think it is *possible* to get to a situation where ABI isn't >>>>> broken without changing API. I have posted such a proposal. >>>>> If one uses the kind of C-level duck typing I describe in the link >>>>> below, one would do >>>>> >>>>> typedef PyObject PyArrayObject; >>>>> >>>>> typedef struct { >>>>> ... >>>>> } NumPyArray; /* used to be PyArrayObject */ >>>> >>>> Maybe we're just in violent agreement, but whatever ends up being used >>>> would require to change the *current* C API, right ? If one wants to >>> >>> Accessing arr->dims[i] directly would need to change. But that's been >>> discouraged for a long time. By "API" I meant access through the macros. >>> >>> One of the changes under discussion here is to change PyArray_SHAPE from >>> a macro that accepts both PyObject* and PyArrayObject* to a function >>> that only accepts PyArrayObject* (hence breakage). I'm saying that under >>> my proposal, assuming I or somebody else can find the time to implement >>> it under, you can both make it a function and have it accept both >>> PyObject* and PyArrayObject* (since they are the same), undoing the >>> breakage but allowing to hide the ABI. >>> >>> (It doesn't give you full flexibility in ABI, it does require that you >>> somewhere have an "npy_intp dims[nd]" with the same lifetime as your >>> object, etc., but I don't consider that a big disadvantage). >>> >>>> allow for changes in our structures more freely, we have to hide them >>>> from the headers, which means breaking the code that depends on the >>>> structure binary layout. Any code that access those directly will need >>>> to be changed. >>>> >>>> There is the particular issue of iterator, which seem quite difficult >>>> to make "ABI-safe" without losing significant performance. >>> >>> I don't agree (for some meanings of "ABI-safe"). You can export the data >>> (dataptr/shape/strides) through the ABI, then the iterator uses these in >>> whatever way it wishes consumer-side. Sort of like PEP 3118 without the >>> performance degradation. The only sane way IMO of doing iteration is >>> building it into the consumer anyway. >> >> (I have not read the whole cython discussion yet) > > I'll try to write a summary and post it when I can get around to it. > >> >> What do you mean by "building iteration in the consumer" ? My > > "consumer" is the user of the NumPy C API. So I meant that the iteration > logic is all in C header files and compiled again for each such > consumer. Iterators don't cross the ABI boundary. > >> understanding is that any data export would be done through a level of >> indirection (dataptr/shape/strides). Conceptually, I can't see how one >> could keep ABI without that level of indirection without some compile. >> In the case of iterator, that means multiple pointer chasing per >> sample -- i.e. the tight loop issue you mentioned earlier for >> PyArray_DATA is the common case for iterator. > > Even if you do indirection, iterator utilities that are compiled in the > "consumer"/user code can cache the data that's retrieved. > > Iterators just do > > // setup crossing ABI > npy_intp *shape = PyArray_DIMS(arr); > npy_intp *strides = PyArray_STRIDES(arr); > ... > // performance-sensitive code just accesses cached pointers and don't > // cross ABI
The problem is that iterators need more that this. But thinking more about it, I am not so dead sure we could not get there. I will need to play with some code. > > Going slightly OT, then IMO, the *only* long-term solution in 2012 is > LLVM. That allows you to do any level of inlining and special casing and > optimization at run-time, which is the only way of matching needs for > performance with using Python at all. > > Mark Florisson is heading down that road this summer with his 'minivect' > project (essentially, code generation for optimal iteration over NumPy > (or NumPy-like) arrays that can be used both by Cython (C code > generation backend) and Numba (LLVM code generation backend)). > > Relying on C++ metaprogramming to implement iterators is like using the > technology of the 80's to build the NumPy of the 2010's. It can only be > exported to Python in a crippled form, so kind of useless. (C++ to > implement the core that sits behind an ABI is another matter, I don't > have an opinion on that. But iterators can't be behind the ABI, as I > think we agree on.) Well, no need to convince me about which of the two solutions is the most appropriate. I was just trying to appear more unbiased than I really am :) David _______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
