On Fri, May 11, 2012 at 8:37 AM, mark florisson <markflorisso...@gmail.com>wrote:
> On 11 May 2012 12:13, Dag Sverre Seljebotn <d.s.seljeb...@astro.uio.no> > wrote: > > (NumPy devs: I know, I get too many ideas. But this time I *really* > believe > > in it, I think this is going to be *huge*. And if Mark F. likes it it's > not > > going to be without manpower; and as his mentor I'd pitch in too here and > > there.) > > > > (Mark F.: I believe this is *very* relevant to your GSoC. I certainly > don't > > want to micro-manage your GSoC, just have your take.) > > > > Travis, thank you very much for those good words in the "NA-mask > > interactions..." thread. It put most of my concerns away. If anybody is > > leaning towards for opaqueness because of its OOP purity, I want to > refer to > > C++ and its walled-garden of ideological purity -- it has, what, 3-4 > > different OOP array libraries, neither of which is able to out-compete > the > > other. Meanwhile the rest of the world happily cooperates using pointers, > > strides, CSR and CSC. > > > > Now, there are limits to what you can do with strides and pointers. > Noone's > > denying the need for more. In my mind that's an API where you can do > > fetch_block and put_block of cache-sized, N-dimensional blocks on an > array; > > but it might be something slightly different. > > > > Here's what I'm asking: DO NOT simply keep extending ndarray and the > NumPy C > > API to deal with this issue. > > > > What we need is duck-typing/polymorphism at the C level. If you keep > > extending ndarray and the NumPy C API, what we'll have is a one-to-many > > relationship: One provider of array technology, multiple consumers (with > > hooks, I'm sure, but all implementations of the hook concept in the NumPy > > world I've seen so far are a total disaster!). > > > > What I think we need instead is something like PEP 3118 for the > "abstract" > > array that is only available block-wise with getters and setters. On the > > Cython list we've decided that what we want for CEP 1000 (for boxing > > callbacks etc.) is to extend PyTypeObject with our own fields; we could > > create CEP 1001 to solve this issue and make any Python object an > exporter > > of "block-getter/setter-arrays" (better name needed). > > > > What would be exported is (of course) a simple vtable: > > > > typedef struct { > > int (*get_block)(void *ctx, ssize_t *upper_left, ssize_t *lower_right, > > ...); > > ... > > } block_getter_setter_array_vtable; > > > > Let's please discuss the details *after* the fundamentals. But the > reason I > > put void* there instead of PyObject* is that I hope this could be used > > beyond the Python world (say, Python<->Julia); the void* would be handed > to > > you at the time you receive the vtable (however we handle that). > > I suppose it would also be useful to have some way of predicting the > output format polymorphically for the caller. E.g. dense * > block_diagonal results in block diagonal, but dense + block_diagonal > results in dense, etc. It might be useful for the caller to know > whether it needs to allocate a sparse, dense or block-structured > array. Or maybe the polymorphic function could even do the allocation. > This needs to happen recursively of course, to avoid intermediate > temporaries. The compiler could easily handle that, and so could numpy > when it gets lazy evaluation. > > I think if the heavy lifting of allocating output arrays and exporting > these arrays work in numpy, then support in Cython could use that (I > can already hear certain people object to more complicated array stuff > in Cython :). Even better here would be an external project that each > our projects could use (I still think the nditer sorting functionality > of arrays should be numpy-agnostic and externally available). > It might be nice to expose something which gives an nditer-style looping primitive through the CEP 1001 mechanism. I could imagine a pure C version of this and an LLVM bitcode version which could inline into numba or other LLVM producing systems. -Mark > > > I think this would fit neatly in Mark F.'s GSoC (Mark F.?), because you > > could embed the block-transposition that's needed for efficient "arr + > > arr.T" at this level. > > > > Imagine being able to do this in Cython: > > > > a[...] = b + c * d > > > > and have that essentially compile to the numexpr blocked approach, *but* > > where b, c, and d can have whatever type that exports CEP 1001? So c > could > > be a "diagonal" array which uses O(n) storage to export O(n^2) elements, > for > > instance, and the unrolled Cython code never needs to know. > > > > As far as NumPy goes, something along these lines should hopefully mean > that > > new C code being written doesn't rely so much on what exactly goes into > > "ndarray" and what goes into other classes; so that we don't get the same > > problem again that we do now with code that doesn't use PEP 3118. > > > > Dag > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion