Sorry about the medium-sized length, but I'd like this to be close to my last email on the subject. I'd just refer to Robert's mail, but I guess some more explanation about NumPy semantics is in order for the benefit of non-NumPy-users, so I've made a summary of that.
Stefan Behnel wrote: > Dag Sverre Seljebotn wrote: >> Stefan Behnel wrote: >>> we have three types: >>> >>> 1) a dynamic array type >>> - allocates memory on creation >>> - reallocates on (explicit) resizing, e.g. a .resize() method >>> - supports PEP 3118 (and disables shrinking with live buffers) >>> - returns a typed value on indexing >>> - returns a typed array copy on slicing >>> - behaves like a tuple otherwise >>> >>> 2) a typed memory view >>> - created on top of a buffer (or array) >>> - never allocates memory (for data, that is) >>> - creates a new view object on slicing >>> - behaves like an array otherwise >> This last point is dangerous as we seem to disagree about what an array >> is. > > It's what I described under 1). > > >>> 3) a SIMD memory view >>> - created on top of a buffer, array or memory view >>> - supports parallel per-item arithmetic >>> - behaves like a memory view otherwise >> Good summary. Starting from this: I want int[:,:] to be the combination >> of 2) and 3) > > You mean "3) and not 2)", right? Could you explain why you need a syntax > for this if it's only a view? I suppose I meant some variation of 3) with some extra bullet points (slicing in particular). We need a syntax because SIMD operations must be handled as a special-case compile-time. Robert put it well; what I want is the core NumPy array semantics on a view to any array memory -- builtin, so that it can be optimized compile-time. We need to return to that; trying to distill something else and more generic out of this seems to only bring confusion. (This is about 1) and 2) in Robert's mail only though.) I'll make a list of what we mean by NumPy semantics below. At the very bottom is some things which I think should *not* be included. First: 1) Nobody is claming this is elegant or Pythonic. It is catering for a numerical special interest, nothing more nor less. 2) As Robert put it: He won't use it himself, but the rest of Cython indirectly benefits from all the Cython interest from the numerical users. 3) The proposed semantics below are really not up for in-detail discussion, what I'm really after is a "yes" or "no" -- I just don't have the time, and NumPy is the de facto standard for Python numerics and what everybody expects anyway. I don't want to invent something entirely new. That said, here's a long list of what I mean with NumPy semantics, assuming both CEPs are implemented. # make x a compile-time-optimizeable 2D view on memoryview(obj) cdef int[:,:] x = obj # make an unassigned 1D view cdef int[:] y # Indexing x[2,3] # Access shape, stride info, raw data pointer x.shape x.strides x.data # Slicing out new view of third row (in two ways) y = x[2,:] y = x[2,...] # Now, modifying y modifies what x points to too. # Make a copy so that y points to seperate memory: y = y.copy() # Indexing with None creates new, 1-length axis x = y[None, :] # x.shape == (1, y.shape[0]) x = y[:, None] # x.shape == (y.shape[0], 1) # Now, this: x[0, 3] = 2 # modifies y[3] too. # New view of exactly same data x[:,:] x[...] # Set all entries in array 12 x[...] = 12 # Set only first row to 10 x[0, :] = 10 # Some ways of multiplying all elements with 2 x *= 2 x[...] *= 2 x[:,:] *= 2 x += x x[...] += x # A more complicated expression...allocates memory x = stdmath.sqrt(x*x + x*(x+1)/(x+2)) # A more complicated expression...overwrites existing # memory x[...] = stdmath.sqrt(x*x + x*(x+1)/(x+2)) # Boolean operators cdef bint[:,:] b # perhaps we could support 8-bit bool too b = (x == 2) # b is now an array the shape of x, containing True where x[i,j] == 2 # Get sum of elements import numpy as np np.sum(x) # As for printing/coercion to Python object, that remains # TBD. Either memoryview, or a pretty-printing subclass # of memoryview, implementing NumPy's __toarray__ protocol # as well for better compatability Here's what I do NOT want to include from NumPy: # Get sum and mean x.sum() x.mean() # and so on, you have to do np.sum(x). # "Fancy indexing" is a mess because the returned object # (due to implementation constraints) is a copy, not a view, # thus being inconsistent with the above. My stance is that # this can go in when we can support treating it as a view, # instead of following NumPy with making a copy. I have ideas # for how to do this. # Get the intersecting array of rows 1, 4 and 5 and # colums 2 and 1 new_data_copy = x[[1,4,5], [2,1]] # Set the same intersection to 0. This is where NumPy gets # really inconsistent; making an exception specifically # in __setitem__ for this case. x[[1,4,5], [2,1]] = 0 # modified x # If y has length 3, pick out element 0 and 4 y[[True, False, False, True]] ...and so on. -- Dag Sverre _______________________________________________ Cython-dev mailing list [email protected] http://codespeak.net/mailman/listinfo/cython-dev
