[Cython] getitem operator

Dag Sverre Seljebotn Sun, 25 May 2008 03:04:44 -0700

After getting a night sleep (and Greg's emails too!) I've definitely 
come over now, having "generic" on the return type complicates the issue 
way too much. Won't say I've put it totally away, but it should not be 
anything close to a priority and I won't think more about it until after 
summer at least.

Robert Bradshaw wrote:
> On May 24, 2008, at 2:58 PM, Dag Sverre Seljebotn wrote:
> 
>> So you propose that one keeps the name __getitem__, yet changes the
>> interface completely compared to Python? (Don't pass slices, unpack  
>> the
>> tuple prior to calling...) I don't see the benefit of keeping the name
>> then, it only creates confusion...
> 
> I'm thinking of extending the semantics (note, for inline functions  
> only, otherwise none of this optimization can happen), and not in a  
> way that is backwards incompatible. __getitem__(self, [object] index)  
> will always be available, and always called for non-inlined code.

I don't like binding this to the question of inline vs. non-inline. (It 
might well work that way if overloading is only enabled for inline 
functions at first, but that would be a side-effect.)

It is perfectly reasonable to want to write container classes as Cython 
"cdef class" wholly in Cython, and want Cython client code to avoid 
tuple packing/unpacking when indexing.

If the __getitem__ name is kept I don't think slices should be specially 
treated -- they're really just object keys. This is what I think we 
should do:

- Whenever the []-operator is encountered, an overload of __getitem__ 
with the same number of arguments as there are dimensions in the 
operator is looked for first. So a[x,y,i:j:k] would look first for 
a.__getitem__(x, y, slice(i,j,k)), and if that can't be called, will 
call a.__getitem__((x, y, slice(i,j,k))).

- If you type the arguments with "int" the overload will fail matching 
when slices are used.

However, this means there's no way to disallow slices but allow, say, 
skip tuple packing on ND hash lookups using object keys.

However, this makes the one-argument case ambigious still (may be passed 
a tuple or not). So I'm really still in favor of being very explicit, 
and state that in the case of exact, non-slice, non-Ellipsis indexing 
*only* do we also support a __cgetitem__ operator (or perhaps 
__getsingleitem__) which always takes the exact same number of arguments 
as passed to [].

(One could even drop overloading, and have __getsingleitem1__, 
__getsingleitem2__, ...)

On the return types:

I suppose having "object" and rely on inlining to remove it again is 
preferable.

There's still an alternative: Allow "self.dtype" as a return type. This 
is more explicit and would mean that type inference could be done 
*before* the inlining/optimization happens (not sure if that is helpful 
or not). It doesn't look nice syntactically but it's not such a problem 
for the compiler ("self.dtype" would be "under assumption", and if 
"self.dtype" is _not_ under assumption that method is simply considered 
non-existant).

(I think we'll end up with "object", but mentioning it. You seem to have 
thought a lot more about type inference than myself.)

> You're whole (..) can be encoded in the argument signature (fixing  
> the number of arguments, which I'll admit is undesirable). The  
> "tuple" representing the indices never gets created or used in this  
> case, which makes things much clearer. Otherwise, I still don't see  
> how you're going to unpack the tuple argument in the generic case  
> without resorting to Python tuples.

No, it would be Python tuples, but evaluated compile-time through the 
whole inlining/unrolling machinery I've dreamed up. But I've put that 
away now.

-- 
Dag Sverre
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

[Cython] getitem operator

Reply via email to