I'll follow up with a sample getitem implementation, so you need not
follow up this thread until then. But I really wanted to explain
compile-time duck typing of return types properly (see below).
Robert Bradshaw wrote:
> On May 24, 2008, at 10:16 AM, Dag Sverre Seljebotn wrote:
>> In order to solve these problems, one could start to create
>> complicated
>> solutions like allowing "self.dtype" as a return type in the method
>> signature if an assumption is placed on self, specify "nested
>> types" (ie.
>> tuple(int, int, int)), with a following combinatorial explosions in
>> manual
>> overloads one needs to create) etc. etc.
>
> Ah, but in this case it seems much simpler (from the users
> perspective) to resolve arr[1,2,3] as __getitem__(1,2,3) and let
> function overloading handle this the normal way. BTW, in terms of
> focus, I think handling slicing is much lower on the priority list
> than a lot of other things (the relative gain here is much smaller).
But this makes our __getitem__ different from Python's! If we do that,
we should rather make up a wholly different, new syntax (__cgetitem__,
__cgetslice__, and so on); but I do not like to take this direction.
It's OK to not do any optimization for slicing, but it's very important
that slices correctly fall back to the Python [] operator. As long as
the Python __getitem__ interface is kept, I must fall back to the []
operator manually, and also take tuples for n-d indices).
(Also I find the prospect of manually creating multiple overloads
depending on the number of dimensions somewhat distasteful. Of course,
I'll do it if there's not enough time, but I'd like to at least have a
path forward that *can* lead there eventually, and *then* hack it.)
>> (In order for this to work there's a small hitch: One must support
>> code
>> like this:
>>
>> cdef generic get_something(as_str):
>> if as_str: return "asdf"
>> else: return 3432
>>
>> This can be fixed simply by having return type mismatches for
>> instantiated
>> generics converted into runtime errors rather than halt
>> compilation, this
>> emulates Python behaviour nicely.)
>
> So here "generic" would become the more general of the two, i.e. an
> object. For generic inline functions, would it get optimized away
> (i.e. if one knew as_str at compile time, it would know the return
> type exactly?)
No, this is all wrong.
If having "generic" as the return value simply resulted in the more
general of the types, I wouldn't bother with it -- after all, the
programmer know which types can be returned, and would be able to
specify object manually!
I'll exemplify using the function above. If you don't like what you see,
read footnote [1].
Working calling code:
(1): cdef char* chbuf = get_something(as_str=True) # chbuf = "asdf"
(2): cdef object s = get_something(as_str=True) # s = str("asdf")
(3): cdef object o_n = get_something(as_str=False) # o_n = int(3423)
(4): cdef int i_n = get_something(as_str=False) # i_n = int(3423)
I.e. this creates four different instances of get_something, each one
with different semantics because of the return type. I.e. (1) instantiates
cdef char* get_something(as_str): ....
which of course makes 'return "asdf"' return a string literal pointer.
(I suppose this will change into an error if that auto-coercion is
removed :-)). (2) and (3) both uses the same instantiation, and their
code returns object (like your guessed behaviour). (4) turns into
cdef int get_something(as_str): ...
OK, so obviously for (1) and (4) there will be a type mismatch in the
line of code that's not run. That's where I proposed to change it into a
run-time error (because "those spots should not be reachable"). I.e,
suppose this call is done:
cdef int n = get_something(as_str=True)
This uses instantiation (4) from above (the int return one). I'll now
write out the proposed body of this function:
cdef int get_something(as_str):
if as_str:
"asdf" # evaluate and discard expression in the return statement
# Then explicitly, and always, raise the coercion error.
# The point is: Usually this place is not reached!
raise TypeError("Cannot coerce str to C int")
# or whatever you have now for <int><object>"a"...
else:
return 3432
Instantiation (1) of the function is symmetric to this, raising an
exception if control reaches the place where the integer is returned.
So the end result is that the "int-return-type" instantiation of
get_something returns the proper, native C int when called with
as_str=False, and raises a coercion exception when called with as_str=True.
[1] Even if this may seem hard to wrap ones head around, the end of the
story for the end-user is rather pleasing; one gets more or less the
same behaviour as if get_something was declared with an "object" return
type. It should natural to use. But no object coercion is involved for
the compiler, so speed is maintained.
--
Dag Sverre
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev