Stefan Behnel wrote:
> Robert Bradshaw, 06.09.2010 19:01:
>
>> On Mon, Sep 6, 2010 at 9:36 AM, Dag Sverre Seljebotn
>>
>>> I don't understand this suggestion. What happens in each of these cases,
>>> for different settings of "from __future__ import unicode_literals"?
>>>
>>> cdef char* x1 = 'abc\u0001'
>>>
>
> As I said in my other mail, I don't think anyone would use the above in
> real code. The alternative below is just too obvious and simple.
>
>
>
>>> cdef char* x2 = 'abc\x01'
>>>
>> from __future__ import unicode_literals (or -3)
>>
>> len(x1) == 4
>> len(x2) == 4
>>
>> Otherwise
>>
>> len(x1) == 9
>> len(x2) == 4
>>
>
> Hmm, now *that* looks unexpected to me. The way I see it, a C string is the
> C equivalent of a Python byte string and should always and predictably
> behave like a Python byte string, regardless of the way Python object
> literals are handled.
>
While the "cdef char*" case isn't that horrible,
f('abc\x01')
is. Imagine throwing in a type in the signature of f and then get
different data in.
I really, really don't like having the value of a literal depend on type
of the variable it gets assigned to (I know, I know about ints and so
on, but let's try to keep the number of instances down).
My vote is for identifying a set of completely safe strings (no \x or
\u, ASCII-only) that is the same regardless of any setting, and allow
that. Anything else, demand a b'' prefix to assign to a char*. Putting
in a b'' isn't THAT hard.
Dag Sverre
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev