On Fri, Sep 26, 2008 at 2:36 PM, Christian Heimes <[EMAIL PROTECTED]> wrote:
> Lisandro Dalcin wrote:
>> If we do 'isinstance(obj, str)', then what should be the behavior in
>> Py2 and Py3? In fact, this question is actually related to the map in
>> builtin_types_table at Cython/Compiler/Builtin.py. Do it make sense to
>> do the map like this
>>
>> bytes -> PyBytes_Type
>> str -> PyString_Type
>> unicode -> PyUnicode_Type
>
> It doesn't make sense for some applications of str. Python 3.0 uses
> PyUnicode for identifiers (attribute names, function and class names
> etc.) while Python 2.x uses PyString. You can use PyUnicode instances
> for attributes in Python 2.x through obj.__dict__[u"key"] = key but you
> cannot use PyBytes in 3.0.
>
> This mapping makes more sense for me:
>
> bytes::
>   whatever Python uses to store binary data
>   Python 2.x: PyString
>   Python 3.x: PyBytes
>
> unicode::
>   text data
>   PyUnicode for all Python versions
>
> str::
>
>   Python 2.x: PyString
>   Python 3.x: PyUnicode

Perhaps I was not clear enough, but the mapping in my mind is just the
mapping you proposed.

In the Python side, 'str' sould be the the type Python uses for
attributes and names. In the C side, that would be 'PyString_Type'.
Of course, Cython has to #define  PyString_Type to PyUnicode_Type for
Py3 (currently, PyString_Type is #define'd as PyBytes_Type for Py3).


> The suggestion follows my forward compatibility code for Python 2.6. In
> Python 2.6 b"" and bytes are simple aliases for PyString. "from
> __future__ import unicode_literals" turns "" into PyUnicode objects.

The only gotcha I have with all this is that I do not have an easy way
to write a string literal that matchs what Python uses for
identifiers, in such a way that it can run in both Py2 and Py3. Let's
take an example: Suppose I want to create a new type calling
'type(name, bases, dict)', then I can write in Cython code:

# Note: Cython code inside a pyx file
type( b'A', (), {}) # -> fails in Py3

type( u'A', (), {}) # -> fails in Py2

type( 'A', (), {}) # -> fails in Py2 or Py3 depending on __future__ import

So I really believe Cython needs an extension, let say

type( s'A', (), {}) # Success in Py2 and Py3!

Then the 's' prefix build a string with the whatever type  the Python
runtime uses for identifiers.

What to you think, Christian? Do my later comments make sense for you?






-- 
Lisandro Dalcín
---------------
Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC)
Instituto de Desarrollo Tecnológico para la Industria Química (INTEC)
Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)
PTLC - Güemes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to