On 5/16/08, Stefan Behnel <[EMAIL PROTECTED]> wrote: > Hi Lisandro, > > please read what I post. You've been stating repeatedly that your experience > with Unicode is limited, and I think I have told you a lot of what I know > about the subject. I appreciate discussion, but please consider that I might > have more reasons for my opinions about the subject than I state in each > single post.
You are completelly right, and take for granted that at any point I will accept what you decide is better. It's just I believe, perhaps beause of lack of knowledge, that "abc" should be always the builtin 'str' type in Py2 (byte strings) and Py3 (unicode strings). If we want to represent literals being data, then we have to use b"abc", in Py2 that would be the 'str' (byte string) type, and in Py3, the new 'bytes' types. Note that in Python 2.6 "abc" and b"abc" both return a byte string 'str' type, and 'bytes' type is an alias for 'str' type. What's wrong with the Python 2.6 way? What you believe was the whole point in 2.6 of add support for b"abc" literals and aliasing 'bytes' to str? > > Lisandro Dalcin wrote: > > On 5/16/08, Stefan Behnel <[EMAIL PROTECTED]> wrote: > >> The thing is that if you write > >> > >> getattr(o, u"attr") > >> > >> in Cython, it will work in both Py2 and Py3. However, > >> > >> getattr(o, "attr") > >> > >> will only work in Py2, unless you do the future import. > > > > Stefan, I understood that one of the traget of Cython is to > > efficiently compile Python code. Please note that > > > > getattr(o, u"attr") > > > > is not valid Python 3 code at all !! > > > That's why I said "in Cython". > > > > > You are proposing that if I do "def foo(): ..." the the identifier > > 'foo' will be implicitely treated as unicode for Py3, > > > Sure. You didn't state in your source that you wanted the identifier name to > be a byte string, did you? (which was obviously because Python doesn't allow > you to do that). > > > > > but a string literal "abc" do not !!. > > > Because the syntax of Python2, which Cython currently implements, dictates > that "abc" is a byte string. This is explicit in Python2, as the unicode > string would be u"abc". > > The difference between identifiers and strings is that one is a name and the > other is a piece of data. The language can do whatever it likes with the > names > (it can even strip them from the compiled result completely), but it must > *never* corrupt data. > > Py3 has come a long way since the initial Unicode support in Py 2.0, almost > eight years back. We shouldn't throw all lessons learned away and think we > can > do better. > > > Stefan > > _______________________________________________ > Cython-dev mailing list > [email protected] > http://codespeak.net/mailman/listinfo/cython-dev > -- Lisandro Dalcín --------------- Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC) Instituto de Desarrollo Tecnológico para la Industria Química (INTEC) Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET) PTLC - Güemes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 _______________________________________________ Cython-dev mailing list [email protected] http://codespeak.net/mailman/listinfo/cython-dev
