Am 07.09.2010 10:27, schrieb Stefan Behnel: > Kay Hayen, 07.09.2010 00:15: >> I may be misunderstanding people >> because of my different goal to stay close to CPython. > > You may want to keep the level of unnecessary FUD down, especially on this > list. Cython has very good and seamless support for Unicode and CPython's > string type semantics.
What you quoted was not a statement about Cython, but about me. I have a lot of doubts about me. Therefore I might not understand that the discussion is about "Cython C string literals", because I don't understand what that should be and why it should be technically. To me at least, if you have "unicode_literals", every literal is unicode in unless you say so, no guessing is needed at all. My proposal to have a C type to represent the type of strings dependent on unicode_literals. Then have the user convert it to UTF-8 if that is what he needs to work with them. That conversion is obviously very simple without "unicode_literals", and for a constant (we are talking about literals even) it could be pre-determined in an optimization step by either the C++ compiler or Cython. This has the benefit of avoiding conversions where needed, and allowing to target both Python2 and Python3 runtime with the same source code and no magic "how it us used" conversions behind the users back. It has the other benefit of working with string objects as well, in the same way, that with Python3. The drawback is that you would need to cast if you use string objects or literals. But if that is needed, is a question to be answered by the user anyway. He may not be calling C functions that want UTF-8 but instead just optimize some string operations, for which conversions to UTF-8 back and forth would be detrimental. I actually see another benefit if unnecessary conversions are avoided. So what's wrong with that approach in the first place? Yours, Kay _______________________________________________ Cython-dev mailing list [email protected] http://codespeak.net/mailman/listinfo/cython-dev
