I tried the code below with Python 2.x. For a given str or unicode object, it copies the bytes in memory (char*) to a list of 1-character strings. I'm getting
"hello" = ['h', 'e', 'l', 'l', 'o'] u"hello" = ['h', '\x00', 'e', '\x00', 'l', '\x00', 'l', '\x00', 'o', '\x00'] U"hello" = ['h', '\x00', 'e', '\x00', 'l', '\x00', 'l', '\x00', 'o', '\x00'] on platforms with sizeof(PY_UNICODE_TYPE) = 2 and "hello" = ['h', 'e', 'l', 'l', 'o'] u"hello" = ['h', '\x00', '\x00', '\x00', 'e', '\x00', '\x00', '\x00', 'l', '\x00', '\x00', '\x00', 'l', '\x00', '\x00', '\x00', 'o', '\x00', '\x00', '\x00'] U"hello" = ['h', '\x00', '\x00', '\x00', 'e', '\x00', '\x00', '\x00', 'l', '\x00', '\x00', '\x00', 'l', '\x00', '\x00', '\x00', 'o', '\x00', '\x00', '\x00'] on platforms with sizeof(PY_UNICODE_TYPE) = 4. Will the results be different using Python 3? I have quite a few C++ functions with const char* arguments, expecting one byte per character. > - convert char* and std::string to/from Python 3 unicode string. How would this work exactly? Is the plan to copy the unicode data to a temporary one-byte-per-character buffer? ----- Original Message ---- From: Stefan Seefeld <seef...@sympatico.ca> To: Development of Python/C++ integration <cplusplus-sig@python.org> Sent: Wednesday, March 18, 2009 11:18:03 AM Subject: Re: [C++-sig] Some thoughts on py3k support Haoyu Bai wrote: > > Yes of course we should allow users to set policy. So the problem is > what the default behavior should be when there is no policy set by > user explicitly. The candidates are: > > - raise an error > - convert char* and std::string to/from Python bytes > - convert char* and std::string to/from Python 3 unicode string. > > I personally like the last one because it would keep most of the > existing code compatible. > I agree. Thanks, Stefan boost::python::list str_or_unicode_as_char_list( boost::python::object const& O) { PyObject* obj = O.ptr(); boost::python::ssize_t n; const char* c; if (PyString_Check(obj)) { n = PyString_GET_SIZE(obj); c = PyString_AS_STRING(obj); } else if (PyUnicode_Check(obj)) { n = PyUnicode_GET_DATA_SIZE(obj); c = PyUnicode_AS_DATA(obj); } else { throw std::invalid_argument("str or unicode object expected."); } boost::python::list result; for(boost::python::ssize_t i=0;i<n;i++) { result.append(std::string(c+i, 1u)); } return result; } _______________________________________________ Cplusplus-sig mailing list Cplusplus-sig@python.org http://mail.python.org/mailman/listinfo/cplusplus-sig