> Hi,
>
> I am using ctypes to get and set data, among which, sometimes, unicode data. I
> was looking for a clean way to encode and decode basestrings.
> The code below illustrates the problem.
>
> import ctypes
>
> s = u'\u0627\u0644\u0633\u0644\u0627\u0645
> \u0639\u0644\u064a\u0643\u0645'
> good = ctypes.c_char_p(s.encode("utf-8"))
> bad = ctypes.c_char_p(s)
> print good, bad
> # prints:
> c_char_p('\xd8\xa7\xd9\x84\xd8\xb3\xd9\x84\xd8\xa7\xd9\x85
> \xd8\xb9\xd9\x84\xd9\x8a\xd9\x83\xd9\x85')
> c_char_p('?????? ?????')
>
> I find it ugly to encode and decode the strings everywhere in my code.
> Moreover,
> the strings are usually contained in dictionaries, which would make it even
> uglier/ more cluttered.
> So I wrote a @transcode decorator:
> http://pastecode.org/index.php/view/29608996 ... only to discover that
> brrrrrr,
> this is so complicated! (it works though).
> Is there a simpler solution?
Hmmm, I just simply used c_wchar_p, instead of c_char_p. And that seems to
work. I thought the C prototype "const char *s" corresponds with c_char_p only
(c_wchar_p corresponds to wchar_t * (NUL terminated)
http://docs.python.org/2/library/ctypes.html). Weird.
<START cognitive_dissonance_reduction>
"Well, at least I learnt something from that juicy decorator code I wrote:
http://pastecode.org/index.php/view/29608996
</END cognitive_dissonance_reduction> ;-)
import ctypes
s = u'\u0627\u0644\u0633\u0644\u0627\u0645'
v = ctypes.c_wchar_p(s)
print v # prints c_wchar_p(u'\u0627\u0644\u0633\u0644\u0627\u0645')
v.value # prints u'\u0627\u0644\u0633\u0644\u0627\u0645'
_______________________________________________
Tutor maillist - [email protected]
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor