Liam Clarke wrote: > Basically, I have data which is coming straight from struct.unpack() > and it's an UTF-16 string, and I'm just trying to get my head around > dealing with the data coming in from struct, and putting my data out > through struct. > > It doesn't help overly that struct considers all strings to consist of > one byte per char, whereas UTF-16 is two. And I was having trouble as > to how to write UTF-16 stuff out properly. > > But, if I understand it correctly, I could use > > j = #some unicode string > out = j.encode("UTF-16") > pattern = "%ds" % len(out) > struct.pack(pattern, out)
Yes that looks good. Note that you will get a byte-order-mark as the first two bytes. If you don't want that, use utf-16le or utf-16be. The correct choice depends on what the consumer of the data expects / can deal with. >>> 'Hi'.encode('utf-16le') 'H\x00i\x00' >>> 'Hi'.encode('utf-16be') '\x00H\x00i' >>> 'Hi'.encode('utf-16') '\xff\xfeH\x00i\x00' Kent _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor