Tom Plunket wrote: > I am building a file with the help of the struct module. > > I would like to be able to put Unicode strings into this file, but I'm > not sure how to do it. > > The format I'm trying to write is basically this C structure: > > struct MyFile > { > int magic; > int flags; > short otherFlags; > char pad[22]; > > wchar_t line1[32]; > wchar_t line2[32]; > > // ... other data which is easy. :) > }; > > (I'm writing data on a PC to be read on a big-endian machine.) > > So I can write the four leading members with the output of > struct.pack('>IIH22x', magic, flags, otherFlags). Unfortunately I > can't figure out how to write the unicode strings, since: > > message = unicode('Hello, world') > myFile.write(message) > > results in 'message' being converted back to a string before being > written. Is the way to do this to do something hideous like this: > > for c in message: > myFile.write(struct.pack('>H', ord(unicode(c)))) > > ?
I'd suggest UTF-encoding it as a string, using the encoding that matches whatever wchar means on the target machine, for example assuming bigendian and sizeof(wchar) == 2: utf_line1 = unicode_line1.encode('utf_16_be') etc struct.pack(">.........64s64s", ......, utf_line1, utf_line2) Presumes (1) you have already checked that you don't have more than 32 characters in each "line" (2) padding with unichr(0) is acceptable. HTH, John -- http://mail.python.org/mailman/listinfo/python-list