I have committed the changes toward 32-bit internal representation of Unicode and removed the lock from the src directory. These changes cover only the main module XP, win32 and gtk code and the wordperfect importer. I will leave the other platforms and plugins for others to do, see the notes below.
I have been able to do only very limited testing and this transition is bound to introduce bugs, since, among other things, the code at places uses UT_uint16 instead UT_UCSChar and hardcodes the character size to 2 instead of using sizeof(UT_UCSChar). Hopefully I have got rid off most of these. Summary of the changes ------------------------------------- There are three new types now: UT_UCS4Char, UT_UCS2Char and UT_GrowBufElement. There is a new string class UT_UCS4String, and new sets of UT_UCS4_ and UT_UCS2_ string functions replacing the UT_UCS_ functions. All internal Unicode processing should be done using the UT_UCS4Char and functions. I have left the UT_UCSChar type in place for the time being, as an equivalent of the new UT_UCS4Char type; this is a temporary measure that is meant to make the transition easier and once we are done we will do a global replace and remove UT_UCSChar from the ut_type.h file. Consequently, all new code should only use UT_UCS4Char. Notes on transferring the remaining code: (1) Replace any UT_UCS_ calls with UT_UCS4_ or UT_UCS2_ as appropriate; replace any UT_UCS2String instances with UT_UCS4String, where appropriate. Outside of impexp code and the input methods and platform specific text drawing calls this can be done blindly; in these special case more care is needed. (2) Make sure that the platform's input methods and drawing methods still work, and make any changes where necessary. (3) In the unlikely case the platform-specific code makes use of the UT_GrowBuf class, add explicit casts using UT_UCS4Char and UT_GrowBufElement as needed. Tomas
