On Thu, 2003-10-23 at 13:31, Wayne Venables wrote: > Unfortunately that still means there is a performance hit converting all > data in and out of the library from UTF-8 to UCS16. A large number of > operating systems and programming languages store strings natively as UCS16.
If you're actually writing a portable application you'll be happy to know your last statement may not be true; Most operating systems and programming languages do NOT store strings "natively as UCS16"- I'm aware of [actually] very few operating systems OR languages that use a double-byte unicode encoding as their native character set. Even if you meant larger than 1% of operating systems AND languages store strings natively as UCS16, you'd still be incorrect. That said, you're probably not writing a portable application; You can trade space for speed by storing the UTF-8 form for sorting and collating and use a sqlite_binary_encode'd UCS-16 form as an additional column. If your catalog is constant but the order of records isn't, consider storing the UCS-16 strings in a constant database (or build to it as a cache periodically- google for CDB for source). Another option (instead of using sqlite_binary_encode-ing) is to select a code-point outside any text that you'll be using (I used some of the user-defined code-point) - any repeating pair of octets will do. Then xor your string before storing and after fetching. This will avoid keeping null-bytes in there. You'll still need to use it in a separate column as collating and sorting won't work (unless of course, you don't need these things). --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]