Dear all SQLite3 users, Recently i have been working on a dictionary style project that had to work with UNICODE non-latin1 strings, i did try the ICU project but i wasn't satisfied with the extra baggage that came with it. I would like to recommend the following possible solution to the long standing UNICODE issue, that was built in as an ICU alternative (excluding collation's), and could be easily be included in the SQLite core as default behavior.
http://ioannis.mpsounds.net/blog/?dl=sqlite3_unicode.c The above file contains mapping tables for lower(), upper(), title(), fold()* characters based on UNICODE mapping tables as described currently by the UNICODE standard v5.1.0 beta, that are used by functions to transform characters to their respective folding cases. (These tables were built by a modified version of Loic Dachary builder in order to included required case transformations) * UNICODE uses case folding mapping tables to implement non-case sensitive comparison sequences (eg LIKE). The above file utilizes the existing ICU infrastructure built in SQLite in order to activate the extra functionality, to automatically : - override the LIKE operation, to support full UNICODE non-case sensitive comparison - override upper(), lower(), to support case transformation of UNICODE characters based on UNICODE mapping tables as described currently by the UNICODE standard v5.1.0 beta - provide title() and fold() functions, also based on UNICODE mapping tables as described currently by the UNICODE standard v5.1.0 beta - provide unaccent() function, (based on the unac library designed for linux by Loic Dachary) to decompose UNICODE characters to there unaccented equivalents in order to perform simpler queries and return wider range of results. (eg. ά -> α, æ -> ae in the latter example the string will automatically grow by 1 character point) In comparison to ICU no collation sequences have been implemented yet. The above functionalities have been designed to be included/excluded independently according to specific needs in order to minimize the size of the library. The total overhead over the SQLite library size with all functionality enabled is approximately 70~80KB. The above file has not been thoroughly tested, but i consider the implementation to stable. You can leave comments, bug reports, suggestions on this board or at http://ioannis.mpsounds.net/blog/2007/12/19/sqlite-native-unicode-like-support (PS. I am not an SQLite expert, but i had to improvise on some extent on this matter.) Thank you very much.