That the same character is found in both encodings is no surprise. You need to look at the actual sequence of bytes.
Comparing a file containing just the "capital A with diaresis" yields A 1 Byte sequence 0xC4 in ANSI A 2 Byte sequence 0xC384 in en_US.UTF8 on a RH5 linux system A 3 Byte Sequence 0xFFFEC4 when converting 0xC4 to UTF-8 in UltraEdit If you store the single byte 0xC4 then SQLite will retrieve the single byte 0xC4. If you change the representation layer to expect 0xFFFEC4 or 0xC384 then you will be disappointed. If you put a cat into a box labeled "cat" and then change the label to "dog", will that change what is inside? If you sell the box, will the buyer not complain? -----Ursprüngliche Nachricht----- Von: sqlite-users-boun...@mailinglists.sqlite.org [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im Auftrag von Wang, Wei Gesendet: Mittwoch, 08. Juni 2016 03:49 An: SQLite mailing list <sqlite-users@mailinglists.sqlite.org> Betreff: Re: [sqlite] Latin-1 characters cannot be supported for Unicode Thanks for your reply! But I found the Latin-1 encoded characters are listed in the Unicode chart. http://unicode.org/charts/PDF/U0080.pdf Best Regards, Wang Wei -----Original Message----- From: sqlite-users-boun...@mailinglists.sqlite.org [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On Behalf Of Igor Tandetnik Sent: Tuesday, June 07, 2016 10:20 PM To: sqlite-users@mailinglists.sqlite.org Subject: Re: [sqlite] Latin-1 characters cannot be supported for Unicode On 6/7/2016 3:43 AM, Wang, Wei wrote: > I met a problem that was maybe caused by the encoding of SQLite. I inserted a > item which including some Latin1 characters like Ç and à into a table. Then > I opened the database with SQLite Developer. After I setting the encoding to > ANSI, the display and the query result for that table were OK. > However after I setting the encoding to Unicode, these Latin1 characters > could not be displayed normally, and could not be queried out. Please see the > attached pictures for the details. A byte sequence containing Latin-1-encoded characters Ç or à is not in fact a valid byte sequence in any Unicode encoding - neither UTF-8 nor UTF-16 nor any other. If you want Unicode data in your database, then store Unicode data, and not ANSI, in your database. -- Igor Tandetnik _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___________________________________________ Gunter Hick Software Engineer Scientific Games International GmbH FN 157284 a, HG Wien Klitschgasse 2-4, A-1130 Vienna, Austria Tel: +43 1 80100 0 E-Mail: h...@scigames.at This communication (including any attachments) is intended for the use of the intended recipient(s) only and may contain information that is confidential, privileged or legally protected. Any unauthorized use or dissemination of this communication is strictly prohibited. If you have received this communication in error, please immediately notify the sender by return e-mail message and delete all copies of the original communication. Thank you for your cooperation. _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users