That the same character is found in both encodings is no surprise. You need to 
look at the actual sequence of bytes.

Comparing a file containing just the "capital A with diaresis" yields

A 1 Byte sequence 0xC4 in ANSI
A 2 Byte sequence 0xC384 in en_US.UTF8 on a RH5 linux system
A 3 Byte Sequence 0xFFFEC4 when converting 0xC4 to UTF-8 in UltraEdit

If you store the single byte 0xC4 then SQLite will retrieve the single byte 
0xC4. If you change the representation layer to expect 0xFFFEC4 or 0xC384 then 
you will be disappointed.

If you put a cat into a box labeled "cat" and then change the label to "dog", 
will that change what is inside? If you sell the box, will the buyer not 
complain?

-----Ursprüngliche Nachricht-----
Von: sqlite-users-boun...@mailinglists.sqlite.org 
[mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im Auftrag von Wang, Wei
Gesendet: Mittwoch, 08. Juni 2016 03:49
An: SQLite mailing list <sqlite-users@mailinglists.sqlite.org>
Betreff: Re: [sqlite] Latin-1 characters cannot be supported for Unicode

Thanks for your reply! But I found the Latin-1 encoded characters are listed in 
the Unicode chart. http://unicode.org/charts/PDF/U0080.pdf


Best Regards,
Wang Wei

-----Original Message-----
From: sqlite-users-boun...@mailinglists.sqlite.org 
[mailto:sqlite-users-boun...@mailinglists.sqlite.org] On Behalf Of Igor 
Tandetnik
Sent: Tuesday, June 07, 2016 10:20 PM
To: sqlite-users@mailinglists.sqlite.org
Subject: Re: [sqlite] Latin-1 characters cannot be supported for Unicode

On 6/7/2016 3:43 AM, Wang, Wei wrote:
> I met a problem that was maybe caused by the encoding of SQLite. I inserted a 
> item which including some Latin1 characters like Ç and  Ã  into a table. Then 
> I opened the database with SQLite Developer. After I setting the encoding to 
> ANSI, the display and the query result for that table were OK.
> However after I setting the encoding to Unicode, these Latin1 characters 
> could not be displayed normally, and could not be queried out. Please see the 
> attached pictures for the details.

A byte sequence containing Latin-1-encoded characters Ç or à is not in fact a 
valid byte sequence in any Unicode encoding - neither UTF-8 nor
UTF-16 nor any other. If you want Unicode data in your database, then store 
Unicode data, and not ANSI, in your database.
--
Igor Tandetnik

_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___________________________________________
 Gunter Hick
Software Engineer
Scientific Games International GmbH
FN 157284 a, HG Wien
Klitschgasse 2-4, A-1130 Vienna, Austria
Tel: +43 1 80100 0
E-Mail: h...@scigames.at

This communication (including any attachments) is intended for the use of the 
intended recipient(s) only and may contain information that is confidential, 
privileged or legally protected. Any unauthorized use or dissemination of this 
communication is strictly prohibited. If you have received this communication 
in error, please immediately notify the sender by return e-mail message and 
delete all copies of the original communication. Thank you for your cooperation.


_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to