Jean-Christophe: Thanks for your advice -very helpful and ilustrative for my-.
Fortunately I'm not in the same horror history thank to the people in this list, and because I use the old and simple way: say plain SQLite C interface; plain Windows32 C API and plain Visual Cpp (Multibyte build in the existing versions, and Unicode build for the new international release of the app). Despite of that, I'm aware that I have some more that pure US-ASCII in the blob objects, in fact I'm near your situation because used the Spanish languaje and have 8-bit extended ASCII with some special characters -accented characters and so-. So the question is Yes, I have upper-ANSI data stored, and need convert it to MS VCpp w_char strings to rebuild the dBase. In this point I plain use mbstowcs() to do the thing. Because there are some users in the field, the new app can detect if the dBase correspond to a previous version and rebuild it to the new version if needed. So I can be almost sure that the user use the same system code page that the one used when the data was originally inserted. Do is there some weird in my tought? A.J. Millan ----- Original Message ----- From: "Jean-Christophe Deschamps" <j...@q-e-d.org> To: "General Discussion of SQLite Database" <sqlite-users@sqlite.org> Sent: Thursday, October 29, 2009 3:04 PM Subject: Re: [sqlite] Some clarification needed about Unicode Hi, Please, follow Igor advices, he is right. >[1] Read the actual textual data with sqlite3_column_blob() Which you can directly convert to TEXT if, as you say, you entered only 7-bit ASCII or UTF-8 compliant data. >[2] Assuming the system code page matches the one used when the data was >originally inserted, convert with mbstowcs() Forget that. >[3] (Doubt) The result can be directly written with >sqlite3_bind_text() -I >want store in UTF-8- Why not convert the column(s) directly from blob to TEXT? >OR must I write the result with sqlite3_bind_text16()? Them, the data is >stored as UTF-16? or as UTF-8? Forget this too. >[4] Afterward once converted the dBase I take his to mean "converts your blobs to TEXT" : yes, this is the _only_ step you need for the whole thing. > and in regular use: > >[4-1a] Read with sqlite3_column_text() > >[4-1b] convert with WideCharToMultiByte(CP_UTF8) Why? Read it off as text16 direct into you cpp string , there must exist zillions [working] wrappers to VC++. >[4-1c] Use the result with Win32 api -SetTex()- No particular API. Try a simple MsgBox to see by yourself, or even a basic "Héllô wörld!" as a Windows _console_ (printf). >OR? > >[4-2a] Read with sqlite3_column_text16() >[4-2b] No convertion needed. >[4-2c] Use the result ... YES !!! The only catch would be if you have (knowingly or not) entered 8-bit data (the upper 128 characters in the Win codepage) encoded in __ANSI__ (1 char = 1 byte) and stored in your blobs. In such case, the representation of the characters is not the expected UTF-8. As far as I understand it, current SQLite _should_ read/write non-compliant UTF-8 without problem, but I didn't checked that latest versions are still byte-neutral. But it is not the right lane. If you have upper_ANSI strings, convert them to UTF-*, where * is your encoding choice for the new database. <horror story> I've spend indecent time sorting out all these questions, just because someone decided to silently convert UTF-16 native Windows stings to ANSI for every SQLite UTF-8 interface in the SQLite wrapper built into the development tool I use. It was very hard for me to figure this out, because a call similar to printf _also_ converted to ANSI (silently) and a call used to display a 2D table _also_ converted to ANSI (silently). So sometimes I had French and other European diacritics converted (i.e. completely destroyed) like this: UTF-16 --> ANSI --> UTF-16 --> ANSI. It was just like if I tried to read original plain text by just looking at its MD5. It's been a _real_ nightmare and I had data from 15 countries, some using "weird" (to me) scripts. </horror story> But you're not even close to this terrible situation. Just determine if you have upper-ANSI data stored and convert it if needed. Well, stay tuned, I'll do something for you, just allow me some time... _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users