Re: [sqlite] Degree character not displayed correctly.
Nico, Igor, You're both right to point out that using SQLite would result in non-UTF-* compliant data producing unexpected results. There is still the possibility to store such data as blobs. Sorry for confusion. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Degree character not displayed correctly.
On Mon, Oct 26, 2009 at 10:01:43AM -0700, Roger Binns wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > Jean-Christophe Deschamps wrote: > > First decide or determine what is (or shall be) your database > > encoding. Even if SQLite has no problem storing ANSI (or EBCDIC or > > anything else) strings untouched, > > This isn't particularly good advice. SQLite works solely in Unicode. When > you supply text it *must* be in either UTF8 or UTF16 according to the API > being used. Sometimes it will appear that you can get by using a different > encoding but that is just luck and things that operate on text will fail. > The actual encoding used by the database is pretty much irrelevant and other > than the pragma you can't even tell what it is nor would you care. Indeed. IIRC SQLite3 is actually 8-bit clean, but that doesn't matter: by stating that it uses UTF-8 (and UTF-16) the SQLite3 developers are actually allowing themselves the freedom to do a variety of things that would break your application if you used non-UTF-8 (and non-UTF-16) text. For example, the SQLite3 developers might add code to reject strings with invalid UTF-8 sequences, or strings which use unassigned code points (unassigned in the version of Unicode supported by the SQLite3 that you are running). Or they might add support for case-insensitive matching for non-ASCII text. Or they might add normalization- insensitive matching. And so on. Nico -- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Degree character not displayed correctly.
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Jean-Christophe Deschamps wrote: > First decide or determine what is (or shall be) your database > encoding. Even if SQLite has no problem storing ANSI (or EBCDIC or > anything else) strings untouched, This isn't particularly good advice. SQLite works solely in Unicode. When you supply text it *must* be in either UTF8 or UTF16 according to the API being used. Sometimes it will appear that you can get by using a different encoding but that is just luck and things that operate on text will fail. The actual encoding used by the database is pretty much irrelevant and other than the pragma you can't even tell what it is nor would you care. Roger -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkrl1fQACgkQmOOfHg372QSC5ACfUDFnyFkDt7YE4d0BivC42eHt 6zYAnRwQ2Vnod9OEYM2flWdld+VC4L3L =gFQA -END PGP SIGNATURE- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Degree character not displayed correctly.
On Mon, 26 Oct 2009 16:06:33 +0100 Jean-Christophe Deschamps wrote: > Ted, > > > >I didn't insert it. I 'inherited' it from a (mercifully nameless) > >predecessor. > >I want to put this data into a database to make it easily accessible > > I'm no SQLite guru (really NO), but here is my 2 cent advice. > > First decide or determine what is (or shall be) your database > encoding. Even if SQLite has no problem storing ANSI (or EBCDIC or > anything else) strings untouched, I would strongly recommend you > select either UTF-8 or UTF-16 if your situation doesn't impose > something else. This way your data is garanteed to display stored > data independant of the user's codepage (if applicable). This choice > is to be made at database creation and can't be changed, short of > dumping the base and re-loading it into a fresh one using another > (internal) encoding. > > Independantly of the selected Unicode internal encoding, you can use > any two UTF interfaces to SQLite: the xxx or the xxx16 functions. > But of course, supply data encoded consistently with the functions > you invoke. > > As I understand it, your data is not yet stored in the base. When/if > this is the case, use whatever transcoding tool you find handy to > re-encode your data before pushing it into the SQLite base, if needed. > > For instance, the 'degree symbol' is {0xB0} ANSI (Latin1 codepage), > is {0xC2 0xB0 } as UTF-8 and {0x00B0} as UTF-16. But even if the > ANSI (Latin1) charset between 0x80 and 0xFF map to corresponding > Unicode codepoints, beware that they need to be UTF-8 encoded if you > want them to display correctly using a UTF-8 tool. > > OTOH you can as well choose to store ANSI (for instance) data, but > you need to retrieve/display back data using the same encoding. The > catch is that non-Unicode (e.g. ANSI) tools are fading away, even in > the Win* world. > > J-C > > > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users I really appreciate you insights. I did a 'pragma encoding;' and SQLite3 returned 'UTF-8'. Perfect. If I had enough disk space, I'd have gone for UTF-16. But this is good. Apparently SQLite3 stores the characters correctly. (I just verified this with TextPad. Now it's just a representational (DOS?) issue. Bit by bit, we're getting there! Thanks. Ted ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Degree character not displayed correctly.
> I didn't insert it. I 'inherited' it from a (mercifully nameless) > predecessor. > I want to put this data into a database to make it easily accessible I don't understand that. "Put data into a database" == "Insert data" (read: into a database). So either you inserted (== want to put into ...) or not inserted (== you already have it in the database and didn't insert). And regarding my other not answered questions: SQLite doesn't display data by itself. You either retrieve it in your program and display it in your program or you use sqlite3 command line tool to retrieve and display. And now pay attention: when you insert data into database you can do it in whatever encoding you like - SQLite doesn't care, doesn't check and doesn't complain if something is incorrectly encoded. When you retrieve data in your program you also can do it in whatever encoding you like - SQLite doesn't care. But if you retrieve data using command line tool sqlite3 - it does care and it assumes that your data is in correct database encoding (UTF-8 in most cases by default) and decodes it accordingly when tries to display it. So you can get a problem here. Pavel On Mon, Oct 26, 2009 at 10:28 AM, Ted Rolle wrote: > On Mon, 26 Oct 2009 10:12:03 -0400 > Pavel Ivanov wrote: > >> How do you insert it? How do you retrieve it? How do you display it? >> I bet the problem is in the first question, not in the last one. >> >> Pavel > > I didn't insert it. I 'inherited' it from a (mercifully nameless) > predecessor. > I want to put this data into a database to make it easily accessible > > Ted > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Degree character not displayed correctly.
Ted, >I didn't insert it. I 'inherited' it from a (mercifully nameless) >predecessor. >I want to put this data into a database to make it easily accessible I'm no SQLite guru (really NO), but here is my 2 cent advice. First decide or determine what is (or shall be) your database encoding. Even if SQLite has no problem storing ANSI (or EBCDIC or anything else) strings untouched, I would strongly recommend you select either UTF-8 or UTF-16 if your situation doesn't impose something else. This way your data is garanteed to display stored data independant of the user's codepage (if applicable). This choice is to be made at database creation and can't be changed, short of dumping the base and re-loading it into a fresh one using another (internal) encoding. Independantly of the selected Unicode internal encoding, you can use any two UTF interfaces to SQLite: the xxx or the xxx16 functions. But of course, supply data encoded consistently with the functions you invoke. As I understand it, your data is not yet stored in the base. When/if this is the case, use whatever transcoding tool you find handy to re-encode your data before pushing it into the SQLite base, if needed. For instance, the 'degree symbol' is {0xB0} ANSI (Latin1 codepage), is {0xC2 0xB0 } as UTF-8 and {0x00B0} as UTF-16. But even if the ANSI (Latin1) charset between 0x80 and 0xFF map to corresponding Unicode codepoints, beware that they need to be UTF-8 encoded if you want them to display correctly using a UTF-8 tool. OTOH you can as well choose to store ANSI (for instance) data, but you need to retrieve/display back data using the same encoding. The catch is that non-Unicode (e.g. ANSI) tools are fading away, even in the Win* world. J-C ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Degree character not displayed correctly.
On Mon, 26 Oct 2009 10:12:03 -0400 Pavel Ivanov wrote: > How do you insert it? How do you retrieve it? How do you display it? > I bet the problem is in the first question, not in the last one. > > Pavel I didn't insert it. I 'inherited' it from a (mercifully nameless) predecessor. I want to put this data into a database to make it easily accessible Ted ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Degree character not displayed correctly.
How do you insert it? How do you retrieve it? How do you display it? I bet the problem is in the first question, not in the last one. Pavel On Mon, Oct 26, 2009 at 10:04 AM, Ted Rolle wrote: > How can I get the degree character ° (0xB0) (as in 32 degrees Farenheit) > to display correctly. > My text editors (Vim and TextPad) and Claws-mailer display this > character correctly. > TextPad uses the 'ANSI' character set. > > Ted > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Degree character not displayed correctly.
How can I get the degree character ° (0xB0) (as in 32 degrees Farenheit) to display correctly. My text editors (Vim and TextPad) and Claws-mailer display this character correctly. TextPad uses the 'ANSI' character set. Ted ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users