Re: [sqlite] Degree character not displayed correctly.

2009-10-26 Thread Jean-Christophe Deschamps
Nico, Igor,


You're both right to point out that using SQLite would result in 
non-UTF-* compliant data producing unexpected results.  There is still 
the possibility to store such data as blobs.

Sorry for confusion.



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Degree character not displayed correctly.

2009-10-26 Thread Nicolas Williams
On Mon, Oct 26, 2009 at 10:01:43AM -0700, Roger Binns wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> Jean-Christophe Deschamps wrote:
> > First decide or determine what is (or shall be) your database 
> > encoding.  Even if SQLite has no problem storing ANSI (or EBCDIC or 
> > anything else) strings untouched,
> 
> This isn't particularly good advice.  SQLite works solely in Unicode.  When
> you supply text it *must* be in either UTF8 or UTF16 according to the API
> being used.  Sometimes it will appear that you can get by using a different
> encoding but that is just luck and things that operate on text will fail.
> The actual encoding used by the database is pretty much irrelevant and other
> than the pragma you can't even tell what it is nor would you care.

Indeed.  IIRC SQLite3 is actually 8-bit clean, but that doesn't matter:
by stating that it uses UTF-8 (and UTF-16) the SQLite3 developers are
actually allowing themselves the freedom to do a variety of things that
would break your application if you used non-UTF-8 (and non-UTF-16)
text.

For example, the SQLite3 developers might add code to reject strings
with invalid UTF-8 sequences, or strings which use unassigned code
points (unassigned in the version of Unicode supported by the SQLite3
that you are running).  Or they might add support for case-insensitive
matching for non-ASCII text.  Or they might add normalization-
insensitive matching.  And so on.

Nico
-- 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Degree character not displayed correctly.

2009-10-26 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Jean-Christophe Deschamps wrote:
> First decide or determine what is (or shall be) your database 
> encoding.  Even if SQLite has no problem storing ANSI (or EBCDIC or 
> anything else) strings untouched,

This isn't particularly good advice.  SQLite works solely in Unicode.  When
you supply text it *must* be in either UTF8 or UTF16 according to the API
being used.  Sometimes it will appear that you can get by using a different
encoding but that is just luck and things that operate on text will fail.
The actual encoding used by the database is pretty much irrelevant and other
than the pragma you can't even tell what it is nor would you care.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkrl1fQACgkQmOOfHg372QSC5ACfUDFnyFkDt7YE4d0BivC42eHt
6zYAnRwQ2Vnod9OEYM2flWdld+VC4L3L
=gFQA
-END PGP SIGNATURE-
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Degree character not displayed correctly.

2009-10-26 Thread Ted Rolle
On Mon, 26 Oct 2009 16:06:33 +0100
Jean-Christophe Deschamps  wrote:

> Ted,
> 
> 
> >I didn't insert it.  I 'inherited' it from a (mercifully nameless)
> >predecessor.
> >I want to put this data into a database to make it easily accessible
> 
> I'm no SQLite guru (really NO), but here is my 2 cent advice.
> 
> First decide or determine what is (or shall be) your database 
> encoding.  Even if SQLite has no problem storing ANSI (or EBCDIC or 
> anything else) strings untouched, I would strongly recommend you
> select either UTF-8 or UTF-16 if your situation doesn't impose
> something else.  This way your data is garanteed to display stored
> data independant of the user's codepage (if applicable).  This choice
> is to be made at database creation and can't be changed, short of
> dumping the base and re-loading it into a fresh one using another
> (internal) encoding.
> 
> Independantly of the selected Unicode internal encoding, you can use 
> any two UTF interfaces to SQLite: the xxx or the xxx16 functions.
> But of course, supply data encoded consistently with the functions
> you invoke.
> 
> As I understand it, your data is not yet stored in the base.  When/if 
> this is the case, use whatever transcoding tool you find handy to 
> re-encode your data before pushing it into the SQLite base, if needed.
> 
> For instance, the 'degree symbol' is {0xB0} ANSI (Latin1 codepage),
> is {0xC2 0xB0 } as UTF-8 and {0x00B0} as UTF-16.  But even if the
> ANSI (Latin1) charset between 0x80 and 0xFF map to corresponding
> Unicode codepoints, beware that they need to be UTF-8 encoded if you
> want them to display correctly using a UTF-8 tool.
> 
> OTOH you can as well choose to store ANSI (for instance) data, but
> you need to retrieve/display back data using the same encoding.  The
> catch is that non-Unicode (e.g. ANSI) tools are fading away, even in
> the Win* world.
> 
> J-C
> 
> 
> 
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
I really appreciate you insights.
I did a 'pragma encoding;' and SQLite3 returned 'UTF-8'.  Perfect.  If
I had enough disk space, I'd have gone for UTF-16.  But this is good.
Apparently SQLite3 stores the characters correctly. (I just verified
this with TextPad.  Now it's just a representational (DOS?) issue.
Bit by bit, we're getting there!  Thanks.
Ted
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Degree character not displayed correctly.

2009-10-26 Thread Pavel Ivanov
> I didn't insert it.  I 'inherited' it from a (mercifully nameless)
> predecessor.
> I want to put this data into a database to make it easily accessible

I don't understand that. "Put data into a database" == "Insert data"
(read: into a database). So either you inserted (== want to put into
...) or not inserted (== you already have it in the database and
didn't insert). And regarding my other not answered questions: SQLite
doesn't display data by itself. You either retrieve it in your program
and display it in your program or you use sqlite3 command line tool to
retrieve and display. And now pay attention: when you insert data into
database you can do it in whatever encoding you like - SQLite doesn't
care, doesn't check and doesn't complain if something is incorrectly
encoded. When you retrieve data in your program you also can do it in
whatever encoding you like - SQLite doesn't care. But if you retrieve
data using command line tool sqlite3 - it does care and it assumes
that your data is in correct database encoding (UTF-8 in most cases by
default) and decodes it accordingly when tries to display it. So you
can get a problem here.

Pavel

On Mon, Oct 26, 2009 at 10:28 AM, Ted Rolle  wrote:
> On Mon, 26 Oct 2009 10:12:03 -0400
> Pavel Ivanov  wrote:
>
>> How do you insert it? How do you retrieve it? How do you display it?
>> I bet the problem is in the first question, not in the last one.
>>
>> Pavel
>
> I didn't insert it.  I 'inherited' it from a (mercifully nameless)
> predecessor.
> I want to put this data into a database to make it easily accessible
>
> Ted
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Degree character not displayed correctly.

2009-10-26 Thread Jean-Christophe Deschamps
Ted,


>I didn't insert it.  I 'inherited' it from a (mercifully nameless)
>predecessor.
>I want to put this data into a database to make it easily accessible

I'm no SQLite guru (really NO), but here is my 2 cent advice.

First decide or determine what is (or shall be) your database 
encoding.  Even if SQLite has no problem storing ANSI (or EBCDIC or 
anything else) strings untouched, I would strongly recommend you select 
either UTF-8 or UTF-16 if your situation doesn't impose something 
else.  This way your data is garanteed to display stored data 
independant of the user's codepage (if applicable).  This choice is to 
be made at database creation and can't be changed, short of dumping the 
base and re-loading it into a fresh one using another (internal) encoding.

Independantly of the selected Unicode internal encoding, you can use 
any two UTF interfaces to SQLite: the xxx or the xxx16 functions.  But 
of course, supply data encoded consistently with the functions you invoke.

As I understand it, your data is not yet stored in the base.  When/if 
this is the case, use whatever transcoding tool you find handy to 
re-encode your data before pushing it into the SQLite base, if needed.

For instance, the 'degree symbol' is {0xB0} ANSI (Latin1 codepage), is 
{0xC2 0xB0 } as UTF-8 and {0x00B0} as UTF-16.  But even if the ANSI 
(Latin1) charset between 0x80 and 0xFF map to corresponding Unicode 
codepoints, beware that they need to be UTF-8 encoded if you want them 
to display correctly using a UTF-8 tool.

OTOH you can as well choose to store ANSI (for instance) data, but you 
need to retrieve/display back data using the same encoding.  The catch 
is that non-Unicode (e.g. ANSI) tools are fading away, even in the Win* 
world.

J-C



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Degree character not displayed correctly.

2009-10-26 Thread Ted Rolle
On Mon, 26 Oct 2009 10:12:03 -0400
Pavel Ivanov  wrote:

> How do you insert it? How do you retrieve it? How do you display it?
> I bet the problem is in the first question, not in the last one.
> 
> Pavel

I didn't insert it.  I 'inherited' it from a (mercifully nameless)
predecessor.
I want to put this data into a database to make it easily accessible

Ted
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Degree character not displayed correctly.

2009-10-26 Thread Pavel Ivanov
How do you insert it? How do you retrieve it? How do you display it?
I bet the problem is in the first question, not in the last one.

Pavel

On Mon, Oct 26, 2009 at 10:04 AM, Ted Rolle  wrote:
> How can I get the degree character ° (0xB0) (as in 32 degrees Farenheit)
> to display correctly.
> My text editors (Vim and TextPad) and Claws-mailer display this
> character correctly.
> TextPad uses the 'ANSI' character set.
>
> Ted
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Degree character not displayed correctly.

2009-10-26 Thread Ted Rolle
How can I get the degree character ° (0xB0) (as in 32 degrees Farenheit)
to display correctly.
My text editors (Vim and TextPad) and Claws-mailer display this
character correctly.
TextPad uses the 'ANSI' character set.

Ted
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users