On Mon, Oct 26, 2009 at 10:01:43AM -0700, Roger Binns wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Jean-Christophe Deschamps wrote:
> > First decide or determine what is (or shall be) your database 
> > encoding.  Even if SQLite has no problem storing ANSI (or EBCDIC or 
> > anything else) strings untouched,
> 
> This isn't particularly good advice.  SQLite works solely in Unicode.  When
> you supply text it *must* be in either UTF8 or UTF16 according to the API
> being used.  Sometimes it will appear that you can get by using a different
> encoding but that is just luck and things that operate on text will fail.
> The actual encoding used by the database is pretty much irrelevant and other
> than the pragma you can't even tell what it is nor would you care.

Indeed.  IIRC SQLite3 is actually 8-bit clean, but that doesn't matter:
by stating that it uses UTF-8 (and UTF-16) the SQLite3 developers are
actually allowing themselves the freedom to do a variety of things that
would break your application if you used non-UTF-8 (and non-UTF-16)
text.

For example, the SQLite3 developers might add code to reject strings
with invalid UTF-8 sequences, or strings which use unassigned code
points (unassigned in the version of Unicode supported by the SQLite3
that you are running).  Or they might add support for case-insensitive
matching for non-ASCII text.  Or they might add normalization-
insensitive matching.  And so on.

Nico
-- 
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to