Staffan Tylen wrote:
> I must admit that I'm a bit confused here. If I'm not wrong UTF-8 differs
> from ascii when the value is higher than '7f'x, but storing data in sqlite
> as text with character values beteen 'x80'x and 'ff'x seems to be no
> problem. I previously thought that this could only be done in blob format.
>
> create table t (a);
> insert into t values(cast(x'ff' as text));
> select a,length(a),hex(a) from t;
>  |1|FF
>
> My conclusion is that storing single-byte characters of any value is
> allowed, is this true?

SQLite assumes that all strings you give it are encoded in UTF-8, and
does not actually check the encoding.  It gives you the same string
back, so you could, in theory, use a different encoding, as long as you
do not use any database string processing functions.  In practice, I
would not recommend this.

> At the bottom it says: *Note to Windows users:* The encoding used for the
> filename argument of sqlite3_open() and sqlite3_open_v2() must be UTF-8,
> not whatever codepage is currently defined. Filenames containing
> international characters must be converted to UTF-8 prior to passing them
> into sqlite3_open() or sqlite3_open_v2().
>
> So what does "must be converted" mean? I don't know how sqlite3.exe works
> here but if I do
>
> sqlite3  ?.db
>
> where '?' as we've seen is '82'x it happily creates a file with a
> non-displayable character in the first position that seems to be 1 byte
> long.

The shell assumes that its arguments are UTF-8, and gives the filename
unchanged to sqlite3_open*().  When you've entered the filename in the
Windows console with the default settings, it is not encoded in UTF-8.


Regards,
Clemens

Reply via email to