Christian,




Christian Smith <[EMAIL PROTECTED]>
03/11/2004 02:33 AM
Please respond to sqlite-users

 
        To:     [EMAIL PROTECTED]
        cc: 
        Subject:        Re: [sqlite] Question about UTF8 encoding in SQLite version 
2.8.13


> On Tue, 2 Nov 2004, Liz Steel wrote:
> >To clarify: I have a database name with Swedish characters in, which 
are
> >converted to multibyte characters, however, the filename that is 
created
> >treats each of the characters separately, which then causes problems 
later.
> >As an example, the string "Ändrad" is converted to "Ã"ndrad".
> The code to parse filenames is not UTF8 aware, and so will cause 
problems
> when splitting a filename into directory and filename components if the
> string is a UTF8 string. The offending function appears to be
> sqlitepager_open in pager.c, which steps backwards through the path name 
a
> character at a time looking the directory seperator character, which 
will
> obviously be tripped up by a multi-byte character.

I wonder if you could add some explaination for your comments above. UTF-8 
is a special unicode encoding that contains no null characters, preserves 
the ascii code range verabatim, and does not include any characters that 
"look like" ascii characters. That is to say, each byte is either an ascii 
character (0-127) or is in the high byte range (128-255) and therefore 
can't be confused with an ascii character. I would have thought that any 
special path characters (eg, '/', '\'...) would be a subset of the ascii 
range and therefore require no special unicode-aware handling. The 
function that sqlite calls to actually create the file, on the other hand, 
would have to be unicode-aware for such filenames to work.

Benjamin

Reply via email to