Christian,
Christian Smith <[EMAIL PROTECTED]> 03/11/2004 02:33 AM Please respond to sqlite-users To: [EMAIL PROTECTED] cc: Subject: Re: [sqlite] Question about UTF8 encoding in SQLite version 2.8.13 > On Tue, 2 Nov 2004, Liz Steel wrote: > >To clarify: I have a database name with Swedish characters in, which are > >converted to multibyte characters, however, the filename that is created > >treats each of the characters separately, which then causes problems later. > >As an example, the string "Ändrad" is converted to "Ã"ndrad". > The code to parse filenames is not UTF8 aware, and so will cause problems > when splitting a filename into directory and filename components if the > string is a UTF8 string. The offending function appears to be > sqlitepager_open in pager.c, which steps backwards through the path name a > character at a time looking the directory seperator character, which will > obviously be tripped up by a multi-byte character. I wonder if you could add some explaination for your comments above. UTF-8 is a special unicode encoding that contains no null characters, preserves the ascii code range verabatim, and does not include any characters that "look like" ascii characters. That is to say, each byte is either an ascii character (0-127) or is in the high byte range (128-255) and therefore can't be confused with an ascii character. I would have thought that any special path characters (eg, '/', '\'...) would be a subset of the ascii range and therefore require no special unicode-aware handling. The function that sqlite calls to actually create the file, on the other hand, would have to be unicode-aware for such filenames to work. Benjamin