https://bugs.freedesktop.org/show_bug.cgi?id=59679
--- Comment #6 from Stephan Bergmann <[email protected]> --- This only happens when using a GTK VCL plugin (SAL_USE_VCLPLUGIN=gtk or SAL_USE_VCLPLUGIN=gtk3) and having "Tools - Options... - LibreOffice - General - Open/Save dialogs - Use LibreOffice dialogs" unticked. This is not easily fixable in the LO code. When saving a newly created database, LO passes the suggested filename ("új adatbázis") to the GTK file chooser's gtk_file_chooser_set_current_name as a UTF-8 encoded string (SalGtkFilePicker::setDefaultName in vcl/unx/gtk/fpicker/SalGtkFilePicker.cxx). The filename that is ultimately chosen by the user is passed back from the GTK file chooser to LO via gtk_file_chooser_get_uris (SalGtkFilePicker::getSelectedFiles in vcl/unx/gtk/fpicker/SalGtkFilePicker.cxx) as a file URL ("file:///.../%C3%BAj adatb%C3%A1zis"). When the G_FILENAME_ENCODING environment variable is unset, GLib assumes that pathnames (which are just sequences of 8-bit bytes after all) use UTF-8 encoding, so the pathname that the GTK file chooser computes for the suggested filename would, in C string notation (where "\XX" denotes a byte with hexadecimal value XX), end in ".../\xC3\xBAj adatb\xC3\xA1zis". As GLib apparently represents file URLs (whose "path payload" are just sequences of 8-bit bytes after all) with an identity-mapping between the bytes of the pathname and the bytes encoded (via percent-encoding) in the URL's path, the URL returned from gtk_file_chooser_get_uris above reads "file:///.../%C3%BAj adatb%C3%A1zis". Now, LO internally uses a different representation of pathnames as file URLs, where the "payload bytes" in the URL's path are interpreted as UTF-8, but the bytes in the pathname are interpreted according to the system locale's encoding (see osl_getThreadTextEncoding), so there is a translation between those two text encodings involved. There is SalGtkPicker::uritounicode and SalGtkPicker::unicodetouri in vcl/unx/gtk/fpicker/SalGtkPicker.cxx to convert between the different interpretations of file URLs in LO and GLib, and for communication about pre-existing files they appear to work reasonably well (esp. in the common case where the system locale's encoding is UTF-8). However, they break down in the above scenario of communication about a not-yet-existing file (passing a filename string that is always UTF-8 encoded in one direction via gtk_file_chooser_set_current_name, but getting back a URL via gtk_file_chooser_get_uris) when the system locale's encoding is not UTF-8. In that case, SalGtkPicker::uritounicode assumes the "payload bytes" of its input URL's path ("file:///.../%C3%BAj adatb%C3%A1zis") should be treated according to the system locale's encoding (which is plain 7-bit ASCII for the POSIX locale; but any bytes with the high bit set are effectively treated as ISO-8859-1 by LO then), so is converted by LO into its internal file URL format as "file:///.../%C3%83%C2%BAj adatb%C3%83%C2%A1zis". The result is that the pathname of the file that LO will create on disk is (in C string notation) ".../\xC3\xBAj adatb\xC3\xA1zis" (so will decode to "új adatbázis" when viewed with a tool that assumes the pathname bytes are UTF-8, like the GTK file chooser), but the title that LO will display for it contains those odd "Ä", U+00BA, "Ä", and "¡". Probably the best way out is to use a system locale with an UTF-8 encoding (hu_HU.utf8, say), or at least set the G_FILENAME_ENCODING environment variable to fix GLib's assumptions (via G_FILENAME_ENCODING=@locale, see <http://developer.gnome.org/glib/stable/glib-Character-Set-Conversion.html#file-name-encodings> and <http://developer.gnome.org/glib/stable/glib-running.html#G_FILENAME_ENCODING>). -- You are receiving this mail because: You are the assignee for the bug.
_______________________________________________ Libreoffice-bugs mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs
