https://bugs.freedesktop.org/show_bug.cgi?id=59679

--- Comment #6 from Stephan Bergmann <[email protected]> ---
This only happens when using a GTK VCL plugin (SAL_USE_VCLPLUGIN=gtk or
SAL_USE_VCLPLUGIN=gtk3) and having "Tools - Options... - LibreOffice - General
- Open/Save dialogs - Use LibreOffice dialogs" unticked.

This is not easily fixable in the LO code.  When saving a newly created
database, LO passes the suggested filename ("új adatbázis") to the GTK file
chooser's gtk_file_chooser_set_current_name as a UTF-8 encoded string
(SalGtkFilePicker::setDefaultName in vcl/unx/gtk/fpicker/SalGtkFilePicker.cxx).
 The filename that is ultimately chosen by the user is passed back from the GTK
file chooser to LO via gtk_file_chooser_get_uris
(SalGtkFilePicker::getSelectedFiles in
vcl/unx/gtk/fpicker/SalGtkFilePicker.cxx) as a file URL ("file:///.../%C3%BAj
adatb%C3%A1zis").

When the G_FILENAME_ENCODING environment variable is unset, GLib assumes that
pathnames (which are just sequences of 8-bit bytes after all) use UTF-8
encoding, so the pathname that the GTK file chooser computes for the suggested
filename would, in C string notation (where "\XX" denotes a byte with
hexadecimal value XX), end in ".../\xC3\xBAj adatb\xC3\xA1zis".  As GLib
apparently represents file URLs (whose "path payload" are just sequences of
8-bit bytes after all) with an identity-mapping between the bytes of the
pathname and the bytes encoded (via percent-encoding) in the URL's path, the
URL returned from gtk_file_chooser_get_uris above reads "file:///.../%C3%BAj
adatb%C3%A1zis".

Now, LO internally uses a different representation of pathnames as file URLs,
where the "payload bytes" in the URL's path are interpreted as UTF-8, but the
bytes in the pathname are interpreted according to the system locale's encoding
(see osl_getThreadTextEncoding), so there is a translation between those two
text encodings involved.

There is SalGtkPicker::uritounicode and SalGtkPicker::unicodetouri in
vcl/unx/gtk/fpicker/SalGtkPicker.cxx to convert between the different
interpretations of file URLs in LO and GLib, and for communication about
pre-existing files they appear to work reasonably well (esp. in the common case
where the system locale's encoding is UTF-8).  However, they break down in the
above scenario of communication about a not-yet-existing file (passing a
filename string that is always UTF-8 encoded in one direction via
gtk_file_chooser_set_current_name, but getting back a URL via
gtk_file_chooser_get_uris) when the system locale's encoding is not UTF-8.  In
that case, SalGtkPicker::uritounicode assumes the "payload bytes" of its input
URL's path ("file:///.../%C3%BAj adatb%C3%A1zis") should be treated according
to the system locale's encoding (which is plain 7-bit ASCII for the POSIX
locale; but any bytes with the high bit set are effectively treated as
ISO-8859-1 by LO then), so is converted by LO into its internal file URL format
as "file:///.../%C3%83%C2%BAj adatb%C3%83%C2%A1zis".  The result is that the
pathname of the file that LO will create on disk is (in C string notation)
".../\xC3\xBAj adatb\xC3\xA1zis" (so will decode to "új adatbázis" when viewed
with a tool that assumes the pathname bytes are UTF-8, like the GTK file
chooser), but the title that LO will display for it contains those odd "Ä",
U+00BA, "Ä", and "¡".

Probably the best way out is to use a system locale with an UTF-8 encoding
(hu_HU.utf8, say), or at least set the G_FILENAME_ENCODING environment variable
to fix GLib's assumptions (via G_FILENAME_ENCODING=@locale, see
<http://developer.gnome.org/glib/stable/glib-Character-Set-Conversion.html#file-name-encodings>
and
<http://developer.gnome.org/glib/stable/glib-running.html#G_FILENAME_ENCODING>).

-- 
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
Libreoffice-bugs mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs

Reply via email to