Ondrej Linhart created GUACAMOLE-2203:
-----------------------------------------
Summary: RDP Printer Redirection corrupts characters in PDF
filenames
Key: GUACAMOLE-2203
URL: https://issues.apache.org/jira/browse/GUACAMOLE-2203
Project: Guacamole
Issue Type: Bug
Components: guacamole-server
Affects Versions: 1.6.0
Environment: Debian, non-dockerized installation
Reporter: Ondrej Linhart
When using RDP connection via Guacamole, it is possible to enable Printer
Redirection.
This will produce a virtual printer inside that RDP session. Printing anything
to this printer will result in the printed document being downloaded in
client's browser as PDF generated by GhostScript.
In print-job.c we can see that the filename of this downloaded PDF is first
seeded by a string constant, then later overwritten by the printed document's
title (if present as Title in the PostScript raw stream). For document formats
that do not have titles (like plaintext files), most editors use filename as
the Title. If the file has not been saved yet, programs tend to fill filename
candidates (Untitled, or Bez názvu in Czech Windows).
However, PostScript does not have a flag for content encoding, so Guacamole has
no way of knowing the encoding of this Title. So it escapes all special
characters and the produced filename of the PDF is corrupted (characters
replaced by their numeric representation) if it contains special characters.
By testing, I have determined that the encoding of the Windows-produced
PostScript stream when printing to the virtual printer follows a configuration
option in Control Panel - Region - Administrative (tab) - setting for
non-Unicode apps. This option tell Windows how to interpret/save data in
applications with ASCII encodings.
When I change this option, the produced characters in the corrupted filenames
are different. I can even check the (beta-marked) UTF-8 option here and it will
produce two escaped numbers instead of one.
By contrast, when using the Drive redirection, the filenames with special
characters are not corrupted so the pipeline for downloading files definitely
can carry UTF or Unicode filenames all the way from the guacd backend to the
frontend and client browser.
This is further demonstrated by running the guacd with DEBUG flag, where this
log line already produces corrupted filenames when printing
{code:java}
guac_client_log(job->client, GUAC_LOG_DEBUG, "Beginning print stream: %s",
job->filename); {code}
whereas this log line produces clean filenames with special characters when
downloading
{code:java}
guac_user_log(user, GUAC_LOG_DEBUG, "%s: Initiating download of \"%s\"",
__func__, path); {code}
I understand this happens because the file download method has pre-determined
encoding of 16bit Unicode and thus guacd correctly interprets it as such, while
the PostScript protocol does not have any encoding indication (I assume?) so
guacd doesn't try to guess and just defaults to some single-byte encoding
(probably ISO-8859-1?) and sanitizes the special characters.
Proposed solution:
Adding a new configuration variable to the RDP connection properties with
encoding selection. The administrator would observe the mentioned option for
encoding of non-Unicode apps (which affects the encoding of the PostScript
printing Title) in their Windows machine, then set this variable to the same
setting in the properties of that RDP connection.
Guacamole would then interpret the title in the selected encoding, producing a
cleaner filename result even for documents with special characters in their
Title. If the option would not be set, it would default to the current
behaviour.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)