Ondrej Linhart created GUACAMOLE-2203:
-----------------------------------------

             Summary: RDP Printer Redirection corrupts characters in PDF 
filenames
                 Key: GUACAMOLE-2203
                 URL: https://issues.apache.org/jira/browse/GUACAMOLE-2203
             Project: Guacamole
          Issue Type: Bug
          Components: guacamole-server
    Affects Versions: 1.6.0
         Environment: Debian, non-dockerized installation
            Reporter: Ondrej Linhart


When using RDP connection via Guacamole, it is possible to enable Printer 
Redirection.

This will produce a virtual printer inside that RDP session. Printing anything 
to this printer will result in the printed document being downloaded in 
client's browser as PDF generated by GhostScript.

In print-job.c we can see that the filename of this downloaded PDF is first 
seeded by a string constant, then later overwritten by the printed document's 
title (if present as Title in the PostScript raw stream). For document formats 
that do not have titles (like plaintext files), most editors use filename as 
the Title. If the file has not been saved yet, programs tend to fill filename 
candidates (Untitled, or Bez názvu in Czech Windows).

However, PostScript does not have a flag for content encoding, so Guacamole has 
no way of knowing the encoding of this Title. So it escapes all special 
characters and the produced filename of the PDF is corrupted (characters 
replaced by their numeric representation) if it contains special characters.

By testing, I have determined that the encoding of the Windows-produced 
PostScript stream when printing to the virtual printer follows a configuration 
option in Control Panel - Region - Administrative (tab) - setting for 
non-Unicode apps. This option tell Windows how to interpret/save data in 
applications with ASCII encodings.

When I change this option, the produced characters in the corrupted filenames 
are different. I can even check the (beta-marked) UTF-8 option here and it will 
produce two escaped numbers instead of one.

By contrast, when using the Drive redirection, the filenames with special 
characters are not corrupted so the pipeline for downloading files definitely 
can carry UTF or Unicode filenames all the way from the guacd backend to the 
frontend and client browser.

This is further demonstrated by running the guacd with DEBUG flag, where this 
log line already produces corrupted filenames when printing
{code:java}
guac_client_log(job->client, GUAC_LOG_DEBUG, "Beginning print stream: %s", 
job->filename); {code}
whereas this log line produces clean filenames with special characters when 
downloading
{code:java}
guac_user_log(user, GUAC_LOG_DEBUG, "%s: Initiating download of \"%s\"", 
__func__, path); {code}
I understand this happens because the file download method has pre-determined 
encoding of 16bit Unicode and thus guacd correctly interprets it as such, while 
the PostScript protocol does not have any encoding indication (I assume?) so 
guacd doesn't try to guess and just defaults to some single-byte encoding 
(probably ISO-8859-1?) and sanitizes the special characters.

Proposed solution:

Adding a new configuration variable to the RDP connection properties with 
encoding selection. The administrator would observe the mentioned option for 
encoding of non-Unicode apps (which affects the encoding of the PostScript 
printing Title) in their Windows machine, then set this variable to the same 
setting in the properties of that RDP connection.

Guacamole would then interpret the title in the selected encoding, producing a 
cleaner filename result even for documents with special characters in their 
Title. If the option would not be set, it would default to the current 
behaviour.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to