Ross Johnson created TIKA-2858:
----------------------------------

             Summary: JAXRS server: allow passwords with special chars (MIME 
encoded words)
                 Key: TIKA-2858
                 URL: https://issues.apache.org/jira/browse/TIKA-2858
             Project: Tika
          Issue Type: Improvement
          Components: server
    Affects Versions: 1.20
            Reporter: Ross Johnson
         Attachments: protected - 4 space password.pdf, protected - Unicode 
password.pdf

Tika Server allows passing a document password in a special {{Password}} 
request header; however, I don't believe this header allows for passwords with 
non-US-ASCII characters, or for passwords with leading or trailing spaces.

One potential solution would be to allow MIME encoded-word values (RFC 2047) in 
the password header so that one could specify any password with only US-ASCII. 
This extra decoding could be enabled / disabled with some other flag or header 
value, in order to avoid any breaking changes for clients that are not encoding 
this header (e.g. if the password happens to literally be "{{=?UTF-8?B??=}}").

Attached are 2 sample PDF files that I'm unable to use with TIka Server due to 
their passwords. These passwords are a bit contrived, but I have come across 
this issue with real passwords. I've included the passwords in code blocks to 
avoid the issue editor / viewer from collapsing multiple spaces into one.

The file named "{{protected - 4 space password.pdf}}" has a password of 4 
literal spaces:
{code:java}
// Password is on line below (4 literal spaces)
    
{code}

The file named "{{protected - Unicode password.pdf}}" has a password of mostly 
special characters, with 2 leading spaces and 2 trailing spaces thrown in for 
good measure:
{code:java}
// Password is on following line (with 2 leading spaces, 2 trailing spaces)
  ! < > " \ € œ ¤ ¼ ½ 𠜎 𩶘 😀  
{code}
     



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to