[
https://issues.apache.org/jira/browse/OAK-9304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17251456#comment-17251456
]
Matt Ryan commented on OAK-9304:
--------------------------------
I have an implementation in place for this. You can see the diff here:
[https://github.com/apache/jackrabbit-oak/compare/trunk...mattvryan:OAK-9304]
However, one use case is not passing. That use case is a filename with a
single double-quote in the middle of the filename, like {{my"file.txt}}.
Azure's blob storage service seems to be okay with this but S3 doesn't like it
and returns a 400 response when you try to issue a request with a URI that has
this filename in the query parameters.
Java's ISO-8859-1 encoder doesn't transpose the " character. But IIUC this
filename is a legal filename in Oak.
My question is, should I move forward with the fix I have so far? It is
probably better than what's in trunk. Or do we first need to address the issue
of double-quotes in the filename - and if so, how to address it?
One option would be to search and replace " with %22, which is what is used in
the RFC-8187 encoding for the other filename value in the content disposition.
While not technically the correct value, it would probably work.
Note that both Azure and S3 do support a file with two double-quotes. For
example, if you name the file {{"myfile.txt"}} (with the double-quotes as a
part of the filename), this appears to work, although it might be working by
accident.
> Filename portion of direct download URI Content-Disposition should be
> ISO-8859-1 encoded
> ----------------------------------------------------------------------------------------
>
> Key: OAK-9304
> URL: https://issues.apache.org/jira/browse/OAK-9304
> Project: Jackrabbit Oak
> Issue Type: Bug
> Components: blob-cloud, blob-cloud-azure, blob-plugins
> Affects Versions: 1.36.0
> Reporter: Matt Ryan
> Assignee: Matt Ryan
> Priority: Major
>
> The "filename" portion of the Content-Disposition needs to be ISO-8859-1
> encoded, per [https://tools.ietf.org/html/rfc6266#section-4.3] in this
> paragraph:
> {quote}The parameters "filename" and "filename*" differ only in that
> "filename*" uses the encoding defined in RFC5987, allowing the use of
> characters not present in the ISO-8859-1 character set ISO-8859-1.
> {quote}
> This is not usually a problem, but if the filename provided contains
> non-standard characters, it can cause the resulting signed URI to be invalid.
> This can lead to blob storage services being unable to service the URl
> request.
> For example, a filename of "Ausländische.jpg" currently requests a
> Content-Disposition header that looks like:
> {noformat}
> inline; filename="Ausländische.jpg"; filename*=UTF-8''Ausla%CC%88ndische.jpg
> {noformat}
> It instead should look like:
> {noformat}
> inline; filename="Ausla?ndische.jpg"; filename*=UTF-8''Ausla%CC%88ndische.jpg
> {noformat}
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)