[
https://issues.apache.org/jira/browse/MESOS-2013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16206899#comment-16206899
]
Benjamin Mahler commented on MESOS-2013:
----------------------------------------
[~jgehrcke] the original intention was just to pass through the file contents
directly. Unfortunately at the time we decided to return the data via JSON and
didn't quite realize that doing so consisted of interpreting the data as UTF-8
(this is the default JSON encoding and JSON strings are unicode). The original
implementation would have had to encode the data (e.g. base64) into the JSON
string for us to have been agnostic of encoding.
I will resolve this ticket since the description above is fixed (it now works
without issue if the file contains UTF-8).
The current state is that if the file contains any encoding other than UTF-8,
we'll spit out the wrong thing from the endpoint. I linked to MESOS-4642 which
is one case of this. Ideally, we can instead spit out base64 data at the
minimum, with potential support for the client telling us which encoding the
file is expected (and us returning an error if invalid), or us trying to detect
it, or something else.
Note that in the V1 API, this issue is resolved. The
[Response::ReadFile|https://github.com/apache/mesos/blob/1.4.0/include/mesos/v1/master/master.proto#L278-L284]
returns a {{bytes}} in protobuf or a base64 JSON string if the client wants
JSON.
[~jgehrcke] How did you run into this ticket?
> Slave read endpoint doesn't encode non-ascii characters correctly
> -----------------------------------------------------------------
>
> Key: MESOS-2013
> URL: https://issues.apache.org/jira/browse/MESOS-2013
> Project: Mesos
> Issue Type: Bug
> Components: json api
> Reporter: Whitney Sorenson
> Assignee: Anand Mazumdar
>
> Create a file in a sandbox with a non-ascii character, like this one:
> http://www.fileformat.info/info/unicode/char/2018/index.htm
> Hit the read endpoint for that file.
> The response will have something like:
> data: "\u00E2\u0080\u0098"
> It should actually be:
> data: "\u2018"
> If you put either into JSON.parse() in the browser you will see the first
> does not render correctly but the second does.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)