[
https://issues.apache.org/jira/browse/MESOS-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14939194#comment-14939194
]
Anand Mazumdar edited comment on MESOS-3562 at 10/16/15 1:50 PM:
-----------------------------------------------------------------
[~BenWhitehead] There seems to be some confusion here. Comments Inline.
> This isn't really standard chunks though, there are chunks within chunks and
> the configuration of the client would have to know that.
Can you elaborate a bit more on what do you mean by chunks between chunks here
? We strictly adhere to the standard chunk encoding format defined in RFC 2616.
The only difference here is that the {{data}} in chunks itself is encoded in
{{RecordIO}} format.
> What is the motivation behind using recordio format ?
Intermediaries on the network e.g. proxies are free to change the chunk
boundaries and this should not have any effect on the recipient application.
We wanted a way to delimit/encode two events for JSON/Protobuf responses
consistently and RecordIO format allowed us to do that.
We could have gone away with RecordIO for JSON responses though by just
delimiting on {{\n}} but that would have made it inconsistent in behavior when
compared to Protobuf Responses.
> If standard encoding were used then every HTTP client would already have the
> necessary understanding to know how to deal with the chunks.
We use standard chunk encoding as defined in RFC. What do you mean here ?
> Where is the specification for what recordio format is? I have not been able
> to find anything online.
We should add more information on this in our docs. For now, till we do that,
here is a brief description on what the format looks like:
{code}
5\n
hello
6\n
world!
{code}
Ideally, whatever client you are using should do the de-chunking for you. You
should get this back from the client i.e. just the {{RecordIO}} encoded data.
{code}
104\n{"subscribed":{"framework_id":{"value":"20150930-103028-16777343-5050-11742-0028"}},"type":"SUBSCRIBED"}
{code}
cc'ing [~bmahler] If I missed anything.
was (Author: anandmazumdar):
[~BenWhitehead] There seems to be some confusion here. Comments Inline.
> This isn't really standard chunks though, there are chunks within chunks and
> the configuration of the client would have to know that.
Can you elaborate a bit more on what do you mean by chunks between chunks here
? We strictly adhere to the standard chunk encoding format defined in RFC 2616.
The only difference here is that the {{data}} in chunks itself is encoded in
{{RecordIO}} format.
> What is the motivation behind using recordio format ?
We wanted a way to delimit two events for JSON/Protobuf responses and RecordIO
format allowed us to do that. We could have gone away with RecordIO for JSON
though by just delimiting on {{\n}} but that would have made it inconsistent in
behavior when compared to Protobuf Responses.
> If standard encoding were used then every HTTP client would already have the
> necessary understanding to know how to deal with the chunks.
We use standard chunk encoding as defined in RFC. What do you mean here ?
> Where is the specification for what recordio format is? I have not been able
> to find anything online.
We should add more information on this in our docs. For now, till we do that,
here is a brief description on what the format looks like:
{code}
5\n
hello
6\n
world!
{code}
Ideally, whatever client you are using should do the de-chunking for you. You
should get this back from the client i.e. just the {{RecordIO}} encoded data.
{code}
104\n{"subscribed":{"framework_id":{"value":"20150930-103028-16777343-5050-11742-0028"}},"type":"SUBSCRIBED"}
{code}
cc'ing [~bmahler] If I missed anything.
> Anomalous bytes in stream from HTTPI Api
> ----------------------------------------
>
> Key: MESOS-3562
> URL: https://issues.apache.org/jira/browse/MESOS-3562
> Project: Mesos
> Issue Type: Bug
> Components: HTTP API
> Affects Versions: 0.24.0
> Environment: Linux 3.16.7-24-desktop #1 SMP PREEMPT Mon Aug 3
> 14:37:06 UTC 2015 (ec183cc) x86_64 x86_64 GNU/Linux
> Mesos 0.24.0
> gcc (SUSE Linux) 4.8.3 20140627 [gcc-4_8-branch revision 212064]
> Reporter: Ben Whitehead
> Priority: Blocker
> Labels: http, mesosphere, wireprotocol
> Attachments: app.log, tcpdump.log
>
>
> When connecting to the new HTTP Api and attempting to {{SUBSCRIBE}} there are
> some anomalous bytes contained in the chunked stream that appear to be
> causing problems when I attempting to integrate.
> Attached are two log files. app.log represents my application trying to
> connect to mesos using RxNetty. Netty has been configured to log all data it
> sends/receives over the wire this can be seen in the byte blocks in the log.
> The client is constructing a protobuf in java for the subscribe call
> {code:java}
> final Call subscribeCall = Call.newBuilder()
> .setType(Call.Type.SUBSCRIBE)
> .setSubscribe(
> Call.Subscribe.newBuilder()
> .setFrameworkInfo(
> Protos.FrameworkInfo.newBuilder()
> .setUser("bill")
> .setName("testing")
> .build()
> )
> )
> .build();
> {code}
>
> lient sends the protobuf to mesos with the following request headers:
> {code}
> POST /api/v1/scheduler HTTP/1.1
> Content-Type: application/x-protobuf
> Accept: application/json
> Content-Length: 35
> Host: localhost:5050
> User-Agent: RxNetty Client
> {code}
> The body is then serialized via protobuf and sent.
> The response from the mesos master has the following headers:
> {code}
> HTTP/1.1 200 OK
> Transfer-Encoding: chunked
> Date: Wed, 30 Sep 2015 21:07:16 GMT
> Content-Type: application/json
> {code}
> followed by
> {code}
> \r\n\r\n6c\r\n104\n{"subscribed":{"framework_id":{"value":"20150930-103028-16777343-5050-11742-0028"}},"type":"SUBSCRIBED"}
> {code}
> The {{\r\n\r\n}} is expected for standard http bodies, how ever {{6c\r\n}}
> doesn't appear to be attached to anything. {{104}} is the correct length of
> the Subscribe events JSON.
> What is this extra number and why is it there?
> This is not the first time confusion has come up related to the wire format
> for the event stream from the new http api see
> [this|http://mail-archives.apache.org/mod_mbox/mesos-user/201508.mbox/%[email protected]%3E]
> message from the mailing list.
> In the [Design
> Doc|https://docs.google.com/document/d/1pnIY_HckimKNvpqhKRhbc9eSItWNFT-priXh_urR-T0/edit#]
> there is a statement that said
> {quote}
> All subsequent events that are relevant to this framework generated by Mesos
> are streamed on this connection. Master encodes each Event in RecordIO
> format, i.e., string representation of length of the event followed by JSON
> or binary Protobuf (possibly compressed) encoded event.
> {quote}
> There is no specification I've been able to find online that actually
> explains this format. The only reference I can find to it is some sample go
> code.
> The attached tcpdump.log contains a tcp dump between the mesos master and my
> client collected using the following command {{tcpdump -xx -n -i lo "dst port
> 5050" or "src port 5050" 2>&1 | tee /tmp/tcpdump.log}}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)