Anand, thanks for the explanation. I'm still a little puzzled why curl behaves so strange. I will check how other client behave as soon as I have a chance.
Vinod, what exactly is the benefit of using recordio here? Doesn't it make the content-type somewhat wrong? If I send 'Accept: application/json' and receive 'Content-Type: application/json', I actually expect to receive only json in the message. Thanks, Dario > On 28.08.2015, at 18:13, Vinod Kone <[email protected]> wrote: > > I'm happy to add the "\n" after the event (note it's different from chunk) if > that makes CURL play nicer. I'm not sure about the "\r" part though? Is that > a nice to have or does it have some other benefit? > > The design doc is not set in the stone since this has not been released yet. > So definitely want to do the right/easy thing. > >> On Fri, Aug 28, 2015 at 7:53 AM, Anand Mazumdar <[email protected]> wrote: >> Dario, >> >> Thanks for the detailed explanation and for trying out the new API. However, >> this is not a bug. The output from CURL is the encoding used by Mesos for >> the events stream. From the user doc: >> >> "Master encodes each Event in RecordIO format, i.e., string representation >> of length of the event in bytes followed by JSON or binary Protobuf >> (possibly compressed) encoded event. Note that the value of length will >> never be ‘0’ and the size of the length will be the size of unsigned integer >> (i.e., 64 bits). Also, note that the RecordIO encoding should be decoded by >> the scheduler whereas the underlying HTTP chunked encoding is typically >> invisible at the application (scheduler) layer.“ >> >> If you run CURL with tracing enabled i.e. —trace, the output would be >> something similar to this: >> >> <= Recv header, 2 bytes (0x2) >> 0000: 0d 0a .. >> <= Recv data, 115 bytes (0x73) >> 0000: 36 64 0d 0a 31 30 35 0a 7b 22 73 75 62 73 63 72 6d..105.{"subscr >> 0010: 69 62 65 64 22 3a 7b 22 66 72 61 6d 65 77 6f 72 ibed":{"framewor >> 0020: 6b 5f 69 64 22 3a 7b 22 76 61 6c 75 65 22 3a 22 k_id":{"value":" >> 0030: 32 30 31 35 30 38 32 35 2d 31 30 33 30 31 38 2d 20150825-103018- >> 0040: 33 38 36 33 38 37 31 34 39 38 2d 35 30 35 30 2d 3863871498-5050- >> 0050: 31 31 38 35 2d 30 30 31 30 22 7d 7d 2c 22 74 79 1185-0010"}},"ty >> 0060: 70 65 22 3a 22 53 55 42 53 43 52 49 42 45 44 22 pe":"SUBSCRIBED" >> 0070: 7d 0d 0a }.. >> <others >> >> In the output above, the chunks are correctly delimited by ‘CRLF' (0d 0a) as >> per the HTTP RFC. As mentioned earlier, the output that you observe on >> stdout with CURL is of the Record-IO encoding used for the events stream ( >> and is not related to the RFC ): >> >> event = event-size LF >> event-data >> >> Looking forward to more bug reports as you try out the new API ! >> >> -anand >> >>> On Aug 28, 2015, at 12:56 AM, Dario Rexin <[email protected]> wrote: >>> >>> -1 (non-binding) >>> >>> I found a breaking bug in the new HTTP API. The messages do not conform to >>> the HTTP standard for chunked transfer encoding. in RFC 2616 Sec. 3 >>> (http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html) a chunk is defined >>> as: >>> >>> chunk = chunk-size [ chunk-extension ] CRLF >>> chunk-data CRLF >>> >>> The HTTP API currently sends a chunk as: >>> >>> chunk = chunk-size LF >>> chunk-data >>> >>> A standard conform HTTP client like curl can’t correctly interpret the data >>> as a complete chunk. In curl it currently looks like this: >>> >>> 104 >>> {"subscribed":{"framework_id":{"value":"20150820-114552-16777343-5050-43704-0000"}},"type":"SUBSCRIBED"}20 >>> {"type":"HEARTBEAT”}666 >>> …. waiting … >>> {"offers":{"offers":[{"agent_id":{"value":"20150820-114552-16777343-5050-43704-S0"},"framework_id":{"value":"20150820-114552-16777343-5050-43704-0000"},"hostname":"localhost","id":{"value":"20150820-114552-16777343-5050-43704-O0"},"resources":[{"name":"cpus","role":"*","scalar":{"value":8},"type":"SCALAR"},{"name":"mem","role":"*","scalar":{"value":15360},"type":"SCALAR"},{"name":"disk","role":"*","scalar":{"value":2965448},"type":"SCALAR"},{"name":"ports","ranges":{"range":[{"begin":31000,"end":32000}]},"role":"*","type":"RANGES"}],"url":{"address":{"hostname":"localhost","ip":"127.0.0.1","port":5051},"path":"\/slave(1)","scheme":"http"}}]},"type":"OFFERS”}20 >>> … waiting … >>> {"type":"HEARTBEAT”}20 >>> … waiting … >>> >>> It will receive a couple of messages after successful registration with the >>> master and the last thing printed is a number (in this case 666). Then >>> after some time it will print the first offers message followed by the >>> number 20. The explanation for this behavior is, that curl can’t interpret >>> the data it gets from Mesos as a complete chunk and waits for the missing >>> data. So it prints what it thinks is a chunk (a message followed by the >>> size of the next messsage) and keeps the rest of the message until another >>> message arrives and so on. The fix for this is to terminate both lines, the >>> message size and the message data, with CRLF. >>> >>> Cheers, >>> Dario >

