Okay, thanks!

On 01.09.2015, at 20:57, Benjamin Mahler <[email protected]> wrote:

>> One more question. From the Mesos code it doesn’t look like events are being 
>> split or combined, so given I have a client that gives me access to the 
>> individual chunks, is it safe to assume that each chunk contains exactly one 
>> event? Because that would make parsing the events a lot easier for me.
> 
> No guarantee that a chunk is a full single event.
> 
>> On Tue, Sep 1, 2015 at 1:49 AM, Dario Rexin <[email protected]> wrote:
>> One more question. From the Mesos code it doesn’t look like events are being 
>> split or combined, so given I have a client that gives me access to the 
>> individual chunks, is it safe to assume that each chunk contains exactly one 
>> event? Because that would make parsing the events a lot easier for me.
>> 
>> Thanks,
>> Dario
>> 
>>> On Sep 1, 2015, at 8:42 AM, [email protected] wrote:
>>> 
>>> Hi Vinod,
>>> 
>>> thanks for the explanation, I got it now.
>>> 
>>> Thanks,
>>> Dario
>>> 
>>>> On 31.08.2015, at 23:47, Vinod Kone <[email protected]> wrote:
>>>> 
>>>> I think you might be confused with the HTTP chunked encoding and RecordIO 
>>>> encoding. Most HTTP client libraries dechunk the stream before presenting 
>>>> it to the application. So the application needs to know the encoding of 
>>>> the dechunked data to be able to process it.
>>>> 
>>>> In Mesos's case, the server (master here) can encode it in JSON or 
>>>> Protobuf. We wanted to have a consistent way to encode both these formats 
>>>> and Record-IO format was the one we settled on. Note that this format is 
>>>> also used by the Twitter streaming API (see delimited messages section).
>>>> 
>>>> HTH,
>>>> 
>>>>> On Mon, Aug 31, 2015 at 2:09 PM, Dario Rexin <[email protected]> wrote:
>>>>> Hi Vino,
>>>>> 
>>>>>> On Aug 31, 2015, at 9:36 PM, Vinod Kone <[email protected]> wrote:
>>>>>> 
>>>>>> Hi Dario,
>>>>>> 
>>>>>> Can you test with "curl --no-buffer" option? Looks like your stdout 
>>>>>> might be line-buffered.
>>>>> 
>>>>> that did the trick, thanks!
>>>>> 
>>>>>> 
>>>>>> The reason we used record-io formatting is to be consistent in how we 
>>>>>> stream protobuf and json encoded data.
>>>>> 
>>>>> How does simple chunked encoding prevent you from doing this?
>>>>> 
>>>>> Thanks,
>>>>> Dario
>>>>> 
>>>>>>> On Fri, Aug 28, 2015 at 2:04 PM, <[email protected]> wrote:
>>>>>>> Anand,
>>>>>>> 
>>>>>>> thanks for the explanation. I didn't think about the case when you have 
>>>>>>> to split a message, now it makes sense.
>>>>>>> 
>>>>>>> But the case I observed with curl is still weird. Even when splitting a 
>>>>>>> message, it should still receive both parts almost at the same time. Do 
>>>>>>> you have any idea why it could behave like this?
>>>>>>> 
>>>>>>>> On 28.08.2015, at 21:31, Anand Mazumdar <[email protected]> wrote:
>>>>>>>> 
>>>>>>>> Dario,
>>>>>>>> 
>>>>>>>> Most HTTP libraries/parsers ( including one that Mesos uses internally 
>>>>>>>> ) provide a way to specify a default size of each chunk. If a Mesos 
>>>>>>>> Event is too big , it would get split into smaller chunks and 
>>>>>>>> vice-versa.
>>>>>>>> 
>>>>>>>> -anand
>>>>>>>> 
>>>>>>>>> On Aug 28, 2015, at 11:51 AM, [email protected] wrote:
>>>>>>>>> 
>>>>>>>>> Anand,
>>>>>>>>> 
>>>>>>>>> in the example from my first mail you can see that curl prints the 
>>>>>>>>> size of a message and then waits for the next message and only when 
>>>>>>>>> it receives that message it will print the prior message plus the 
>>>>>>>>> size of the next message, but not the actual message.
>>>>>>>>> 
>>>>>>>>> What's the benefit of encoding multiple messages in a single chunk? 
>>>>>>>>> You could simply create a single chunk per event.
>>>>>>>>> 
>>>>>>>>> Cheers,
>>>>>>>>> Dario
>>>>>>>>> 
>>>>>>>>>> On 28.08.2015, at 19:43, Anand Mazumdar <[email protected]> wrote:
>>>>>>>>>> 
>>>>>>>>>> Dario,
>>>>>>>>>> 
>>>>>>>>>> Can you shed a bit more light on what you still find puzzling about 
>>>>>>>>>> the CURL behavior after my explanation ? 
>>>>>>>>>> 
>>>>>>>>>> PS: A single HTTP chunk can have 0 or more Mesos (Scheduler API) 
>>>>>>>>>> Events. So in your example, the first chunk had complete information 
>>>>>>>>>> about the first “event”, followed by partial information about the 
>>>>>>>>>> subsequent event from another chunk.
>>>>>>>>>> 
>>>>>>>>>> As for the benefit of using RecordIO format here, how else do you 
>>>>>>>>>> think we could have de-marcated two events in the response ?
>>>>>>>>>> 
>>>>>>>>>> -anand
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>> On Aug 28, 2015, at 10:01 AM, [email protected] wrote:
>>>>>>>>>>> 
>>>>>>>>>>> Anand,
>>>>>>>>>>> 
>>>>>>>>>>> thanks for the explanation. I'm still a little puzzled why curl 
>>>>>>>>>>> behaves so strange. I will check how other client behave as soon as 
>>>>>>>>>>> I have a chance.
>>>>>>>>>>> 
>>>>>>>>>>> Vinod,
>>>>>>>>>>> 
>>>>>>>>>>> what exactly is the benefit of using recordio here? Doesn't it make 
>>>>>>>>>>> the content-type somewhat wrong? If I send 'Accept: 
>>>>>>>>>>> application/json' and receive 'Content-Type: application/json', I 
>>>>>>>>>>> actually expect to receive only json in the message.
>>>>>>>>>>> 
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Dario
>>>>>>>>>>> 
>>>>>>>>>>>> On 28.08.2015, at 18:13, Vinod Kone <[email protected]> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> I'm happy to add the "\n" after the event (note it's different 
>>>>>>>>>>>> from chunk) if that makes CURL play nicer. I'm not sure about the 
>>>>>>>>>>>> "\r" part though? Is that a nice to have or does it have some 
>>>>>>>>>>>> other benefit?
>>>>>>>>>>>> 
>>>>>>>>>>>> The design doc is not set in the stone since this has not been 
>>>>>>>>>>>> released yet. So definitely want to do the right/easy thing.
>>>>>>>>>>>> 
>>>>>>>>>>>>> On Fri, Aug 28, 2015 at 7:53 AM, Anand Mazumdar 
>>>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>>> Dario,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks for the detailed explanation and for trying out the new 
>>>>>>>>>>>>> API. However, this is not a bug. The output from CURL is the 
>>>>>>>>>>>>> encoding used by Mesos for the events stream. From the user doc:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> "Master encodes each Event in RecordIO format, i.e., string 
>>>>>>>>>>>>> representation of length of the event in bytes followed by JSON 
>>>>>>>>>>>>> or binary Protobuf  (possibly compressed) encoded event. Note 
>>>>>>>>>>>>> that the value of length will never be ‘0’ and the size of the 
>>>>>>>>>>>>> length will be the size of unsigned integer (i.e., 64 bits). 
>>>>>>>>>>>>> Also, note that the RecordIO encoding should be decoded by the 
>>>>>>>>>>>>> scheduler whereas the underlying HTTP chunked encoding is 
>>>>>>>>>>>>> typically invisible at the application (scheduler) layer.“
>>>>>>>>>>>>> 
>>>>>>>>>>>>> If you run CURL with tracing enabled i.e. —trace, the output 
>>>>>>>>>>>>> would be something similar to this:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> <= Recv header, 2 bytes (0x2)
>>>>>>>>>>>>> 0000: 0d 0a                                           ..
>>>>>>>>>>>>> <= Recv data, 115 bytes (0x73)
>>>>>>>>>>>>> 0000: 36 64 0d 0a 31 30 35 0a 7b 22 73 75 62 73 63 72 
>>>>>>>>>>>>> 6d..105.{"subscr
>>>>>>>>>>>>> 0010: 69 62 65 64 22 3a 7b 22 66 72 61 6d 65 77 6f 72 
>>>>>>>>>>>>> ibed":{"framewor
>>>>>>>>>>>>> 0020: 6b 5f 69 64 22 3a 7b 22 76 61 6c 75 65 22 3a 22 
>>>>>>>>>>>>> k_id":{"value":"
>>>>>>>>>>>>> 0030: 32 30 31 35 30 38 32 35 2d 31 30 33 30 31 38 2d 
>>>>>>>>>>>>> 20150825-103018-
>>>>>>>>>>>>> 0040: 33 38 36 33 38 37 31 34 39 38 2d 35 30 35 30 2d 
>>>>>>>>>>>>> 3863871498-5050-
>>>>>>>>>>>>> 0050: 31 31 38 35 2d 30 30 31 30 22 7d 7d 2c 22 74 79 
>>>>>>>>>>>>> 1185-0010"}},"ty
>>>>>>>>>>>>> 0060: 70 65 22 3a 22 53 55 42 53 43 52 49 42 45 44 22 
>>>>>>>>>>>>> pe":"SUBSCRIBED"
>>>>>>>>>>>>> 0070: 7d 0d 0a                                        }..
>>>>>>>>>>>>> <others
>>>>>>>>>>>>> 
>>>>>>>>>>>>> In the output above, the chunks are correctly delimited by ‘CRLF' 
>>>>>>>>>>>>> (0d 0a) as per the HTTP RFC. As mentioned earlier, the output 
>>>>>>>>>>>>> that you observe on stdout with CURL is of the Record-IO encoding 
>>>>>>>>>>>>> used for the events stream ( and is not related to the RFC ):
>>>>>>>>>>>>> 
>>>>>>>>>>>>> event = event-size LF
>>>>>>>>>>>>>              event-data
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Looking forward to more bug reports as you try out the new API !
>>>>>>>>>>>>> 
>>>>>>>>>>>>> -anand
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Aug 28, 2015, at 12:56 AM, Dario Rexin <[email protected]> 
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> -1 (non-binding)
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I found a breaking bug in the new HTTP API. The messages do not 
>>>>>>>>>>>>>> conform to the HTTP standard for chunked transfer encoding. in 
>>>>>>>>>>>>>> RFC 2616 Sec. 3 
>>>>>>>>>>>>>> (http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html) a chunk 
>>>>>>>>>>>>>> is defined as:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> chunk = chunk-size [ chunk-extension ] CRLF
>>>>>>>>>>>>>>         chunk-data CRLF
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> The HTTP API currently sends a chunk as:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> chunk = chunk-size LF
>>>>>>>>>>>>>>         chunk-data
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> A standard conform HTTP client like curl can’t correctly 
>>>>>>>>>>>>>> interpret the data as a complete chunk. In curl it currently 
>>>>>>>>>>>>>> looks like this:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 104
>>>>>>>>>>>>>> {"subscribed":{"framework_id":{"value":"20150820-114552-16777343-5050-43704-0000"}},"type":"SUBSCRIBED"}20
>>>>>>>>>>>>>> {"type":"HEARTBEAT”}666
>>>>>>>>>>>>>> …. waiting …
>>>>>>>>>>>>>> {"offers":{"offers":[{"agent_id":{"value":"20150820-114552-16777343-5050-43704-S0"},"framework_id":{"value":"20150820-114552-16777343-5050-43704-0000"},"hostname":"localhost","id":{"value":"20150820-114552-16777343-5050-43704-O0"},"resources":[{"name":"cpus","role":"*","scalar":{"value":8},"type":"SCALAR"},{"name":"mem","role":"*","scalar":{"value":15360},"type":"SCALAR"},{"name":"disk","role":"*","scalar":{"value":2965448},"type":"SCALAR"},{"name":"ports","ranges":{"range":[{"begin":31000,"end":32000}]},"role":"*","type":"RANGES"}],"url":{"address":{"hostname":"localhost","ip":"127.0.0.1","port":5051},"path":"\/slave(1)","scheme":"http"}}]},"type":"OFFERS”}20
>>>>>>>>>>>>>> … waiting …
>>>>>>>>>>>>>> {"type":"HEARTBEAT”}20
>>>>>>>>>>>>>> … waiting …
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> It will receive a couple of messages after successful 
>>>>>>>>>>>>>> registration with the master and the last thing printed is a 
>>>>>>>>>>>>>> number (in this case 666). Then after some time it will print 
>>>>>>>>>>>>>> the first offers message followed by the number 20. The 
>>>>>>>>>>>>>> explanation for this behavior is, that curl can’t interpret the 
>>>>>>>>>>>>>> data it gets from Mesos as a complete chunk and waits for the 
>>>>>>>>>>>>>> missing data. So it prints what it thinks is a chunk (a message 
>>>>>>>>>>>>>> followed by the size of the next messsage) and keeps the rest of 
>>>>>>>>>>>>>> the message until another message arrives and so on. The fix for 
>>>>>>>>>>>>>> this is to terminate both lines, the message size and the 
>>>>>>>>>>>>>> message data, with CRLF.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>> Dario
> 

Reply via email to