First, my apologies for getting to this party so late. It's great to see people interested in helping create native-language Mesos libraries.
Vladimir: my presentation was definitely referring to the the low-level protocol between master, framework (scheduler), slave, and executors. I'll do my best here to clarify how the current protocol works and what we need to do to get it to the point where we can write native-language libraries. (Eventually it would be great to move some of this into documentation as necessary.) As Nikita pointed out, the protocol is currently "HTTP-like". As my presentation describes, think actors and one-way message passing when considering how the protocol works. To send a message an actor POSTs an HTTP request where the actor that is supposed to receive the message is the first component of the request path and the name of the message is the remaining part of the path. To distinguish one of these "messages" from a normal HTTP request we look to see if the 'User-Agent' is 'libprocess/...'. For example: POST /master/mesos.internal.RegisterFrameworkMessage HTTP/1.1 User-Agent: libprocess/scheduler(1)@10.0.1.7:53523 ... represents a message with the name 'mesos.internal.RegisterFrameworkMessage' destined for the actor 'master' coming from the actor 'scheduler(1)' at 10.0.1.7:53523. If the 'master' actor were to send a message back it would look something like this: POST /scheduler(1)/mesos.internal.FrameworkRegisteredMessage HTTP/1.1 User-Agent: libprocess/master@10.0.1.7:5050 So, one-way message passing via HTTP POST. The message data is captured as the body of the HTTP request (which can be specified using _either_ Content-Length or a Transfer-Encoding, and as Nikita points out we use chunked transfer encoding internally). The data is arbitrary and the actor ultimately decides how it wants to "parse" it. In Mesos, 99% of our messages use serialized protobufs, but we also send a few messages with just arbitrary data. All this really means is that knowing the actor and message name is not enough, you also need to know what the body type is supposed to be for that message. In the future we'll probably enable messages with either JSON or serialized protobuf[1] ... for now, just serialized protobuf. Okay, so where does this break down when trying to do this language-natively? I've had some of this in the works and this conversation has motivated me to publish some reviews addressing the issues: (1) We'll need to return a response if one plans to use a native HTTP library since it'll expect request/response. https://reviews.apache.org/r/20276 introduces responding with a '202 Accepted' for these messages (from the HTTP specification, a '202 Accepted': "The request has been accepted for processing, but the processing has not been completed. The request might or might not eventually be acted upon, as it might be disallowed when processing actually takes place. There is no facility for re-sending a status code from an asynchronous operation such as this."). (2) Most HTTP libraries will set their 'User-Agent' themselves, so https://reviews.apache.org/r/20277 introduces a 'libprocess-from' header that works similar to User-Agent. There is still some cleanup I'd love to do around stringification of PIDs (the underlying type Mesos uses for remote actors, inspired by Erlang). Until then, the 'libprocess-from' string is unfortunately esoteric (see the test). The combination of these two patches should make sending and receiving messages straightforward. However, we still plan to expose the low-level Event and Call protobuf messages and that will be the preferred approach for building a native-language library. Follow along at https://issues.apache.org/jira/browse/MESOS-1127 for more details. (To be clear, you'd still be able to implement native-language libraries with the patches above but we'll be deprecating the protobufs you'd be using in favor of Event and Call protobufs instead. If you're eager to get that going before Event and Call are committed I'm happy to discuss the existing protobufs in more detail.) I hope this helps. Ben. On Fri, Apr 11, 2014 at 4:54 AM, Vladimir Vivien <vladimir.viv...@gmail.com>wrote: > Nikita > Thanks for the JIRA. > > > On Wed, Apr 9, 2014 at 2:16 PM, Vetoshkin Nikita < > nikita.vetosh...@gmail.com > > wrote: > > > BTW, there is also somehow related ticket > > https://issues.apache.org/jira/browse/MESOS-930 > > > > > > On Wed, Apr 9, 2014 at 9:54 PM, Benjamin Mahler > > <benjamin.mah...@gmail.com>wrote: > > > > > > > > > > I thought the low-level api being referred in the > > > > video had to do with communication between master and > > framework|executor > > > > for scheduling. But, it's really administrative. I thought that > would > > > > have been an opportunity for a Go binding that did not require the > C++ > > > > libraries. > > > > > > > > > > Vladimir, the low-level API referred to in the talk is exactly what > > you're > > > interpreting, it is for communication between master and scheduler, and > > > slave and executor. You could definitely build pure go bindings as you > > > described, just not with JSON. > > > > > > Forget I mentioned anything about the administrative endpoints and > JSON, > > as > > > I see that's leading to confusion. ;) > > > > > > On Wed, Apr 9, 2014 at 3:39 AM, Vladimir Vivien > > > <vladimir.viv...@gmail.com>wrote: > > > > > > > Ben, > > > > Thank you for clarifying. I thought the low-level api being referred > in > > > the > > > > video had to do with communication between master and > > framework|executor > > > > for scheduling. But, it's really administrative. I thought that > would > > > > have been an opportunity for a Go binding that did not require the > C++ > > > > libraries. > > > > > > > > Thanks anyway. > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Apr 8, 2014 at 4:52 PM, Benjamin Mahler > > > > <benjamin.mah...@gmail.com>wrote: > > > > > > > > > Sorry, I was not referring to implementing a scheduler via JSON > > instead > > > > of > > > > > protobuf, in theory that would be possible but there has been no > > > planning > > > > > in this area. Sorry for the confusion. > > > > > > > > > > I was referring to administrative endpoints. For example, kicking a > > > > > framework out or telling the master a slave is needs to be > repaired. > > > > These > > > > > endpoints may rely on the ability to convert JSON to internal > > > protobufs. > > > > > > > > > > Can you clarify what you're looking to do? Are you looking to > > implement > > > > an > > > > > API in Go that communicates with JSON instead of serialized > protobuf? > > > > > > > > > > On Tue, Apr 8, 2014 at 1:19 PM, Vladimir Vivien > > > > > <vladimir.viv...@gmail.com>wrote: > > > > > > > > > > > Ben, > > > > > > That is exactly what I am asking. > > > > > > Is that something coming up soon, is there a JIRA I can look at? > > > > > > I wanna get early start on a native json Go api or even help out > if > > > > > > possible. > > > > > > > > > > > > > > > > > > On Tue, Apr 8, 2014 at 3:25 PM, Benjamin Mahler > > > > > > <benjamin.mah...@gmail.com>wrote: > > > > > > > > > > > > > +vinod, benh > > > > > > > > > > > > > > Hey Vladimir, there will be some authenticated REST endpoints > at > > > some > > > > > > > point, there is some work in this area underway. > > > > > > > > > > > > > > We have the ability to encode protobuf messages as JSON, so the > > > plan > > > > > was > > > > > > to > > > > > > > have any REST endpoints directly use JSON to send us protobuf > > > > messages. > > > > > > I'm > > > > > > > not sure if this is what you're asking though? > > > > > > > > > > > > > > > > > > > > > On Tue, Apr 8, 2014 at 11:13 AM, Vetoshkin Nikita < > > > > > > > nikita.vetosh...@gmail.com> wrote: > > > > > > > > > > > > > > > I'm not a mesos guy, just very curious. But in my opinion - I > > > doubt > > > > > it, > > > > > > > > HTTP is synchronous request-response protocol. Mesos needs > > > > something > > > > > > more > > > > > > > > robust for message passing. Websockets anyone? :) > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Apr 8, 2014 at 10:08 PM, Vladimir Vivien > > > > > > > > <vladimir.viv...@gmail.com>wrote: > > > > > > > > > > > > > > > > > Ben / Nikita > > > > > > > > > Thanks for the pointers. > > > > > > > > > So, (without digging yet) is it a fair summary to say that > > > > > libprocess > > > > > > > > wraps > > > > > > > > > protobufs-encoded calls and push them over HTTP to > > > master/slaves > > > > ? > > > > > > Will > > > > > > > > > protobuf (eventually) be supplanted by direct HTTP via REST > > or > > > > > > similar > > > > > > > ? > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Apr 7, 2014 at 2:54 PM, Vetoshkin Nikita < > > > > > > > > > nikita.vetosh...@gmail.com > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Or, just to get to know - you can take tcpdump and take a > > > look > > > > :) > > > > > > > > > > > > > > > > > > > > I personally wouldn't call that HTTP. Something > "HTTP-like" > > > > would > > > > > > > > > describe > > > > > > > > > > it better. Because it's not request-response. It's just > > > message > > > > > > > > passing, > > > > > > > > > no > > > > > > > > > > need to wait for the answer - send new message one after > > > > another. > > > > > > > Every > > > > > > > > > > message is POST with address and message type encoded in > > URI: > > > > > POST > > > > > > > > > > /executor(1)/mesos.internal.RunTaskMessage. Sender is > > encoded > > > > in > > > > > > > > > User-Agent > > > > > > > > > > header, e.g: libprocess/slave(1)@127.0.0.1:5051. Body > > > contains > > > > > > > > protobuf > > > > > > > > > > message, Transfer-Encoding is always "chunked". > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Apr 7, 2014 at 10:42 PM, Benjamin Mahler > > > > > > > > > > <benjamin.mah...@gmail.com>wrote: > > > > > > > > > > > > > > > > > > > > > Unfortunately you will need to learn this by looking at > > the > > > > > code > > > > > > in > > > > > > > > > > > libprocess, as the message passing format is not > > explicitly > > > > > > > > documented > > > > > > > > > at > > > > > > > > > > > the current time. > > > > > > > > > > > > > > > > > > > > > > Start with calls like ProtobufProcess::send() and dig > > your > > > > way > > > > > > > down. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Sat, Apr 5, 2014 at 7:52 AM, Vladimir Vivien > > > > > > > > > > > <vladimir.viv...@gmail.com>wrote: > > > > > > > > > > > > > > > > > > > > > > > I was watching this video from > > > > > > > > > > > > https://www.youtube.com/watch?v=n5GT7OFSh58from Ben > > > where > > > > he > > > > > > > > talked > > > > > > > > > > > > about the wire protocol for Mesos being done in > > > > > > > > > > > > HTTP. > > > > > > > > > > > > > > > > > > > > > > > > Where can I learn about the low-level wire protocol > > > either > > > > in > > > > > > > > > > > documentation > > > > > > > > > > > > or browsing through the code. > > > > > > > > > > > > > > > > > > > > > > > > Thanks. > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > Vladimir Vivien > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > Vladimir Vivien > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Vladimir Vivien > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Vladimir Vivien > > > > > > > > > > > > > -- > Vladimir Vivien >