Cool 2016-03-01 2:48 GMT+08:00 Chong Chen <[email protected]>:
> Thanks, it is clear and helpful! > > > > *From:* haosdent [mailto:[email protected]] > *Sent:* Saturday, February 27, 2016 2:28 AM > *To:* user > *Subject:* Re: How did the mesos master detect the disconnect of a > framework (scheduler) > > > > Joseph's explanation quite detail.👍 > > On Feb 27, 2016 3:33 AM, "Joseph Wu" <[email protected]> wrote: > > Here's a brief(?) run-down: > > 1. > > https://github.com/apache/mesos/blob/4376803007446b949840d53945547d8a61b91339/src/master/master.cpp#L5739-L5748 > > <https://github.com/apache/mesos/blob/master/src/master/master.cpp#L5739-L5748> > When a new framework is added, the master opens a socket connection > with the framework. > > > - If this is a scheduler-driver-based framework, this is a plain > socket connection. > - If this is a new HTTP API framework, the master uses the > streaming HTTP connection instead. > > > 1. The HTTP API framework's exit logic is simpler to explain. When > the streaming connection closes, the master considers the framework to have > exited. In the above code, see this chunk of code: > http.closed() > .onAny(defer(self(), &Self::exited, framework->id(), http)); > 2. The scheduler-driver-based framework exit is a bit more involved: > > > 1. > > https://github.com/apache/mesos/blob/4376803007446b949840d53945547d8a61b91339/3rdparty/libprocess/src/process.cpp#L1326 > Libprocess has a SocketManager which, as the name suggests, managed > sockets. Linking the master <-> framework spawns a socket here. > 2. > > https://github.com/apache/mesos/blob/4376803007446b949840d53945547d8a61b91339/3rdparty/libprocess/src/process.cpp#L1394-L1400 > Linking will install a dispatch loop, which continually reads the > data from the socket until the socket closes. > 3. > > https://github.com/apache/mesos/blob/4376803007446b949840d53945547d8a61b91339/3rdparty/libprocess/src/process.cpp#L1300-L1312 > The dispatch loop calls "ignore_recv_data". This detects when the > socket closes and calls "SocketManager->close(s)". > 4. > > https://github.com/apache/mesos/blob/4376803007446b949840d53945547d8a61b91339/3rdparty/libprocess/src/process.cpp#L1928 > "SocketManager->close" will generate a libprocess "ExitedEvent". > 5. > > https://github.com/apache/mesos/blob/4376803007446b949840d53945547d8a61b91339/src/master/master.cpp#L1352 > Master has a listener for "ExitedEvent" which rate-limits these > events. > 6. > > https://github.com/apache/mesos/blob/4376803007446b949840d53945547d8a61b91339/src/master/master.cpp#L1161 > The "ExitedEvent" eventually gets propagated to that ^ method > (through a libprocess event visitor). > 7. > > https://github.com/apache/mesos/blob/4376803007446b949840d53945547d8a61b91339/src/master/master.cpp#L1165 > Finally, the framework gets removed. > > Hope that helps, > > ~Joseph > > > > On Fri, Feb 26, 2016 at 10:45 AM, Chong Chen <[email protected]> > wrote: > > Hi, > > When a running framework was disconnected (manually terminated), the Mesos > master will detect it immediately. The master::exited() function will be > invoked with log info “framework disconnected”. > > I just wondering, how this disconnect detection was implemented in Mesos? > I can’t find any place in mesos src directory where the Master::exit() > function was called. > > > > Thanks! > > > > Best Regards, > > Chong > > > -- Deshi Xiao Twitter: xds2000 E-mail: xiaods(AT)gmail.com

