Repository: mesos Updated Branches: refs/heads/master 2915a80bd -> be2ee8366
Removed embedded interfaces from the framework development guide. This replaces embedded interfaces in the document to links to the actual interfaces. Project: http://git-wip-us.apache.org/repos/asf/mesos/repo Commit: http://git-wip-us.apache.org/repos/asf/mesos/commit/be2ee836 Tree: http://git-wip-us.apache.org/repos/asf/mesos/tree/be2ee836 Diff: http://git-wip-us.apache.org/repos/asf/mesos/diff/be2ee836 Branch: refs/heads/master Commit: be2ee83663226e145ef323400edb3aeaa6fbe0a5 Parents: 2915a80 Author: Benjamin Mahler <bmah...@apache.org> Authored: Wed Jul 11 20:01:43 2018 -0700 Committer: Benjamin Mahler <bmah...@apache.org> Committed: Wed Jul 11 20:01:43 2018 -0700 ---------------------------------------------------------------------- docs/app-framework-development-guide.md | 362 ++------------------------- 1 file changed, 25 insertions(+), 337 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/mesos/blob/be2ee836/docs/app-framework-development-guide.md ---------------------------------------------------------------------- diff --git a/docs/app-framework-development-guide.md b/docs/app-framework-development-guide.md index 835757e..035ac1f 100644 --- a/docs/app-framework-development-guide.md +++ b/docs/app-framework-development-guide.md @@ -15,277 +15,28 @@ and Scala. ## Create your Framework Scheduler +### API If you are writing a scheduler against Mesos 1.0 or newer, it is recommended to use the new [HTTP API](scheduler-http-api.md) to talk to Mesos. -If your framework needs to talk to Mesos 0.28.0 or older, you can write the -scheduler in C++, Java/Scala, or Python. Your framework scheduler should inherit -from the `Scheduler` class (see API below). Your scheduler should create a SchedulerDriver +If your framework needs to talk to Mesos 0.28.0 or older, or you have not updated to the +[HTTP API](scheduler-http-api.md), you can write the scheduler in C++, Java/Scala, or Python. +Your framework scheduler should inherit from the `Scheduler` class +(see: [C++](https://github.com/apache/mesos/blob/1.6.0/include/mesos/scheduler.hpp#L58-L177), +[Java](http://mesos.apache.org/api/latest/java/org/apache/mesos/Scheduler.html), +[Python](https://github.com/apache/mesos/blob/1.6.0/src/python/interface/src/mesos/interface/__init__.py#L34-L137)). Your scheduler should create a SchedulerDriver (which will mediate +communication between your scheduler and the Mesos master) and then call `SchedulerDriver.run()` +(see: [C++](https://github.com/apache/mesos/blob/1.6.0/include/mesos/scheduler.hpp#L180-L317), +[Java](http://mesos.apache.org/api/latest/java/org/apache/mesos/SchedulerDriver.html), +[Python](https://github.com/apache/mesos/blob/1.6.0/src/python/interface/src/mesos/interface/__init__.py#L140-L278)). + +Your scheduler should create a SchedulerDriver (which will mediate communication between your scheduler and the Mesos master) and then call -`SchedulerDriver.run()`. +`SchedulerDriver.run()`: -### Scheduler API +### High Availability -Callback interface to be implemented by framework schedulers. - -Declared in `MESOS_HOME/include/mesos/scheduler.hpp` - -~~~{.cpp} -/* - * Invoked when the scheduler successfully registers with a Mesos - * master. A unique ID (generated by the master) used for - * distinguishing this framework from others and `MasterInfo` - * with the ip and port of the current master are provided as arguments. - */ -virtual void registered( - SchedulerDriver* driver, - const FrameworkID& frameworkId, - const MasterInfo& masterInfo); - -/* - * Invoked when the scheduler reregisters with a newly elected Mesos master. - * This is only called when the scheduler has previously been registered. - * `MasterInfo` containing the updated information about the elected master - * is provided as an argument. - */ -virtual void reregistered( - SchedulerDriver* driver, - const MasterInfo& masterInfo); - -/* - * Invoked when the scheduler becomes "disconnected" from the master - * (e.g., the master fails and another is taking over). - */ -virtual void disconnected(SchedulerDriver* driver); - -/* - * Invoked when resources have been offered to this framework. A - * single offer will only contain resources from a single slave. - * Resources associated with an offer will not be re-offered to - * _this_ framework until either (a) this framework has rejected - * those resources (see SchedulerDriver::launchTasks) or (b) those - * resources have been rescinded (see Scheduler::offerRescinded). - * Note that resources may be concurrently offered to more than one - * framework at a time (depending on the allocator being used). In - * that case, the first framework to launch tasks using those - * resources will be able to use them while the other frameworks - * will have those resources rescinded (or if a framework has - * already launched tasks with those resources then those tasks will - * fail with a TASK_LOST status and a message saying as much). - */ -virtual void resourceOffers( - SchedulerDriver* driver, - const std::vector<Offer>& offers); - -/* - * Invoked when an offer is no longer valid (e.g., the slave was - * lost or another framework used resources in the offer). If for - * whatever reason an offer is never rescinded (e.g., dropped - * message, failing over framework, etc.), a framework that attempts - * to launch tasks using an invalid offer will receive TASK_LOST - * status updates for those tasks (see Scheduler::resourceOffers). - */ -virtual void offerRescinded(SchedulerDriver* driver, const OfferID& offerId); - -/* - * Invoked when the status of a task has changed (e.g., a slave is - * lost and so the task is lost, a task finishes and an executor - * sends a status update saying so, etc). If implicit - * acknowledgements are being used, then returning from this - * callback _acknowledges_ receipt of this status update! If for - * whatever reason the scheduler aborts during this callback (or - * the process exits) another status update will be delivered (note, - * however, that this is currently not true if the slave sending the - * status update is lost/fails during that time). If explicit - * acknowledgements are in use, the scheduler must acknowledge this - * status on the driver. - */ -virtual void statusUpdate(SchedulerDriver* driver, const TaskStatus& status); - -/* - * Invoked when an executor sends a message. These messages are best - * effort; do not expect a framework message to be retransmitted in - * any reliable fashion. - */ -virtual void frameworkMessage( - SchedulerDriver* driver, - const ExecutorID& executorId, - const SlaveID& slaveId, - const std::string& data); - -/* - * Invoked when a slave has been determined unreachable (e.g., - * machine failure, network partition). Most frameworks will need to - * reschedule any tasks launched on this slave on a new slave. - * - * NOTE: This callback is not reliably delivered. If a host or - * network failure causes messages between the master and the - * scheduler to be dropped, this callback may not be invoked. - */ -virtual void slaveLost(SchedulerDriver* driver, const SlaveID& slaveId); - -/* - * Invoked when an executor has exited/terminated. Note that any - * tasks running will have TASK_LOST status updates automagically - * generated. - * - * NOTE: This callback is not reliably delivered. If a host or - * network failure causes messages between the master and the - * scheduler to be dropped, this callback may not be invoked. - */ -virtual void executorLost( - SchedulerDriver* driver, - const ExecutorID& executorId, - const SlaveID& slaveId, - int status); - -/* - * Invoked when there is an unrecoverable error in the scheduler or - * scheduler driver. The driver will be aborted BEFORE invoking this - * callback. - */ -virtual void error(SchedulerDriver* driver, const std::string& message); -~~~ - -### Scheduler Driver API - -The Scheduler Driver is responsible for managing the scheduler's lifecycle -(e.g., start, stop, or wait to finish) and interacting with Mesos Master -(e.g., launch tasks, kill tasks, etc.). - -Note that this interface is usually not implemented by a framework itself, -but it describes the possible calls a framework scheduler can make to -interact with the Mesos Master. - -Please note that usage of this interface requires an instantiated -MesosSchedulerDiver. -See `src/examples/test_framework.cpp` for an example of using the -MesosSchedulerDriver. - -Declared in `MESOS_HOME/include/mesos/scheduler.hpp` - -~~~{.cpp} -// Starts the scheduler driver. This needs to be called before any -// other driver calls are made. -virtual Status start(); - -// Stops the scheduler driver. If the 'failover' flag is set to -// false then it is expected that this framework will never -// reconnect to Mesos. So Mesos will unregister the framework and -// shutdown all its tasks and executors. If 'failover' is true, all -// executors and tasks will remain running (for some framework -// specific failover timeout) allowing the scheduler to reconnect -// (possibly in the same process, or from a different process, for -// example, on a different machine). -virtual Status stop(bool failover = false); - -// Aborts the driver so that no more callbacks can be made to the -// scheduler. The semantics of abort and stop have deliberately been -// separated so that code can detect an aborted driver (i.e., via -// the return status of SchedulerDriver::join, see below), and -// instantiate and start another driver if desired (from within the -// same process). Note that 'stop()' is not automatically called -// inside 'abort()'. -virtual Status abort(); - -// Waits for the driver to be stopped or aborted, possibly -// _blocking_ the current thread indefinitely. The return status of -// this function can be used to determine if the driver was aborted -// (see mesos.proto for a description of Status). -virtual Status join(); - -// Starts and immediately joins (i.e., blocks on) the driver. -virtual Status run(); - -// Requests resources from Mesos (see mesos.proto for a description -// of Request and how, for example, to request resources from -// specific slaves). Any resources available are offered to the -// framework via Scheduler::resourceOffers callback, asynchronously. -virtual Status requestResources(const std::vector<Request>& requests); - -// Launches the given set of tasks. Any remaining resources (i.e., -// those that are not used by the launched tasks or their executors) -// will be considered declined. Note that this includes resources -// used by tasks that the framework attempted to launch but failed -// (with `TASK_ERROR`) due to a malformed task description. The -// specified filters are applied on all unused resources (see -// mesos.proto for a description of Filters). Available resources -// are aggregated when multiple offers are provided. Note that all -// offers must belong to the same slave. Invoking this function with -// an empty collection of tasks declines offers in their entirety -// (see Scheduler::declineOffer). -virtual Status launchTasks( - const std::vector<OfferID>& offerIds, - const std::vector<TaskInfo>& tasks, - const Filters& filters = Filters()); - -// Kills the specified task. Note that attempting to kill a task is -// currently not reliable. If, for example, a scheduler fails over -// while it was attempting to kill a task it will need to retry in -// the future. Likewise, if unregistered / disconnected, the request -// will be dropped (these semantics may be changed in the future). -virtual Status killTask(const TaskID& taskId); - -// Accepts the given offers and performs a sequence of operations on -// those accepted offers. See Offer.Operation in mesos.proto for the -// set of available operations. Any remaining resources (i.e., those -// that are not used by the launched tasks or their executors) will -// be considered declined. Note that this includes resources used by -// tasks that the framework attempted to launch but failed (with -// `TASK_ERROR`) due to a malformed task description. The specified -// filters are applied on all unused resources (see mesos.proto for -// a description of Filters). Available resources are aggregated -// when multiple offers are provided. Note that all offers must -// belong to the same slave. -virtual Status acceptOffers( - const std::vector<OfferID>& offerIds, - const std::vector<Offer::Operation>& operations, - const Filters& filters = Filters()); - -// Declines an offer in its entirety and applies the specified -// filters on the resources (see mesos.proto for a description of -// Filters). Note that this can be done at any time, it is not -// necessary to do this within the Scheduler::resourceOffers -// callback. -virtual Status declineOffer( - const OfferID& offerId, - const Filters& filters = Filters()); - -// Removes all filters previously set by the framework (via -// launchTasks()). This enables the framework to receive offers from -// those filtered slaves. -virtual Status reviveOffers(); - -// Inform Mesos master to stop sending offers to the framework. The -// scheduler should call reviveOffers() to resume getting offers. -virtual Status suppressOffers(); - -// Acknowledges the status update. This should only be called -// once the status update is processed durably by the scheduler. -// Not that explicit acknowledgements must be requested via the -// constructor argument, otherwise a call to this method will -// cause the driver to crash. -virtual Status acknowledgeStatusUpdate(const TaskStatus& status); - -// Sends a message from the framework to one of its executors. These -// messages are best effort; do not expect a framework message to be -// retransmitted in any reliable fashion. -virtual Status sendFrameworkMessage( - const ExecutorID& executorId, - const SlaveID& slaveId, - const std::string& data); - -// Allows the framework to query the status for non-terminal tasks. -// This causes the master to send back the latest task status for -// each task in 'statuses', if possible. Tasks that are no longer -// known will result in a TASK_LOST update. If statuses is empty, -// then the master will send the latest status for each task -// currently known. -virtual Status reconcileTasks(const std::vector<TaskStatus>& statuses); -~~~ - -### Handling Failures -How to build Mesos frameworks that remain available in the face of failures is +How to build Mesos frameworks that are highly available in the face of failures is discussed in a [separate document](high-availability-framework-guide.md). ## Working with Executors @@ -365,79 +116,16 @@ If you are writing an executor against Mesos 1.0 or newer, it is recommended to use the new [HTTP API](executor-http-api.md) to talk to Mesos. If writing against Mesos 0.28.0 or older, your framework executor must inherit -from the Executor class. It must override the launchTask() method. You can use +from the Executor class (see (see: [C++](https://github.com/apache/mesos/blob/1.6.0/include/mesos/executor.hpp#L60-L137), +[Java](http://mesos.apache.org/api/latest/java/org/apache/mesos/Executor.html), +[Python](https://github.com/apache/mesos/blob/1.6.0/src/python/interface/src/mesos/interface/__init__.py#L280-L344)). It must override the launchTask() method. You can use the $MESOS_HOME environment variable inside of your executor to determine where -Mesos is running from. - -#### Executor API - -Declared in `MESOS_HOME/include/mesos/executor.hpp` - -~~~{.cpp} -/* - * Invoked once the executor driver has been able to successfully - * connect with Mesos. In particular, a scheduler can pass some - * data to its executors through the `FrameworkInfo.ExecutorInfo`'s - * data field. - */ -virtual void registered( - ExecutorDriver* driver, - const ExecutorInfo& executorInfo, - const FrameworkInfo& frameworkInfo, - const SlaveInfo& slaveInfo); - -/* - * Invoked when the executor reregisters with a restarted slave. - */ -virtual void reregistered(ExecutorDriver* driver, const SlaveInfo& slaveInfo); - -/* - * Invoked when the executor becomes "disconnected" from the slave - * (e.g., the slave is being restarted due to an upgrade). - */ -virtual void disconnected(ExecutorDriver* driver); - -/* - * Invoked when a task has been launched on this executor (initiated - * via Scheduler::launchTasks). Note that this task can be realized - * with a thread, a process, or some simple computation, however, no - * other callbacks will be invoked on this executor until this - * callback has returned. - */ -virtual void launchTask(ExecutorDriver* driver, const TaskInfo& task); - -/* - * Invoked when a task running within this executor has been killed - * (via SchedulerDriver::killTask). Note that no status update will - * be sent on behalf of the executor, the executor is responsible - * for creating a new TaskStatus (i.e., with TASK_KILLED) and - * invoking ExecutorDriver::sendStatusUpdate. - */ -virtual void killTask(ExecutorDriver* driver, const TaskID& taskId); - -/* - * Invoked when a framework message has arrived for this - * executor. These messages are best effort; do not expect a - * framework message to be retransmitted in any reliable fashion. - */ -virtual void frameworkMessage(ExecutorDriver* driver, const std::string& data); - -/* - * Invoked when the executor should terminate all of its currently - * running tasks. Note that after a Mesos has determined that an - * executor has terminated any tasks that the executor did not send - * terminal status updates for (e.g., TASK_KILLED, TASK_FINISHED, - * TASK_FAILED, etc) a TASK_LOST status update will be created. - */ -virtual void shutdown(ExecutorDriver* driver); - -/* - * Invoked when a fatal error has occurred with the executor and/or - * executor driver. The driver will be aborted BEFORE invoking this - * callback. - */ -virtual void error(ExecutorDriver* driver, const std::string& message); -~~~ +Mesos is running from. Your executor should create an ExecutorDriver (which will +mediate communication between your executor and the Mesos agent) and then call +`ExecutorDriver.run()` +(see: [C++](https://github.com/apache/mesos/blob/1.6.0/include/mesos/executor.hpp#L140-L188), +[Java](http://mesos.apache.org/api/latest/java/org/apache/mesos/ExecutorDriver.html), +[Python](https://github.com/apache/mesos/blob/1.6.0/src/python/interface/src/mesos/interface/__init__.py#L348-L401)). #### Install your custom Framework Executor