Modified: mesos/site/publish/documentation/latest/configuration/index.html URL: http://svn.apache.org/viewvc/mesos/site/publish/documentation/latest/configuration/index.html?rev=1638021&r1=1638020&r2=1638021&view=diff ============================================================================== --- mesos/site/publish/documentation/latest/configuration/index.html (original) +++ mesos/site/publish/documentation/latest/configuration/index.html Tue Nov 11 04:11:00 2014 @@ -123,378 +123,1409 @@ <p><em>These options can be supplied to both masters and slaves.</em></p> -<pre><code> --ip=VALUE IP address to listen on +<table class="table table-striped"> + <thead> + <tr> + <th width="30%"> + Flag + </th> + <th> + Explanation + </th> + </thead> + + <tr> + <td> + --ip=VALUE + </td> + <td> + IP address to listen on + + </td> + </tr> + <tr> + <td> + --[no-]help + </td> + <td> + Prints this help message (default: false) + + </td> + </tr> + <tr> + <td> + --[no-]initialize_driver_logging + </td> + <td> + Whether to automatically initialize google logging of scheduler + and/or executor drivers. (default: true) + + </td> + </tr> + <tr> + <td> + --log_dir=VALUE + </td> + <td> + Location to put log files (no default, nothing + is written to disk unless specified; + does not affect logging to stderr) + + </td> + </tr> + <tr> + <td> + --logbufsecs=VALUE + </td> + <td> + How many seconds to buffer log messages for (default: 0) + + </td> + </tr> + <tr> + <td> + --logging_level=VALUE + </td> + <td> + Log message at or above this level; possible values: + 'INFO', 'WARNING', 'ERROR'; if quiet flag is used, this + will affect just the logs from log_dir (if specified) (default: INFO) + + </td> + </tr> + <tr> + <td> + --port=VALUE + </td> + <td> + Port to listen on (master default: 5050 and slave default: 5051) + + </td> + </tr> + <tr> + <td> + --[no-]quiet + </td> + <td> + Disable logging to stderr (default: false) + + </td> + </tr> + <tr> + <td> + --[no-]version + </td> + <td> + Show version and exit. (default: false) +</table> - --[no-]help Prints this help message (default: false) - - --log_dir=VALUE Location to put log files (no default, nothing - is written to disk unless specified; - does not affect logging to stderr) - - --logbufsecs=VALUE How many seconds to buffer log messages for (default: 0) - - --logging_level=VALUE Log message at or above this level; possible values: - 'INFO', 'WARNING', 'ERROR'; if quiet flag is used, this - will affect just the logs from log_dir (if specified) (default: INFO) - - --port=VALUE Port to listen on (master default: 5050 and slave default: 5051) - - --[no-]quiet Disable logging to stderr (default: false) - - --[no-]version Show version and exit. (default: false) -</code></pre> <h2>Master Options</h2> <p><em>Required Flags</em></p> -<pre><code> --quorum=VALUE The size of the quorum of replicas when using 'replicated_log' based - registry. It is imperative to set this value to be a majority of - masters i.e., quorum > (number of masters)/2. - - --work_dir=VALUE Where to store the persistent information stored in the Registry. - - --zk=VALUE ZooKeeper URL (used for leader election amongst masters) - May be one of: - zk://host1:port1,host2:port2,.../path - zk://username:password@host1:port1,host2:port2,.../path - file://path/to/file (where file contains one of the above) -</code></pre> +<table class="table table-striped"> + <thead> + <tr> + <th width="30%"> + Flag + </th> + <th> + Explanation + </th> + </thead> + <tr> + <td> + --quorum=VALUE + </td> + <td> + The size of the quorum of replicas when using 'replicated_log' based + registry. It is imperative to set this value to be a majority of + masters i.e., quorum > (number of masters)/2. + + </td> + </tr> + <tr> + <td> + --work_dir=VALUE + </td> + <td> + Where to store the persistent information stored in the Registry. + + </td> + </tr> + <tr> + <td> + --zk=VALUE + </td> + <td> + ZooKeeper URL (used for leader election amongst masters) + May be one of: +<pre><code>zk://host1:port1,host2:port2,.../path +zk://username:password@host1:port1,host2:port2,.../path +file://path/to/file (where file contains one of the above)</code></pre> + </td> + </tr> +</table> + <p><em>Optional Flags</em></p> -<pre><code> --allocation_interval=VALUE Amount of time to wait between performing - (batch) allocations (e.g., 500ms, 1sec, etc). (default: 1secs) - --[no-]authenticate If authenticate is 'true' only authenticated frameworks are allowed - to register. If 'false' unauthenticated frameworks are also - allowed to register. (default: false) - --[no-]authenticate_slaves If 'true' only authenticated slaves are allowed to register. - If 'false' unauthenticated slaves are also allowed to register. (default: false) - --cluster=VALUE Human readable name for the cluster, - displayed in the webui. - --credentials=VALUE Path to a file with a list of credentials. - Each line contains 'principal' and 'secret' separated by whitespace. - Path could be of the form 'file:///path/to/file' or '/path/to/file'. - --framework_sorter=VALUE Policy to use for allocating resources - between a given user's frameworks. Options - are the same as for user_allocator. (default: drf) - --hostname=VALUE The hostname the master should advertise in ZooKeeper. - If left unset, the hostname is resolved from the IP address that the master binds to. - --[no-]log_auto_initialize Whether to automatically initialize the replicated log used for the - registry. If this is set to false, the log has to be manually - initialized when used for the very first time. (default: true) - - --recovery_slave_removal_limit=VALUE For failovers, limit on the percentage of slaves that can be removed - from the registry *and* shutdown after the re-registration timeout - elapses. If the limit is exceeded, the master will fail over rather - than remove the slaves. - This can be used to provide safety guarantees for production - environments. Production environments may expect that across Master - failovers, at most a certain percentage of slaves will fail - permanently (e.g. due to rack-level failures). - Setting this limit would ensure that a human needs to get - involved should an unexpected widespread failure of slaves occur - in the cluster. - Values: [0%-100%] (default: 100%) - - --registry=VALUE Persistence strategy for the registry; - available options are 'replicated_log', 'in_memory' (for testing). (default: replicated_log) - - --registry_fetch_timeout=VALUE Duration of time to wait in order to fetch data from the registry - after which the operation is considered a failure. (default: 1mins) - - --registry_store_timeout=VALUE Duration of time to wait in order to store data in the registry - after which the operation is considered a failure. (default: 5secs) - - --[no-]registry_strict Whether the Master will take actions based on the persistent - information stored in the Registry. Setting this to false means - that the Registrar will never reject the admission, readmission, - or removal of a slave. Consequently, 'false' can be used to - bootstrap the persistent state on a running cluster. - NOTE: This flag is *experimental* and should not be used in - production yet. (default: false) - - --roles=VALUE A comma separated list of the allocation - roles that frameworks in this cluster may - belong to. - - --[no-]root_submissions Can root submit frameworks? (default: true) - - --slave_reregister_timeout=VALUE The timeout within which all slaves are expected to re-register - when a new master is elected as the leader. Slaves that do not - re-register within the timeout will be removed from the registry - and will be shutdown if they attempt to communicate with master. - NOTE: This value has to be atleast 10mins. (default: 10mins) - - --user_sorter=VALUE Policy to use for allocating resources - between users. May be one of: - dominant_resource_fairness (drf) (default: drf) - - --webui_dir=VALUE Location of the webui files/assets (default: /usr/local/share/mesos/webui) - - --weights=VALUE A comma separated list of role/weight pairs - of the form 'role=weight,role=weight'. Weights - are used to indicate forms of priority. - - --whitelist=VALUE Path to a file with a list of slaves - (one per line) to advertise offers for. - Path could be of the form 'file:///path/to/file' or '/path/to/file'. (default: *) +<table class="table table-striped"> + <thead> + <tr> + <th width="30%"> + Flag + </th> + <th> + Explanation + </th> + </thead> + <tr> + <td> + --acls=VALUE + </td> + <td> + The value could be a JSON formatted string of ACLs + or a file path containing the JSON formatted ACLs used + for authorization. Path could be of the form <code>file:///path/to/file</code> + or <code>/path/to/file</code>. + <p/> + See the ACLs protobuf in mesos.proto for the expected format. + <p/> + JSON file example: +<pre><code>{ + "register_frameworks": [ + { + "principals": { "type": "ANY" }, + "roles": { "values": ["a"] } + } + ], + "run_tasks": [ + { + "principals": { "values": ["a", "b"] }, + "users": { "values": ["c"] } + } + ], + "shutdown_frameworks": [ + { + "principals": { "values": ["a", "b"] }, + "framework_principals": { "values": ["c"] } + } + ] +}</code></pre> + </td> + </tr> + <tr> + <td> + --allocation_interval=VALUE + </td> + <td> + Amount of time to wait between performing + (batch) allocations (e.g., 500ms, 1sec, etc). (default: 1secs) + </td> + </tr> + <tr> + <td> + --[no-]authenticate + </td> + <td> + If authenticate is 'true' only authenticated frameworks are allowed + to register. If 'false' unauthenticated frameworks are also + allowed to register. (default: false) + </td> + </tr> + <tr> + <td> + --[no-]authenticate_slaves + </td> + <td> + If 'true' only authenticated slaves are allowed to register. + <p/> + If 'false' unauthenticated slaves are also allowed to register. (default: false) + </td> + </tr> + <tr> + <td> + --authenticators=VALUE + </td> + <td> + Authenticator implementation to use when authenticating frameworks + and/or slaves. Use the default <code>crammd5</code>, or + load an alternate authenticator module using <code>--modules</code>. (default: crammd5) + </td> + </tr> + <tr> + <td> + --cluster=VALUE + </td> + <td> + Human readable name for the cluster, + displayed in the webui. + </td> + </tr> + <tr> + <td> + --credentials=VALUE + </td> + <td> + Either a path to a text file with a list of credentials, + each line containing 'principal' and 'secret' separated by whitespace, + or, a path to a JSON-formatted file containing credentials. + Path could be of the form <code>file:///path/to/file</code> or <code>/path/to/file</code>. + <p/> + JSON file Example: +<pre><code>{ + "credentials": [ + { + "principal": "sherman", + "secret": "kitesurf" + } + ] +}</code></pre> + + <p/> + Text file Example: +<pre><code> username secret </code></pre> + + </td> + </tr> + <tr> + <td> + --framework_sorter=VALUE + </td> + <td> + Policy to use for allocating resources + between a given user's frameworks. Options + are the same as for user_allocator. (default: drf) + </td> + </tr> + <tr> + <td> + --hostname=VALUE + </td> + <td> + The hostname the master should advertise in ZooKeeper. + If left unset, the hostname is resolved from the IP address + that the master binds to. + </td> + </tr> + <tr> + <td> + --[no-]log_auto_initialize + </td> + <td> + Whether to automatically initialize the replicated log used for the + registry. If this is set to false, the log has to be manually + initialized when used for the very first time. (default: true) + </td> + </tr> + <tr> + <td> + --modules=VALUE + </td> + <td> + List of modules to be loaded and be available to the internal + subsystems. + <p/> + Use <code>--modules=filepath</code> to specify the list of modules via a + file containing a JSON formatted string. 'filepath' can be + of the form <code>file:///path/to/file</code> or <code>/path/to/file</code>. + <p/> + Use <code>--modules="{...}"</code> to specify the list of modules inline. + <p/> + JSON file example: +<pre><code>{ + "libraries": [ + { + "file": "/path/to/libfoo.so", + "modules": [ + { + "name": "org_apache_mesos_bar", + "parameters": [ + { + "key": "X", + "value": "Y" + } + ] + }, + { + "name": "org_apache_mesos_baz" + } + ] + }, + { + "name": "qux", + "modules": [ + { + "name": "org_apache_mesos_norf" + } + ] + } + ] +}</code></pre> + </td> + </tr> + <tr> + <td> + --offer_timeout=VALUE + </td> + <td> + Duration of time before an offer is rescinded from a framework. + <p/> + This helps fairness when running frameworks that hold on to offers, + or frameworks that accidentally drop offers. + + </td> + </tr> + <tr> + <td> + --rate_limits=VALUE + </td> + <td> + The value could be a JSON formatted string of rate limits + or a file path containing the JSON formatted rate limits used + for framework rate limiting. + <p/> + Path could be of the form <code>file:///path/to/file</code> + or <code>/path/to/file</code>. + <p/> + + See the RateLimits protobuf in mesos.proto for the expected format. + <p/> + + Example: +<pre><code>{ + "limits": [ + { + "principal": "foo", + "qps": 55.5 + }, + { + "principal": "bar" + } + ], + "aggregate_default_qps": 33.3 +}</code></pre> + </td> + </tr> + <tr> + <td> + --recovery_slave_removal_limit=VALUE + </td> + <td> + For failovers, limit on the percentage of slaves that can be removed + from the registry *and* shutdown after the re-registration timeout + elapses. If the limit is exceeded, the master will fail over rather + than remove the slaves. + <p/> + This can be used to provide safety guarantees for production + environments. Production environments may expect that across Master + failovers, at most a certain percentage of slaves will fail + permanently (e.g. due to rack-level failures). + <p/> + Setting this limit would ensure that a human needs to get + involved should an unexpected widespread failure of slaves occur + in the cluster. + <p/> + Values: [0%-100%] (default: 100%) + </td> + </tr> + <tr> + <td> + --registry=VALUE + </td> + <td> + Persistence strategy for the registry; + <p/> + available options are 'replicated_log', 'in_memory' (for testing). (default: replicated_log) + </td> + </tr> + <tr> + <td> + --registry_fetch_timeout=VALUE + </td> + <td> + Duration of time to wait in order to fetch data from the registry + after which the operation is considered a failure. (default: 1mins) + </td> + </tr> + <tr> + <td> + --registry_store_timeout=VALUE + </td> + <td> + Duration of time to wait in order to store data in the registry + after which the operation is considered a failure. (default: 5secs) + </td> + </tr> + <tr> + <td> + --[no-]registry_strict + </td> + <td> + Whether the Master will take actions based on the persistent + information stored in the Registry. Setting this to false means + that the Registrar will never reject the admission, readmission, + or removal of a slave. Consequently, 'false' can be used to + bootstrap the persistent state on a running cluster. + <p/> + NOTE: This flag is *experimental* and should not be used in + production yet. (default: false) + </td> + </tr> + <tr> + <td> + --roles=VALUE + </td> + <td> + A comma separated list of the allocation + roles that frameworks in this cluster may + belong to. + </td> + </tr> + <tr> + <td> + --[no-]root_submissions + </td> + <td> + Can root submit frameworks? (default: true) + </td> + </tr> + <tr> + <td> + --slave_reregister_timeout=VALUE + </td> + <td> + The timeout within which all slaves are expected to re-register + when a new master is elected as the leader. Slaves that do not + re-register within the timeout will be removed from the registry + and will be shutdown if they attempt to communicate with master. + <p/> + NOTE: This value has to be atleast 10mins. (default: 10mins) + </td> + </tr> + <tr> + <td> + --user_sorter=VALUE + </td> + <td> + Policy to use for allocating resources + between users. May be one of: + <p/> + dominant_resource_fairness (drf) (default: drf) + </td> + </tr> + <tr> + <td> + --webui_dir=VALUE + </td> + <td> + Directory path of the webui files/assets (default: /usr/local/share/mesos/webui) + </td> + </tr> + <tr> + <td> + --weights=VALUE + </td> + <td> + A comma separated list of role/weight pairs + of the form 'role=weight,role=weight'. Weights + are used to indicate forms of priority. + </td> + </tr> + <tr> + <td> + --whitelist=VALUE + </td> + <td> + Path to a file with a list of slaves + (one per line) to advertise offers for. + <p/> + Path could be of the form <code>file:///path/to/file</code> or <code>/path/to/file</code>. (default: *) + </td> + </tr> + <tr> + <td> + --zk_session_timeout=VALUE + </td> + <td> + ZooKeeper session timeout. (default: 10secs) + </td> + </tr> +</table> - --zk_session_timeout=VALUE ZooKeeper session timeout. (default: 10secs) -</code></pre> <h2>Slave Options</h2> <p><em>Required Flags</em></p> -<pre><code> --master=VALUE May be one of: - zk://host1:port1,host2:port2,.../path - zk://username:password@host1:port1,host2:port2,.../path - file://path/to/file (where file contains one of the above) +<table class="table table-striped"> + <thead> + <tr> + <th width="30%"> + Flag + </th> + <th> + Explanation + </th> + </thead> + <tr> + <td> + --master=VALUE + </td> + <td> + This specifies how to connect to a master or a quorum of masters. This flag works with 3 different techniques. It may be one of: + <ol> + <li> hostname or ip to a master or comma-delimited list of masters, e.g., +<pre><code>--master=localhost:5050 +--master=10.0.0.5:5050,10.0.0.6:5050 </code></pre> + </li> -<p><em>Optional Flags</em></p> + <li> zookeeper or quorum hostname/ip + port + master registration path </li> +<pre><code>--master=zk://host1:port1,host2:port2,.../path +--master=zk://username:password@host1:port1,host2:port2,.../path +</code></pre> + </li> -<pre><code> --attributes=VALUE Attributes of machine + <li> a path to a file containing either one of the above options </li> +<pre><code> --master=file://path/to/file (where file contains one of the above)</code></pre> + </li> + </ol> + Examples: + + </td> + </tr> +</table> - --[no-]cgroups_enable_cfs Cgroups feature flag to enable hard limits on CPU resources - via the CFS bandwidth limiting subfeature. - (default: false) - - --cgroups_hierarchy=VALUE The path to the cgroups hierarchy root - (default: /sys/fs/cgroup) - - --cgroups_root=VALUE Name of the root cgroup - (default: mesos) - - --cgroups_subsystems=VALUE This flag has been deprecated and is no longer used, - please update your flags - - --[no-]checkpoint Whether to checkpoint slave and frameworks information - to disk. This enables a restarted slave to recover - status updates and reconnect with (--recover=reconnect) or - kill (--recover=cleanup) old executors (default: true) - - --containerizer_path=VALUE The path to the external containerizer executable used when - external isolation is activated (--isolation=external). - - --credential=VALUE Path to a file containing a single line with - the 'principal' and 'secret' separated by whitespace. - Path could be of the form 'file:///path/to/file' or '/path/to/file' - - --default_container_image=VALUE The default container image to use if not specified by a task, - when using external containerizer - - --default_role=VALUE Any resources in the --resources flag that - omit a role, as well as any resources that - are not present in --resources but that are - automatically detected, will be assigned to - this role. (default: *) - - --disk_watch_interval=VALUE Periodic time interval (e.g., 10secs, 2mins, etc) - to check the disk usage (default: 1mins) - - --executor_registration_timeout=VALUE Amount of time to wait for an executor - to register with the slave before considering it hung and - shutting it down (e.g., 60secs, 3mins, etc) (default: 1mins) - - --executor_shutdown_grace_period=VALUE Amount of time to wait for an executor - to shut down (e.g., 60secs, 3mins, etc) (default: 5secs) - - --frameworks_home=VALUE Directory prepended to relative executor URIs (default: ) - - --gc_delay=VALUE Maximum amount of time to wait before cleaning up - executor directories (e.g., 3days, 2weeks, etc). - Note that this delay may be shorter depending on - the available disk usage. (default: 1weeks) - - --hadoop_home=VALUE Where to find Hadoop installed (for - fetching framework executors from HDFS) - (no default, look for HADOOP_HOME in - environment or find hadoop on PATH) (default: ) - - --hostname=VALUE The hostname the slave should report. - If left unset, the hostname is resolved from the IP address that the slave binds to. - - --isolation=VALUE Isolation mechanisms to use, e.g., 'posix/cpu,posix/mem' - or 'cgroups/cpu,cgroups/mem' or 'external'. (default: posix/cpu,posix/mem) - - --launcher_dir=VALUE Location of Mesos binaries (default: /usr/local/libexec/mesos) - - --recover=VALUE Whether to recover status updates and reconnect with old executors. - Valid values for 'recover' are - reconnect: Reconnect with any old live executors. - cleanup : Kill any old live executors and exit. - Use this option when doing an incompatible slave - or executor upgrade!). - NOTE: If checkpointed slave doesn't exist, no recovery is performed - and the slave registers with the master as a new slave. (default: reconnect) - - --recovery_timeout=VALUE Amount of time alloted for the slave to recover. If the slave takes - longer than recovery_timeout to recover, any executors that are - waiting to reconnect to the slave will self-terminate. - NOTE: This flag is only applicable when checkpoint is enabled. - (default: 15mins) - - --registration_backoff_factor=VALUE Slave initially picks a random amount of time between [0, b], where - b = register_backoff_factor, to (re-)register with a new master. - Subsequent retries are exponentially backed off based on this - interval (e.g., 1st retry uses a random value between [0, b * 2^1], - 2nd retry between [0, b * 2^2], 3rd retry between [0, b * 2^3] etc) - up to a maximum of 1mins (default: 1secs) - - --resource_monitoring_interval=VALUE Periodic time interval for monitoring executor - resource usage (e.g., 10secs, 1min, etc) (default: 1secs) - - --resources=VALUE Total consumable resources per slave, in - the form 'name(role):value;name(role):value...'. - - --slave_subsystems=VALUE List of comma-separated cgroup subsystems to run the slave binary - in, e.g., 'memory,cpuacct'. The default is none. - Present functionality is intended for resource monitoring and - no cgroup limits are set, they are inherited from the root mesos - cgroup. - - --[no-]strict If strict=true, any and all recovery errors are considered fatal. - If strict=false, any expected errors (e.g., slave cannot recover - information about an executor, because the slave died right before - the executor registered.) during recovery are ignored and as much - state as possible is recovered. - (default: true) - - --[no-]switch_user Whether to run tasks as the user who - submitted them rather than the user running - the slave (requires setuid permission) (default: true) - --work_dir=VALUE Where to place framework work directories - (default: /tmp/mesos) -</code></pre> +<p><em>Optional Flags</em></p> -<h2>Mesos Build Configuration Options</h2> +<table class="table table-striped"> + <thead> + <tr> + <th width="30%"> + Flag + </th> + <th> + Explanation + </th> + </thead> + <tr> + <td> + --attributes=VALUE + </td> + <td> + Attributes of machine, in the form: + <p/> + <code>rack:2</code> or <code>'rack:2;u:1'</code> + </td> + </tr> + <tr> + <td> + --[no-]cgroups_enable_cfs + </td> + <td> + Cgroups feature flag to enable hard limits on CPU resources + via the CFS bandwidth limiting subfeature. + (default: false) + </td> + </tr> + <tr> + <td> + --cgroups_hierarchy=VALUE + </td> + <td> + The path to the cgroups hierarchy root + (default: /sys/fs/cgroup) + </td> + </tr> + <tr> + <td> + --[no-]cgroups_limit_swap + </td> + <td> + Cgroups feature flag to enable memory limits on both memory and + swap instead of just memory. + (default: false) + </td> + </tr> + <tr> + <td> + --cgroups_root=VALUE + </td> + <td> + Name of the root cgroup + (default: mesos) + </td> + </tr> + <tr> + <td> + --cgroups_subsystems=VALUE + </td> + <td> + This flag has been deprecated and is no longer used, + please update your flags + </td> + </tr> + <tr> + <td> + --[no-]checkpoint + </td> + <td> + This flag is deprecated and will be removed in a future release. + Whether to checkpoint slave and frameworks information + to disk. This enables a restarted slave to recover + status updates and reconnect with (--recover=reconnect) or + kill (--recover=cleanup) old executors (default: true) + </td> + </tr> + <tr> + <td> + --containerizer_path=VALUE + </td> + <td> + The path to the external containerizer executable used when + external isolation is activated (--isolation=external). + + </td> + </tr> + <tr> + <td> + --containerizers=VALUE + </td> + <td> + Comma separated list of containerizer implementations + to compose in order to provide containerization. + <p/> + Available options are 'mesos', 'external', and + 'docker' (on Linux). The order the containerizers + are specified is the order they are tried + (--containerizers=mesos). + (default: mesos) + </td> + </tr> + <tr> + <td> + --credential=VALUE + </td> + <td> + Either a path to a text with a single line + containing 'principal' and 'secret' separated by whitespace. + <p/> + Or a path containing the JSON formatted information used for one credential. + <p/> + Path could be of the form <code>file:///path/to/file< code> or <code>/path/to/file</code>. + <p/> + JSON file example: +<pre><code>{ + "principal": "username", + "secret": "secret" +}</code></pre> + </td> + </tr> + <tr> + <td> + --default_container_image=VALUE + </td> + <td> + The default container image to use if not specified by a task, + when using external containerizer. + + </td> + </tr> + <tr> + <td> + --default_container_info=VALUE + </td> + <td> + JSON formatted ContainerInfo that will be included into + any ExecutorInfo that does not specify a ContainerInfo. + <p/> + See the ContainerInfo protobuf in mesos.proto for + the expected format. + <p/> + Example: +<pre><code>{ + "type": "MESOS", + "volumes": [ + { + "host_path": "./.private/tmp", + "container_path": "/tmp", + "mode": "RW" + } + ] +}</code></pre> + </td> + </tr> + <tr> + <td> + --default_role=VALUE + </td> + <td> + Any resources in the --resources flag that + omit a role, as well as any resources that + are not present in --resources but that are + automatically detected, will be assigned to + this role. (default: *) + </td> + </tr> + <tr> + <td> + --disk_watch_interval=VALUE + </td> + <td> + Periodic time interval (e.g., 10secs, 2mins, etc) + to check the disk usage (default: 1mins) + </td> + </tr> + <tr> + <td> + --docker=VALUE + </td> + <td> + The absolute path to the docker executable for docker + containerizer. + (default: docker) + </td> + </tr> + <tr> + <td> + --docker_remove_delay=VALUE + </td> + <td> + The amount of time to wait before removing docker containers + (e.g., 3days, 2weeks, etc). + (default: 6hrs) + </td> + </tr> + <tr> + <td> + --docker_sandbox_directory=VALUE + </td> + <td> + The absolute path for the directory in the container where the + sandbox is mapped to. + (default: /mnt/mesos/sandbox) + </td> + </tr> + <tr> + <td> + --executor_registration_timeout=VALUE + </td> + <td> + Amount of time to wait for an executor + to register with the slave before considering it hung and + shutting it down (e.g., 60secs, 3mins, etc) (default: 1mins) + </td> + </tr> + <tr> + <td> + --executor_shutdown_grace_period=VALUE + </td> + <td> + Amount of time to wait for an executor + to shut down (e.g., 60secs, 3mins, etc) (default: 5secs) + </td> + </tr> + <tr> + <td> + --frameworks_home=VALUE + </td> + <td> + Directory path prepended to relative executor URIs (default: ) + </td> + </tr> + <tr> + <td> + --gc_delay=VALUE + </td> + <td> + Maximum amount of time to wait before cleaning up + executor directories (e.g., 3days, 2weeks, etc). + <p/> + Note that this delay may be shorter depending on + the available disk usage. (default: 1weeks) + </td> + </tr> + <tr> + <td> + --hadoop_home=VALUE + </td> + <td> + Path to find Hadoop installed (for + fetching framework executors from HDFS) + (no default, look for HADOOP_HOME in + environment or find hadoop on PATH) (default: ) + </td> + </tr> + <tr> + <td> + --hostname=VALUE + </td> + <td> + The hostname the slave should report. + <p/> + If left unset, the hostname is resolved from the IP address + that the slave binds to. + </td> + </tr> + <tr> + <td> + --isolation=VALUE + </td> + <td> + Isolation mechanisms to use, e.g., 'posix/cpu,posix/mem', or + 'cgroups/cpu,cgroups/mem', or network/port_mapping + (configure with flag: --with-network-isolator to enable), + or 'external', or load an alternate isolator module using + the <code>--modules</code> flag. (default: posix/cpu,posix/mem) + </td> + </tr> + <tr> + <td> + --launcher_dir=VALUE + </td> + <td> + Directory path of Mesos binaries (default: /usr/local/lib/mesos) + </td> + </tr> + <tr> + <td> + --modules=VALUE + </td> + <td> + List of modules to be loaded and be available to the internal + subsystems. + <p/> + Use <code>--modules=filepath</code> to specify the list of modules via a + file containing a JSON formatted string. 'filepath' can be + of the form <code>file:///path/to/file</code> or <code>/path/to/file</code>. + <p/> + Use <code>--modules="{...}"</code> to specify the list of modules inline. + <p/> + JSON file example: +<pre><code> +{ + "libraries": [ + { + "file": "/path/to/libfoo.so", + "modules": [ + { + "name": "org_apache_mesos_bar", + "parameters": [ + { + "key": "X", + "value": "Y" + } + ] + }, + { + "name": "org_apache_mesos_baz" + } + ] + }, + { + "name": "qux", + "modules": [ + { + "name": "org_apache_mesos_norf" + } + ] + } + ] +}</code></pre> + </td> + </tr> + <tr> + <td> + --perf_duration=VALUE + </td> + <td> + Duration of a perf stat sample. The duration must be less + that the perf_interval. (default: 10secs) + </td> + </tr> + <tr> + <td> + --perf_events=VALUE + </td> + <td> + List of command-separated perf events to sample for each container + when using the perf_event isolator. Default is none. + <p/> + Run command 'perf list' to see all events. Event names are + sanitized by downcasing and replacing hyphens with underscores + when reported in the PerfStatistics protobuf, e.g., cpu-cycles + becomes cpu_cycles; see the PerfStatistics protobuf for all names. + </td> + </tr> + <tr> + <td> + --perf_interval=VALUE + </td> + <td> + Interval between the start of perf stat samples. Perf samples are + obtained periodically according to perf_interval and the most + recently obtained sample is returned rather than sampling on + demand. For this reason, perf_interval is independent of the + resource monitoring interval (default: 1mins) + </td> + </tr> + <tr> + <td> + --recover=VALUE + </td> + <td> + Whether to recover status updates and reconnect with old executors. + <p/> + Valid values for 'recover' are + <p/> + reconnect: Reconnect with any old live executors. + <p/> + cleanup : Kill any old live executors and exit. + <p/> + Use this option when doing an incompatible slave + or executor upgrade!). + <p/> + NOTE: If checkpointed slave doesn't exist, no recovery is performed + and the slave registers with the master as a new slave. (default: reconnect) + </td> + </tr> + <tr> + <td> + --recovery_timeout=VALUE + </td> + <td> + Amount of time alloted for the slave to recover. If the slave takes + longer than recovery_timeout to recover, any executors that are + waiting to reconnect to the slave will self-terminate. + <p/> + NOTE: This flag is only applicable when checkpoint is enabled. + (default: 15mins) + </td> + </tr> + <tr> + <td> + --registration_backoff_factor=VALUE + </td> + <td> + Slave initially picks a random amount of time between [0, b], where + b = registration_backoff_factor, to (re-)register with a new master. + <p/> + Subsequent retries are exponentially backed off based on this + interval (e.g., 1st retry uses a random value between [0, b * 2^1], + 2nd retry between [0, b * 2^2], 3rd retry between [0, b * 2^3] etc) + up to a maximum of 1mins (default: 1secs) + </td> + </tr> + <tr> + <td> + --resource_monitoring_interval=VALUE + </td> + <td> + Periodic time interval for monitoring executor + resource usage (e.g., 10secs, 1min, etc) (default: 1secs) + </td> + </tr> + <tr> + <td> + --resources=VALUE + </td> + <td> + Total consumable resources per slave, in the form + </p> + <code>name(role):value;name(role):value...</code>. + </td> + </tr> + <tr> + <td> + --slave_subsystems=VALUE + </td> + <td> + List of comma-separated cgroup subsystems to run the slave binary + in, e.g., <code>memory,cpuacct</code>. The default is none. + Present functionality is intended for resource monitoring and + no cgroup limits are set, they are inherited from the root mesos + cgroup. + </td> + </tr> + <tr> + <td> + --[no-]strict + </td> + <td> + If strict=true, any and all recovery errors are considered fatal. + <p/> + If strict=false, any expected errors (e.g., slave cannot recover + information about an executor, because the slave died right before + the executor registered.) during recovery are ignored and as much + state as possible is recovered. + (default: true) + </td> + </tr> + <tr> + <td> + --[no-]switch_user + </td> + <td> + Whether to run tasks as the user who + submitted them rather than the user running + the slave (requires setuid permission) (default: true) + </td> + </tr> + <tr> + <td> + --work_dir=VALUE + </td> + <td> + Directory path to place framework work directories + (default: /tmp/mesos) + </td> + </tr> +</table> -<p>The configure script has the following options:</p> -<pre><code>To assign environment variables (e.g., CC, CFLAGS...), specify them as -VAR=VALUE. See below for descriptions of some of the useful variables. +<h2>Mesos Build Configuration Options</h2> -Defaults for the options are specified in brackets. +<h3>The configure script has the following flags for optional features:</h3> -Configuration: - -h, --help display this help and exit - --help=short display options specific to this package - --help=recursive display the short help of all the included packages - -V, --version display version information and exit - -q, --quiet, --silent do not print `checking...' messages - --cache-file=FILE cache test results in FILE [disabled] - -C, --config-cache alias for `--cache-file=config.cache' - -n, --no-create do not create output files - --srcdir=DIR find the sources in DIR [configure dir or `..'] - -Installation directories: - --prefix=PREFIX install architecture-independent files in PREFIX - [/usr/local] - --exec-prefix=EPREFIX install architecture-dependent files in EPREFIX - [PREFIX] - -By default, `make install' will install all the files in -`/usr/local/bin', `/usr/local/lib' etc. You can specify -an installation prefix other than `/usr/local' using `--prefix', -for instance `--prefix=$HOME'. - -For better control, use the options below. - -Fine tuning of the installation directories: - --bindir=DIR user executables [EPREFIX/bin] - --sbindir=DIR system admin executables [EPREFIX/sbin] - --libexecdir=DIR program executables [EPREFIX/libexec] - --sysconfdir=DIR read-only single-machine data [PREFIX/etc] - --sharedstatedir=DIR modifiable architecture-independent data [PREFIX/com] - --localstatedir=DIR modifiable single-machine data [PREFIX/var] - --libdir=DIR object code libraries [EPREFIX/lib] - --includedir=DIR C header files [PREFIX/include] - --oldincludedir=DIR C header files for non-gcc [/usr/include] - --datarootdir=DIR read-only arch.-independent data root [PREFIX/share] - --datadir=DIR read-only architecture-independent data [DATAROOTDIR] - --infodir=DIR info documentation [DATAROOTDIR/info] - --localedir=DIR locale-dependent data [DATAROOTDIR/locale] - --mandir=DIR man documentation [DATAROOTDIR/man] - --docdir=DIR documentation root [DATAROOTDIR/doc/mesos] - --htmldir=DIR html documentation [DOCDIR] - --dvidir=DIR dvi documentation [DOCDIR] - --pdfdir=DIR pdf documentation [DOCDIR] - --psdir=DIR ps documentation [DOCDIR] - -Program names: - --program-prefix=PREFIX prepend PREFIX to installed program names - --program-suffix=SUFFIX append SUFFIX to installed program names - --program-transform-name=PROGRAM run sed PROGRAM on installed program names - -System types: - --build=BUILD configure for building on BUILD [guessed] - --host=HOST cross-compile to build programs to run on HOST [BUILD] - --target=TARGET configure for building compilers for TARGET [HOST] - -Optional Features: - --disable-option-checking ignore unrecognized --enable/--with options - --disable-FEATURE do not include FEATURE (same as --enable-FEATURE=no) - --enable-FEATURE[=ARG] include FEATURE [ARG=yes] - --enable-shared[=PKGS] build shared libraries [default=yes] - --enable-static[=PKGS] build static libraries [default=yes] - --enable-fast-install[=PKGS] - optimize for fast installation [default=yes] - --disable-dependency-tracking speeds up one-time build - --enable-dependency-tracking do not reject slow dependency extractors - --disable-libtool-lock avoid locking (might break parallel builds) - --disable-java don't build Java bindings - --disable-python don't build Python bindings - --disable-optimize don't try to compile with optimizations - --disable-bundled build against preinstalled dependencies instead of - bundled libraries - -Optional Packages: - --with-PACKAGE[=ARG] use PACKAGE [ARG=yes] - --without-PACKAGE do not use PACKAGE (same as --with-PACKAGE=no) - --with-pic[=PKGS] try to use only PIC/non-PIC objects [default=use - both] - --with-gnu-ld assume the C compiler uses GNU ld [default=no] - --with-sysroot=DIR Search for dependent libraries within DIR - (or the compiler's sysroot if not specified). - --with-zookeeper[=DIR] excludes building and using the bundled ZooKeeper - package in lieu of an installed version at a - location prefixed by the given path - --with-leveldb[=DIR] excludes building and using the bundled LevelDB - package in lieu of an installed version at a - location prefixed by the given path - --without-cxx11 builds Mesos without C++11 support (deprecated) - --with-network-isolator builds the network isolator - -Some influential environment variables: - CC C compiler command - CFLAGS C compiler flags - LDFLAGS linker flags, e.g. -L<lib dir> if you have libraries in a - nonstandard directory <lib dir> - LIBS libraries to pass to the linker, e.g. -l<library> - CPPFLAGS C/C++/Objective C preprocessor flags, e.g. -I<include dir> if - you have headers in a nonstandard directory <include dir> - CXX C++ compiler command - CXXFLAGS C++ compiler flags - CPP C preprocessor - CXXCPP C++ preprocessor - JAVA_HOME location of Java Development Kit (JDK) - JAVA_CPPFLAGS - preprocessor flags for JNI - JAVA_JVM_LIBRARY - full path to libjvm.so - MAVEN_HOME looks for mvn at MAVEN_HOME/bin/mvn - PYTHON which Python interpreter to use - PYTHON_VERSION - The installed Python version to use, for example '2.3'. This - string will be appended to the Python interpreter canonical - name. +<table class="table table-striped"> + <thead> + <tr> + <th width="30%"> + Flag + </th> + <th> + Explanation + </th> + </thead> + <tr> + <td> + --enable-shared[=PKGS] + </td> + <td> + build shared libraries [default=yes] + </td> + </tr> + <tr> + <td> + --enable-static[=PKGS] + </td> + <td> + build static libraries [default=yes] + </td> + </tr> + <tr> + <td> + --enable-fast-install[=PKGS] + </td> + <td> + + optimize for fast installation [default=yes] + </td> + </tr> + <tr> + <td> + --disable-libtool-lock + </td> + <td> + avoid locking (might break parallel builds) + </td> + </tr> + <tr> + <td> + --disable-java + </td> + <td> + don't build Java bindings + </td> + </tr> + <tr> + <td> + --disable-python + </td> + <td> + don't build Python bindings + </td> + </tr> + <tr> + <td> + --enable-debug + </td> + <td> + enable debugging. If CFLAGS/CXXFLAGS are set, this + option won't change them default: no + </td> + </tr> + <tr> + <td> + --enable-optimize + </td> + <td> + enable optimizations. If CFLAGS/CXXFLAGS are set, + this option won't change them default: no + </td> + </tr> + <tr> + <td> + --disable-bundled + </td> + <td> + build against preinstalled dependencies instead of + bundled libraries + </td> + </tr> + <tr> + <td> + --disable-bundled-distribute + </td> + <td> + + excludes building and using the bundled distribute + package in lieu of an installed version in + PYTHONPATH + </td> + </tr> + <tr> + <td> + --disable-bundled-pip + </td> + <td> + excludes building and using the bundled pip package + in lieu of an installed version in PYTHONPATH + </td> + </tr> + <tr> + <td> + --disable-bundled-wheel + </td> + <td> + excludes building and using the bundled wheel + package in lieu of an installed version in + PYTHONPATH + </td> + </tr> + <tr> + <td> + --disable-python-dependency-install + </td> + <td> + + when the python packages are installed during make + install, no external dependencies are downloaded or + installed + </td> + </tr> +</table> + + +<h3>The configure script has the following flags for optional packages:</h3> + +<table class="table table-striped"> + <thead> + <tr> + <th width="30%"> + Flag + </th> + <th> + Explanation + </th> + </thead> + <tr> + <td> + --with-gnu-ld + </td> + <td> + assume the C compiler uses GNU ld [default=no] + </td> + </tr> + <tr> + <td> + --with-sysroot=DIR + </td> + <td> + Search for dependent libraries within DIR + (or the compiler's sysroot if not specified). + </td> + </tr> + <tr> + <td> + --with-zookeeper[=DIR] + </td> + <td> + excludes building and using the bundled ZooKeeper + package in lieu of an installed version at a + location prefixed by the given path + </td> + </tr> + <tr> + <td> + --with-leveldb[=DIR] + </td> + <td> + excludes building and using the bundled LevelDB + package in lieu of an installed version at a + location prefixed by the given path + </td> + </tr> + <tr> + <td> + --with-glog[=DIR] + </td> + <td> + excludes building and using the bundled glog package + in lieu of an installed version at a location + prefixed by the given path + </td> + </tr> + <tr> + <td> + --with-protobuf[=DIR] + </td> + <td> + excludes building and using the bundled protobuf + package in lieu of an installed version at a + location prefixed by the given path + </td> + </tr> + <tr> + <td> + --with-gmock[=DIR] + </td> + <td> + excludes building and using the bundled gmock + package in lieu of an installed version at a + location prefixed by the given path + </td> + </tr> + <tr> + <td> + --with-curl=[=DIR] + </td> + <td> + specify where to locate the curl library + </td> + </tr> + <tr> + <td> + --with-sasl=[=DIR] + </td> + <td> + specify where to locate the sasl2 library + </td> + </tr> + <tr> + <td> + --with-zlib=[=DIR] + </td> + <td> + specify where to locate the zlib library + </td> + </tr> + <tr> + <td> + --with-apr=[=DIR] + </td> + <td> + specify where to locate the apr-1 library + </td> + </tr> + <tr> + <td> + --with-svn=[=DIR] + </td> + <td> + specify where to locate the svn-1 library + </td> + </tr> + <tr> + <td> + --with-network-isolator + </td> + <td> + builds the network isolator + </td> + </tr> +</table> + + +<h3>Some influential environment variables for configure script:</h3> + +<p>Use these variables to override the choices made by `configure' or to help +it to find libraries and programs with nonstandard names/locations.</p> + +<table class="table table-striped"> + <thead> + <tr> + <th width="30%"> + Flag + </th> + <th> + Explanation + </th> + </thead> + <tr> + <td> + JAVA_HOME + </td> + <td> + location of Java Development Kit (JDK) + </td> + </tr> + <tr> + <td> + JAVA_CPPFLAGS + </td> + <td> + preprocessor flags for JNI + </td> + </tr> + <tr> + <td> + JAVA_JVM_LIBRARY + </td> + <td> + full path to libjvm.so + </td> + </tr> + <tr> + <td> + MAVEN_HOME + </td> + <td> + looks for mvn at MAVEN_HOME/bin/mvn + </td> + </tr> + <tr> + <td> + PROTOBUF_JAR + </td> + <td> + full path to protobuf jar on prefixed builds + </td> + </tr> + <tr> + <td> + PYTHON + </td> + <td> + which Python interpreter to use + </td> + </tr> + <tr> + <td> + PYTHON_VERSION + </td> + <td> + The installed Python version to use, for example '2.3'. This + string will be appended to the Python interpreter canonical + name. + </td> + </tr> +</table> -Use these variables to override the choices made by `configure' or to help -it to find libraries and programs with nonstandard names/locations. -</code></pre> </div> </div>
Added: mesos/site/publish/documentation/latest/external-containerizer/index.html URL: http://svn.apache.org/viewvc/mesos/site/publish/documentation/latest/external-containerizer/index.html?rev=1638021&view=auto ============================================================================== --- mesos/site/publish/documentation/latest/external-containerizer/index.html (added) +++ mesos/site/publish/documentation/latest/external-containerizer/index.html Tue Nov 11 04:11:00 2014 @@ -0,0 +1,645 @@ +<!DOCTYPE html> +<!-- + + ______ __ + /\ _ \ /\ \ + \ \ \L\ \ _____ __ ___\ \ \___ __ + \ \ __ \/\ '__`\ /'__`\ /'___\ \ _ `\ /'__`\ + \ \ \/\ \ \ \L\ \/\ \L\.\_/\ \__/\ \ \ \ \/\ __/ + \ \_\ \_\ \ ,__/\ \__/.\_\ \____\\ \_\ \_\ \____\ + \/_/\/_/\ \ \/ \/__/\/_/\/____/ \/_/\/_/\/____/ + \ \_\ + \/_/ + + /'\_/`\ + /\ \ __ ____ ___ ____ + \ \ \__\ \ /'__`\ /',__\ / __`\ /',__\ + \ \ \_/\ \/\ __//\__, `\/\ \L\ \/\__, `\ + \ \_\\ \_\ \____\/\____/\ \____/\/\____/ + \/_/ \/_/\/____/\/___/ \/___/ \/___/ + +--> +<html> + <head> + <meta charset="utf-8"> + <title></title> + <meta name="viewport" content="width=device-width, initial-scale=1.0"> + <meta name="description" content=""> + <meta name="author" content=""> + + <!-- Le styles --> + <link href="//netdna.bootstrapcdn.com/bootstrap/3.1.1/css/bootstrap.min.css" rel="stylesheet"> + + <link href="../../../assets/css/main.css" media="screen" rel="stylesheet" type="text/css" /> + + + + <!-- Google Analytics Magic --> + <script type="text/javascript"> + var _gaq = _gaq || []; + _gaq.push(['_setAccount', 'UA-20226872-1']); + _gaq.push(['_setDomainName', 'apache.org']); + _gaq.push(['_trackPageview']); + + (function() { + var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; + ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; + var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); + })(); + </script> + </head> + <body> + <!-- magical breadcrumbs --> + <div class="topnav"> + <ul class="breadcrumb"> + <li> + <div class="dropdown"> + <a data-toggle="dropdown" href="#">Apache Software Foundation <span class="caret"></span></a> + <ul class="dropdown-menu" role="menu" aria-labelledby="dLabel"> + <li><a href="http://www.apache.org">Apache Homepage</a></li> + <li><a href="http://www.apache.org/licenses/">License</a></li> + <li><a href="http://www.apache.org/foundation/sponsorship.html">Sponsorship</a></li> + <li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li> + <li><a href="http://www.apache.org/security/">Security</a></li> + </ul> + </div> + </li> + <li><a href="http://mesos.apache.org">Apache Mesos</a></li> + + + <li><a href="/documentation +/">Documentation +</a></li> + + + </ul><!-- /breadcrumb --> + </div> + + <!-- navbar excitement --> + <div class="navbar navbar-static-top" role="navigation"> + <div class="navbar-inner"> + <div class="container"> + <a href="/" class="logo"><img src="/assets/img/mesos_logo.png" alt="Apache Mesos logo" /></a> + <div class="nav-collapse"> + <ul class="nav nav-pills navbar-right"> + <li><a href="/gettingstarted/">Getting Started</a></li> + <li><a href="/documentation/latest/">Documentation</a></li> + <li><a href="/downloads/">Downloads</a></li> + <li><a href="/community/">Community</a></li> + </ul> + </div> + </div> + </div> + </div><!-- /.navbar --> + + <div class="container"> + + <div class="row-fluid"> + <div class="col-md-4"> + <h4>If you're new to Mesos</h4> + <p>See the <a href="/gettingstarted/">getting started</a> page for more information about downloading, building, and deploying Mesos.</p> + + <h4>If you'd like to get involved or you're looking for support</h4> + <p>See our <a href="/community/">community</a> page for more details.</p> + </div> + <div class="col-md-8"> + <h1>External Containerizer</h1> + +<ul> +<li>EC = external containerizer. A part of the mesos slave that provides +an API for containerizing via external plugin executables.</li> +<li>ECP = external containerizer program. An external plugin executable +implementing the actual containerizing by interfacing with a +containerizing system (e.g. Docker).</li> +</ul> + + +<h1>Containerizing</h1> + +<h1>General Overview</h1> + +<p>EC invokes ECP as a shell process, passing the command as a parameter +to the ECP executable. Additional data is exhanged via stdin and +stdout.</p> + +<p>The ECP is expected to return a zero exit code for all commands it was +able to process. A non-zero status code signals an error. Below you +will find an overview of the commands that have to be implemented by +an ECP, as well as their invocation scheme.</p> + +<p>The ECP is expected to be using stderr for state info and displaying +additional debug information. That information is getting logged to +a file, see <a href="#sandbox">Enviroment: <strong>Sandbox</strong></a>.</p> + +<h3>Call and communication scheme</h3> + +<p>Interface describing the functions an ECP has to implement via +command calls. Many invocations on the ECP will also pass a +protobuf message along via stdin. Some invocations on the ECP also +expect to deliver a result protobuf message back via stdout. +All protobuf messages are prefixed by their original length - +this is sometimes referred to as “Record-IO”-format. See +<a href="#record-io-deserializing-example">Record-IO De/Serializing Example</a>.</p> + +<p><strong>COMMAND < INPUT-PROTO > RESULT-PROTO</strong></p> + +<ul> +<li><code>launch < containerizer::Launch</code></li> +<li><code>update < containerizer::Update</code></li> +<li><code>usage < containerizer::Usage > mesos::ResourceStatistics</code></li> +<li><code>wait < containerizer::Wait > containerizer::Termination</code></li> +<li><code>destroy < containerizer::Destroy</code></li> +<li><code>containers > containerizer::Containers</code></li> +<li><code>recover</code></li> +</ul> + + +<h1>Command Ordering</h1> + +<h2>Make no assumptions</h2> + +<p>Commands may pretty much come in any order. There is only one +exception to this rule; when launching a task, the EC will make sure +that the ECP first receives a <code>launch</code> on that specific container, all +other commands are queued until <code>launch</code> returns from the ECP.</p> + +<h1>Use Cases</h1> + +<h2>Task Launching EC / ECP Overview</h2> + +<ul> +<li>EC invokes <code>launch</code> on the ECP.</li> +<li>Along with that call, the ECP will receive a containerizer::Launch +protobuf message via stdin.</li> +<li>ECP now makes sure the executor gets started. +<strong>Note</strong> that <code>launch</code> is not supposed to block. It should return +immediately after triggering the executor/command - that could be done +via fork-exec within the ECP.</li> +<li>EC invokes <code>wait</code> on the ECP.</li> +<li>Along with that call, the ECP will receive a containerizer::Wait +protobuf message via stdin.</li> +<li>ECP now blocks until the launched command is reaped - that could be +implemented via waitpid within the ECP.</li> +<li>Once the command is reaped, the ECP should deliver a +containerizer::Termination protobuf message via stdout, back to the +EC.</li> +</ul> + + +<h2>Container Lifecycle Sequence Diagrams</h2> + +<h3>Container Launching</h3> + +<p>A container is in a staging state and now gets started and observed +until it gets into a final state.</p> + +<p><img src="images/ec_launch_seqdiag.png?raw=true" alt="Container Launching Scheme" /></p> + +<h3>Container Running</h3> + +<p>A container has gotten launched at some point and now is considered +being in a non terminal state by the slave. The following commands +will get triggered multiple times at the ECP over the lifetime of a +container. Their order however is not determined.</p> + +<p><img src="images/ec_lifecycle_seqdiag.png?raw=true" alt="Container Running Scheme" /></p> + +<h3>Resource Limitation</h3> + +<p>While a container is active, a resource limitation was identified +(e.g. out of memory) by the ECP isolation mechanism of choice.</p> + +<p><img src="images/ec_kill_seqdiag.png?raw=true" alt="Resource Limitation Scheme" /></p> + +<h2>Slave Recovery Overview</h2> + +<ul> +<li>Slave recovers via check pointed state.</li> +<li>EC invokes <code>recover</code> on the ECP - there is no protobuf message sent +or expected as a result from this command.</li> +<li>The ECP may try to recover internal states via its own failover +mechanisms, if needed.</li> +<li>After <code>recover</code> returns, the EC will invoke <code>containers</code> on the ECP.</li> +<li>The ECP should return Containers which is a list of currently +active containers. +<strong>Note</strong> these containers are known to the ECP but might in fact +partially be unknown to the slave (e.g. slave failed after launch but +before or within wait) - those containers are considered to be +orphans.</li> +<li>The EC now compares the list of slave known containers to those +listed within <code>Containers</code>. For each orphan it identifies, the slave +will invoke a <code>wait</code> followed by a <code>destroy</code> on the ECP for those +containers.</li> +<li>Slave will now call <code>wait</code> on the ECP (via EC) for all recovered +containers. This does once again put <code>wait</code> into the position of the +ultimate command reaper.</li> +</ul> + + +<h2>Slave Recovery Sequence Diagram</h2> + +<h3>Recovery</h3> + +<p>While containers are active, the slave fails over.</p> + +<p><img src="images/ec_recover_seqdiag.png?raw=true" alt="Recovery Scheme" /></p> + +<h3>Orphan Destruction</h3> + +<p>Containers identified by the ECP as being active but not slave state +recoverable are getting terminated.</p> + +<p><img src="images/ec_orphan_seqdiag.png?raw=true" alt="Orphan Destruction Scheme" /></p> + +<h1>Command Details</h1> + +<h2>launch</h2> + +<h3>Start the containerized executor</h3> + +<p>Hands over all information the ECP needs for launching a task +via an executor. +This call should not wait for the executor/command to return. The +actual reaping of the containerized command is done via the <code>wait</code> +call.</p> + +<pre><code>launch < containerizer::Launch +</code></pre> + +<p>This call receives the containerizer::Launch protobuf via stdin.</p> + +<pre><code>/** + * Encodes the launch command sent to the external containerizer + * program. + */ +message Launch { + required ContainerID container_id = 1; + optional TaskInfo task_info = 2; + optional ExecutorInfo executor_info = 3; + optional string directory = 4; + optional string user = 5; + optional SlaveID slave_id = 6; + optional string slave_pid = 7; + optional bool checkpoint = 8; +} +</code></pre> + +<p>This call does not return any data via stdout.</p> + +<h2>wait</h2> + +<h3>Gets information on the containerized executor’s Termination</h3> + +<p>Is expected to reap the executor/command. This call should block +until the executor/command has terminated.</p> + +<pre><code>wait < containerizer::Wait > containerizer::Termination +</code></pre> + +<p>This call receives the containerizer::Wait protobuf via stdin.</p> + +<pre><code>/** + * Encodes the wait command sent to the external containerizer + * program. + */ +message Wait { + required ContainerID container_id = 1; +} +</code></pre> + +<p>This call is expected to return containerizer::Termination via stdout.</p> + +<pre><code>/** + * Information about a container termination, returned by the + * containerizer to the slave. + */ +message Termination { + // A container may be killed if it exceeds its resources; this will + // be indicated by killed=true and described by the message string. + required bool killed = 1; + required string message = 2; + + // Exit status of the process. + optional int32 status = 3; +} +</code></pre> + +<p>The Termination attribute <code>killed</code> is to be set only when the +containerizer or the underlying isolation had to enforce a limitation +by killing the task (e.g. task exceeded suggested memory limit).</p> + +<h2>update</h2> + +<h3>Updates the container’s resource limits</h3> + +<p>Is sending (new) resource constraints for the given container. +Resource constraints onto a container may vary over the lifetime of +the containerized task.</p> + +<pre><code>update < containerizer::Update +</code></pre> + +<p>This call receives the containerizer::Update protobuf via stdin.</p> + +<pre><code>/** + * Encodes the update command sent to the external containerizer + * program. + */ +message Update { + required ContainerID container_id = 1; + repeated Resource resources = 2; +} +</code></pre> + +<p>This call does not return any data via stdout.</p> + +<h2>usage</h2> + +<h3>Gathers resource usage statistics for a containerized task</h3> + +<p>Is used for polling the current resource uses for the given container.</p> + +<pre><code>usage < containerizer::Usage > mesos::ResourceStatistics +</code></pre> + +<p>This call received the containerizer::Usage protobuf via stdin.</p> + +<pre><code>/** + * Encodes the usage command sent to the external containerizer + * program. + */ +message Usage { + required ContainerID container_id = 1; +} +</code></pre> + +<p>This call is expected to return mesos::ResourceStatistics via stdout.</p> + +<pre><code>/* + * A snapshot of resource usage statistics. + */ +message ResourceStatistics { + required double timestamp = 1; // Snapshot time, in seconds since the Epoch. + + // CPU Usage Information: + // Total CPU time spent in user mode, and kernel mode. + optional double cpus_user_time_secs = 2; + optional double cpus_system_time_secs = 3; + + // Number of CPUs allocated. + optional double cpus_limit = 4; + + // cpu.stat on process throttling (for contention issues). + optional uint32 cpus_nr_periods = 7; + optional uint32 cpus_nr_throttled = 8; + optional double cpus_throttled_time_secs = 9; + + // Memory Usage Information: + optional uint64 mem_rss_bytes = 5; // Resident Set Size. + + // Amount of memory resources allocated. + optional uint64 mem_limit_bytes = 6; + + // Broken out memory usage information (files, anonymous, and mmaped files) + optional uint64 mem_file_bytes = 10; + optional uint64 mem_anon_bytes = 11; + optional uint64 mem_mapped_file_bytes = 12; +} +</code></pre> + +<h2>destroy</h2> + +<h3>Terminates the containerized executor</h3> + +<p>Is used in rare situations, like for graceful slave shutdown +but also in slave fail over scenarios - see Slave Recovery for more.</p> + +<pre><code>destroy < containerizer::Destroy +</code></pre> + +<p>This call receives the containerizer::Destroy protobuf via stdin.</p> + +<pre><code>/** + * Encodes the destroy command sent to the external containerizer + * program. + */ +message Destroy { + required ContainerID container_id = 1; +} +</code></pre> + +<p>This call does not return any data via stdout.</p> + +<h2>containers</h2> + +<h3>Gets all active container-id’s</h3> + +<p>Returns all container identifiers known to be currently active.</p> + +<pre><code>containers > containerizer::Containers +</code></pre> + +<p>This call does not receive any additional data via stdin.</p> + +<p>This call is expected to pass containerizer::Containers back via +stdout.</p> + +<pre><code>/** + * Information on all active containers returned by the containerizer + * to the slave. + */ +message Containers { + repeated ContainerID containers = 1; +} +</code></pre> + +<h2>recover</h2> + +<h3>Internal ECP state recovery</h3> + +<p>Allows the ECP to do a state recovery on its own. If the ECP +uses state check-pointing e.g. via file system, then this call would +be a good moment to de-serialize that state information. Make sure you +also see <a href="#slave-recovery-overview">Slave Recovery Overview</a> for more.</p> + +<pre><code>recover +</code></pre> + +<p>This call does not receive any additional data via stdin. +No returned data via stdout.</p> + +<h3>Protobuf Message Definitions</h3> + +<p>For possibly more up-to-date versions of the above mentioned protobufs +as well as protobuf messages referenced by them, please check:</p> + +<ul> +<li><p>containerizer::XXX are defined within +include/mesos/containerizer/containerizer.proto.</p></li> +<li><p>mesos::XXX are defined within include/mesos/mesos.proto.</p></li> +</ul> + + +<h1>Environment</h1> + +<h2><strong>Sandbox</strong></h2> + +<p>A sandbox environment is formed by <code>cd</code> into the work-directory of the +executor as well as a stderr redirect into the executor’s “stderr” +log-file. +<strong>Note</strong> not <strong>all</strong> invocations have a complete sandbox environment.</p> + +<h2>Addional Environment Variables</h2> + +<p>Additionally, there are a few new environment variables set when +invoking the ECP.</p> + +<ul> +<li><p>MESOS_LIBEXEC_DIRECTORY = path to mesos-executor, mesos-usage, … +This information is always present.</p></li> +<li><p>MESOS_WORK_DIRECTORY = slave work directory. This should be used for +distinguishing slave instances. +This information is always present.</p></li> +</ul> + + +<p><strong>Note</strong> that this is specifically helpful for being able to tie a set +of containers to a specific slave instance, thus allowing proper +recovery when needed.</p> + +<ul> +<li>MESOS_DEFAULT_CONTAINER_IMAGE = default image as provided via slave +flags (default_container_image). This variable is provided only in +calls to <code>launch</code>.</li> +</ul> + + +<h1>Debugging</h1> + +<h2>Enhanced Verbosity Logging</h2> + +<p>For receiving an increased level of status information from the EC +use the GLOG verbosity level. Prefix your mesos startup call by +setting the level to a value higher than or equal to two.</p> + +<p><code>GLOG_v=2 ./bin/mesos-slave --master=[...]</code></p> + +<h2>ECP stderr Logging</h2> + +<p>All output to stderr of your ECP will get logged to the executor’s +‘stderr’ log file. +The specific location can be extracted from the <a href="#enhanced-verbosity-logging">Enhanced Verbosity +Logging</a> of the EC.</p> + +<p>Example Log Output:</p> + +<pre><code>I0603 02:12:34.165662 174215168 external_containerizer.cpp:1083] Invoking external containerizer for method 'launch' +I0603 02:12:34.165675 174215168 external_containerizer.cpp:1100] calling: [/Users/till/Development/mesos-till/build/src/test-containerizer launch] +I0603 02:12:34.165678 175824896 slave.cpp:497] Successfully attached file '/tmp/ExternalContainerizerTest_Launch_lP22ci/slaves/20140603-021232-16777343-51377-7591-0/frameworks/20140603-021232-16777343-51377-7591-0000/executors/1/runs/558e0a69-70da-4d71-b4c4-c2820b1d6345' +I0603 02:12:34.165686 174215168 external_containerizer.cpp:1101] directory: /tmp/ExternalContainerizerTest_Launch_lP22ci/slaves/20140603-021232-16777343-51377-7591-0/frameworks/20140603-021232-16777343-51377-7591-0000/executors/1/runs/558e0a69-70da-4d71-b4c4-c2820b1d6345 +</code></pre> + +<p>The stderr output of the ECP for this call is found within the stderr file located in the directory displayed in the last quoted line.</p> + +<pre><code>cat /tmp/ExternalContainerizerTest_Launch_lP22ci/slaves/20140603-021232-16777343-51377-7591-0/frameworks/20140603-021232-16777343-51377-7591-0000/executors/1/runs/558e0a69-70da-4d71-b4c4-c2820b1d6345/stderr +</code></pre> + +<h1>Appendix</h1> + +<h2>Record-IO Proto Example: Launch</h2> + +<p>This is what a properly record-io formatted protobuf looks like.</p> + +<p><strong>name: offset</strong></p> + +<ul> +<li><p>length: 00 - 03 = record length in byte</p></li> +<li><p>payload: 04 - (length + 4) = protobuf payload</p></li> +</ul> + + +<p>Example length: 00000240h = 576 byte total protobuf size</p> + +<p>Example Hexdump:</p> + +<pre><code>00000000: 4002 0000 0a26 0a24 3433 3532 3533 6162 2d64 3234 362d 3437 :@....&.$435253ab-d246-47 +00000018: 6265 2d61 3335 302d 3335 3432 3034 3635 6438 3638 1a81 020a :be-a350-35420465d868.... +00000030: 030a 0131 2a16 0a04 6370 7573 1000 1a09 0900 0000 0000 0000 :...1*...cpus............ +00000048: 4032 012a 2a15 0a03 6d65 6d10 001a 0909 0000 0000 0000 9040 :@2.**...mem............@ +00000060: 3201 2a2a 160a 0464 6973 6b10 001a 0909 0000 0000 0000 9040 :2.**...disk............@ +00000078: 3201 2a2a 180a 0570 6f72 7473 1001 220a 0a08 0898 f201 1080 :2.**...ports.."......... +00000090: fa01 3201 2a3a 2a1a 2865 6368 6f20 274e 6f20 7375 6368 2066 :..2.*:*.(echo 'No such f +000000a8: 696c 6520 6f72 2064 6972 6563 746f 7279 273b 2065 7869 7420 :ile or directory'; exit +000000c0: 3142 2b0a 2932 3031 3430 3532 362d 3031 3530 3036 2d31 3637 :1B+.)20140526-015006-167 +000000d8: 3737 3334 332d 3535 3430 332d 3632 3536 372d 3030 3030 4a3d :77343-55403-62567-0000J= +000000f0: 436f 6d6d 616e 6420 4578 6563 7574 6f72 2028 5461 736b 3a20 :Command Executor (Task: +00000108: 3129 2028 436f 6d6d 616e 643a 2073 6820 2d63 2027 7768 696c :1) (Command: sh -c 'whil +00000120: 6520 7472 7565 203b 2e2e 2e27 2952 0131 22c5 012f 746d 702f :e true ;...')R.1"../tmp/ +00000138: 4578 7465 726e 616c 436f 6e74 6169 6e65 7269 7a65 7254 6573 :ExternalContainerizerTes +00000150: 745f 4c61 756e 6368 5f6c 5855 6839 662f 736c 6176 6573 2f32 :t_Launch_lXUh9f/slaves/2 +00000168: 3031 3430 3532 362d 3031 3530 3036 2d31 3637 3737 3334 332d :0140526-015006-16777343- +00000180: 3535 3430 332d 3632 3536 372d 302f 6672 616d 6577 6f72 6b73 :55403-62567-0/frameworks +00000198: 2f32 3031 3430 3532 362d 3031 3530 3036 2d31 3637 3737 3334 :/20140526-015006-1677734 +000001b0: 332d 3535 3430 332d 3632 3536 372d 3030 3030 2f65 7865 6375 :3-55403-62567-0000/execu +000001c8: 746f 7273 2f31 2f72 756e 732f 3433 3532 3533 6162 2d64 3234 :tors/1/runs/435253ab-d24 +000001e0: 362d 3437 6265 2d61 3335 302d 3335 3432 3034 3635 6438 3638 :6-47be-a350-35420465d868 +000001f8: 2a04 7469 6c6c 3228 0a26 3230 3134 3035 3236 2d30 3135 3030 :*.till2(.&20140526-01500 +00000210: 362d 3136 3737 3733 3433 2d35 3534 3033 2d36 3235 3637 2d30 :6-16777343-55403-62567-0 +00000228: 3a18 736c 6176 6528 3129 4031 3237 2e30 2e30 2e31 3a35 3534 ::.slave(1)@127.0.0.1:554 +00000240: 3033 4000 +</code></pre> + +<h2>Record-IO De/Serializing Example</h2> + +<p>How to send and receive such record-io formatted message +using Python</p> + +<p><em>taken from src/examples/python/test_containerizer.py</em></p> + +<pre><code># Read a data chunk prefixed by its total size from stdin. +def receive(): + # Read size (uint32 => 4 bytes). + size = struct.unpack('I', sys.stdin.read(4)) + if size[0] <= 0: + print >> sys.stderr, "Expected protobuf size over stdin. " \ + "Received 0 bytes." + return "" + + # Read payload. + data = sys.stdin.read(size[0]) + if len(data) != size[0]: + print >> sys.stderr, "Expected %d bytes protobuf over stdin. " \ + "Received %d bytes." % (size[0], len(data)) + return "" + + return data + +# Write a protobuf message prefixed by its total size (aka recordio) +# to stdout. +def send(data): + # Write size (uint32 => 4 bytes). + sys.stdout.write(struct.pack('I', len(data))) + + # Write payload. + sys.stdout.write(data) +</code></pre> + + </div> +</div> + + + <hr> + + <!-- footer --> + <div class="footer"> + <p>© 2012-2014 <a href="http://apache.org">The Apache Software Foundation</a>. + Apache Mesos, the Apache feather logo, and the Apache Mesos project logo are trademarks of The Apache Software Foundation.<p> + </div><!-- /footer --> + + </div> <!-- /container --> + + <!-- JS --> + <script src="//code.jquery.com/jquery-1.11.0.min.js" type="text/javascript"></script> + <script src="//netdna.bootstrapcdn.com/bootstrap/3.1.1/js/bootstrap.min.js" type="text/javascript"></script> + </body> +</html> \ No newline at end of file Modified: mesos/site/publish/documentation/latest/index.html URL: http://svn.apache.org/viewvc/mesos/site/publish/documentation/latest/index.html?rev=1638021&r1=1638020&r2=1638021&view=diff ============================================================================== --- mesos/site/publish/documentation/latest/index.html (original) +++ mesos/site/publish/documentation/latest/index.html Tue Nov 11 04:11:00 2014 @@ -118,6 +118,7 @@ <ul> <li><a href="/documentation/latest/configuration/">Configuration</a> for command-line arguments.</li> <li><a href="/documentation/latest/docker-containerizer/">Docker Containerizer</a> for launching a Docker image as a Task, or as an Executor.</li> +<li><a href="/documentation/latest/external-containerizer/">External Containerizer</a></li> <li><a href="/documentation/latest/authorization/">Framework Authorization</a></li> <li><a href="/documentation/latest/framework-rate-limiting/">Framework Rate Limiting</a></li> <li><a href="/documentation/latest/high-availability/">High Availability</a> for running multiple masters simultaneously.</li>
