[jira] [Created] (MESOS-8716) Freezer controller is not returned to thaw if task termination fails
Sargun Dhillon created MESOS-8716: - Summary: Freezer controller is not returned to thaw if task termination fails Key: MESOS-8716 URL: https://issues.apache.org/jira/browse/MESOS-8716 Project: Mesos Issue Type: Bug Components: agent, containerization Affects Versions: 1.3.2 Reporter: Sargun Dhillon This issue is related to https://issues.apache.org/jira/browse/MESOS-8004. A container may fail to terminate for a variety of reasons. One common reason in our system is when containers rely on external storage, they run fsync before exiting (fsync on SIGTERM). This makes it so that the termination can timeout. Even though Mesos has sent the requisite kill signals, the task will never terminate because the cgroup stays frozen. The intended behaviour should be that on failure to terminate, if the pids isolator is running, pids.max should be set to 0, to prevent further processes from being created, the cgroup should be walked and sigkilled, and then thawed. Once the processes finish thawing, the kill signal will be delivered, and processed, resulting in the container finally finishing, -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-7744) Mesos Agent Sends TASK_KILL status update to Master, and still launches task
[ https://issues.apache.org/jira/browse/MESOS-7744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095569#comment-16095569 ] Sargun Dhillon commented on MESOS-7744: --- [~neilc] The task is still running. The agent, and master think the task is killed. The framework receives TASK_KILLED. The framework "knows" due to out-of-band mechanisms the task is still alive (We have our own mechanism outside Mesos to do reconciliation), and it resends the kill, but the kill never gets to the executor. The Executor sends TASK_RUNNING status updates to the agent, but these never make it to the master, nor the framework. It occurs if the executor is already running, and the task is killed nearly immediately after it's being started. Specifically, if when the task is on the "queue". > Mesos Agent Sends TASK_KILL status update to Master, and still launches task > > > Key: MESOS-7744 > URL: https://issues.apache.org/jira/browse/MESOS-7744 > Project: Mesos > Issue Type: Bug >Affects Versions: 1.0.1 >Reporter: Sargun Dhillon >Priority: Minor > Labels: reliability > > We sometimes launch jobs, and cancel them in ~7 seconds, if we don't get a > TASK_STARTING back from the agent. Under certain conditions it can result in > Mesos losing track of the task. The chunk of the logs which is interesting is > here: > {code} > Jun 29 23:22:26 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c > mesos-slave[4290]: I0629 23:22:26.951799 5171 slave.cpp:1495] Got assigned > task Titus-7590548-worker-0-4476 for framework TitusFramework > Jun 29 23:22:26 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c > mesos-slave[4290]: I0629 23:22:26.952251 5171 slave.cpp:1614] Launching task > Titus-7590548-worker-0-4476 for framework TitusFramework > Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c > mesos-slave[4290]: I0629 23:22:37.484611 5171 slave.cpp:1853] Queuing task > ‘Titus-7590548-worker-0-4476’ for executor ‘docker-executor’ of framework > TitusFramework at executor(1)@100.66.11.10:17707 > Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c > mesos-slave[4290]: I0629 23:22:37.487876 5171 slave.cpp:2035] Asked to kill > task Titus-7590548-worker-0-4476 of framework TitusFramework > Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c > mesos-slave[4290]: I0629 23:22:37.488994 5171 slave.cpp:3211] Handling > status update TASK_KILLED (UUID: 898215d6-a244-4dbe-bc9c-878a22d36ea4) for > task Titus-7590548-worker-0-4476 of framework TitusFramework from @0.0.0.0:0 > Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c > mesos-slave[4290]: I0629 23:22:37.490603 5171 slave.cpp:2005] Sending queued > task ‘Titus-7590548-worker-0-4476’ to executor ‘docker-executor’ of framework > TitusFramework at executor(1)@100.66.11.10:17707{ > {code} > In our executor, we see that the launch message arrives after the master has > already gotten the kill update. We then send non-terminal state updates to > the agent, and yet it doesn't forward these to our framework. We're using a > custom executor which is based on the older mesos-go bindings. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-7744) Mesos Agent Sends TASK_KILL status update to Master, and still launches task
[ https://issues.apache.org/jira/browse/MESOS-7744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sargun Dhillon updated MESOS-7744: -- Description: We sometimes launch jobs, and cancel them in ~7 seconds, if we don't get a TASK_STARTING back from the agent. Under certain conditions it can result in Mesos losing track of the task. The chunk of the logs which is interesting is here: {code} Jun 29 23:22:26 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c mesos-slave[4290]: I0629 23:22:26.951799 5171 slave.cpp:1495] Got assigned task Titus-7590548-worker-0-4476 for framework TitusFramework Jun 29 23:22:26 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c mesos-slave[4290]: I0629 23:22:26.952251 5171 slave.cpp:1614] Launching task Titus-7590548-worker-0-4476 for framework TitusFramework Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c mesos-slave[4290]: I0629 23:22:37.484611 5171 slave.cpp:1853] Queuing task ‘Titus-7590548-worker-0-4476’ for executor ‘docker-executor’ of framework TitusFramework at executor(1)@100.66.11.10:17707 Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c mesos-slave[4290]: I0629 23:22:37.487876 5171 slave.cpp:2035] Asked to kill task Titus-7590548-worker-0-4476 of framework TitusFramework Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c mesos-slave[4290]: I0629 23:22:37.488994 5171 slave.cpp:3211] Handling status update TASK_KILLED (UUID: 898215d6-a244-4dbe-bc9c-878a22d36ea4) for task Titus-7590548-worker-0-4476 of framework TitusFramework from @0.0.0.0:0 Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c mesos-slave[4290]: I0629 23:22:37.490603 5171 slave.cpp:2005] Sending queued task ‘Titus-7590548-worker-0-4476’ to executor ‘docker-executor’ of framework TitusFramework at executor(1)@100.66.11.10:17707{ {code} In our executor, we see that the launch message arrives after the master has already gotten the kill update. We then send non-terminal state updates to the agent, and yet it doesn't forward these to our framework. We're using a custom executor which is based on the older mesos-go bindings. was: We sometimes launch jobs, and cancel them in ~7 seconds, if we don't get a TASK_STARTING back from the agent. Under certain conditions it can result in Mesos losing track of the task. The chunk of the logs which is interesting is here: {code} Jun 29 23:22:26 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c mesos-slave[4290]: I0629 23:22:26.951799 5171 slave.cpp:1495] Got assigned task Titus-7590548-worker-0-4476 for framework TitusFramework Jun 29 23:22:26 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c mesos-slave[4290]: I0629 23:22:26.952251 5171 slave.cpp:1614] Launching task Titus-7590548-worker-0-4476 for framework TitusFramework Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c mesos-slave[4290]: I0629 23:22:37.484611 5171 slave.cpp:1853] Queuing task ‘Titus-7590548-worker-0-4476’ for executor ‘docker-executor’ of framework TitusFramework at executor(1)@100.66.11.10:17707 Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c mesos-slave[4290]: I0629 23:22:37.487876 5171 slave.cpp:2035] Asked to kill task Titus-7590548-worker-0-4476 of framework TitusFramework Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c mesos-slave[4290]: I0629 23:22:37.488994 5171 slave.cpp:3211] Handling status update TASK_KILLED (UUID: 898215d6-a244-4dbe-bc9c-878a22d36ea4) for task Titus-7590548-worker-0-4476 of framework TitusFramework from @0.0.0.0:0 Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c mesos-slave[4290]: I0629 23:22:37.490603 5171 slave.cpp:2005] Sending queued task ‘Titus-7590548-worker-0-4476’ to executor ‘docker-executor’ of framework TitusFramework at executor(1)@100.66.11.10:17707{ {code} In our executor, we see that the launch message arrives after the master has already gotten the kill update. We then send non-terminal state updates to the agent, and yet it doesn't forward these to our framework. > Mesos Agent Sends TASK_KILL status update to Master, and still launches task > > > Key: MESOS-7744 > URL: https://issues.apache.org/jira/browse/MESOS-7744 > Project: Mesos > Issue Type: Bug >Affects Versions: 1.0.1 >Reporter: Sargun Dhillon >Priority: Minor > > We sometimes launch jobs, and cancel them in ~7 seconds, if we don't get a > TASK_STARTING back from the agent. Under certain conditions it can result in > Mesos losing track of the task. The chunk of the logs which is interesting is > here: > {code} > Jun 29 23:22:26 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c > mesos-slave[4290]: I0629 23:22:26.951799 5171 slave.cpp:1495] Got assigned
[jira] [Commented] (MESOS-7744) Mesos Agent Sends TASK_KILL status update to Master, and still launches task
[ https://issues.apache.org/jira/browse/MESOS-7744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16069355#comment-16069355 ] Sargun Dhillon commented on MESOS-7744: --- Full log: {code} Jun 29 23:22:26 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c mesos-slave[4290]: I0629 23:22:26.951799 5171 slave.cpp:1495] Got assigned task Titus-7590548-worker-0-4476 for framework TitusFramework Jun 29 23:22:26 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c mesos-slave[4290]: I0629 23:22:26.952251 5171 slave.cpp:1614] Launching task Titus-7590548-worker-0-4476 for framework TitusFramework Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c mesos-slave[4290]: I0629 23:22:37.484611 5171 slave.cpp:1853] Queuing task ‘Titus-7590548-worker-0-4476’ for executor ‘docker-executor’ of framework TitusFramework at executor(1)@100.66.11.10:17707 Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c mesos-slave[4290]: I0629 23:22:37.487876 5171 slave.cpp:2035] Asked to kill task Titus-7590548-worker-0-4476 of framework TitusFramework Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c mesos-slave[4290]: I0629 23:22:37.488994 5171 slave.cpp:3211] Handling status update TASK_KILLED (UUID: 898215d6-a244-4dbe-bc9c-878a22d36ea4) for task Titus-7590548-worker-0-4476 of framework TitusFramework from @0.0.0.0:0 Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c mesos-slave[4290]: I0629 23:22:37.490603 5171 slave.cpp:2005] Sending queued task ‘Titus-7590548-worker-0-4476’ to executor ‘docker-executor’ of framework TitusFramework at executor(1)@100.66.11.10:17707 Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c mesos-slave[4290]: I0629 23:22:37.494860 5171 slave.cpp:3211] Handling status update TASK_STARTING (UUID: d6aaed02-5d21-11e7-846c-0a0c90a7033c) for task Titus-7590548-worker-0-4476 of framework TitusFramework from executor(1)@100.66.11.10:17707 Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c mesos-slave[4290]: I0629 23:22:37.496829 5191 status_update_manager.cpp:320] Received status update TASK_KILLED (UUID: 898215d6-a244-4dbe-bc9c-878a22d36ea4) for task Titus-7590548-worker-0-4476 of framework TitusFramework Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c mesos-slave[4290]: I0629 23:22:37.497530 5191 status_update_manager.cpp:825] Checkpointing UPDATE for status update TASK_KILLED (UUID: 898215d6-a244-4dbe-bc9c-878a22d36ea4) for task Titus-7590548-worker-0-4476 of framework TitusFramework Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c mesos-slave[4290]: I0629 23:22:37.498082 5171 slave.cpp:3211] Handling status update TASK_STARTING (UUID: d6aafd3f-5d21-11e7-846c-0a0c90a7033c) for task Titus-7590548-worker-0-4476 of framework TitusFramework from executor(1)@100.66.11.10:17707 Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c mesos-slave[4290]: I0629 23:22:37.500267 5191 status_update_manager.cpp:320] Received status update TASK_STARTING (UUID: d6aaed02-5d21-11e7-846c-0a0c90a7033c) for task Titus-7590548-worker-0-4476 of framework TitusFramework Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c mesos-slave[4290]: I0629 23:22:37.500377 5191 status_update_manager.cpp:825] Checkpointing UPDATE for status update TASK_STARTING (UUID: d6aaed02-5d21-11e7-846c-0a0c90a7033c) for task Titus-7590548-worker-0-4476 of framework TitusFramework Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c mesos-slave[4290]: I0629 23:22:37.500562 5189 slave.cpp:3604] Forwarding the update TASK_KILLED (UUID: 898215d6-a244-4dbe-bc9c-878a22d36ea4) for task Titus-7590548-worker-0-4476 of framework TitusFramework to master@100.66.3.213:7103 Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c mesos-slave[4290]: I0629 23:22:37.502029 5191 status_update_manager.cpp:320] Received status update TASK_STARTING (UUID: d6aafd3f-5d21-11e7-846c-0a0c90a7033c) for task Titus-7590548-worker-0-4476 of framework TitusFramework Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c mesos-slave[4290]: I0629 23:22:37.502092 5191 status_update_manager.cpp:825] Checkpointing UPDATE for status update TASK_STARTING (UUID: d6aafd3f-5d21-11e7-846c-0a0c90a7033c) for task Titus-7590548-worker-0-4476 of framework TitusFramework Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c mesos-slave[4290]: I0629 23:22:37.502393 5189 slave.cpp:3514] Sending acknowledgement for status update TASK_STARTING (UUID: d6aaed02-5d21-11e7-846c-0a0c90a7033c) for task Titus-7590548-worker-0-4476 of framework TitusFramework to executor(1)@100.66.11.10:17707 Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c mesos-slave[4290]: I0629 23:22:37.504465 5189 slave.cpp:3514] Sending acknowledgement for status update
[jira] [Created] (MESOS-7744) Mesos Agent Sends TASK_KILL status update to Master, and still launches task
Sargun Dhillon created MESOS-7744: - Summary: Mesos Agent Sends TASK_KILL status update to Master, and still launches task Key: MESOS-7744 URL: https://issues.apache.org/jira/browse/MESOS-7744 Project: Mesos Issue Type: Bug Affects Versions: 1.0.1 Reporter: Sargun Dhillon Priority: Minor We sometimes launch jobs, and cancel them in ~7 seconds, if we don't get a TASK_STARTING back from the agent. Under certain conditions it can result in Mesos losing track of the task. The chunk of the logs which is interesting is here: {code} Jun 29 23:22:26 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c mesos-slave[4290]: I0629 23:22:26.951799 5171 slave.cpp:1495] Got assigned task Titus-7590548-worker-0-4476 for framework TitusFramework Jun 29 23:22:26 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c mesos-slave[4290]: I0629 23:22:26.952251 5171 slave.cpp:1614] Launching task Titus-7590548-worker-0-4476 for framework TitusFramework Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c mesos-slave[4290]: I0629 23:22:37.484611 5171 slave.cpp:1853] Queuing task ‘Titus-7590548-worker-0-4476’ for executor ‘docker-executor’ of framework TitusFramework at executor(1)@100.66.11.10:17707 Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c mesos-slave[4290]: I0629 23:22:37.487876 5171 slave.cpp:2035] Asked to kill task Titus-7590548-worker-0-4476 of framework TitusFramework Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c mesos-slave[4290]: I0629 23:22:37.488994 5171 slave.cpp:3211] Handling status update TASK_KILLED (UUID: 898215d6-a244-4dbe-bc9c-878a22d36ea4) for task Titus-7590548-worker-0-4476 of framework TitusFramework from @0.0.0.0:0 Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c mesos-slave[4290]: I0629 23:22:37.490603 5171 slave.cpp:2005] Sending queued task ‘Titus-7590548-worker-0-4476’ to executor ‘docker-executor’ of framework TitusFramework at executor(1)@100.66.11.10:17707{ {code} In our executor, we see that the launch message arrives after the master has already gotten the kill update. We then send non-terminal state updates to the agent, and yet it doesn't forward these to our framework. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-4891) Add a '/containers' endpoint to the agent to list all the active containers.
[ https://issues.apache.org/jira/browse/MESOS-4891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15184017#comment-15184017 ] Sargun Dhillon commented on MESOS-4891: --- Can we also have a place to list all executor PIDs that are associated with those containers? > Add a '/containers' endpoint to the agent to list all the active containers. > > > Key: MESOS-4891 > URL: https://issues.apache.org/jira/browse/MESOS-4891 > Project: Mesos > Issue Type: Improvement >Reporter: Jie Yu > > This endpoint will be similar to /monitor/statistics.json endpoint, but it'll > also contain the 'container_status' about the container (see ContainerStatus > in mesos.proto). We'll eventually deprecate the /monitor/statistics.json > endpoint. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4883) Add agent ID to agent state endpoint
[ https://issues.apache.org/jira/browse/MESOS-4883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15183895#comment-15183895 ] Sargun Dhillon commented on MESOS-4883: --- We have a tool here that looks at the agent state.json and assembles a complete cluster view based on the sum of the agent JSONs. This system is soft-state. If the last state we have in memory has a bunch of tasks associated with this slave, and this slave comes back and runs for a while (10m), and we don't see any tasks, we do not know to take those old tasks and remove them from the system. > Add agent ID to agent state endpoint > > > Key: MESOS-4883 > URL: https://issues.apache.org/jira/browse/MESOS-4883 > Project: Mesos > Issue Type: Improvement >Reporter: Sargun Dhillon >Priority: Minor > Labels: mesosphere > > I would like to have the slave ID exposed on the slave before any tasks are > running on the slave on the state.json endpoint. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4884) Ensure task_status timestamp is monotonic
Sargun Dhillon created MESOS-4884: - Summary: Ensure task_status timestamp is monotonic Key: MESOS-4884 URL: https://issues.apache.org/jira/browse/MESOS-4884 Project: Mesos Issue Type: Improvement Reporter: Sargun Dhillon Priority: Critical In state.json the task status has a timestamp associated with it. From my understanding, the timestamp is when the task status update was generated. Although the slave guarantees that the list is sorted, and the first item of the list is the newest status. This becomes a problem if someone is independently getting the task status updates -- without the logic in the slave, we cannot determine the current state of the task. There exists a timestamp on the task. I would like the executor (API) to ensure that this timestamp is strictly monotonic. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4883) Add agent ID to agent state endpoint
Sargun Dhillon created MESOS-4883: - Summary: Add agent ID to agent state endpoint Key: MESOS-4883 URL: https://issues.apache.org/jira/browse/MESOS-4883 Project: Mesos Issue Type: Improvement Reporter: Sargun Dhillon Priority: Minor I would like to have the slave ID exposed on the slave before any tasks are running on the slave on the state.json endpoint. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4427) Ensure ip_address in state.json (from NetworkInfo) is valid
[ https://issues.apache.org/jira/browse/MESOS-4427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15180686#comment-15180686 ] Sargun Dhillon commented on MESOS-4427: --- [~jieyu] This is closed, right? > Ensure ip_address in state.json (from NetworkInfo) is valid > --- > > Key: MESOS-4427 > URL: https://issues.apache.org/jira/browse/MESOS-4427 > Project: Mesos > Issue Type: Bug >Reporter: Sargun Dhillon >Priority: Critical > Labels: mesosphere > > We have seen a master state.json where the state.json has a field that looks > similar to: > ---REDACTED--- > {code:json} > { > "container": { > "docker": { > "force_pull_image": false, > "image": "REDACTED", > "network": "HOST", > "privileged": false > }, > "type": "DOCKER" > }, > "executor_id": "", > "framework_id": "9f0e50ea-54b0-44e3-a451-c69e0c1a58fb-", > "id": "ping-as-a-service.c2d1c17a-be22-11e5-b053-002590e56e25", > "name": "ping-as-a-service", > "resources": { > "cpus": 0.1, > "disk": 0, > "mem": 64, > "ports": "[7907-7907]" > }, > "slave_id": "9f0e50ea-54b0-44e3-a451-c69e0c1a58fb-S76043", > "state": "TASK_RUNNING", > "statuses": [ > { > "container_status": { > "network_infos": [ > { > "ip_address": "", > "ip_addresses": [ > { > "ip_address": "" > } > ] > } > ] > }, > "labels": [ > { > "key": "Docker.NetworkSettings.IPAddress", > "value": "" > } > ], > "state": "TASK_RUNNING", > "timestamp": 1453149270.95511 > } > ] > } > {code} > ---REDACTED--- > This is invalid, and it mesos-core should filter it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4427) Ensure ip_address in state.json (from NetworkInfo) is valid
[ https://issues.apache.org/jira/browse/MESOS-4427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15180685#comment-15180685 ] Sargun Dhillon commented on MESOS-4427: --- No. > Ensure ip_address in state.json (from NetworkInfo) is valid > --- > > Key: MESOS-4427 > URL: https://issues.apache.org/jira/browse/MESOS-4427 > Project: Mesos > Issue Type: Bug >Reporter: Sargun Dhillon >Priority: Critical > Labels: mesosphere > > We have seen a master state.json where the state.json has a field that looks > similar to: > ---REDACTED--- > {code:json} > { > "container": { > "docker": { > "force_pull_image": false, > "image": "REDACTED", > "network": "HOST", > "privileged": false > }, > "type": "DOCKER" > }, > "executor_id": "", > "framework_id": "9f0e50ea-54b0-44e3-a451-c69e0c1a58fb-", > "id": "ping-as-a-service.c2d1c17a-be22-11e5-b053-002590e56e25", > "name": "ping-as-a-service", > "resources": { > "cpus": 0.1, > "disk": 0, > "mem": 64, > "ports": "[7907-7907]" > }, > "slave_id": "9f0e50ea-54b0-44e3-a451-c69e0c1a58fb-S76043", > "state": "TASK_RUNNING", > "statuses": [ > { > "container_status": { > "network_infos": [ > { > "ip_address": "", > "ip_addresses": [ > { > "ip_address": "" > } > ] > } > ] > }, > "labels": [ > { > "key": "Docker.NetworkSettings.IPAddress", > "value": "" > } > ], > "state": "TASK_RUNNING", > "timestamp": 1453149270.95511 > } > ] > } > {code} > ---REDACTED--- > This is invalid, and it mesos-core should filter it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4738) Expose egress bandwidth as a resource
[ https://issues.apache.org/jira/browse/MESOS-4738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15163473#comment-15163473 ] Sargun Dhillon commented on MESOS-4738: --- Yeah, we're using DRR for egress load balancing at the moment with our security stuff. > Expose egress bandwidth as a resource > - > > Key: MESOS-4738 > URL: https://issues.apache.org/jira/browse/MESOS-4738 > Project: Mesos > Issue Type: Improvement >Reporter: Sargun Dhillon >Priority: Minor > Labels: mesosphere > > Some of our users care about variable network network isolation. Although we > cannot fundamentally limit ingress network bandwidth, having it as a > resource, so we can drop packets above a specific limit would be attractive. > It would be nice to expose egress and ingress bandwidth as an agent resource, > perhaps with a default of 10,000 mbps, and we can allow people to adjust as > needed. Alternatively, a more advanced design would involve generating > heuristics based on an analysis of the network MII / PHY. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4752) Expose ingress bandwidth as a resource
Sargun Dhillon created MESOS-4752: - Summary: Expose ingress bandwidth as a resource Key: MESOS-4752 URL: https://issues.apache.org/jira/browse/MESOS-4752 Project: Mesos Issue Type: Improvement Reporter: Sargun Dhillon Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4738) Expose egress bandwidth as a resource
[ https://issues.apache.org/jira/browse/MESOS-4738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159756#comment-15159756 ] Sargun Dhillon commented on MESOS-4738: --- We can shape ingress bandwidth. We drop packets beyond a certain rate, and mark ECN if someone is within 80% of throughput of being dropped. > Expose egress bandwidth as a resource > - > > Key: MESOS-4738 > URL: https://issues.apache.org/jira/browse/MESOS-4738 > Project: Mesos > Issue Type: Improvement >Reporter: Sargun Dhillon >Priority: Minor > Labels: mesosphere > > Some of our users care about variable network network isolation. Although we > cannot fundamentally limit ingress network bandwidth, having it as a > resource, so we can drop packets above a specific limit would be attractive. > It would be nice to expose egress and ingress bandwidth as an agent resource, > perhaps with a default of 10,000 mbps, and we can allow people to adjust as > needed. Alternatively, a more advanced design would involve generating > heuristics based on an analysis of the network MII / PHY. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4738) Make ingress and egress bandwidth a resource
[ https://issues.apache.org/jira/browse/MESOS-4738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157981#comment-15157981 ] Sargun Dhillon commented on MESOS-4738: --- We could build something to determine the bandwidth on the machine, or the kinds of NICs on the machine to model bandwidth. I would say we should have a plan of implementing this resource, because it is in demand. > Make ingress and egress bandwidth a resource > > > Key: MESOS-4738 > URL: https://issues.apache.org/jira/browse/MESOS-4738 > Project: Mesos > Issue Type: Improvement >Reporter: Sargun Dhillon >Priority: Minor > Labels: mesosphere > > Some of our users care about variable network network isolation. Although we > cannot fundamentally limit ingress network bandwidth, having it as a > resource, so we can drop packets above a specific limit would be attractive. > It would be nice to expose egress and ingress bandwidth as an agent resource, > perhaps with a default of 10,000 mbps, and we can allow people to adjust as > needed. Alternatively, a more advanced design would involve generating > heuristics based on an analysis of the network MII / PHY. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4738) Make ingress and egress bandwidth a resource
[ https://issues.apache.org/jira/browse/MESOS-4738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sargun Dhillon updated MESOS-4738: -- Description: Some of our users care about variable network network isolation. Although we cannot fundamentally limit ingress network bandwidth, having it as a resource, so we can drop packets above a specific limit would be attractive. It would be nice to expose egress and ingress bandwidth as an agent resource, perhaps with a default of 10,000 mbps, and we can allow people to adjust as needed. Alternatively, a more advanced design would involve generating heuristics based on an analysis of the network MII / PHY. was: Some of our users care about variable network network isolation. Although we cannot fundamentally limit ingress network bandwidth, having it as a resource, so we can drop packets above a specific limit would be attractive. > Make ingress and egress bandwidth a resource > > > Key: MESOS-4738 > URL: https://issues.apache.org/jira/browse/MESOS-4738 > Project: Mesos > Issue Type: Improvement >Reporter: Sargun Dhillon >Priority: Minor > Labels: mesosphere > > Some of our users care about variable network network isolation. Although we > cannot fundamentally limit ingress network bandwidth, having it as a > resource, so we can drop packets above a specific limit would be attractive. > It would be nice to expose egress and ingress bandwidth as an agent resource, > perhaps with a default of 10,000 mbps, and we can allow people to adjust as > needed. Alternatively, a more advanced design would involve generating > heuristics based on an analysis of the network MII / PHY. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4738) Make ingress and egress bandwidth a resource
[ https://issues.apache.org/jira/browse/MESOS-4738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sargun Dhillon updated MESOS-4738: -- Labels: mesosphere (was: ) > Make ingress and egress bandwidth a resource > > > Key: MESOS-4738 > URL: https://issues.apache.org/jira/browse/MESOS-4738 > Project: Mesos > Issue Type: Improvement >Reporter: Sargun Dhillon >Priority: Minor > Labels: mesosphere > > Some of our users care about variable network network isolation. Although we > cannot fundamentally limit ingress network bandwidth, having it as a > resource, so we can drop packets above a specific limit would be attractive. > It would be nice to expose egress and ingress bandwidth as an agent resource, > perhaps with a default of 10,000 mbps, and we can allow people to adjust as > needed. Alternatively, a more advanced design would involve generating > heuristics based on an analysis of the network MII / PHY. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4738) Make ingress and egress bandwidth a resource
Sargun Dhillon created MESOS-4738: - Summary: Make ingress and egress bandwidth a resource Key: MESOS-4738 URL: https://issues.apache.org/jira/browse/MESOS-4738 Project: Mesos Issue Type: Improvement Reporter: Sargun Dhillon Priority: Minor Some of our users care about variable network network isolation. Although we cannot fundamentally limit ingress network bandwidth, having it as a resource, so we can drop packets above a specific limit would be attractive. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4710) Add comment about labels caveats to mesos.proto
Sargun Dhillon created MESOS-4710: - Summary: Add comment about labels caveats to mesos.proto Key: MESOS-4710 URL: https://issues.apache.org/jira/browse/MESOS-4710 Project: Mesos Issue Type: Improvement Reporter: Sargun Dhillon Priority: Trivial Right now, there exists the labels message in the mesos.proto. This message may have duplicate entires of a specific label, but it is not recommended to do this, because some software treats this datastructure as the basis for a key-value datastructure. There might be some value in adding detail around the potential downsides to repeated instances of labels, and suggesting that users ensure each label key inside a repeated labels field is unique. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4427) Ensure ip_address in state.json (from NetworkInfo) is valid
Sargun Dhillon created MESOS-4427: - Summary: Ensure ip_address in state.json (from NetworkInfo) is valid Key: MESOS-4427 URL: https://issues.apache.org/jira/browse/MESOS-4427 Project: Mesos Issue Type: Bug Reporter: Sargun Dhillon Priority: Critical We have seen a master state.json where the state.json has a field that looks similar to: ``` ---REDACTED--- { "container": { "docker": { "force_pull_image": false, "image": "REDACTED", "network": "HOST", "privileged": false }, "type": "DOCKER" }, "executor_id": "", "framework_id": "9f0e50ea-54b0-44e3-a451-c69e0c1a58fb-", "id": "ping-as-a-service.c2d1c17a-be22-11e5-b053-002590e56e25", "name": "ping-as-a-service", "resources": { "cpus": 0.1, "disk": 0, "mem": 64, "ports": "[7907-7907]" }, "slave_id": "9f0e50ea-54b0-44e3-a451-c69e0c1a58fb-S76043", "state": "TASK_RUNNING", "statuses": [ { "container_status": { "network_infos": [ { "ip_address": "", "ip_addresses": [ { "ip_address": "" } ] } ] }, "labels": [ { "key": "Docker.NetworkSettings.IPAddress", "value": "" } ], "state": "TASK_RUNNING", "timestamp": 1453149270.95511 } ] } ---REDACTED--- ``` This is invalid, and it mesos-core should filter it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4427) Ensure ip_address in state.json (from NetworkInfo) is valid
[ https://issues.apache.org/jira/browse/MESOS-4427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sargun Dhillon updated MESOS-4427: -- Description: We have seen a master state.json where the state.json has a field that looks similar to: ---REDACTED--- {code:json} { "container": { "docker": { "force_pull_image": false, "image": "REDACTED", "network": "HOST", "privileged": false }, "type": "DOCKER" }, "executor_id": "", "framework_id": "9f0e50ea-54b0-44e3-a451-c69e0c1a58fb-", "id": "ping-as-a-service.c2d1c17a-be22-11e5-b053-002590e56e25", "name": "ping-as-a-service", "resources": { "cpus": 0.1, "disk": 0, "mem": 64, "ports": "[7907-7907]" }, "slave_id": "9f0e50ea-54b0-44e3-a451-c69e0c1a58fb-S76043", "state": "TASK_RUNNING", "statuses": [ { "container_status": { "network_infos": [ { "ip_address": "", "ip_addresses": [ { "ip_address": "" } ] } ] }, "labels": [ { "key": "Docker.NetworkSettings.IPAddress", "value": "" } ], "state": "TASK_RUNNING", "timestamp": 1453149270.95511 } ] } {code} ---REDACTED--- This is invalid, and it mesos-core should filter it. was: We have seen a master state.json where the state.json has a field that looks similar to: ``` ---REDACTED--- { "container": { "docker": { "force_pull_image": false, "image": "REDACTED", "network": "HOST", "privileged": false }, "type": "DOCKER" }, "executor_id": "", "framework_id": "9f0e50ea-54b0-44e3-a451-c69e0c1a58fb-", "id": "ping-as-a-service.c2d1c17a-be22-11e5-b053-002590e56e25", "name": "ping-as-a-service", "resources": { "cpus": 0.1, "disk": 0, "mem": 64, "ports": "[7907-7907]" }, "slave_id": "9f0e50ea-54b0-44e3-a451-c69e0c1a58fb-S76043", "state": "TASK_RUNNING", "statuses": [ { "container_status": { "network_infos": [ { "ip_address": "", "ip_addresses": [ { "ip_address": "" } ] } ] }, "labels": [ { "key": "Docker.NetworkSettings.IPAddress", "value": "" } ], "state": "TASK_RUNNING", "timestamp": 1453149270.95511 } ] } ---REDACTED--- ``` This is invalid, and it mesos-core should filter it. > Ensure ip_address in state.json (from NetworkInfo) is valid > --- > > Key: MESOS-4427 > URL: https://issues.apache.org/jira/browse/MESOS-4427 > Project: Mesos > Issue Type: Bug >Reporter: Sargun Dhillon >Priority: Critical > Labels: mesosphere > > We have seen a master state.json where the state.json has a field that looks > similar to: > ---REDACTED--- > {code:json} > { > "container": { > "docker": { > "force_pull_image": false, > "image": "REDACTED", > "network": "HOST", > "privileged": false > }, > "type": "DOCKER" > }, > "executor_id": "", > "framework_id": "9f0e50ea-54b0-44e3-a451-c69e0c1a58fb-", > "id": "ping-as-a-service.c2d1c17a-be22-11e5-b053-002590e56e25", > "name": "ping-as-a-service", > "resources": { > "cpus": 0.1, > "disk": 0, > "mem": 64, > "ports": "[7907-7907]" > }, > "slave_id": "9f0e50ea-54b0-44e3-a451-c69e0c1a58fb-S76043", > "state": "TASK_RUNNING", > "statuses": [ > { > "container_status": { > "network_infos": [ > { > "ip_address": "", > "ip_addresses": [ > { > "ip_address": "" > } > ] > } > ] > }, > "labels": [ > { > "key": "Docker.NetworkSettings.IPAddress", > "value": "" > } > ], > "state": "TASK_RUNNING", >
[jira] [Created] (MESOS-4356) Expose Slave IP in state.json
Sargun Dhillon created MESOS-4356: - Summary: Expose Slave IP in state.json Key: MESOS-4356 URL: https://issues.apache.org/jira/browse/MESOS-4356 Project: Mesos Issue Type: Improvement Reporter: Sargun Dhillon Priority: Minor Right now, we expose the slave hostname in state.json. Unfortunately, there are many environments where DNS does not work. The slave's libprocess PID is in state.json as well, and we're (Mesos-DNS, and others) forced to parse that. That's less than optimal. If the slave's IP was in state.json directly, that would make our lives easier. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4120) Make DiscoveryInfo dynamically updatable
[ https://issues.apache.org/jira/browse/MESOS-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15095716#comment-15095716 ] Sargun Dhillon commented on MESOS-4120: --- Sorry, no. It can align along a future release. It is primarily to support dynamically updated VIPs with systems like K8s, and it's required for us to make parity with their label-oriented load balancing. Sent from my iPhone > Make DiscoveryInfo dynamically updatable > > > Key: MESOS-4120 > URL: https://issues.apache.org/jira/browse/MESOS-4120 > Project: Mesos > Issue Type: Improvement >Affects Versions: 0.25.0, 0.26.0 >Reporter: Sargun Dhillon >Priority: Critical > Labels: mesosphere > > K8s tasks can dynamically update what they expose to make discoverable by the > cluster. Unfortunately, all DiscoveryInfo the cluster is immutable, at the > time of task start. > We would like to enable DiscoveryInfo to be dynamically updatable, so that > executors can change what they're advertising based on their internal state, > versus requiring DiscoveryInfo to be known prior to starting the tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4120) Make DiscoveryInfo dynamically updatable
[ https://issues.apache.org/jira/browse/MESOS-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15095698#comment-15095698 ] Sargun Dhillon commented on MESOS-4120: --- No. We are restrategizing, and punting this. Sent from my iPhone > Make DiscoveryInfo dynamically updatable > > > Key: MESOS-4120 > URL: https://issues.apache.org/jira/browse/MESOS-4120 > Project: Mesos > Issue Type: Improvement >Affects Versions: 0.25.0, 0.26.0 >Reporter: Sargun Dhillon >Priority: Critical > Labels: mesosphere > > K8s tasks can dynamically update what they expose to make discoverable by the > cluster. Unfortunately, all DiscoveryInfo the cluster is immutable, at the > time of task start. > We would like to enable DiscoveryInfo to be dynamically updatable, so that > executors can change what they're advertising based on their internal state, > versus requiring DiscoveryInfo to be known prior to starting the tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4114) Add field VIP to message Port
[ https://issues.apache.org/jira/browse/MESOS-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15085845#comment-15085845 ] Sargun Dhillon commented on MESOS-4114: --- I think we should just do a label on ports. In order to expose a VIP, a port can have a label named vip with a value of a standard ip:port notation. So, like 1.2.3.4:4000, or [::1]:80. Let's get rid of the field VIPInfo, I think it's inflexible, an complicated. > Add field VIP to message Port > - > > Key: MESOS-4114 > URL: https://issues.apache.org/jira/browse/MESOS-4114 > Project: Mesos > Issue Type: Wish >Reporter: Sargun Dhillon >Assignee: Avinash Sridharan >Priority: Trivial > Labels: mesosphere > Fix For: 0.27.0 > > > We would like to extend the Mesos protocol buffer 'Port' to include an > optional repeated string named "VIP" - to map it to a well known virtual IP, > or virtual hostname for discovery purposes. > We also want this field exposed in DiscoveryInfo in state.json. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (MESOS-4114) Add field VIP to message Port
[ https://issues.apache.org/jira/browse/MESOS-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15085845#comment-15085845 ] Sargun Dhillon edited comment on MESOS-4114 at 1/6/16 5:10 PM: --- I think we should just do a label on ports. In order to expose a VIP, a port can have a label named vip with a value of a standard ip:port notation. So, like 1.2.3.4:4000, or [::1]:80. Let's get rid of the field VIPInfo, I think it's inflexible, an complicated. Of course this info should be exposed via state.json as well. was (Author: sargun): I think we should just do a label on ports. In order to expose a VIP, a port can have a label named vip with a value of a standard ip:port notation. So, like 1.2.3.4:4000, or [::1]:80. Let's get rid of the field VIPInfo, I think it's inflexible, an complicated. > Add field VIP to message Port > - > > Key: MESOS-4114 > URL: https://issues.apache.org/jira/browse/MESOS-4114 > Project: Mesos > Issue Type: Wish >Reporter: Sargun Dhillon >Assignee: Avinash Sridharan >Priority: Trivial > Labels: mesosphere > Fix For: 0.27.0 > > > We would like to extend the Mesos protocol buffer 'Port' to include an > optional repeated string named "VIP" - to map it to a well known virtual IP, > or virtual hostname for discovery purposes. > We also want this field exposed in DiscoveryInfo in state.json. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (MESOS-4114) Add field VIP to message Port
[ https://issues.apache.org/jira/browse/MESOS-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15085845#comment-15085845 ] Sargun Dhillon edited comment on MESOS-4114 at 1/6/16 5:23 PM: --- I think we should just do a label on ports. In order to expose a VIP, a port can have a label named vip with a value of a standard ip:port notation. So, like 1.2.3.4:4000, or [::1]:80. Let's get rid of the field VIPInfo, I think it's inflexible, an complicated. The label can be repeated, in case the user wants to expose a specific port multiple times, they just add two labels, both with key VIP, and value of different VIPs. Of course this info should be exposed via state.json as well. was (Author: sargun): I think we should just do a label on ports. In order to expose a VIP, a port can have a label named vip with a value of a standard ip:port notation. So, like 1.2.3.4:4000, or [::1]:80. Let's get rid of the field VIPInfo, I think it's inflexible, an complicated. Of course this info should be exposed via state.json as well. > Add field VIP to message Port > - > > Key: MESOS-4114 > URL: https://issues.apache.org/jira/browse/MESOS-4114 > Project: Mesos > Issue Type: Wish >Reporter: Sargun Dhillon >Assignee: Avinash Sridharan >Priority: Trivial > Labels: mesosphere > Fix For: 0.27.0 > > > We would like to extend the Mesos protocol buffer 'Port' to include an > optional repeated string named "VIP" - to map it to a well known virtual IP, > or virtual hostname for discovery purposes. > We also want this field exposed in DiscoveryInfo in state.json. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4286) Expose state(.json) as a structured protobuf
Sargun Dhillon created MESOS-4286: - Summary: Expose state(.json) as a structured protobuf Key: MESOS-4286 URL: https://issues.apache.org/jira/browse/MESOS-4286 Project: Mesos Issue Type: Wish Reporter: Sargun Dhillon Priority: Minor State.json, both on the agent, and the master exposes information about the current state of the Mesos runtime. This information is super valuable to external users such as Mesos-DNS. Unfortunately, working with state.json can at times become cumbersome in languages where dealing with json isn't necessarily a first-class construct. Fortunately, protocol buffers exist. If the state.json was exposed as a protocol buffer, it would make the lives of software authors to the Mesos ecosystem significantly easier. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4286) Expose state(.json) as a structured protobuf
[ https://issues.apache.org/jira/browse/MESOS-4286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sargun Dhillon updated MESOS-4286: -- Labels: mesosphere (was: ) > Expose state(.json) as a structured protobuf > > > Key: MESOS-4286 > URL: https://issues.apache.org/jira/browse/MESOS-4286 > Project: Mesos > Issue Type: Wish >Reporter: Sargun Dhillon >Priority: Minor > Labels: mesosphere > > State.json, both on the agent, and the master exposes information about the > current state of the Mesos runtime. This information is super valuable to > external users such as Mesos-DNS. Unfortunately, working with state.json can > at times become cumbersome in languages where dealing with json isn't > necessarily a first-class construct. > Fortunately, protocol buffers exist. If the state.json was exposed as a > protocol buffer, it would make the lives of software authors to the Mesos > ecosystem significantly easier. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4113) Docker Executor should not set container IP during bridged mode
[ https://issues.apache.org/jira/browse/MESOS-4113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15073065#comment-15073065 ] Sargun Dhillon commented on MESOS-4113: --- MESOS-4064 fundamentally solves the problem that this ticket raises, but it requires that every program that needs to determine the IP of a task implement some logic, that although trivial, can be error-prone. I'd rather Mesos implement this, rather than having to implement it in any of the quarter-dozen pieces of software that need this information. > Docker Executor should not set container IP during bridged mode > --- > > Key: MESOS-4113 > URL: https://issues.apache.org/jira/browse/MESOS-4113 > Project: Mesos > Issue Type: Bug > Components: docker >Affects Versions: 0.25.0, 0.26.0 >Reporter: Sargun Dhillon >Assignee: Artem Harutyunyan > Labels: mesosphere > > The docker executor currently sets the IP address of the container into > ContainerStatus.NetworkInfo.IPAddresses. This isn't a good thing, because > during bridged mode execution, it makes it so that that IP address is > useless, since it's behind the Docker NAT. I would like a flag that disables > filling the IP address in, and allows it to fall back to the agent IP. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4113) Docker Executor should not set container IP during bridged mode
[ https://issues.apache.org/jira/browse/MESOS-4113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15073008#comment-15073008 ] Sargun Dhillon commented on MESOS-4113: --- The libprocess agent of the IP, not the executor. That'll be the externally-accessible IP for the task, assuming Docker NAT. > Docker Executor should not set container IP during bridged mode > --- > > Key: MESOS-4113 > URL: https://issues.apache.org/jira/browse/MESOS-4113 > Project: Mesos > Issue Type: Bug > Components: docker >Affects Versions: 0.25.0, 0.26.0 >Reporter: Sargun Dhillon >Assignee: Artem Harutyunyan > Labels: mesosphere > > The docker executor currently sets the IP address of the container into > ContainerStatus.NetworkInfo.IPAddresses. This isn't a good thing, because > during bridged mode execution, it makes it so that that IP address is > useless, since it's behind the Docker NAT. I would like a flag that disables > filling the IP address in, and allows it to fall back to the agent IP. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4113) Docker Executor should not set container IP during bridged mode
[ https://issues.apache.org/jira/browse/MESOS-4113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15069471#comment-15069471 ] Sargun Dhillon commented on MESOS-4113: --- The information that's exposed by MESOS-4064 allows for a external program to > Docker Executor should not set container IP during bridged mode > --- > > Key: MESOS-4113 > URL: https://issues.apache.org/jira/browse/MESOS-4113 > Project: Mesos > Issue Type: Bug > Components: docker >Affects Versions: 0.25.0, 0.26.0 >Reporter: Sargun Dhillon >Assignee: Artem Harutyunyan > Labels: mesosphere > > The docker executor currently sets the IP address of the container into > ContainerStatus.NetworkInfo.IPAddresses. This isn't a good thing, because > during bridged mode execution, it makes it so that that IP address is > useless, since it's behind the Docker NAT. I would like a flag that disables > filling the IP address in, and allows it to fall back to the agent IP. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (MESOS-4113) Docker Executor should not set container IP during bridged mode
[ https://issues.apache.org/jira/browse/MESOS-4113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15069471#comment-15069471 ] Sargun Dhillon edited comment on MESOS-4113 at 12/23/15 10:22 AM: -- The information that's exposed by MESOS-4064 allows for a external program to analyze the state.json and determine what IP to use. Specifically, it parses to see if the task / executor has a docker container in bridged mode. If it's in the mode, it uses the slaveID field to lookup the relevant slave, and then parses the PID. Currently, Minuteman, and Mesos-DNS both do this. I believe we should have another NetworkInfos field that actually determines the definitive IPs that external users can contact in order to connect to the task, because NetworkInfos as they are today are effectively useless, due to the behaviour under Docker containers. was (Author: sargun): The information that's exposed by MESOS-4064 allows for a external program to > Docker Executor should not set container IP during bridged mode > --- > > Key: MESOS-4113 > URL: https://issues.apache.org/jira/browse/MESOS-4113 > Project: Mesos > Issue Type: Bug > Components: docker >Affects Versions: 0.25.0, 0.26.0 >Reporter: Sargun Dhillon >Assignee: Artem Harutyunyan > Labels: mesosphere > > The docker executor currently sets the IP address of the container into > ContainerStatus.NetworkInfo.IPAddresses. This isn't a good thing, because > during bridged mode execution, it makes it so that that IP address is > useless, since it's behind the Docker NAT. I would like a flag that disables > filling the IP address in, and allows it to fall back to the agent IP. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (MESOS-4113) Docker Executor should not set container IP during bridged mode
[ https://issues.apache.org/jira/browse/MESOS-4113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15069471#comment-15069471 ] Sargun Dhillon edited comment on MESOS-4113 at 12/23/15 10:24 AM: -- The information that's exposed by MESOS-4064 allows for a external program to analyze the state.json and determine what IP to use. Specifically, it parses to see if the task / executor has a docker container in bridged mode. If it's in the mode, it uses the slaveID field to lookup the relevant slave, and then parses the PID. Currently, Minuteman, and Mesos-DNS both do this. I believe we should have another NetworkInfos field that actually determines the definitive IPs that external users can contact in order to connect to the task, because NetworkInfos as they are today are effectively useless, due to the behaviour under Docker containers. CC: [~jieyu] [~avin...@mesosphere.io] was (Author: sargun): The information that's exposed by MESOS-4064 allows for a external program to analyze the state.json and determine what IP to use. Specifically, it parses to see if the task / executor has a docker container in bridged mode. If it's in the mode, it uses the slaveID field to lookup the relevant slave, and then parses the PID. Currently, Minuteman, and Mesos-DNS both do this. I believe we should have another NetworkInfos field that actually determines the definitive IPs that external users can contact in order to connect to the task, because NetworkInfos as they are today are effectively useless, due to the behaviour under Docker containers. > Docker Executor should not set container IP during bridged mode > --- > > Key: MESOS-4113 > URL: https://issues.apache.org/jira/browse/MESOS-4113 > Project: Mesos > Issue Type: Bug > Components: docker >Affects Versions: 0.25.0, 0.26.0 >Reporter: Sargun Dhillon >Assignee: Artem Harutyunyan > Labels: mesosphere > > The docker executor currently sets the IP address of the container into > ContainerStatus.NetworkInfo.IPAddresses. This isn't a good thing, because > during bridged mode execution, it makes it so that that IP address is > useless, since it's behind the Docker NAT. I would like a flag that disables > filling the IP address in, and allows it to fall back to the agent IP. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4113) Docker Executor should not set container IP during bridged mode
[ https://issues.apache.org/jira/browse/MESOS-4113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15070050#comment-15070050 ] Sargun Dhillon commented on MESOS-4113: --- Could Mesos just use the libprocess IP by default? RE: The ports, this could be ascertained using DiscoveryInfo or by looking at the ports resource. > Docker Executor should not set container IP during bridged mode > --- > > Key: MESOS-4113 > URL: https://issues.apache.org/jira/browse/MESOS-4113 > Project: Mesos > Issue Type: Bug > Components: docker >Affects Versions: 0.25.0, 0.26.0 >Reporter: Sargun Dhillon >Assignee: Artem Harutyunyan > Labels: mesosphere > > The docker executor currently sets the IP address of the container into > ContainerStatus.NetworkInfo.IPAddresses. This isn't a good thing, because > during bridged mode execution, it makes it so that that IP address is > useless, since it's behind the Docker NAT. I would like a flag that disables > filling the IP address in, and allows it to fall back to the agent IP. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4113) Docker Executor should not set container IP during bridged mode
[ https://issues.apache.org/jira/browse/MESOS-4113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15054493#comment-15054493 ] Sargun Dhillon commented on MESOS-4113: --- The purpose of NetworkInfo, that I see is what IP the task is accessible at. It shouldn't be set by the frameworks (ever). It should instead be set by the executors, or isolators. > Docker Executor should not set container IP during bridged mode > --- > > Key: MESOS-4113 > URL: https://issues.apache.org/jira/browse/MESOS-4113 > Project: Mesos > Issue Type: Bug > Components: docker >Affects Versions: 0.25.0, 0.26.0 >Reporter: Sargun Dhillon > Labels: mesosphere > > The docker executor currently sets the IP address of the container into > ContainerStatus.NetworkInfo.IPAddresses. This isn't a good thing, because > during bridged mode execution, it makes it so that that IP address is > useless, since it's behind the Docker NAT. I would like a flag that disables > filling the IP address in, and allows it to fall back to the agent IP. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4140) Indicate that the task is shutting down on shutdown
Sargun Dhillon created MESOS-4140: - Summary: Indicate that the task is shutting down on shutdown Key: MESOS-4140 URL: https://issues.apache.org/jira/browse/MESOS-4140 Project: Mesos Issue Type: Improvement Reporter: Sargun Dhillon In the shutdown handler in the default executor, there is a grace period between when a SIGTERM is sent, and a SIGKILL is sent. There should a mechanism to expose that the task is being killed. A simple mechanism would be to mark the task as unhealthy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4139) Make escalationTimeout configurable
[ https://issues.apache.org/jira/browse/MESOS-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sargun Dhillon updated MESOS-4139: -- Priority: Major (was: Critical) > Make escalationTimeout configurable > --- > > Key: MESOS-4139 > URL: https://issues.apache.org/jira/browse/MESOS-4139 > Project: Mesos > Issue Type: Improvement >Reporter: Sargun Dhillon > Labels: mesosphere > > At the moment, escalationTimeout is fixed at 3 seconds in the code. This > means that if a task is shutdown, there are only 3 seconds between the > SIGTERM, and SIGKILL. This means that if someone is running something like a > rails framework, it may be too quick to terminate the tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4114) Add field VIP to message Port
[ https://issues.apache.org/jira/browse/MESOS-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053110#comment-15053110 ] Sargun Dhillon commented on MESOS-4114: --- Isn’t the port on DiscoveryInfo.Ports.Port the local port (the one that Marathon requested from Mesos?)? Otherwise, how do you know which DiscoveryInfo name correlated with what Mesos Port. Also, you may want to expose different services under different names, or IPs. > Add field VIP to message Port > - > > Key: MESOS-4114 > URL: https://issues.apache.org/jira/browse/MESOS-4114 > Project: Mesos > Issue Type: Wish >Reporter: Sargun Dhillon >Assignee: Avinash Sridharan >Priority: Trivial > Labels: mesosphere > > We would like to extend the Mesos protocol buffer 'Port' to include an > optional repeated string named "VIP" - to map it to a well known virtual IP, > or virtual hostname for discovery purposes. > We also want this field exposed in DiscoveryInfo in state.json. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4139) Make escalationTimeout configurable
Sargun Dhillon created MESOS-4139: - Summary: Make escalationTimeout configurable Key: MESOS-4139 URL: https://issues.apache.org/jira/browse/MESOS-4139 Project: Mesos Issue Type: Improvement Reporter: Sargun Dhillon Priority: Critical At the moment, escalationTimeout is fixed at 3 seconds in the code. This means that if a task is shutdown, there are only 3 seconds between the SIGTERM, and SIGKILL. This means that if someone is running something like a rails framework, it may be too quick to terminate the tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4120) Make DiscoveryInfo dynamically updatable
Sargun Dhillon created MESOS-4120: - Summary: Make DiscoveryInfo dynamically updatable Key: MESOS-4120 URL: https://issues.apache.org/jira/browse/MESOS-4120 Project: Mesos Issue Type: Improvement Reporter: Sargun Dhillon Priority: Critical K8s tasks can dynamically update what they expose to make discoverable by the cluster. Unfortunately, all DiscoveryInfo the cluster is immutable, at the time of task start. We would like to enable DiscoveryInfo to be dynamically updatable, so that executors can change what they're advertising based on their internal state, versus requiring DiscoveryInfo to be known prior to starting the tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4114) Add field VIP to message Port
[ https://issues.apache.org/jira/browse/MESOS-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sargun Dhillon updated MESOS-4114: -- Description: We would like to extend the Mesos protocol buffer 'Port' to include an optional repeated string named "VIP" - to map it to a well known virtual IP, or virtual hostname for discovery purposes. We also want this field exposed in DiscoveryInfo in state.json. was: We would like to extend the Mesos protocol buffer 'Port' to include an optional string string named "VIP" - to map it to a well known virtual IP, or virtual hostname for discovery purposes. We also want this field exposed in DiscoveryInfo in state.json. > Add field VIP to message Port > - > > Key: MESOS-4114 > URL: https://issues.apache.org/jira/browse/MESOS-4114 > Project: Mesos > Issue Type: Wish >Reporter: Sargun Dhillon >Assignee: Avinash Sridharan >Priority: Trivial > Labels: mesosphere > > We would like to extend the Mesos protocol buffer 'Port' to include an > optional repeated string named "VIP" - to map it to a well known virtual IP, > or virtual hostname for discovery purposes. > We also want this field exposed in DiscoveryInfo in state.json. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4113) Docker Executor should not set container IP during bridged mode
Sargun Dhillon created MESOS-4113: - Summary: Docker Executor should not set container IP during bridged mode Key: MESOS-4113 URL: https://issues.apache.org/jira/browse/MESOS-4113 Project: Mesos Issue Type: Bug Components: docker Affects Versions: 0.25.0 Reporter: Sargun Dhillon Priority: Minor The docker executor currently sets the IP address of the container into ContainerStatus.NetworkInfo.IPAddresses. This isn't a good thing, because during bridged mode execution, it makes it so that that IP address is useless, since it's behind the Docker NAT. I would like a flag that disables filling the IP address in, and allows it to fall back to the agent IP. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4114) Add field VIP to message Port
Sargun Dhillon created MESOS-4114: - Summary: Add field VIP to message Port Key: MESOS-4114 URL: https://issues.apache.org/jira/browse/MESOS-4114 Project: Mesos Issue Type: Wish Reporter: Sargun Dhillon Priority: Trivial We would like to extend the Mesos protocol buffer 'Port' to include an optional string string named "VIP" - to map it to a well known virtual IP, or virtual hostname for discovery purposes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4114) Add field VIP to message Port
[ https://issues.apache.org/jira/browse/MESOS-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sargun Dhillon updated MESOS-4114: -- Description: We would like to extend the Mesos protocol buffer 'Port' to include an optional string string named "VIP" - to map it to a well known virtual IP, or virtual hostname for discovery purposes. We also want this field exposed in DiscoveryInfo in state.json. was:We would like to extend the Mesos protocol buffer 'Port' to include an optional string string named "VIP" - to map it to a well known virtual IP, or virtual hostname for discovery purposes. > Add field VIP to message Port > - > > Key: MESOS-4114 > URL: https://issues.apache.org/jira/browse/MESOS-4114 > Project: Mesos > Issue Type: Wish >Reporter: Sargun Dhillon >Assignee: Avinash Sridharan >Priority: Trivial > Labels: mesosphere > > We would like to extend the Mesos protocol buffer 'Port' to include an > optional string string named "VIP" - to map it to a well known virtual IP, or > virtual hostname for discovery purposes. > We also want this field exposed in DiscoveryInfo in state.json. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4015) Expose task / executor health in master & slave state.json
Sargun Dhillon created MESOS-4015: - Summary: Expose task / executor health in master & slave state.json Key: MESOS-4015 URL: https://issues.apache.org/jira/browse/MESOS-4015 Project: Mesos Issue Type: Improvement Affects Versions: 0.25.0 Reporter: Sargun Dhillon Priority: Trivial Right now, if I specify a healthcheck for a task, the only way to get to it is via the Task Status updates that come to the framework. Unfortunately, this information isn't exposed in the state.json either in the slave or master. It'd be ideal to have that information to enable tools like Mesos-DNS to be health-aware. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-3962) Add labels to the message Port
Sargun Dhillon created MESOS-3962: - Summary: Add labels to the message Port Key: MESOS-3962 URL: https://issues.apache.org/jira/browse/MESOS-3962 Project: Mesos Issue Type: Wish Reporter: Sargun Dhillon Priority: Minor I want to add arbitrary labels to the message "Port". I have a few use cases for this: 1) I want to use it to drive isolators to install firewall rules associated with the port 2) I want to use it to drive third party components to be able to specify advertising information 3) I want to be able to able to use this to associate a deterministic virtual hostname with a given port Ideally, once the task is launched, these labels would be immutable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3826) Add an optional unique identifier for resource reservations
[ https://issues.apache.org/jira/browse/MESOS-3826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14995809#comment-14995809 ] Sargun Dhillon commented on MESOS-3826: --- Questions: 1. Will Mesos only give resourceOffers to a framework for the given principal that it is registered under, or is this further filtering that the framework author must do? 2. Can a given framework only reserve resources with its principal or can it use the principals of others? If a given framework can only operate on one principal at a time, what's the point of `ReservationInfo` at all? Can't Mesos implicitly apply the principal to all reservations, and task launches? > Add an optional unique identifier for resource reservations > --- > > Key: MESOS-3826 > URL: https://issues.apache.org/jira/browse/MESOS-3826 > Project: Mesos > Issue Type: Improvement > Components: general >Reporter: Sargun Dhillon >Assignee: Guangya Liu >Priority: Minor > Labels: mesosphere > > Thanks to the resource reservation primitives, frameworks can reserve > resources. These reservations are per role, which means multiple frameworks > can share reservations. This can get very hairy, as multiple reservations can > occur on each agent. > It would be nice to be able to optionally, uniquely identify reservations by > ID, much like persistent volumes are today. This could be done by adding a > new protobuf field, such as Resource.ReservationInfo.id, that if set upon > reservation time, would come back when the reservation is advertised. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (MESOS-3826) Add an optional unique identifier for resource reservations
[ https://issues.apache.org/jira/browse/MESOS-3826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14995809#comment-14995809 ] Sargun Dhillon edited comment on MESOS-3826 at 11/8/15 9:08 PM: Questions: 1. Will Mesos only give resourceOffers to a framework for the given principal that it is registered under, or is this further filtering that the framework author must do? 2. Can a given framework only reserve resources with its principal or can it use the principals of others? If a given framework can only operate on one principal at a time, what's the point of ```ReservationInfo``` at all? Can't Mesos implicitly apply the principal to all reservations, and task launches? was (Author: sargun): Questions: 1. Will Mesos only give resourceOffers to a framework for the given principal that it is registered under, or is this further filtering that the framework author must do? 2. Can a given framework only reserve resources with its principal or can it use the principals of others? If a given framework can only operate on one principal at a time, what's the point of `ReservationInfo` at all? Can't Mesos implicitly apply the principal to all reservations, and task launches? > Add an optional unique identifier for resource reservations > --- > > Key: MESOS-3826 > URL: https://issues.apache.org/jira/browse/MESOS-3826 > Project: Mesos > Issue Type: Improvement > Components: general >Reporter: Sargun Dhillon >Assignee: Guangya Liu >Priority: Minor > Labels: mesosphere > > Thanks to the resource reservation primitives, frameworks can reserve > resources. These reservations are per role, which means multiple frameworks > can share reservations. This can get very hairy, as multiple reservations can > occur on each agent. > It would be nice to be able to optionally, uniquely identify reservations by > ID, much like persistent volumes are today. This could be done by adding a > new protobuf field, such as Resource.ReservationInfo.id, that if set upon > reservation time, would come back when the reservation is advertised. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (MESOS-3826) Add an optional unique identifier for resource reservations
[ https://issues.apache.org/jira/browse/MESOS-3826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14995809#comment-14995809 ] Sargun Dhillon edited comment on MESOS-3826 at 11/8/15 9:09 PM: Questions: 1. Will Mesos only give resourceOffers to a framework for the given principal that it is registered under, or is this further filtering that the framework author must do? 2. Can a given framework only reserve resources with its principal or can it use the principals of others? If a given framework can only operate on one principal at a time, what's the point of {{ReservationInfo}} at all? Can't Mesos implicitly apply the principal to all reservations, and task launches? was (Author: sargun): Questions: 1. Will Mesos only give resourceOffers to a framework for the given principal that it is registered under, or is this further filtering that the framework author must do? 2. Can a given framework only reserve resources with its principal or can it use the principals of others? If a given framework can only operate on one principal at a time, what's the point of ```ReservationInfo``` at all? Can't Mesos implicitly apply the principal to all reservations, and task launches? > Add an optional unique identifier for resource reservations > --- > > Key: MESOS-3826 > URL: https://issues.apache.org/jira/browse/MESOS-3826 > Project: Mesos > Issue Type: Improvement > Components: general >Reporter: Sargun Dhillon >Assignee: Guangya Liu >Priority: Minor > Labels: mesosphere > > Thanks to the resource reservation primitives, frameworks can reserve > resources. These reservations are per role, which means multiple frameworks > can share reservations. This can get very hairy, as multiple reservations can > occur on each agent. > It would be nice to be able to optionally, uniquely identify reservations by > ID, much like persistent volumes are today. This could be done by adding a > new protobuf field, such as Resource.ReservationInfo.id, that if set upon > reservation time, would come back when the reservation is advertised. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3826) Add an optional unique identifier for resource reservations
[ https://issues.apache.org/jira/browse/MESOS-3826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14995888#comment-14995888 ] Sargun Dhillon commented on MESOS-3826: --- The other problem here is idempotence. If I as framework A create a reservation, and for whatever reason that resource gets offered to framework B, which holds it, I am going to time out, thinking my reservation failed. I need to have an ID, that's forced to be unique, in order to get some level of idempotence, given the current asynchronous nature of reservations. > Add an optional unique identifier for resource reservations > --- > > Key: MESOS-3826 > URL: https://issues.apache.org/jira/browse/MESOS-3826 > Project: Mesos > Issue Type: Improvement > Components: general >Reporter: Sargun Dhillon >Assignee: Guangya Liu >Priority: Minor > Labels: mesosphere > > Thanks to the resource reservation primitives, frameworks can reserve > resources. These reservations are per role, which means multiple frameworks > can share reservations. This can get very hairy, as multiple reservations can > occur on each agent. > It would be nice to be able to optionally, uniquely identify reservations by > ID, much like persistent volumes are today. This could be done by adding a > new protobuf field, such as Resource.ReservationInfo.id, that if set upon > reservation time, would come back when the reservation is advertised. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-3853) Expose Dynamic Reservations and Persistent Volumes in Master or Slave state.json
Sargun Dhillon created MESOS-3853: - Summary: Expose Dynamic Reservations and Persistent Volumes in Master or Slave state.json Key: MESOS-3853 URL: https://issues.apache.org/jira/browse/MESOS-3853 Project: Mesos Issue Type: Improvement Components: master, slave Reporter: Sargun Dhillon Priority: Minor Right now dynamic reservations, and persistent volumes aren't exposed in state.json, as it's exposed as just generic reservations: {code} "reserved_resources": { "test": { "cpus": 0.02, "disk": 200, "mem": 0 } } {code} It would be nice to get information about which resources are dynamically reserved, and what principal as well as role holds those dynamic reservations. For volumes, it would be nice to see what principal, it would be good to see the size, and volume info for that given volume. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3826) Add an optional unique identifier for resource reservations
[ https://issues.apache.org/jira/browse/MESOS-3826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14992632#comment-14992632 ] Sargun Dhillon commented on MESOS-3826: --- Yeah. I think that's how roles are supposed to work. That a role would align to a department, rather than every framework having their own role. We can change that, if we want. > Add an optional unique identifier for resource reservations > --- > > Key: MESOS-3826 > URL: https://issues.apache.org/jira/browse/MESOS-3826 > Project: Mesos > Issue Type: Improvement > Components: general >Reporter: Sargun Dhillon >Assignee: Guangya Liu >Priority: Minor > Labels: mesosphere > > Thanks to the resource reservation primitives, frameworks can reserve > resources. These reservations are per role, which means multiple frameworks > can share reservations. This can get very hairy, as multiple reservations can > occur on each agent. > It would be nice to be able to optionally, uniquely identify reservations by > ID, much like persistent volumes are today. This could be done by adding a > new protobuf field, such as Resource.ReservationInfo.id, that if set upon > reservation time, would come back when the reservation is advertised. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-3835) Expose framework principal through state.json
Sargun Dhillon created MESOS-3835: - Summary: Expose framework principal through state.json Key: MESOS-3835 URL: https://issues.apache.org/jira/browse/MESOS-3835 Project: Mesos Issue Type: Wish Components: master Reporter: Sargun Dhillon Priority: Trivial We would like to expose the framework principal through the Master /state.json. This is for the purposes of both debugging (from the operator perspective). This could be used for inspection during the process of creating, or modifying ACLs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-3826) Add an optional unique identifier for resource reservations
Sargun Dhillon created MESOS-3826: - Summary: Add an optional unique identifier for resource reservations Key: MESOS-3826 URL: https://issues.apache.org/jira/browse/MESOS-3826 Project: Mesos Issue Type: Improvement Components: general Reporter: Sargun Dhillon Priority: Minor Thanks to the resource reservation primitives, frameworks can reserve resources. These reservations are per role, which means multiple frameworks can share reservations. This can get very hairy, as multiple reservations can occur on each agent. It would be nice to be able to optionally, uniquely identify reservations by ID, much like persistent volumes are today. This could be done by adding a new protobuf field, such as Resource.ReservationInfo.id, that if set upon reservation time, would come back when the reservation is advertised. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-2935) Documentation Needs Clarification about Compressed artifacts
Sargun Dhillon created MESOS-2935: - Summary: Documentation Needs Clarification about Compressed artifacts Key: MESOS-2935 URL: https://issues.apache.org/jira/browse/MESOS-2935 Project: Mesos Issue Type: Documentation Components: fetcher Reporter: Sargun Dhillon Priority: Trivial Compressed artifacts get decompressed with either unzip -d or tar -C $DIR -xf In addition, only the following file suffixes / extensions result in decompression: -tgz -tar.gz -tbz2 -tar.bz2 -tar.xz -txz -zip -- This message was sent by Atlassian JIRA (v6.3.4#6332)