[jira] [Created] (MESOS-8716) Freezer controller is not returned to thaw if task termination fails

2018-03-21 Thread Sargun Dhillon (JIRA)
Sargun Dhillon created MESOS-8716:
-

 Summary: Freezer controller is not returned to thaw if task 
termination fails
 Key: MESOS-8716
 URL: https://issues.apache.org/jira/browse/MESOS-8716
 Project: Mesos
  Issue Type: Bug
  Components: agent, containerization
Affects Versions: 1.3.2
Reporter: Sargun Dhillon


This issue is related to https://issues.apache.org/jira/browse/MESOS-8004. A 
container may fail to terminate for a variety of reasons. One common reason in 
our system is when containers rely on external storage, they run fsync before 
exiting (fsync on SIGTERM). This makes it so that the termination can timeout. 

 

Even though Mesos has sent the requisite kill signals, the task will never 
terminate because the cgroup stays frozen. 

 

The intended behaviour should be that on failure to terminate, if the pids 
isolator is running, pids.max should be set to 0, to prevent further processes 
from being created, the cgroup should be walked and sigkilled, and then thawed. 
Once the processes finish thawing, the kill signal will be delivered, and 
processed, resulting in the container finally finishing,



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (MESOS-7744) Mesos Agent Sends TASK_KILL status update to Master, and still launches task

2017-07-20 Thread Sargun Dhillon (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095569#comment-16095569
 ] 

Sargun Dhillon commented on MESOS-7744:
---

[~neilc]

The task is still running. The agent, and master think the task is killed. The 
framework receives TASK_KILLED. The framework "knows" due to out-of-band 
mechanisms the task is still alive (We have our own mechanism outside Mesos to 
do reconciliation), and it resends the kill, but the kill never gets to the 
executor. The Executor sends TASK_RUNNING status updates to the agent, but 
these never make it to the master, nor the framework.

It occurs if the executor is already running, and the task is killed nearly 
immediately after it's being started. Specifically, if when the task is on the 
"queue".

> Mesos Agent Sends TASK_KILL status update to Master, and still launches task
> 
>
> Key: MESOS-7744
> URL: https://issues.apache.org/jira/browse/MESOS-7744
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 1.0.1
>Reporter: Sargun Dhillon
>Priority: Minor
>  Labels: reliability
>
> We sometimes launch jobs, and cancel them in ~7 seconds, if we don't get a 
> TASK_STARTING back from the agent. Under certain conditions it can result in 
> Mesos losing track of the task. The chunk of the logs which is interesting is 
> here:
> {code}
> Jun 29 23:22:26 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
> mesos-slave[4290]: I0629 23:22:26.951799  5171 slave.cpp:1495] Got assigned 
> task Titus-7590548-worker-0-4476 for framework TitusFramework
> Jun 29 23:22:26 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
> mesos-slave[4290]: I0629 23:22:26.952251  5171 slave.cpp:1614] Launching task 
> Titus-7590548-worker-0-4476 for framework TitusFramework
> Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
> mesos-slave[4290]: I0629 23:22:37.484611  5171 slave.cpp:1853] Queuing task 
> ‘Titus-7590548-worker-0-4476’ for executor ‘docker-executor’ of framework 
> TitusFramework at executor(1)@100.66.11.10:17707
> Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
> mesos-slave[4290]: I0629 23:22:37.487876  5171 slave.cpp:2035] Asked to kill 
> task Titus-7590548-worker-0-4476 of framework TitusFramework
> Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
> mesos-slave[4290]: I0629 23:22:37.488994  5171 slave.cpp:3211] Handling 
> status update TASK_KILLED (UUID: 898215d6-a244-4dbe-bc9c-878a22d36ea4) for 
> task Titus-7590548-worker-0-4476 of framework TitusFramework from @0.0.0.0:0
> Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
> mesos-slave[4290]: I0629 23:22:37.490603  5171 slave.cpp:2005] Sending queued 
> task ‘Titus-7590548-worker-0-4476’ to executor ‘docker-executor’ of framework 
> TitusFramework at executor(1)@100.66.11.10:17707{
> {code}
> In our executor, we see that the launch message arrives after the master has 
> already gotten the kill update. We then send non-terminal state updates to 
> the agent, and yet it doesn't forward these to our framework. We're using a 
> custom executor which is based on the older mesos-go bindings. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7744) Mesos Agent Sends TASK_KILL status update to Master, and still launches task

2017-06-29 Thread Sargun Dhillon (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sargun Dhillon updated MESOS-7744:
--
Description: 
We sometimes launch jobs, and cancel them in ~7 seconds, if we don't get a 
TASK_STARTING back from the agent. Under certain conditions it can result in 
Mesos losing track of the task. The chunk of the logs which is interesting is 
here:

{code}
Jun 29 23:22:26 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
mesos-slave[4290]: I0629 23:22:26.951799  5171 slave.cpp:1495] Got assigned 
task Titus-7590548-worker-0-4476 for framework TitusFramework
Jun 29 23:22:26 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
mesos-slave[4290]: I0629 23:22:26.952251  5171 slave.cpp:1614] Launching task 
Titus-7590548-worker-0-4476 for framework TitusFramework
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
mesos-slave[4290]: I0629 23:22:37.484611  5171 slave.cpp:1853] Queuing task 
‘Titus-7590548-worker-0-4476’ for executor ‘docker-executor’ of framework 
TitusFramework at executor(1)@100.66.11.10:17707
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
mesos-slave[4290]: I0629 23:22:37.487876  5171 slave.cpp:2035] Asked to kill 
task Titus-7590548-worker-0-4476 of framework TitusFramework
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
mesos-slave[4290]: I0629 23:22:37.488994  5171 slave.cpp:3211] Handling status 
update TASK_KILLED (UUID: 898215d6-a244-4dbe-bc9c-878a22d36ea4) for task 
Titus-7590548-worker-0-4476 of framework TitusFramework from @0.0.0.0:0
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
mesos-slave[4290]: I0629 23:22:37.490603  5171 slave.cpp:2005] Sending queued 
task ‘Titus-7590548-worker-0-4476’ to executor ‘docker-executor’ of framework 
TitusFramework at executor(1)@100.66.11.10:17707{
{code}

In our executor, we see that the launch message arrives after the master has 
already gotten the kill update. We then send non-terminal state updates to the 
agent, and yet it doesn't forward these to our framework. We're using a custom 
executor which is based on the older mesos-go bindings. 

  was:
We sometimes launch jobs, and cancel them in ~7 seconds, if we don't get a 
TASK_STARTING back from the agent. Under certain conditions it can result in 
Mesos losing track of the task. The chunk of the logs which is interesting is 
here:

{code}
Jun 29 23:22:26 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
mesos-slave[4290]: I0629 23:22:26.951799  5171 slave.cpp:1495] Got assigned 
task Titus-7590548-worker-0-4476 for framework TitusFramework
Jun 29 23:22:26 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
mesos-slave[4290]: I0629 23:22:26.952251  5171 slave.cpp:1614] Launching task 
Titus-7590548-worker-0-4476 for framework TitusFramework
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
mesos-slave[4290]: I0629 23:22:37.484611  5171 slave.cpp:1853] Queuing task 
‘Titus-7590548-worker-0-4476’ for executor ‘docker-executor’ of framework 
TitusFramework at executor(1)@100.66.11.10:17707
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
mesos-slave[4290]: I0629 23:22:37.487876  5171 slave.cpp:2035] Asked to kill 
task Titus-7590548-worker-0-4476 of framework TitusFramework
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
mesos-slave[4290]: I0629 23:22:37.488994  5171 slave.cpp:3211] Handling status 
update TASK_KILLED (UUID: 898215d6-a244-4dbe-bc9c-878a22d36ea4) for task 
Titus-7590548-worker-0-4476 of framework TitusFramework from @0.0.0.0:0
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
mesos-slave[4290]: I0629 23:22:37.490603  5171 slave.cpp:2005] Sending queued 
task ‘Titus-7590548-worker-0-4476’ to executor ‘docker-executor’ of framework 
TitusFramework at executor(1)@100.66.11.10:17707{
{code}

In our executor, we see that the launch message arrives after the master has 
already gotten the kill update. We then send non-terminal state updates to the 
agent, and yet it doesn't forward these to our framework. 


> Mesos Agent Sends TASK_KILL status update to Master, and still launches task
> 
>
> Key: MESOS-7744
> URL: https://issues.apache.org/jira/browse/MESOS-7744
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 1.0.1
>Reporter: Sargun Dhillon
>Priority: Minor
>
> We sometimes launch jobs, and cancel them in ~7 seconds, if we don't get a 
> TASK_STARTING back from the agent. Under certain conditions it can result in 
> Mesos losing track of the task. The chunk of the logs which is interesting is 
> here:
> {code}
> Jun 29 23:22:26 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
> mesos-slave[4290]: I0629 23:22:26.951799  5171 slave.cpp:1495] Got assigned 

[jira] [Commented] (MESOS-7744) Mesos Agent Sends TASK_KILL status update to Master, and still launches task

2017-06-29 Thread Sargun Dhillon (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16069355#comment-16069355
 ] 

Sargun Dhillon commented on MESOS-7744:
---

Full log:
{code}
Jun 29 23:22:26 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
mesos-slave[4290]: I0629 23:22:26.951799  5171 slave.cpp:1495] Got assigned 
task Titus-7590548-worker-0-4476 for framework TitusFramework
Jun 29 23:22:26 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
mesos-slave[4290]: I0629 23:22:26.952251  5171 slave.cpp:1614] Launching task 
Titus-7590548-worker-0-4476 for framework TitusFramework
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
mesos-slave[4290]: I0629 23:22:37.484611  5171 slave.cpp:1853] Queuing task 
‘Titus-7590548-worker-0-4476’ for executor ‘docker-executor’ of framework 
TitusFramework at executor(1)@100.66.11.10:17707
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
mesos-slave[4290]: I0629 23:22:37.487876  5171 slave.cpp:2035] Asked to kill 
task Titus-7590548-worker-0-4476 of framework TitusFramework
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
mesos-slave[4290]: I0629 23:22:37.488994  5171 slave.cpp:3211] Handling status 
update TASK_KILLED (UUID: 898215d6-a244-4dbe-bc9c-878a22d36ea4) for task 
Titus-7590548-worker-0-4476 of framework TitusFramework from @0.0.0.0:0
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
mesos-slave[4290]: I0629 23:22:37.490603  5171 slave.cpp:2005] Sending queued 
task ‘Titus-7590548-worker-0-4476’ to executor ‘docker-executor’ of framework 
TitusFramework at executor(1)@100.66.11.10:17707
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
mesos-slave[4290]: I0629 23:22:37.494860  5171 slave.cpp:3211] Handling status 
update TASK_STARTING (UUID: d6aaed02-5d21-11e7-846c-0a0c90a7033c) for task 
Titus-7590548-worker-0-4476 of framework TitusFramework from 
executor(1)@100.66.11.10:17707
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
mesos-slave[4290]: I0629 23:22:37.496829  5191 status_update_manager.cpp:320] 
Received status update TASK_KILLED (UUID: 898215d6-a244-4dbe-bc9c-878a22d36ea4) 
for task Titus-7590548-worker-0-4476 of framework TitusFramework
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
mesos-slave[4290]: I0629 23:22:37.497530  5191 status_update_manager.cpp:825] 
Checkpointing UPDATE for status update TASK_KILLED (UUID: 
898215d6-a244-4dbe-bc9c-878a22d36ea4) for task Titus-7590548-worker-0-4476 of 
framework TitusFramework
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
mesos-slave[4290]: I0629 23:22:37.498082  5171 slave.cpp:3211] Handling status 
update TASK_STARTING (UUID: d6aafd3f-5d21-11e7-846c-0a0c90a7033c) for task 
Titus-7590548-worker-0-4476 of framework TitusFramework from 
executor(1)@100.66.11.10:17707
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
mesos-slave[4290]: I0629 23:22:37.500267  5191 status_update_manager.cpp:320] 
Received status update TASK_STARTING (UUID: 
d6aaed02-5d21-11e7-846c-0a0c90a7033c) for task Titus-7590548-worker-0-4476 of 
framework TitusFramework
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
mesos-slave[4290]: I0629 23:22:37.500377  5191 status_update_manager.cpp:825] 
Checkpointing UPDATE for status update TASK_STARTING (UUID: 
d6aaed02-5d21-11e7-846c-0a0c90a7033c) for task Titus-7590548-worker-0-4476 of 
framework TitusFramework
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
mesos-slave[4290]: I0629 23:22:37.500562  5189 slave.cpp:3604] Forwarding the 
update TASK_KILLED (UUID: 898215d6-a244-4dbe-bc9c-878a22d36ea4) for task 
Titus-7590548-worker-0-4476 of framework TitusFramework to 
master@100.66.3.213:7103
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
mesos-slave[4290]: I0629 23:22:37.502029  5191 status_update_manager.cpp:320] 
Received status update TASK_STARTING (UUID: 
d6aafd3f-5d21-11e7-846c-0a0c90a7033c) for task Titus-7590548-worker-0-4476 of 
framework TitusFramework
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
mesos-slave[4290]: I0629 23:22:37.502092  5191 status_update_manager.cpp:825] 
Checkpointing UPDATE for status update TASK_STARTING (UUID: 
d6aafd3f-5d21-11e7-846c-0a0c90a7033c) for task Titus-7590548-worker-0-4476 of 
framework TitusFramework
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
mesos-slave[4290]: I0629 23:22:37.502393  5189 slave.cpp:3514] Sending 
acknowledgement for status update TASK_STARTING (UUID: 
d6aaed02-5d21-11e7-846c-0a0c90a7033c) for task Titus-7590548-worker-0-4476 of 
framework TitusFramework to executor(1)@100.66.11.10:17707
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
mesos-slave[4290]: I0629 23:22:37.504465  5189 slave.cpp:3514] Sending 
acknowledgement for status update 

[jira] [Created] (MESOS-7744) Mesos Agent Sends TASK_KILL status update to Master, and still launches task

2017-06-29 Thread Sargun Dhillon (JIRA)
Sargun Dhillon created MESOS-7744:
-

 Summary: Mesos Agent Sends TASK_KILL status update to Master, and 
still launches task
 Key: MESOS-7744
 URL: https://issues.apache.org/jira/browse/MESOS-7744
 Project: Mesos
  Issue Type: Bug
Affects Versions: 1.0.1
Reporter: Sargun Dhillon
Priority: Minor


We sometimes launch jobs, and cancel them in ~7 seconds, if we don't get a 
TASK_STARTING back from the agent. Under certain conditions it can result in 
Mesos losing track of the task. The chunk of the logs which is interesting is 
here:

{code}
Jun 29 23:22:26 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
mesos-slave[4290]: I0629 23:22:26.951799  5171 slave.cpp:1495] Got assigned 
task Titus-7590548-worker-0-4476 for framework TitusFramework
Jun 29 23:22:26 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
mesos-slave[4290]: I0629 23:22:26.952251  5171 slave.cpp:1614] Launching task 
Titus-7590548-worker-0-4476 for framework TitusFramework
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
mesos-slave[4290]: I0629 23:22:37.484611  5171 slave.cpp:1853] Queuing task 
‘Titus-7590548-worker-0-4476’ for executor ‘docker-executor’ of framework 
TitusFramework at executor(1)@100.66.11.10:17707
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
mesos-slave[4290]: I0629 23:22:37.487876  5171 slave.cpp:2035] Asked to kill 
task Titus-7590548-worker-0-4476 of framework TitusFramework
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
mesos-slave[4290]: I0629 23:22:37.488994  5171 slave.cpp:3211] Handling status 
update TASK_KILLED (UUID: 898215d6-a244-4dbe-bc9c-878a22d36ea4) for task 
Titus-7590548-worker-0-4476 of framework TitusFramework from @0.0.0.0:0
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
mesos-slave[4290]: I0629 23:22:37.490603  5171 slave.cpp:2005] Sending queued 
task ‘Titus-7590548-worker-0-4476’ to executor ‘docker-executor’ of framework 
TitusFramework at executor(1)@100.66.11.10:17707{
{code}

In our executor, we see that the launch message arrives after the master has 
already gotten the kill update. We then send non-terminal state updates to the 
agent, and yet it doesn't forward these to our framework. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-4891) Add a '/containers' endpoint to the agent to list all the active containers.

2016-03-07 Thread Sargun Dhillon (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15184017#comment-15184017
 ] 

Sargun Dhillon commented on MESOS-4891:
---

Can we also have a place to list all executor PIDs that are associated with 
those containers?

> Add a '/containers' endpoint to the agent to list all the active containers.
> 
>
> Key: MESOS-4891
> URL: https://issues.apache.org/jira/browse/MESOS-4891
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Jie Yu
>
> This endpoint will be similar to /monitor/statistics.json endpoint, but it'll 
> also contain the 'container_status' about the container (see ContainerStatus 
> in mesos.proto). We'll eventually deprecate the /monitor/statistics.json 
> endpoint.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4883) Add agent ID to agent state endpoint

2016-03-07 Thread Sargun Dhillon (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15183895#comment-15183895
 ] 

Sargun Dhillon commented on MESOS-4883:
---

We have a tool here that looks at the agent state.json and assembles a complete 
cluster view based on the sum of the agent JSONs. This system is soft-state. If 
the last state we have in memory has a bunch of tasks associated with this 
slave, and this slave comes back and runs for a while (10m), and we don't see 
any tasks, we do not know to take those old tasks and remove them from the 
system.

> Add agent ID to agent state endpoint
> 
>
> Key: MESOS-4883
> URL: https://issues.apache.org/jira/browse/MESOS-4883
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Sargun Dhillon
>Priority: Minor
>  Labels: mesosphere
>
> I would like to have the slave ID exposed on the slave before any tasks are 
> running on the slave on the state.json endpoint. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4884) Ensure task_status timestamp is monotonic

2016-03-07 Thread Sargun Dhillon (JIRA)
Sargun Dhillon created MESOS-4884:
-

 Summary: Ensure task_status timestamp is monotonic
 Key: MESOS-4884
 URL: https://issues.apache.org/jira/browse/MESOS-4884
 Project: Mesos
  Issue Type: Improvement
Reporter: Sargun Dhillon
Priority: Critical


In state.json the task status has a timestamp associated with it. From my 
understanding, the timestamp is when the task status update was generated. 
Although the slave guarantees that the list is sorted, and the first item of 
the list is the newest status. This becomes a problem if someone is 
independently getting the task status updates -- without the logic in the 
slave, we cannot determine the current state of the task. 

There exists a timestamp on the task. I would like the executor (API) to ensure 
that this timestamp is strictly monotonic. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4883) Add agent ID to agent state endpoint

2016-03-07 Thread Sargun Dhillon (JIRA)
Sargun Dhillon created MESOS-4883:
-

 Summary: Add agent ID to agent state endpoint
 Key: MESOS-4883
 URL: https://issues.apache.org/jira/browse/MESOS-4883
 Project: Mesos
  Issue Type: Improvement
Reporter: Sargun Dhillon
Priority: Minor


I would like to have the slave ID exposed on the slave before any tasks are 
running on the slave on the state.json endpoint. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4427) Ensure ip_address in state.json (from NetworkInfo) is valid

2016-03-04 Thread Sargun Dhillon (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15180686#comment-15180686
 ] 

Sargun Dhillon commented on MESOS-4427:
---

[~jieyu] This is closed, right?

> Ensure ip_address in state.json (from NetworkInfo) is valid
> ---
>
> Key: MESOS-4427
> URL: https://issues.apache.org/jira/browse/MESOS-4427
> Project: Mesos
>  Issue Type: Bug
>Reporter: Sargun Dhillon
>Priority: Critical
>  Labels: mesosphere
>
> We have seen a master state.json where the state.json has a field that looks 
> similar to:
> ---REDACTED---
> {code:json}
> {
> "container": {
> "docker": {
> "force_pull_image": false,
> "image": "REDACTED",
> "network": "HOST",
> "privileged": false
> },
> "type": "DOCKER"
> },
> "executor_id": "",
> "framework_id": "9f0e50ea-54b0-44e3-a451-c69e0c1a58fb-",
> "id": "ping-as-a-service.c2d1c17a-be22-11e5-b053-002590e56e25",
> "name": "ping-as-a-service",
> "resources": {
> "cpus": 0.1,
> "disk": 0,
> "mem": 64,
> "ports": "[7907-7907]"
> },
> "slave_id": "9f0e50ea-54b0-44e3-a451-c69e0c1a58fb-S76043",
> "state": "TASK_RUNNING",
> "statuses": [
> {
> "container_status": {
> "network_infos": [
> {
> "ip_address": "",
> "ip_addresses": [
> {
> "ip_address": ""
> }
> ]
> }
> ]
> },
> "labels": [
> {
> "key": "Docker.NetworkSettings.IPAddress",
> "value": ""
> }
> ],
> "state": "TASK_RUNNING",
> "timestamp": 1453149270.95511
> }
> ]
> }
> {code}
> ---REDACTED---
> This is invalid, and it mesos-core should filter it. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4427) Ensure ip_address in state.json (from NetworkInfo) is valid

2016-03-04 Thread Sargun Dhillon (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15180685#comment-15180685
 ] 

Sargun Dhillon commented on MESOS-4427:
---

No.

> Ensure ip_address in state.json (from NetworkInfo) is valid
> ---
>
> Key: MESOS-4427
> URL: https://issues.apache.org/jira/browse/MESOS-4427
> Project: Mesos
>  Issue Type: Bug
>Reporter: Sargun Dhillon
>Priority: Critical
>  Labels: mesosphere
>
> We have seen a master state.json where the state.json has a field that looks 
> similar to:
> ---REDACTED---
> {code:json}
> {
> "container": {
> "docker": {
> "force_pull_image": false,
> "image": "REDACTED",
> "network": "HOST",
> "privileged": false
> },
> "type": "DOCKER"
> },
> "executor_id": "",
> "framework_id": "9f0e50ea-54b0-44e3-a451-c69e0c1a58fb-",
> "id": "ping-as-a-service.c2d1c17a-be22-11e5-b053-002590e56e25",
> "name": "ping-as-a-service",
> "resources": {
> "cpus": 0.1,
> "disk": 0,
> "mem": 64,
> "ports": "[7907-7907]"
> },
> "slave_id": "9f0e50ea-54b0-44e3-a451-c69e0c1a58fb-S76043",
> "state": "TASK_RUNNING",
> "statuses": [
> {
> "container_status": {
> "network_infos": [
> {
> "ip_address": "",
> "ip_addresses": [
> {
> "ip_address": ""
> }
> ]
> }
> ]
> },
> "labels": [
> {
> "key": "Docker.NetworkSettings.IPAddress",
> "value": ""
> }
> ],
> "state": "TASK_RUNNING",
> "timestamp": 1453149270.95511
> }
> ]
> }
> {code}
> ---REDACTED---
> This is invalid, and it mesos-core should filter it. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4738) Expose egress bandwidth as a resource

2016-02-24 Thread Sargun Dhillon (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15163473#comment-15163473
 ] 

Sargun Dhillon commented on MESOS-4738:
---

Yeah, we're using DRR for egress load balancing at the moment with our security 
stuff.

> Expose egress bandwidth as a resource
> -
>
> Key: MESOS-4738
> URL: https://issues.apache.org/jira/browse/MESOS-4738
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Sargun Dhillon
>Priority: Minor
>  Labels: mesosphere
>
> Some of our users care about variable network network isolation. Although we 
> cannot fundamentally limit ingress network bandwidth, having it as a 
> resource, so we can drop packets above a specific limit would be attractive. 
> It would be nice to expose egress and ingress bandwidth as an agent resource, 
> perhaps with a default of 10,000 mbps, and we can allow people to adjust as 
> needed. Alternatively, a more advanced design would involve generating 
> heuristics based on an analysis of the network MII / PHY. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4752) Expose ingress bandwidth as a resource

2016-02-23 Thread Sargun Dhillon (JIRA)
Sargun Dhillon created MESOS-4752:
-

 Summary: Expose ingress bandwidth as a resource
 Key: MESOS-4752
 URL: https://issues.apache.org/jira/browse/MESOS-4752
 Project: Mesos
  Issue Type: Improvement
Reporter: Sargun Dhillon
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4738) Expose egress bandwidth as a resource

2016-02-23 Thread Sargun Dhillon (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159756#comment-15159756
 ] 

Sargun Dhillon commented on MESOS-4738:
---

We can shape ingress bandwidth. We drop packets beyond a certain rate, and mark 
ECN if someone is within 80% of throughput of being dropped.

> Expose egress bandwidth as a resource
> -
>
> Key: MESOS-4738
> URL: https://issues.apache.org/jira/browse/MESOS-4738
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Sargun Dhillon
>Priority: Minor
>  Labels: mesosphere
>
> Some of our users care about variable network network isolation. Although we 
> cannot fundamentally limit ingress network bandwidth, having it as a 
> resource, so we can drop packets above a specific limit would be attractive. 
> It would be nice to expose egress and ingress bandwidth as an agent resource, 
> perhaps with a default of 10,000 mbps, and we can allow people to adjust as 
> needed. Alternatively, a more advanced design would involve generating 
> heuristics based on an analysis of the network MII / PHY. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4738) Make ingress and egress bandwidth a resource

2016-02-22 Thread Sargun Dhillon (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157981#comment-15157981
 ] 

Sargun Dhillon commented on MESOS-4738:
---

We could build something to determine the bandwidth on the machine, or the 
kinds of NICs on the machine to model bandwidth. I would say we should have a 
plan of implementing this resource, because it is in demand.

> Make ingress and egress bandwidth a resource
> 
>
> Key: MESOS-4738
> URL: https://issues.apache.org/jira/browse/MESOS-4738
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Sargun Dhillon
>Priority: Minor
>  Labels: mesosphere
>
> Some of our users care about variable network network isolation. Although we 
> cannot fundamentally limit ingress network bandwidth, having it as a 
> resource, so we can drop packets above a specific limit would be attractive. 
> It would be nice to expose egress and ingress bandwidth as an agent resource, 
> perhaps with a default of 10,000 mbps, and we can allow people to adjust as 
> needed. Alternatively, a more advanced design would involve generating 
> heuristics based on an analysis of the network MII / PHY. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4738) Make ingress and egress bandwidth a resource

2016-02-22 Thread Sargun Dhillon (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sargun Dhillon updated MESOS-4738:
--
Description: 
Some of our users care about variable network network isolation. Although we 
cannot fundamentally limit ingress network bandwidth, having it as a resource, 
so we can drop packets above a specific limit would be attractive. 

It would be nice to expose egress and ingress bandwidth as an agent resource, 
perhaps with a default of 10,000 mbps, and we can allow people to adjust as 
needed. Alternatively, a more advanced design would involve generating 
heuristics based on an analysis of the network MII / PHY. 



  was:
Some of our users care about variable network network isolation. Although we 
cannot fundamentally limit ingress network bandwidth, having it as a resource, 
so we can drop packets above a specific limit would be attractive. 




> Make ingress and egress bandwidth a resource
> 
>
> Key: MESOS-4738
> URL: https://issues.apache.org/jira/browse/MESOS-4738
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Sargun Dhillon
>Priority: Minor
>  Labels: mesosphere
>
> Some of our users care about variable network network isolation. Although we 
> cannot fundamentally limit ingress network bandwidth, having it as a 
> resource, so we can drop packets above a specific limit would be attractive. 
> It would be nice to expose egress and ingress bandwidth as an agent resource, 
> perhaps with a default of 10,000 mbps, and we can allow people to adjust as 
> needed. Alternatively, a more advanced design would involve generating 
> heuristics based on an analysis of the network MII / PHY. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4738) Make ingress and egress bandwidth a resource

2016-02-22 Thread Sargun Dhillon (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sargun Dhillon updated MESOS-4738:
--
Labels: mesosphere  (was: )

> Make ingress and egress bandwidth a resource
> 
>
> Key: MESOS-4738
> URL: https://issues.apache.org/jira/browse/MESOS-4738
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Sargun Dhillon
>Priority: Minor
>  Labels: mesosphere
>
> Some of our users care about variable network network isolation. Although we 
> cannot fundamentally limit ingress network bandwidth, having it as a 
> resource, so we can drop packets above a specific limit would be attractive. 
> It would be nice to expose egress and ingress bandwidth as an agent resource, 
> perhaps with a default of 10,000 mbps, and we can allow people to adjust as 
> needed. Alternatively, a more advanced design would involve generating 
> heuristics based on an analysis of the network MII / PHY. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4738) Make ingress and egress bandwidth a resource

2016-02-22 Thread Sargun Dhillon (JIRA)
Sargun Dhillon created MESOS-4738:
-

 Summary: Make ingress and egress bandwidth a resource
 Key: MESOS-4738
 URL: https://issues.apache.org/jira/browse/MESOS-4738
 Project: Mesos
  Issue Type: Improvement
Reporter: Sargun Dhillon
Priority: Minor


Some of our users care about variable network network isolation. Although we 
cannot fundamentally limit ingress network bandwidth, having it as a resource, 
so we can drop packets above a specific limit would be attractive. 





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4710) Add comment about labels caveats to mesos.proto

2016-02-18 Thread Sargun Dhillon (JIRA)
Sargun Dhillon created MESOS-4710:
-

 Summary: Add comment about labels caveats to mesos.proto
 Key: MESOS-4710
 URL: https://issues.apache.org/jira/browse/MESOS-4710
 Project: Mesos
  Issue Type: Improvement
Reporter: Sargun Dhillon
Priority: Trivial


Right now, there exists the labels message in the mesos.proto. This message may 
have duplicate entires of a specific label, but it is not recommended to do 
this, because some software treats this datastructure as the basis for a 
key-value datastructure. 

There might be some value in adding detail around the potential downsides to 
repeated instances of labels, and suggesting that users ensure each label key 
inside a repeated labels field is unique.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4427) Ensure ip_address in state.json (from NetworkInfo) is valid

2016-01-19 Thread Sargun Dhillon (JIRA)
Sargun Dhillon created MESOS-4427:
-

 Summary: Ensure ip_address in state.json (from NetworkInfo) is 
valid
 Key: MESOS-4427
 URL: https://issues.apache.org/jira/browse/MESOS-4427
 Project: Mesos
  Issue Type: Bug
Reporter: Sargun Dhillon
Priority: Critical


We have seen a master state.json where the state.json has a field that looks 
similar to:
```
---REDACTED---
{
"container": {
"docker": {
"force_pull_image": false,
"image": "REDACTED",
"network": "HOST",
"privileged": false
},
"type": "DOCKER"
},
"executor_id": "",
"framework_id": "9f0e50ea-54b0-44e3-a451-c69e0c1a58fb-",
"id": "ping-as-a-service.c2d1c17a-be22-11e5-b053-002590e56e25",
"name": "ping-as-a-service",
"resources": {
"cpus": 0.1,
"disk": 0,
"mem": 64,
"ports": "[7907-7907]"
},
"slave_id": "9f0e50ea-54b0-44e3-a451-c69e0c1a58fb-S76043",
"state": "TASK_RUNNING",
"statuses": [
{
"container_status": {
"network_infos": [
{
"ip_address": "",
"ip_addresses": [
{
"ip_address": ""
}
]
}
]
},
"labels": [
{
"key": "Docker.NetworkSettings.IPAddress",
"value": ""
}
],
"state": "TASK_RUNNING",
"timestamp": 1453149270.95511
}
]
}
---REDACTED---
```

This is invalid, and it mesos-core should filter it. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4427) Ensure ip_address in state.json (from NetworkInfo) is valid

2016-01-19 Thread Sargun Dhillon (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sargun Dhillon updated MESOS-4427:
--
Description: 
We have seen a master state.json where the state.json has a field that looks 
similar to:


---REDACTED---
{code:json}
{
"container": {
"docker": {
"force_pull_image": false,
"image": "REDACTED",
"network": "HOST",
"privileged": false
},
"type": "DOCKER"
},
"executor_id": "",
"framework_id": "9f0e50ea-54b0-44e3-a451-c69e0c1a58fb-",
"id": "ping-as-a-service.c2d1c17a-be22-11e5-b053-002590e56e25",
"name": "ping-as-a-service",
"resources": {
"cpus": 0.1,
"disk": 0,
"mem": 64,
"ports": "[7907-7907]"
},
"slave_id": "9f0e50ea-54b0-44e3-a451-c69e0c1a58fb-S76043",
"state": "TASK_RUNNING",
"statuses": [
{
"container_status": {
"network_infos": [
{
"ip_address": "",
"ip_addresses": [
{
"ip_address": ""
}
]
}
]
},
"labels": [
{
"key": "Docker.NetworkSettings.IPAddress",
"value": ""
}
],
"state": "TASK_RUNNING",
"timestamp": 1453149270.95511
}
]
}
{code}
---REDACTED---


This is invalid, and it mesos-core should filter it. 

  was:
We have seen a master state.json where the state.json has a field that looks 
similar to:

```
---REDACTED---
{
"container": {
"docker": {
"force_pull_image": false,
"image": "REDACTED",
"network": "HOST",
"privileged": false
},
"type": "DOCKER"
},
"executor_id": "",
"framework_id": "9f0e50ea-54b0-44e3-a451-c69e0c1a58fb-",
"id": "ping-as-a-service.c2d1c17a-be22-11e5-b053-002590e56e25",
"name": "ping-as-a-service",
"resources": {
"cpus": 0.1,
"disk": 0,
"mem": 64,
"ports": "[7907-7907]"
},
"slave_id": "9f0e50ea-54b0-44e3-a451-c69e0c1a58fb-S76043",
"state": "TASK_RUNNING",
"statuses": [
{
"container_status": {
"network_infos": [
{
"ip_address": "",
"ip_addresses": [
{
"ip_address": ""
}
]
}
]
},
"labels": [
{
"key": "Docker.NetworkSettings.IPAddress",
"value": ""
}
],
"state": "TASK_RUNNING",
"timestamp": 1453149270.95511
}
]
}
---REDACTED---
```

This is invalid, and it mesos-core should filter it. 


> Ensure ip_address in state.json (from NetworkInfo) is valid
> ---
>
> Key: MESOS-4427
> URL: https://issues.apache.org/jira/browse/MESOS-4427
> Project: Mesos
>  Issue Type: Bug
>Reporter: Sargun Dhillon
>Priority: Critical
>  Labels: mesosphere
>
> We have seen a master state.json where the state.json has a field that looks 
> similar to:
> ---REDACTED---
> {code:json}
> {
> "container": {
> "docker": {
> "force_pull_image": false,
> "image": "REDACTED",
> "network": "HOST",
> "privileged": false
> },
> "type": "DOCKER"
> },
> "executor_id": "",
> "framework_id": "9f0e50ea-54b0-44e3-a451-c69e0c1a58fb-",
> "id": "ping-as-a-service.c2d1c17a-be22-11e5-b053-002590e56e25",
> "name": "ping-as-a-service",
> "resources": {
> "cpus": 0.1,
> "disk": 0,
> "mem": 64,
> "ports": "[7907-7907]"
> },
> "slave_id": "9f0e50ea-54b0-44e3-a451-c69e0c1a58fb-S76043",
> "state": "TASK_RUNNING",
> "statuses": [
> {
> "container_status": {
> "network_infos": [
> {
> "ip_address": "",
> "ip_addresses": [
> {
> "ip_address": ""
> }
> ]
> }
> ]
> },
> "labels": [
> {
> "key": "Docker.NetworkSettings.IPAddress",
> "value": ""
> }
> ],
> "state": "TASK_RUNNING",
> 

[jira] [Created] (MESOS-4356) Expose Slave IP in state.json

2016-01-13 Thread Sargun Dhillon (JIRA)
Sargun Dhillon created MESOS-4356:
-

 Summary: Expose Slave IP in state.json
 Key: MESOS-4356
 URL: https://issues.apache.org/jira/browse/MESOS-4356
 Project: Mesos
  Issue Type: Improvement
Reporter: Sargun Dhillon
Priority: Minor


Right now, we expose the slave hostname in state.json. Unfortunately, there are 
many environments where DNS does not work. The slave's libprocess PID is in 
state.json as well, and we're (Mesos-DNS, and others) forced to parse that. 
That's less than optimal. If the slave's IP was in state.json directly, that 
would make our lives easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4120) Make DiscoveryInfo dynamically updatable

2016-01-12 Thread Sargun Dhillon (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15095716#comment-15095716
 ] 

Sargun Dhillon commented on MESOS-4120:
---

Sorry, no. It can align along a future release. It is primarily to
support dynamically updated VIPs with systems like K8s, and it's
required for us to make parity with their label-oriented load
balancing.



Sent from my iPhone



> Make DiscoveryInfo dynamically updatable
> 
>
> Key: MESOS-4120
> URL: https://issues.apache.org/jira/browse/MESOS-4120
> Project: Mesos
>  Issue Type: Improvement
>Affects Versions: 0.25.0, 0.26.0
>Reporter: Sargun Dhillon
>Priority: Critical
>  Labels: mesosphere
>
> K8s tasks can dynamically update what they expose to make discoverable by the 
> cluster. Unfortunately, all DiscoveryInfo the cluster is immutable, at the 
> time of task start. 
> We would like to enable DiscoveryInfo to be dynamically updatable, so that 
> executors can change what they're advertising based on their internal state, 
> versus requiring DiscoveryInfo to be known prior to starting the tasks. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4120) Make DiscoveryInfo dynamically updatable

2016-01-12 Thread Sargun Dhillon (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15095698#comment-15095698
 ] 

Sargun Dhillon commented on MESOS-4120:
---

No. We are restrategizing, and punting this.

Sent from my iPhone



> Make DiscoveryInfo dynamically updatable
> 
>
> Key: MESOS-4120
> URL: https://issues.apache.org/jira/browse/MESOS-4120
> Project: Mesos
>  Issue Type: Improvement
>Affects Versions: 0.25.0, 0.26.0
>Reporter: Sargun Dhillon
>Priority: Critical
>  Labels: mesosphere
>
> K8s tasks can dynamically update what they expose to make discoverable by the 
> cluster. Unfortunately, all DiscoveryInfo the cluster is immutable, at the 
> time of task start. 
> We would like to enable DiscoveryInfo to be dynamically updatable, so that 
> executors can change what they're advertising based on their internal state, 
> versus requiring DiscoveryInfo to be known prior to starting the tasks. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4114) Add field VIP to message Port

2016-01-06 Thread Sargun Dhillon (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15085845#comment-15085845
 ] 

Sargun Dhillon commented on MESOS-4114:
---

I think we should just do a label on ports. In order to expose a VIP, a port 
can have a label named vip with a value of a standard ip:port notation. So, 
like 1.2.3.4:4000, or [::1]:80. Let's get rid of the field VIPInfo, I think 
it's inflexible, an complicated. 

> Add field VIP to message Port
> -
>
> Key: MESOS-4114
> URL: https://issues.apache.org/jira/browse/MESOS-4114
> Project: Mesos
>  Issue Type: Wish
>Reporter: Sargun Dhillon
>Assignee: Avinash Sridharan
>Priority: Trivial
>  Labels: mesosphere
> Fix For: 0.27.0
>
>
> We would like to extend the Mesos protocol buffer 'Port' to include an 
> optional repeated string named "VIP" - to map it to a well known virtual IP, 
> or virtual hostname for discovery purposes.
> We also want this field exposed in DiscoveryInfo in state.json.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-4114) Add field VIP to message Port

2016-01-06 Thread Sargun Dhillon (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15085845#comment-15085845
 ] 

Sargun Dhillon edited comment on MESOS-4114 at 1/6/16 5:10 PM:
---

I think we should just do a label on ports. In order to expose a VIP, a port 
can have a label named vip with a value of a standard ip:port notation. So, 
like 1.2.3.4:4000, or [::1]:80. Let's get rid of the field VIPInfo, I think 
it's inflexible, an complicated. 

Of course this info should be exposed via state.json as well. 


was (Author: sargun):
I think we should just do a label on ports. In order to expose a VIP, a port 
can have a label named vip with a value of a standard ip:port notation. So, 
like 1.2.3.4:4000, or [::1]:80. Let's get rid of the field VIPInfo, I think 
it's inflexible, an complicated. 

> Add field VIP to message Port
> -
>
> Key: MESOS-4114
> URL: https://issues.apache.org/jira/browse/MESOS-4114
> Project: Mesos
>  Issue Type: Wish
>Reporter: Sargun Dhillon
>Assignee: Avinash Sridharan
>Priority: Trivial
>  Labels: mesosphere
> Fix For: 0.27.0
>
>
> We would like to extend the Mesos protocol buffer 'Port' to include an 
> optional repeated string named "VIP" - to map it to a well known virtual IP, 
> or virtual hostname for discovery purposes.
> We also want this field exposed in DiscoveryInfo in state.json.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-4114) Add field VIP to message Port

2016-01-06 Thread Sargun Dhillon (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15085845#comment-15085845
 ] 

Sargun Dhillon edited comment on MESOS-4114 at 1/6/16 5:23 PM:
---

I think we should just do a label on ports. In order to expose a VIP, a port 
can have a label named vip with a value of a standard ip:port notation. So, 
like 1.2.3.4:4000, or [::1]:80. Let's get rid of the field VIPInfo, I think 
it's inflexible, an complicated. The label can be repeated, in case the user 
wants to expose a specific port multiple times, they just add two labels, both 
with key VIP, and value of different VIPs.

Of course this info should be exposed via state.json as well. 


was (Author: sargun):
I think we should just do a label on ports. In order to expose a VIP, a port 
can have a label named vip with a value of a standard ip:port notation. So, 
like 1.2.3.4:4000, or [::1]:80. Let's get rid of the field VIPInfo, I think 
it's inflexible, an complicated. 

Of course this info should be exposed via state.json as well. 

> Add field VIP to message Port
> -
>
> Key: MESOS-4114
> URL: https://issues.apache.org/jira/browse/MESOS-4114
> Project: Mesos
>  Issue Type: Wish
>Reporter: Sargun Dhillon
>Assignee: Avinash Sridharan
>Priority: Trivial
>  Labels: mesosphere
> Fix For: 0.27.0
>
>
> We would like to extend the Mesos protocol buffer 'Port' to include an 
> optional repeated string named "VIP" - to map it to a well known virtual IP, 
> or virtual hostname for discovery purposes.
> We also want this field exposed in DiscoveryInfo in state.json.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4286) Expose state(.json) as a structured protobuf

2016-01-04 Thread Sargun Dhillon (JIRA)
Sargun Dhillon created MESOS-4286:
-

 Summary: Expose state(.json) as a structured protobuf
 Key: MESOS-4286
 URL: https://issues.apache.org/jira/browse/MESOS-4286
 Project: Mesos
  Issue Type: Wish
Reporter: Sargun Dhillon
Priority: Minor


State.json, both on the agent, and the master exposes information about the 
current state of the Mesos runtime. This information is super valuable to 
external users such as Mesos-DNS. Unfortunately, working with state.json can at 
times become cumbersome in languages where dealing with json isn't necessarily 
a first-class construct. 

Fortunately, protocol buffers exist. If the state.json was exposed as a 
protocol buffer, it would make the lives of software authors to the Mesos 
ecosystem significantly easier. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4286) Expose state(.json) as a structured protobuf

2016-01-04 Thread Sargun Dhillon (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sargun Dhillon updated MESOS-4286:
--
Labels: mesosphere  (was: )

> Expose state(.json) as a structured protobuf
> 
>
> Key: MESOS-4286
> URL: https://issues.apache.org/jira/browse/MESOS-4286
> Project: Mesos
>  Issue Type: Wish
>Reporter: Sargun Dhillon
>Priority: Minor
>  Labels: mesosphere
>
> State.json, both on the agent, and the master exposes information about the 
> current state of the Mesos runtime. This information is super valuable to 
> external users such as Mesos-DNS. Unfortunately, working with state.json can 
> at times become cumbersome in languages where dealing with json isn't 
> necessarily a first-class construct. 
> Fortunately, protocol buffers exist. If the state.json was exposed as a 
> protocol buffer, it would make the lives of software authors to the Mesos 
> ecosystem significantly easier. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4113) Docker Executor should not set container IP during bridged mode

2015-12-28 Thread Sargun Dhillon (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15073065#comment-15073065
 ] 

Sargun Dhillon commented on MESOS-4113:
---

MESOS-4064 fundamentally solves the problem that this ticket raises, but it 
requires that every program that needs to determine the IP of a task implement 
some logic, that although trivial, can be error-prone. I'd rather Mesos 
implement this, rather than having to implement it in any of the quarter-dozen 
pieces of software that need this information. 

> Docker Executor should not set container IP during bridged mode
> ---
>
> Key: MESOS-4113
> URL: https://issues.apache.org/jira/browse/MESOS-4113
> Project: Mesos
>  Issue Type: Bug
>  Components: docker
>Affects Versions: 0.25.0, 0.26.0
>Reporter: Sargun Dhillon
>Assignee: Artem Harutyunyan
>  Labels: mesosphere
>
> The docker executor currently sets the IP address of the container into 
> ContainerStatus.NetworkInfo.IPAddresses. This isn't a good thing, because 
> during bridged mode execution, it makes it so that that IP address is 
> useless, since it's behind the Docker NAT. I would like a flag that disables 
> filling the IP address in, and allows it to fall back to the agent IP. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4113) Docker Executor should not set container IP during bridged mode

2015-12-28 Thread Sargun Dhillon (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15073008#comment-15073008
 ] 

Sargun Dhillon commented on MESOS-4113:
---

The libprocess agent of the IP, not the executor. That'll be the 
externally-accessible IP for the task, assuming Docker NAT.

> Docker Executor should not set container IP during bridged mode
> ---
>
> Key: MESOS-4113
> URL: https://issues.apache.org/jira/browse/MESOS-4113
> Project: Mesos
>  Issue Type: Bug
>  Components: docker
>Affects Versions: 0.25.0, 0.26.0
>Reporter: Sargun Dhillon
>Assignee: Artem Harutyunyan
>  Labels: mesosphere
>
> The docker executor currently sets the IP address of the container into 
> ContainerStatus.NetworkInfo.IPAddresses. This isn't a good thing, because 
> during bridged mode execution, it makes it so that that IP address is 
> useless, since it's behind the Docker NAT. I would like a flag that disables 
> filling the IP address in, and allows it to fall back to the agent IP. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4113) Docker Executor should not set container IP during bridged mode

2015-12-23 Thread Sargun Dhillon (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15069471#comment-15069471
 ] 

Sargun Dhillon commented on MESOS-4113:
---

The information that's exposed by MESOS-4064 allows for a external program to

> Docker Executor should not set container IP during bridged mode
> ---
>
> Key: MESOS-4113
> URL: https://issues.apache.org/jira/browse/MESOS-4113
> Project: Mesos
>  Issue Type: Bug
>  Components: docker
>Affects Versions: 0.25.0, 0.26.0
>Reporter: Sargun Dhillon
>Assignee: Artem Harutyunyan
>  Labels: mesosphere
>
> The docker executor currently sets the IP address of the container into 
> ContainerStatus.NetworkInfo.IPAddresses. This isn't a good thing, because 
> during bridged mode execution, it makes it so that that IP address is 
> useless, since it's behind the Docker NAT. I would like a flag that disables 
> filling the IP address in, and allows it to fall back to the agent IP. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-4113) Docker Executor should not set container IP during bridged mode

2015-12-23 Thread Sargun Dhillon (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15069471#comment-15069471
 ] 

Sargun Dhillon edited comment on MESOS-4113 at 12/23/15 10:22 AM:
--

The information that's exposed by MESOS-4064 allows for a external program to 
analyze the state.json and determine what IP to use. Specifically, it parses to 
see if the task / executor has a docker container in bridged mode. If it's in 
the mode, it uses the slaveID field to lookup the relevant slave, and then 
parses the PID. Currently, Minuteman, and Mesos-DNS both do this.

I believe we should have another NetworkInfos field that actually determines 
the definitive IPs that external users can contact in order to connect to the 
task, because NetworkInfos as they are today are effectively useless, due to 
the behaviour under Docker containers.


was (Author: sargun):
The information that's exposed by MESOS-4064 allows for a external program to

> Docker Executor should not set container IP during bridged mode
> ---
>
> Key: MESOS-4113
> URL: https://issues.apache.org/jira/browse/MESOS-4113
> Project: Mesos
>  Issue Type: Bug
>  Components: docker
>Affects Versions: 0.25.0, 0.26.0
>Reporter: Sargun Dhillon
>Assignee: Artem Harutyunyan
>  Labels: mesosphere
>
> The docker executor currently sets the IP address of the container into 
> ContainerStatus.NetworkInfo.IPAddresses. This isn't a good thing, because 
> during bridged mode execution, it makes it so that that IP address is 
> useless, since it's behind the Docker NAT. I would like a flag that disables 
> filling the IP address in, and allows it to fall back to the agent IP. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-4113) Docker Executor should not set container IP during bridged mode

2015-12-23 Thread Sargun Dhillon (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15069471#comment-15069471
 ] 

Sargun Dhillon edited comment on MESOS-4113 at 12/23/15 10:24 AM:
--

The information that's exposed by MESOS-4064 allows for a external program to 
analyze the state.json and determine what IP to use. Specifically, it parses to 
see if the task / executor has a docker container in bridged mode. If it's in 
the mode, it uses the slaveID field to lookup the relevant slave, and then 
parses the PID. Currently, Minuteman, and Mesos-DNS both do this.

I believe we should have another NetworkInfos field that actually determines 
the definitive IPs that external users can contact in order to connect to the 
task, because NetworkInfos as they are today are effectively useless, due to 
the behaviour under Docker containers.

CC: [~jieyu] [~avin...@mesosphere.io]


was (Author: sargun):
The information that's exposed by MESOS-4064 allows for a external program to 
analyze the state.json and determine what IP to use. Specifically, it parses to 
see if the task / executor has a docker container in bridged mode. If it's in 
the mode, it uses the slaveID field to lookup the relevant slave, and then 
parses the PID. Currently, Minuteman, and Mesos-DNS both do this.

I believe we should have another NetworkInfos field that actually determines 
the definitive IPs that external users can contact in order to connect to the 
task, because NetworkInfos as they are today are effectively useless, due to 
the behaviour under Docker containers.

> Docker Executor should not set container IP during bridged mode
> ---
>
> Key: MESOS-4113
> URL: https://issues.apache.org/jira/browse/MESOS-4113
> Project: Mesos
>  Issue Type: Bug
>  Components: docker
>Affects Versions: 0.25.0, 0.26.0
>Reporter: Sargun Dhillon
>Assignee: Artem Harutyunyan
>  Labels: mesosphere
>
> The docker executor currently sets the IP address of the container into 
> ContainerStatus.NetworkInfo.IPAddresses. This isn't a good thing, because 
> during bridged mode execution, it makes it so that that IP address is 
> useless, since it's behind the Docker NAT. I would like a flag that disables 
> filling the IP address in, and allows it to fall back to the agent IP. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4113) Docker Executor should not set container IP during bridged mode

2015-12-23 Thread Sargun Dhillon (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15070050#comment-15070050
 ] 

Sargun Dhillon commented on MESOS-4113:
---

Could Mesos just use the libprocess IP by default? RE: The ports, this could be 
ascertained using DiscoveryInfo or by looking at the ports resource.

> Docker Executor should not set container IP during bridged mode
> ---
>
> Key: MESOS-4113
> URL: https://issues.apache.org/jira/browse/MESOS-4113
> Project: Mesos
>  Issue Type: Bug
>  Components: docker
>Affects Versions: 0.25.0, 0.26.0
>Reporter: Sargun Dhillon
>Assignee: Artem Harutyunyan
>  Labels: mesosphere
>
> The docker executor currently sets the IP address of the container into 
> ContainerStatus.NetworkInfo.IPAddresses. This isn't a good thing, because 
> during bridged mode execution, it makes it so that that IP address is 
> useless, since it's behind the Docker NAT. I would like a flag that disables 
> filling the IP address in, and allows it to fall back to the agent IP. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4113) Docker Executor should not set container IP during bridged mode

2015-12-12 Thread Sargun Dhillon (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15054493#comment-15054493
 ] 

Sargun Dhillon commented on MESOS-4113:
---

The purpose of NetworkInfo, that I see is what IP the task is accessible at. It 
shouldn't be set by the frameworks (ever). It should instead be set by the 
executors, or isolators.

> Docker Executor should not set container IP during bridged mode
> ---
>
> Key: MESOS-4113
> URL: https://issues.apache.org/jira/browse/MESOS-4113
> Project: Mesos
>  Issue Type: Bug
>  Components: docker
>Affects Versions: 0.25.0, 0.26.0
>Reporter: Sargun Dhillon
>  Labels: mesosphere
>
> The docker executor currently sets the IP address of the container into 
> ContainerStatus.NetworkInfo.IPAddresses. This isn't a good thing, because 
> during bridged mode execution, it makes it so that that IP address is 
> useless, since it's behind the Docker NAT. I would like a flag that disables 
> filling the IP address in, and allows it to fall back to the agent IP. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4140) Indicate that the task is shutting down on shutdown

2015-12-11 Thread Sargun Dhillon (JIRA)
Sargun Dhillon created MESOS-4140:
-

 Summary: Indicate that the task is shutting down on shutdown
 Key: MESOS-4140
 URL: https://issues.apache.org/jira/browse/MESOS-4140
 Project: Mesos
  Issue Type: Improvement
Reporter: Sargun Dhillon


In the shutdown handler in the default executor, there is a grace period 
between when a SIGTERM is sent, and a SIGKILL is sent. There should a mechanism 
to expose that the task is being killed. A simple mechanism would be to mark 
the task as unhealthy. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4139) Make escalationTimeout configurable

2015-12-11 Thread Sargun Dhillon (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sargun Dhillon updated MESOS-4139:
--
Priority: Major  (was: Critical)

> Make escalationTimeout configurable
> ---
>
> Key: MESOS-4139
> URL: https://issues.apache.org/jira/browse/MESOS-4139
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Sargun Dhillon
>  Labels: mesosphere
>
> At the moment, escalationTimeout is fixed at 3 seconds in the code. This 
> means that if a task is shutdown, there are only 3 seconds between the 
> SIGTERM, and SIGKILL. This means that if someone is running something like a 
> rails framework, it may be too quick to terminate the tasks. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4114) Add field VIP to message Port

2015-12-11 Thread Sargun Dhillon (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053110#comment-15053110
 ] 

Sargun Dhillon commented on MESOS-4114:
---

 Isn’t the port on DiscoveryInfo.Ports.Port the local port (the one that 
Marathon requested from Mesos?)? Otherwise, how do you know which DiscoveryInfo 
name correlated with what Mesos Port.

Also, you may want to expose different services under different names, or IPs.

> Add field VIP to message Port
> -
>
> Key: MESOS-4114
> URL: https://issues.apache.org/jira/browse/MESOS-4114
> Project: Mesos
>  Issue Type: Wish
>Reporter: Sargun Dhillon
>Assignee: Avinash Sridharan
>Priority: Trivial
>  Labels: mesosphere
>
> We would like to extend the Mesos protocol buffer 'Port' to include an 
> optional repeated string named "VIP" - to map it to a well known virtual IP, 
> or virtual hostname for discovery purposes.
> We also want this field exposed in DiscoveryInfo in state.json.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4139) Make escalationTimeout configurable

2015-12-11 Thread Sargun Dhillon (JIRA)
Sargun Dhillon created MESOS-4139:
-

 Summary: Make escalationTimeout configurable
 Key: MESOS-4139
 URL: https://issues.apache.org/jira/browse/MESOS-4139
 Project: Mesos
  Issue Type: Improvement
Reporter: Sargun Dhillon
Priority: Critical


At the moment, escalationTimeout is fixed at 3 seconds in the code. This means 
that if a task is shutdown, there are only 3 seconds between the SIGTERM, and 
SIGKILL. This means that if someone is running something like a rails 
framework, it may be too quick to terminate the tasks. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4120) Make DiscoveryInfo dynamically updatable

2015-12-10 Thread Sargun Dhillon (JIRA)
Sargun Dhillon created MESOS-4120:
-

 Summary: Make DiscoveryInfo dynamically updatable
 Key: MESOS-4120
 URL: https://issues.apache.org/jira/browse/MESOS-4120
 Project: Mesos
  Issue Type: Improvement
Reporter: Sargun Dhillon
Priority: Critical


K8s tasks can dynamically update what they expose to make discoverable by the 
cluster. Unfortunately, all DiscoveryInfo the cluster is immutable, at the time 
of task start. 

We would like to enable DiscoveryInfo to be dynamically updatable, so that 
executors can change what they're advertising based on their internal state, 
versus requiring DiscoveryInfo to be known prior to starting the tasks. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4114) Add field VIP to message Port

2015-12-10 Thread Sargun Dhillon (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sargun Dhillon updated MESOS-4114:
--
Description: 
We would like to extend the Mesos protocol buffer 'Port' to include an optional 
repeated string named "VIP" - to map it to a well known virtual IP, or virtual 
hostname for discovery purposes.

We also want this field exposed in DiscoveryInfo in state.json.

  was:
We would like to extend the Mesos protocol buffer 'Port' to include an optional 
string string named "VIP" - to map it to a well known virtual IP, or virtual 
hostname for discovery purposes.

We also want this field exposed in DiscoveryInfo in state.json.


> Add field VIP to message Port
> -
>
> Key: MESOS-4114
> URL: https://issues.apache.org/jira/browse/MESOS-4114
> Project: Mesos
>  Issue Type: Wish
>Reporter: Sargun Dhillon
>Assignee: Avinash Sridharan
>Priority: Trivial
>  Labels: mesosphere
>
> We would like to extend the Mesos protocol buffer 'Port' to include an 
> optional repeated string named "VIP" - to map it to a well known virtual IP, 
> or virtual hostname for discovery purposes.
> We also want this field exposed in DiscoveryInfo in state.json.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4113) Docker Executor should not set container IP during bridged mode

2015-12-10 Thread Sargun Dhillon (JIRA)
Sargun Dhillon created MESOS-4113:
-

 Summary: Docker Executor should not set container IP during 
bridged mode
 Key: MESOS-4113
 URL: https://issues.apache.org/jira/browse/MESOS-4113
 Project: Mesos
  Issue Type: Bug
  Components: docker
Affects Versions: 0.25.0
Reporter: Sargun Dhillon
Priority: Minor


The docker executor currently sets the IP address of the container into 
ContainerStatus.NetworkInfo.IPAddresses. This isn't a good thing, because 
during bridged mode execution, it makes it so that that IP address is useless, 
since it's behind the Docker NAT. I would like a flag that disables filling the 
IP address in, and allows it to fall back to the agent IP. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4114) Add field VIP to message Port

2015-12-10 Thread Sargun Dhillon (JIRA)
Sargun Dhillon created MESOS-4114:
-

 Summary: Add field VIP to message Port
 Key: MESOS-4114
 URL: https://issues.apache.org/jira/browse/MESOS-4114
 Project: Mesos
  Issue Type: Wish
Reporter: Sargun Dhillon
Priority: Trivial


We would like to extend the Mesos protocol buffer 'Port' to include an optional 
string string named "VIP" - to map it to a well known virtual IP, or virtual 
hostname for discovery purposes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4114) Add field VIP to message Port

2015-12-10 Thread Sargun Dhillon (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sargun Dhillon updated MESOS-4114:
--
Description: 
We would like to extend the Mesos protocol buffer 'Port' to include an optional 
string string named "VIP" - to map it to a well known virtual IP, or virtual 
hostname for discovery purposes.

We also want this field exposed in DiscoveryInfo in state.json.

  was:We would like to extend the Mesos protocol buffer 'Port' to include an 
optional string string named "VIP" - to map it to a well known virtual IP, or 
virtual hostname for discovery purposes.


> Add field VIP to message Port
> -
>
> Key: MESOS-4114
> URL: https://issues.apache.org/jira/browse/MESOS-4114
> Project: Mesos
>  Issue Type: Wish
>Reporter: Sargun Dhillon
>Assignee: Avinash Sridharan
>Priority: Trivial
>  Labels: mesosphere
>
> We would like to extend the Mesos protocol buffer 'Port' to include an 
> optional string string named "VIP" - to map it to a well known virtual IP, or 
> virtual hostname for discovery purposes.
> We also want this field exposed in DiscoveryInfo in state.json.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4015) Expose task / executor health in master & slave state.json

2015-11-25 Thread Sargun Dhillon (JIRA)
Sargun Dhillon created MESOS-4015:
-

 Summary: Expose task / executor health in master & slave state.json
 Key: MESOS-4015
 URL: https://issues.apache.org/jira/browse/MESOS-4015
 Project: Mesos
  Issue Type: Improvement
Affects Versions: 0.25.0
Reporter: Sargun Dhillon
Priority: Trivial


Right now, if I specify a healthcheck for a task, the only way to get to it is 
via the Task Status updates that come to the framework. Unfortunately, this 
information isn't exposed in the state.json either in the slave or master. It'd 
be ideal to have that information to enable tools like Mesos-DNS to be 
health-aware.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3962) Add labels to the message Port

2015-11-19 Thread Sargun Dhillon (JIRA)
Sargun Dhillon created MESOS-3962:
-

 Summary: Add labels to the message Port
 Key: MESOS-3962
 URL: https://issues.apache.org/jira/browse/MESOS-3962
 Project: Mesos
  Issue Type: Wish
Reporter: Sargun Dhillon
Priority: Minor


I want to add arbitrary labels to the message "Port". I have a few use cases 
for this:
1) I want to use it to drive isolators to install firewall rules associated 
with the port
2) I want to use it to drive third party components to be able to specify 
advertising information
3) I want to be able to able to use this to associate a deterministic virtual 
hostname with a given port

Ideally, once the task is launched, these labels would be immutable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3826) Add an optional unique identifier for resource reservations

2015-11-08 Thread Sargun Dhillon (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14995809#comment-14995809
 ] 

Sargun Dhillon commented on MESOS-3826:
---

Questions: 
1. Will Mesos only give resourceOffers to a framework for the given principal 
that it is registered under, or is this further filtering that the framework 
author must do? 
2. Can a given framework only reserve resources with its principal or can it 
use the principals of others?

If a given framework can only operate on one principal at a time, what's the 
point of `ReservationInfo` at all? Can't Mesos implicitly apply the principal 
to all reservations, and task launches? 

> Add an optional unique identifier for resource reservations
> ---
>
> Key: MESOS-3826
> URL: https://issues.apache.org/jira/browse/MESOS-3826
> Project: Mesos
>  Issue Type: Improvement
>  Components: general
>Reporter: Sargun Dhillon
>Assignee: Guangya Liu
>Priority: Minor
>  Labels: mesosphere
>
> Thanks to the resource reservation primitives, frameworks can reserve 
> resources. These reservations are per role, which means multiple frameworks 
> can share reservations. This can get very hairy, as multiple reservations can 
> occur on each agent. 
> It would be nice to be able to optionally, uniquely identify reservations by 
> ID, much like persistent volumes are today. This could be done by adding a 
> new protobuf field, such as Resource.ReservationInfo.id, that if set upon 
> reservation time, would come back when the reservation is advertised.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-3826) Add an optional unique identifier for resource reservations

2015-11-08 Thread Sargun Dhillon (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14995809#comment-14995809
 ] 

Sargun Dhillon edited comment on MESOS-3826 at 11/8/15 9:08 PM:


Questions: 
1. Will Mesos only give resourceOffers to a framework for the given principal 
that it is registered under, or is this further filtering that the framework 
author must do? 
2. Can a given framework only reserve resources with its principal or can it 
use the principals of others?

If a given framework can only operate on one principal at a time, what's the 
point of ```ReservationInfo``` at all? Can't Mesos implicitly apply the 
principal to all reservations, and task launches? 


was (Author: sargun):
Questions: 
1. Will Mesos only give resourceOffers to a framework for the given principal 
that it is registered under, or is this further filtering that the framework 
author must do? 
2. Can a given framework only reserve resources with its principal or can it 
use the principals of others?

If a given framework can only operate on one principal at a time, what's the 
point of `ReservationInfo` at all? Can't Mesos implicitly apply the principal 
to all reservations, and task launches? 

> Add an optional unique identifier for resource reservations
> ---
>
> Key: MESOS-3826
> URL: https://issues.apache.org/jira/browse/MESOS-3826
> Project: Mesos
>  Issue Type: Improvement
>  Components: general
>Reporter: Sargun Dhillon
>Assignee: Guangya Liu
>Priority: Minor
>  Labels: mesosphere
>
> Thanks to the resource reservation primitives, frameworks can reserve 
> resources. These reservations are per role, which means multiple frameworks 
> can share reservations. This can get very hairy, as multiple reservations can 
> occur on each agent. 
> It would be nice to be able to optionally, uniquely identify reservations by 
> ID, much like persistent volumes are today. This could be done by adding a 
> new protobuf field, such as Resource.ReservationInfo.id, that if set upon 
> reservation time, would come back when the reservation is advertised.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-3826) Add an optional unique identifier for resource reservations

2015-11-08 Thread Sargun Dhillon (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14995809#comment-14995809
 ] 

Sargun Dhillon edited comment on MESOS-3826 at 11/8/15 9:09 PM:


Questions: 
1. Will Mesos only give resourceOffers to a framework for the given principal 
that it is registered under, or is this further filtering that the framework 
author must do? 
2. Can a given framework only reserve resources with its principal or can it 
use the principals of others?

If a given framework can only operate on one principal at a time, what's the 
point of {{ReservationInfo}} at all? Can't Mesos implicitly apply the principal 
to all reservations, and task launches? 


was (Author: sargun):
Questions: 
1. Will Mesos only give resourceOffers to a framework for the given principal 
that it is registered under, or is this further filtering that the framework 
author must do? 
2. Can a given framework only reserve resources with its principal or can it 
use the principals of others?

If a given framework can only operate on one principal at a time, what's the 
point of ```ReservationInfo``` at all? Can't Mesos implicitly apply the 
principal to all reservations, and task launches? 

> Add an optional unique identifier for resource reservations
> ---
>
> Key: MESOS-3826
> URL: https://issues.apache.org/jira/browse/MESOS-3826
> Project: Mesos
>  Issue Type: Improvement
>  Components: general
>Reporter: Sargun Dhillon
>Assignee: Guangya Liu
>Priority: Minor
>  Labels: mesosphere
>
> Thanks to the resource reservation primitives, frameworks can reserve 
> resources. These reservations are per role, which means multiple frameworks 
> can share reservations. This can get very hairy, as multiple reservations can 
> occur on each agent. 
> It would be nice to be able to optionally, uniquely identify reservations by 
> ID, much like persistent volumes are today. This could be done by adding a 
> new protobuf field, such as Resource.ReservationInfo.id, that if set upon 
> reservation time, would come back when the reservation is advertised.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3826) Add an optional unique identifier for resource reservations

2015-11-08 Thread Sargun Dhillon (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14995888#comment-14995888
 ] 

Sargun Dhillon commented on MESOS-3826:
---

The other problem here is idempotence. If I as framework A create a 
reservation, and for whatever reason that resource gets offered to framework B, 
which holds it, I am going to time out, thinking my reservation failed. I need 
to have an ID, that's forced to be unique, in order to get some level of 
idempotence, given the current asynchronous nature of reservations.

> Add an optional unique identifier for resource reservations
> ---
>
> Key: MESOS-3826
> URL: https://issues.apache.org/jira/browse/MESOS-3826
> Project: Mesos
>  Issue Type: Improvement
>  Components: general
>Reporter: Sargun Dhillon
>Assignee: Guangya Liu
>Priority: Minor
>  Labels: mesosphere
>
> Thanks to the resource reservation primitives, frameworks can reserve 
> resources. These reservations are per role, which means multiple frameworks 
> can share reservations. This can get very hairy, as multiple reservations can 
> occur on each agent. 
> It would be nice to be able to optionally, uniquely identify reservations by 
> ID, much like persistent volumes are today. This could be done by adding a 
> new protobuf field, such as Resource.ReservationInfo.id, that if set upon 
> reservation time, would come back when the reservation is advertised.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3853) Expose Dynamic Reservations and Persistent Volumes in Master or Slave state.json

2015-11-08 Thread Sargun Dhillon (JIRA)
Sargun Dhillon created MESOS-3853:
-

 Summary: Expose Dynamic Reservations and Persistent Volumes in 
Master or Slave state.json
 Key: MESOS-3853
 URL: https://issues.apache.org/jira/browse/MESOS-3853
 Project: Mesos
  Issue Type: Improvement
  Components: master, slave
Reporter: Sargun Dhillon
Priority: Minor


Right now dynamic reservations, and persistent volumes aren't exposed in 
state.json, as it's exposed as just generic reservations:
{code}
"reserved_resources": {
  "test": {
"cpus": 0.02,
"disk": 200,
"mem": 0
  }
}
{code}

It would be nice to get information about which resources are dynamically 
reserved, and what principal as well as role holds those dynamic reservations. 
For volumes, it would be nice to see what principal, it would be good to see 
the size, and volume info for that given volume.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3826) Add an optional unique identifier for resource reservations

2015-11-05 Thread Sargun Dhillon (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14992632#comment-14992632
 ] 

Sargun Dhillon commented on MESOS-3826:
---

Yeah. I think that's how roles are supposed to work. That a role would align to 
a department, rather than every framework having their own role. We can change 
that, if we want.

> Add an optional unique identifier for resource reservations
> ---
>
> Key: MESOS-3826
> URL: https://issues.apache.org/jira/browse/MESOS-3826
> Project: Mesos
>  Issue Type: Improvement
>  Components: general
>Reporter: Sargun Dhillon
>Assignee: Guangya Liu
>Priority: Minor
>  Labels: mesosphere
>
> Thanks to the resource reservation primitives, frameworks can reserve 
> resources. These reservations are per role, which means multiple frameworks 
> can share reservations. This can get very hairy, as multiple reservations can 
> occur on each agent. 
> It would be nice to be able to optionally, uniquely identify reservations by 
> ID, much like persistent volumes are today. This could be done by adding a 
> new protobuf field, such as Resource.ReservationInfo.id, that if set upon 
> reservation time, would come back when the reservation is advertised.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3835) Expose framework principal through state.json

2015-11-05 Thread Sargun Dhillon (JIRA)
Sargun Dhillon created MESOS-3835:
-

 Summary: Expose framework principal through state.json
 Key: MESOS-3835
 URL: https://issues.apache.org/jira/browse/MESOS-3835
 Project: Mesos
  Issue Type: Wish
  Components: master
Reporter: Sargun Dhillon
Priority: Trivial


We would like to expose the framework principal through the Master /state.json. 
This is for the purposes of both debugging (from the operator perspective). 
This could be used for inspection during the process of creating, or modifying 
ACLs. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3826) Add an optional unique identifier for resource reservations

2015-11-03 Thread Sargun Dhillon (JIRA)
Sargun Dhillon created MESOS-3826:
-

 Summary: Add an optional unique identifier for resource 
reservations
 Key: MESOS-3826
 URL: https://issues.apache.org/jira/browse/MESOS-3826
 Project: Mesos
  Issue Type: Improvement
  Components: general
Reporter: Sargun Dhillon
Priority: Minor


Thanks to the resource reservation primitives, frameworks can reserve 
resources. These reservations are per role, which means multiple frameworks can 
share reservations. This can get very hairy, as multiple reservations can occur 
on each agent. 

It would be nice to be able to optionally, uniquely identify reservations by 
ID, much like persistent volumes are today. This could be done by adding a new 
protobuf field, such as Resource.ReservationInfo.id, that if set upon 
reservation time, would come back when the reservation is advertised.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-2935) Documentation Needs Clarification about Compressed artifacts

2015-06-25 Thread Sargun Dhillon (JIRA)
Sargun Dhillon created MESOS-2935:
-

 Summary: Documentation Needs Clarification about Compressed 
artifacts
 Key: MESOS-2935
 URL: https://issues.apache.org/jira/browse/MESOS-2935
 Project: Mesos
  Issue Type: Documentation
  Components: fetcher
Reporter: Sargun Dhillon
Priority: Trivial


Compressed artifacts get decompressed with either unzip -d or tar -C $DIR 
-xf 

In addition, only the following file suffixes / extensions result in 
decompression:
-tgz
-tar.gz
-tbz2
-tar.bz2
-tar.xz
-txz
-zip





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)