[jira] [Commented] (MESOS-5148) Supporting Container Images in Mesos Containerizer doesn't work by using marathon api

2016-04-07 Thread wangqun (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231661#comment-15231661
 ] 

wangqun commented on MESOS-5148:


[~jieyu] please check it. Thanks.

> Supporting Container Images in Mesos Containerizer doesn't work by using 
> marathon api
> -
>
> Key: MESOS-5148
> URL: https://issues.apache.org/jira/browse/MESOS-5148
> Project: Mesos
>  Issue Type: Bug
>Reporter: wangqun
>
> Hi
> I use the marathon api to create tasks to test Supporting Container 
> Images in Mesos Containerizer .
> My steps is the following:
> 1) to run the process in master node.
> sudo /usr/sbin/mesos-master --zk=zk://10.0.0.4:2181/mesos --port=5050 
> --log_dir=/var/log/mesos --cluster=mesosbay --hostname=10.0.0.4 --ip=10.0.0.4 
> --quorum=1 --work_dir=/var/lib/mesos
> 2) to run the process in slave node.
> sudo /usr/sbin/mesos-slave --master=zk://10.0.0.4:2181/mesos 
> --log_dir=/var/log/mesos --containerizers=docker,mesos 
> --executor_registration_timeout=5mins --hostname=10.0.0.5 --ip=10.0.0.5 
> --isolation=docker/runtime,filesystem/linux --work_dir=/tmp/mesos/slave 
> --image_providers=docker --executor_environment_variables="{}"
> 3) to create one json file to specify the container to be managed by mesos.
> sudo  touch mesos.json
> sudo vim  mesos.json
> {
>   "container": {
> "type": "MESOS",
> "mesos": {
>   "image": "library/redis"
> }
>   },
>   "id": "ubuntumesos",
>   "instances": 1,
>   "cpus": 0.5,
>   "mem": 512,
>   "uris": [],
>   "cmd": "ping 8.8.8.8"
> }
> 4)sudo curl -X POST -H "Content-Type: application/json" 
> localhost:8080/v2/apps -d...@mesos.json
> 5)sudo  curl http://localhost:8080/v2/tasks
> {"tasks":[{"id":"ubuntumesos.fc1879be-fc9f-11e5-81e0-024294de4967","host":"10.0.0.5","ipAddresses":[],"ports":[31597],"startedAt":"2016-04-07T09:06:24.900Z","stagedAt":"2016-04-07T09:06:16.611Z","version":"2016-04-07T09:06:14.354Z","slaveId":"058fb5a7-9273-4bfa-83bb-8cb091621e19-S1","appId":"/ubuntumesos","servicePorts":[1]}]}
> 6) sudo docker run -ti --net=host redis redis-cli  
> Could not connect to Redis at 127.0.0.1:6379: Connection refused
> not connected> 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5148) Supporting Container Images in Mesos Containerizer doesn't work by using marathon api

2016-04-07 Thread wangqun (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231613#comment-15231613
 ] 

wangqun commented on MESOS-5148:


[~osallou] please check it thanks.

> Supporting Container Images in Mesos Containerizer doesn't work by using 
> marathon api
> -
>
> Key: MESOS-5148
> URL: https://issues.apache.org/jira/browse/MESOS-5148
> Project: Mesos
>  Issue Type: Bug
>Reporter: wangqun
>
> Hi
> I use the marathon api to create tasks to test Supporting Container 
> Images in Mesos Containerizer .
> My steps is the following:
> 1) to run the process in master node.
> sudo /usr/sbin/mesos-master --zk=zk://10.0.0.4:2181/mesos --port=5050 
> --log_dir=/var/log/mesos --cluster=mesosbay --hostname=10.0.0.4 --ip=10.0.0.4 
> --quorum=1 --work_dir=/var/lib/mesos
> 2) to run the process in slave node.
> sudo /usr/sbin/mesos-slave --master=zk://10.0.0.4:2181/mesos 
> --log_dir=/var/log/mesos --containerizers=docker,mesos 
> --executor_registration_timeout=5mins --hostname=10.0.0.5 --ip=10.0.0.5 
> --isolation=docker/runtime,filesystem/linux --work_dir=/tmp/mesos/slave 
> --image_providers=docker --executor_environment_variables="{}"
> 3) to create one json file to specify the container to be managed by mesos.
> sudo  touch mesos.json
> sudo vim  mesos.json
> {
>   "container": {
> "type": "MESOS",
> "mesos": {
>   "image": "library/redis"
> }
>   },
>   "id": "ubuntumesos",
>   "instances": 1,
>   "cpus": 0.5,
>   "mem": 512,
>   "uris": [],
>   "cmd": "ping 8.8.8.8"
> }
> 4)sudo curl -X POST -H "Content-Type: application/json" 
> localhost:8080/v2/apps -d...@mesos.json
> 5)sudo  curl http://localhost:8080/v2/tasks
> {"tasks":[{"id":"ubuntumesos.fc1879be-fc9f-11e5-81e0-024294de4967","host":"10.0.0.5","ipAddresses":[],"ports":[31597],"startedAt":"2016-04-07T09:06:24.900Z","stagedAt":"2016-04-07T09:06:16.611Z","version":"2016-04-07T09:06:14.354Z","slaveId":"058fb5a7-9273-4bfa-83bb-8cb091621e19-S1","appId":"/ubuntumesos","servicePorts":[1]}]}
> 6) sudo docker run -ti --net=host redis redis-cli  
> Could not connect to Redis at 127.0.0.1:6379: Connection refused
> not connected> 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5148) Supporting Container Images in Mesos Containerizer doesn't work by using marathon api

2016-04-07 Thread wangqun (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231608#comment-15231608
 ] 

wangqun commented on MESOS-5148:


[~bmahler] please check it  thanks.

> Supporting Container Images in Mesos Containerizer doesn't work by using 
> marathon api
> -
>
> Key: MESOS-5148
> URL: https://issues.apache.org/jira/browse/MESOS-5148
> Project: Mesos
>  Issue Type: Bug
>Reporter: wangqun
>
> Hi
> I use the marathon api to create tasks to test Supporting Container 
> Images in Mesos Containerizer .
> My steps is the following:
> 1) to run the process in master node.
> sudo /usr/sbin/mesos-master --zk=zk://10.0.0.4:2181/mesos --port=5050 
> --log_dir=/var/log/mesos --cluster=mesosbay --hostname=10.0.0.4 --ip=10.0.0.4 
> --quorum=1 --work_dir=/var/lib/mesos
> 2) to run the process in slave node.
> sudo /usr/sbin/mesos-slave --master=zk://10.0.0.4:2181/mesos 
> --log_dir=/var/log/mesos --containerizers=docker,mesos 
> --executor_registration_timeout=5mins --hostname=10.0.0.5 --ip=10.0.0.5 
> --isolation=docker/runtime,filesystem/linux --work_dir=/tmp/mesos/slave 
> --image_providers=docker --executor_environment_variables="{}"
> 3) to create one json file to specify the container to be managed by mesos.
> sudo  touch mesos.json
> sudo vim  mesos.json
> {
>   "container": {
> "type": "MESOS",
> "mesos": {
>   "image": "library/redis"
> }
>   },
>   "id": "ubuntumesos",
>   "instances": 1,
>   "cpus": 0.5,
>   "mem": 512,
>   "uris": [],
>   "cmd": "ping 8.8.8.8"
> }
> 4)sudo curl -X POST -H "Content-Type: application/json" 
> localhost:8080/v2/apps -d...@mesos.json
> 5)sudo  curl http://localhost:8080/v2/tasks
> {"tasks":[{"id":"ubuntumesos.fc1879be-fc9f-11e5-81e0-024294de4967","host":"10.0.0.5","ipAddresses":[],"ports":[31597],"startedAt":"2016-04-07T09:06:24.900Z","stagedAt":"2016-04-07T09:06:16.611Z","version":"2016-04-07T09:06:14.354Z","slaveId":"058fb5a7-9273-4bfa-83bb-8cb091621e19-S1","appId":"/ubuntumesos","servicePorts":[1]}]}
> 6) sudo docker run -ti --net=host redis redis-cli  
> Could not connect to Redis at 127.0.0.1:6379: Connection refused
> not connected> 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5148) Supporting Container Images in Mesos Containerizer doesn't work by using marathon api

2016-04-07 Thread wangqun (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangqun updated MESOS-5148:
---
Environment: (was: Hi
I use the marathon api to create tasks to test Supporting Container Images 
in Mesos Containerizer .
My steps is the following:
1) to run the process in master node.
sudo /usr/sbin/mesos-master --zk=zk://10.0.0.4:2181/mesos --port=5050 
--log_dir=/var/log/mesos --cluster=mesosbay --hostname=10.0.0.4 --ip=10.0.0.4 
--quorum=1 --work_dir=/var/lib/mesos
2) to run the process in slave node.
sudo /usr/sbin/mesos-slave --master=zk://10.0.0.4:2181/mesos 
--log_dir=/var/log/mesos --containerizers=docker,mesos 
--executor_registration_timeout=5mins --hostname=10.0.0.5 --ip=10.0.0.5 
--isolation=docker/runtime,filesystem/linux --work_dir=/tmp/mesos/slave 
--image_providers=docker --executor_environment_variables="{}"
3) to create one json file to specify the container to be managed by mesos.
sudo  touch mesos.json
sudo vim  mesos.json
{
  "container": {
"type": "MESOS",
"mesos": {
  "image": "library/redis"
}
  },
  "id": "ubuntumesos",
  "instances": 1,
  "cpus": 0.5,
  "mem": 512,
  "uris": [],
  "cmd": "ping 8.8.8.8"
}
4)sudo curl -X POST -H "Content-Type: application/json" localhost:8080/v2/apps 
-d...@mesos.json
5)sudo  curl http://localhost:8080/v2/tasks
{"tasks":[{"id":"ubuntumesos.fc1879be-fc9f-11e5-81e0-024294de4967","host":"10.0.0.5","ipAddresses":[],"ports":[31597],"startedAt":"2016-04-07T09:06:24.900Z","stagedAt":"2016-04-07T09:06:16.611Z","version":"2016-04-07T09:06:14.354Z","slaveId":"058fb5a7-9273-4bfa-83bb-8cb091621e19-S1","appId":"/ubuntumesos","servicePorts":[1]}]}
6) sudo docker run -ti --net=host redis redis-cli  
Could not connect to Redis at 127.0.0.1:6379: Connection refused
not connected> 
)
Description: 
Hi
I use the marathon api to create tasks to test Supporting Container Images 
in Mesos Containerizer .
My steps is the following:
1) to run the process in master node.
sudo /usr/sbin/mesos-master --zk=zk://10.0.0.4:2181/mesos --port=5050 
--log_dir=/var/log/mesos --cluster=mesosbay --hostname=10.0.0.4 --ip=10.0.0.4 
--quorum=1 --work_dir=/var/lib/mesos
2) to run the process in slave node.
sudo /usr/sbin/mesos-slave --master=zk://10.0.0.4:2181/mesos 
--log_dir=/var/log/mesos --containerizers=docker,mesos 
--executor_registration_timeout=5mins --hostname=10.0.0.5 --ip=10.0.0.5 
--isolation=docker/runtime,filesystem/linux --work_dir=/tmp/mesos/slave 
--image_providers=docker --executor_environment_variables="{}"
3) to create one json file to specify the container to be managed by mesos.
sudo  touch mesos.json
sudo vim  mesos.json
{
  "container": {
"type": "MESOS",
"mesos": {
  "image": "library/redis"
}
  },
  "id": "ubuntumesos",
  "instances": 1,
  "cpus": 0.5,
  "mem": 512,
  "uris": [],
  "cmd": "ping 8.8.8.8"
}
4)sudo curl -X POST -H "Content-Type: application/json" localhost:8080/v2/apps 
-d...@mesos.json
5)sudo  curl http://localhost:8080/v2/tasks
{"tasks":[{"id":"ubuntumesos.fc1879be-fc9f-11e5-81e0-024294de4967","host":"10.0.0.5","ipAddresses":[],"ports":[31597],"startedAt":"2016-04-07T09:06:24.900Z","stagedAt":"2016-04-07T09:06:16.611Z","version":"2016-04-07T09:06:14.354Z","slaveId":"058fb5a7-9273-4bfa-83bb-8cb091621e19-S1","appId":"/ubuntumesos","servicePorts":[1]}]}
6) sudo docker run -ti --net=host redis redis-cli  
Could not connect to Redis at 127.0.0.1:6379: Connection refused
not connected> 


> Supporting Container Images in Mesos Containerizer doesn't work by using 
> marathon api
> -
>
> Key: MESOS-5148
> URL: https://issues.apache.org/jira/browse/MESOS-5148
> Project: Mesos
>  Issue Type: Bug
>Reporter: wangqun
>
> Hi
> I use the marathon api to create tasks to test Supporting Container 
> Images in Mesos Containerizer .
> My steps is the following:
> 1) to run the process in master node.
> sudo /usr/sbin/mesos-master --zk=zk://10.0.0.4:2181/mesos --port=5050 
> --log_dir=/var/log/mesos --cluster=mesosbay --hostname=10.0.0.4 --ip=10.0.0.4 
> --quorum=1 --work_dir=/var/lib/mesos
> 2) to run the process in slave node.
> sudo /usr/sbin/mesos-slave --master=zk://10.0.0.4:2181/mesos 
> --log_dir=/var/log/mesos --containerizers=docker,mesos 
> --executor_registration_timeout=5mins --hostname=10.0.0.5 --ip=10.0.0.5 
> --isolation=docker/runtime,filesystem/linux --work_dir=/tmp/mesos/slave 
> --image_providers=docker --executor_environment_variables="{}"
> 3) to create one json file to specify the container to be managed by mesos.
> sudo  touch mesos.json
> sudo vim  mesos.json
> {
>   "container": {
> "type": "MESOS",
> "mesos": {
>   "image": "library/redis"
> }
>   },
>   "id": "ubuntumesos",
>   "instances": 1,
>   "cpus": 0.5,
>   "mem": 512,
>   

[jira] [Created] (MESOS-5148) Supporting Container Images in Mesos Containerizer doesn't work by using marathon api

2016-04-07 Thread wangqun (JIRA)
wangqun created MESOS-5148:
--

 Summary: Supporting Container Images in Mesos Containerizer 
doesn't work by using marathon api
 Key: MESOS-5148
 URL: https://issues.apache.org/jira/browse/MESOS-5148
 Project: Mesos
  Issue Type: Bug
 Environment: Hi
I use the marathon api to create tasks to test Supporting Container Images 
in Mesos Containerizer .
My steps is the following:
1) to run the process in master node.
sudo /usr/sbin/mesos-master --zk=zk://10.0.0.4:2181/mesos --port=5050 
--log_dir=/var/log/mesos --cluster=mesosbay --hostname=10.0.0.4 --ip=10.0.0.4 
--quorum=1 --work_dir=/var/lib/mesos
2) to run the process in slave node.
sudo /usr/sbin/mesos-slave --master=zk://10.0.0.4:2181/mesos 
--log_dir=/var/log/mesos --containerizers=docker,mesos 
--executor_registration_timeout=5mins --hostname=10.0.0.5 --ip=10.0.0.5 
--isolation=docker/runtime,filesystem/linux --work_dir=/tmp/mesos/slave 
--image_providers=docker --executor_environment_variables="{}"
3) to create one json file to specify the container to be managed by mesos.
sudo  touch mesos.json
sudo vim  mesos.json
{
  "container": {
"type": "MESOS",
"mesos": {
  "image": "library/redis"
}
  },
  "id": "ubuntumesos",
  "instances": 1,
  "cpus": 0.5,
  "mem": 512,
  "uris": [],
  "cmd": "ping 8.8.8.8"
}
4)sudo curl -X POST -H "Content-Type: application/json" localhost:8080/v2/apps 
-d...@mesos.json
5)sudo  curl http://localhost:8080/v2/tasks
{"tasks":[{"id":"ubuntumesos.fc1879be-fc9f-11e5-81e0-024294de4967","host":"10.0.0.5","ipAddresses":[],"ports":[31597],"startedAt":"2016-04-07T09:06:24.900Z","stagedAt":"2016-04-07T09:06:16.611Z","version":"2016-04-07T09:06:14.354Z","slaveId":"058fb5a7-9273-4bfa-83bb-8cb091621e19-S1","appId":"/ubuntumesos","servicePorts":[1]}]}
6) sudo docker run -ti --net=host redis redis-cli  
Could not connect to Redis at 127.0.0.1:6379: Connection refused
not connected> 

Reporter: wangqun






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-5147) UnsupportedClassVersionError: mesosphere/marathon/Main : Unsupported major.minor version 52.0

2016-04-07 Thread wangqun (JIRA)
wangqun created MESOS-5147:
--

 Summary: UnsupportedClassVersionError: mesosphere/marathon/Main : 
Unsupported major.minor version 52.0
 Key: MESOS-5147
 URL: https://issues.apache.org/jira/browse/MESOS-5147
 Project: Mesos
  Issue Type: Bug
Affects Versions: 0.28.0
Reporter: wangqun


Hi,
I want to use marathon API to create the App and tasks.  So I install 
marathon according to https://mesosphere.github.io/marathon/docs/.
1)  curl -O 
http://downloads.mesosphere.com/marathon/v1.0.0-RC1/marathon-1.0.0-RC1.tgz
2) tar xzf marathon-1.0.0-RC1.tgz
3)  cd marathon-1.0.0-RC1
4) ./bin/start --master zk://zk1.foo.bar:2181,zk2.foo.bar:2181/mesos --zk 
zk://zk1.foo.bar:2181,zk2.foo.bar:2181/marathon
I got the following error:
MESOS_NATIVE_JAVA_LIBRARY is not set. Searching in /usr/lib /usr/local/lib.
MESOS_NATIVE_LIBRARY, MESOS_NATIVE_JAVA_LIBRARY set to 
'/usr/local/lib/libmesos.so'
Exception in thread "main" java.lang.UnsupportedClassVersionError: 
mesosphere/marathon/Main : Unsupported major.minor version 52.0
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:803)
at 
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:48




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-5027) Enable authenticated login in the webui

2016-04-07 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231545#comment-15231545
 ] 

haosdent edited comment on MESOS-5027 at 4/8/16 2:56 AM:
-

[~js84] Do we have plan to make the login window more user friendly? I mean, we 
show the login page instead of popup the form.


was (Author: haosd...@gmail.com):
[~js84] Do you have plan to make the login window more user friendly? I mean, 
we show the login page instead of popup the form.

> Enable authenticated login in the webui
> ---
>
> Key: MESOS-5027
> URL: https://issues.apache.org/jira/browse/MESOS-5027
> Project: Mesos
>  Issue Type: Improvement
>  Components: master, security, webui
>Reporter: Greg Mann
>Assignee: Joerg Schad
>  Labels: mesosphere, security
> Fix For: 0.29.0
>
> Attachments: Screen Shot 2016-04-07 at 21.02.45.png
>
>
> The webui hits a number of endpoints to get the data that it displays: 
> {{/state}}, {{/metrics/snapshot}}, {{/files/browse}}, {{/files/read}}, and 
> maybe others? Once authentication is enabled on these endpoints, we need to 
> add a login prompt to the webui so that users can provide credentials.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5027) Enable authenticated login in the webui

2016-04-07 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231545#comment-15231545
 ] 

haosdent commented on MESOS-5027:
-

[~js84] Do you have plan to make the login window more user friendly? I mean, 
we show the login page instead of popup the form.

> Enable authenticated login in the webui
> ---
>
> Key: MESOS-5027
> URL: https://issues.apache.org/jira/browse/MESOS-5027
> Project: Mesos
>  Issue Type: Improvement
>  Components: master, security, webui
>Reporter: Greg Mann
>Assignee: Joerg Schad
>  Labels: mesosphere, security
> Fix For: 0.29.0
>
> Attachments: Screen Shot 2016-04-07 at 21.02.45.png
>
>
> The webui hits a number of endpoints to get the data that it displays: 
> {{/state}}, {{/metrics/snapshot}}, {{/files/browse}}, {{/files/read}}, and 
> maybe others? Once authentication is enabled on these endpoints, we need to 
> add a login prompt to the webui so that users can provide credentials.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4728) Left panel of WebUI is small than content.

2016-04-07 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231523#comment-15231523
 ] 

haosdent commented on MESOS-4728:
-

Hi, [~klaus1982] . Sorry for forgot to reply you last time. Yes, when screen 
great than 1200px, it would look as you see. We need update it to

{code}
diff --git a/src/webui/master/static/home.html 
b/src/webui/master/static/home.html
index a691084..6229d66 100644
--- a/src/webui/master/static/home.html
+++ b/src/webui/master/static/home.html
@@ -6,7 +6,7 @@
 

 
-  
+  
 
   
 Cluster:
@@ -134,7 +134,7 @@
 
   

-  
+  
 
   
{code} 

[~vinodkone][~bmahler] May you shepherd on this ticket? Then I could post this 
simple fix shortly. :-)

> Left panel of WebUI is small than content.
> --
>
> Key: MESOS-4728
> URL: https://issues.apache.org/jira/browse/MESOS-4728
> Project: Mesos
>  Issue Type: Bug
>  Components: webui
> Environment: Safari Version 9.0.3 (11601.4.4)
>Reporter: Klaus Ma
>Priority: Minor
> Attachments: webui.png
>
>
> Left panel of WebUI is small than content. Refer to the attachment for the 
> detail



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5145) protobuf vendored but its depencencies are not

2016-04-07 Thread Chen Zhiwei (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231480#comment-15231480
 ] 

Chen Zhiwei commented on MESOS-5145:


I am confused, since enable Python and Java needs access to internet(or setup 
local mirror of pypi and maven).

And there is already a series of PRs for mesos-3rdparty repo 
(https://github.com/3rdparty/mesos-3rdparty/pulls/chenzhiwei).

> protobuf vendored but its depencencies are not
> --
>
> Key: MESOS-5145
> URL: https://issues.apache.org/jira/browse/MESOS-5145
> Project: Mesos
>  Issue Type: Bug
>  Components: build
>Reporter: David Robinson
>
> Updating [protobuf from 2.5 to 
> 2.6.1|https://github.com/apache/mesos/commit/51872fba7f94d80e55c9cc9b46f96780a938f626]
>  has caused Mesos builds to fail if pypi.python.org is unreachable. 
> Protobuf-2.6.1 requires 
> [google-apputils|https://pypi.python.org/pypi/google-apputils] and if it's 
> not available the build process will attempt to download it from pypi.
> Prior to this change it was possible to build Mesos without Internet access. 
> If the build process reaches out to arbitrary things on the Internet it's 
> impossible to guarantee build reproducibility.
> {noformat:title=snippet from setup.py in protobuf-2.6.1.tar.gz}
>   setup(name = 'protobuf',
> version = '2.6.1',
> ...
> setup_requires = ['google-apputils'],
> ...
> )
> {noformat}
> {noformat:title=snippet from build log}
> 08:20:49 DEBUG: Building protobuf Python egg ...
> 08:20:49 DEBUG: cd ../3rdparty/libprocess/3rdparty/protobuf-2.6.1/python &&   
> \
> 08:20:49 DEBUG: CC="gcc"  \
> 08:20:49 DEBUG: CXX="g++" \
> 08:20:49 DEBUG: CFLAGS="-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 
> -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic 
> -Wno-unused-local-typedefs"   \
> 08:20:49 DEBUG: CXXFLAGS="-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 
> -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic 
> -Wno-unused-local-typedefs -Wno-maybe-uninitialized -std=c++11"   
>   \
> 08:20:49 DEBUG: 
> PYTHONPATH=/builddir/build/BUILD/mesos-0.29.0/3rdparty/distribute-0.6.26  
> \
> 08:20:49 DEBUG: /usr/bin/python2.7 setup.py build bdist_egg
> 08:20:49 DEBUG: Download error on 
> http://pypi.python.org/simple/google-apputils/: [Errno 111] Connection 
> refused -- Some packages may not be found!
> 08:20:49 DEBUG: Download error on 
> http://pypi.python.org/simple/google-apputils/: [Errno 111] Connection 
> refused -- Some packages may not be found!
> 08:20:49 DEBUG: Couldn't find index page for 'google-apputils' (maybe 
> misspelled?)
> 08:20:49 DEBUG: Download error on http://pypi.python.org/simple/: [Errno 111] 
> Connection refused -- Some packages may not be found!
> 08:20:49 DEBUG: No local packages or download links found for google-apputils
> 08:20:49 DEBUG: Traceback (most recent call last):
> 08:20:49 DEBUG:   File "setup.py", line 200, in 
> 08:20:49 DEBUG: "Protocol Buffers are Google's data interchange format.",
> 08:20:49 DEBUG:   File "/usr/lib64/python2.7/distutils/core.py", line 111, in 
> setup
> 08:20:49 DEBUG: _setup_distribution = dist = klass(attrs)
> 08:20:49 DEBUG:   File 
> "/builddir/build/BUILD/mesos-0.29.0/3rdparty/distribute-0.6.26/setuptools/dist.py",
>  line 221, in __init__
> 08:20:49 DEBUG: self.fetch_build_eggs(attrs.pop('setup_requires'))
> 08:20:49 DEBUG:   File 
> "/builddir/build/BUILD/mesos-0.29.0/3rdparty/distribute-0.6.26/setuptools/dist.py",
>  line 245, in fetch_build_eggs
> 08:20:49 DEBUG: parse_requirements(requires), 
> installer=self.fetch_build_egg
> 08:20:49 DEBUG:   File 
> "/builddir/build/BUILD/mesos-0.29.0/3rdparty/distribute-0.6.26/pkg_resources.py",
>  line 580, in resolve
> 08:20:49 DEBUG: dist = best[req.key] = env.best_match(req, self, 
> installer)
> 08:20:49 DEBUG:   File 
> "/builddir/build/BUILD/mesos-0.29.0/3rdparty/distribute-0.6.26/pkg_resources.py",
>  line 825, in best_match
> 08:20:49 DEBUG: return self.obtain(req, installer) # try and 
> download/install
> 08:20:49 DEBUG:   File 
> "/builddir/build/BUILD/mesos-0.29.0/3rdparty/distribute-0.6.26/pkg_resources.py",
>  line 837, in obtain
> 08:20:49 DEBUG: return installer(requirement)
> 08:20:49 DEBUG:   File 
> "/builddir/build/BUILD/mesos-0.29.0/3rdparty/distribute-0.6.26/setuptools/dist.py",
>  line 294, in fetch_build_egg
> 08:20:49 DEBUG: return cmd.easy_install(req)
> 08:20:49 DEBUG:   File 
> "/builddir/build/BUILD/mesos-0.29.0/3rdparty/distribute-0.6.26/setuptools/command/easy_install.py",
>  line 584, in easy_install
> 08:20:49 DEBUG: raise 

[jira] [Commented] (MESOS-5038) Added a any mechanism for futures

2016-04-07 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231476#comment-15231476
 ] 

haosdent commented on MESOS-5038:
-

Compare with use {{Future any(const std::list& futures)}}, the 
problem of add mode to {{collect}} is it always return {{list}} although we may 
just want to get single result when use {{ANY}} mode.

> Added a any mechanism for futures
> -
>
> Key: MESOS-5038
> URL: https://issues.apache.org/jira/browse/MESOS-5038
> Project: Mesos
>  Issue Type: Improvement
>  Components: libprocess
>Reporter: haosdent
>Assignee: haosdent
>
> Now we already have {{collect}} and {{await}} mechanisms which would wait for 
> a list of {{Future}}. However, we would like to return immediately if any of 
> the list of {{Future}} complete instead of wait for the whole list finished 
> in {{collect}}. The interface of this any mechanism could be
> {code}
> template 
> Future any(const std::list& futures);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-5146) MasterAllocatorTest/1.RebalancedForUpdatedWeights is flaky

2016-04-07 Thread Yongqiao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongqiao Wang reassigned MESOS-5146:


Assignee: Yongqiao Wang

> MasterAllocatorTest/1.RebalancedForUpdatedWeights is flaky
> --
>
> Key: MESOS-5146
> URL: https://issues.apache.org/jira/browse/MESOS-5146
> Project: Mesos
>  Issue Type: Bug
>  Components: allocation, tests
>Affects Versions: 0.28.0
> Environment: Ubuntu 14.04 using clang, without libevent or SSL
>Reporter: Greg Mann
>Assignee: Yongqiao Wang
>  Labels: mesosphere
>
> Observed on the ASF CI:
> {code}
> [ RUN  ] MasterAllocatorTest/1.RebalancedForUpdatedWeights
> I0407 22:34:10.330394 29278 cluster.cpp:149] Creating default 'local' 
> authorizer
> I0407 22:34:10.466182 29278 leveldb.cpp:174] Opened db in 135.608207ms
> I0407 22:34:10.516398 29278 leveldb.cpp:181] Compacted db in 50.159558ms
> I0407 22:34:10.516464 29278 leveldb.cpp:196] Created db iterator in 34959ns
> I0407 22:34:10.516484 29278 leveldb.cpp:202] Seeked to beginning of db in 
> 10195ns
> I0407 22:34:10.516496 29278 leveldb.cpp:271] Iterated through 0 keys in the 
> db in 7324ns
> I0407 22:34:10.516547 29278 replica.cpp:779] Replica recovered with log 
> positions 0 -> 0 with 1 holes and 0 unlearned
> I0407 22:34:10.517277 29298 recover.cpp:447] Starting replica recovery
> I0407 22:34:10.517693 29300 recover.cpp:473] Replica is in EMPTY status
> I0407 22:34:10.520251 29310 replica.cpp:673] Replica in EMPTY status received 
> a broadcasted recover request from (4775)@172.17.0.3:35855
> I0407 22:34:10.520611 29311 recover.cpp:193] Received a recover response from 
> a replica in EMPTY status
> I0407 22:34:10.521164 29299 recover.cpp:564] Updating replica status to 
> STARTING
> I0407 22:34:10.523435 29298 master.cpp:382] Master 
> f59f9057-a5c7-43e1-b129-96862e640a12 (129e11060069) started on 
> 172.17.0.3:35855
> I0407 22:34:10.523473 29298 master.cpp:384] Flags at startup: --acls="" 
> --allocation_interval="1secs" --allocator="HierarchicalDRF" 
> --authenticate="true" --authenticate_http="true" --authenticate_slaves="true" 
> --authenticators="crammd5" --authorizers="local" 
> --credentials="/tmp/3rZY8C/credentials" --framework_sorter="drf" 
> --help="false" --hostname_lookup="true" --http_authenticators="basic" 
> --initialize_driver_logging="true" --log_auto_initialize="true" 
> --logbufsecs="0" --logging_level="INFO" --max_completed_frameworks="50" 
> --max_completed_tasks_per_framework="1000" --max_slave_ping_timeouts="5" 
> --quiet="false" --recovery_slave_removal_limit="100%" 
> --registry="replicated_log" --registry_fetch_timeout="1mins" 
> --registry_store_timeout="100secs" --registry_strict="true" 
> --root_submissions="true" --slave_ping_timeout="15secs" 
> --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" 
> --webui_dir="/mesos/mesos-0.29.0/_inst/share/mesos/webui" 
> --work_dir="/tmp/3rZY8C/master" --zk_session_timeout="10secs"
> I0407 22:34:10.523885 29298 master.cpp:433] Master only allowing 
> authenticated frameworks to register
> I0407 22:34:10.523901 29298 master.cpp:438] Master only allowing 
> authenticated agents to register
> I0407 22:34:10.523913 29298 credentials.hpp:37] Loading credentials for 
> authentication from '/tmp/3rZY8C/credentials'
> I0407 22:34:10.524298 29298 master.cpp:480] Using default 'crammd5' 
> authenticator
> I0407 22:34:10.524441 29298 master.cpp:551] Using default 'basic' HTTP 
> authenticator
> I0407 22:34:10.524564 29298 master.cpp:589] Authorization enabled
> I0407 22:34:10.525269 29305 hierarchical.cpp:145] Initialized hierarchical 
> allocator process
> I0407 22:34:10.525333 29305 whitelist_watcher.cpp:77] No whitelist given
> I0407 22:34:10.527331 29298 master.cpp:1832] The newly elected leader is 
> master@172.17.0.3:35855 with id f59f9057-a5c7-43e1-b129-96862e640a12
> I0407 22:34:10.527441 29298 master.cpp:1845] Elected as the leading master!
> I0407 22:34:10.527545 29298 master.cpp:1532] Recovering from registrar
> I0407 22:34:10.527889 29298 registrar.cpp:331] Recovering registrar
> I0407 22:34:10.549734 29299 leveldb.cpp:304] Persisting metadata (8 bytes) to 
> leveldb took 28.25177ms
> I0407 22:34:10.549782 29299 replica.cpp:320] Persisted replica status to 
> STARTING
> I0407 22:34:10.550010 29299 recover.cpp:473] Replica is in STARTING status
> I0407 22:34:10.551352 29299 replica.cpp:673] Replica in STARTING status 
> received a broadcasted recover request from (4777)@172.17.0.3:35855
> I0407 22:34:10.551676 29299 recover.cpp:193] Received a recover response from 
> a replica in STARTING status
> I0407 22:34:10.552315 29308 recover.cpp:564] Updating replica status to VOTING
> I0407 22:34:10.574865 29308 leveldb.cpp:304] Persisting metadata (8 bytes) to 
> 

[jira] [Commented] (MESOS-5038) Added a any mechanism for futures

2016-04-07 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231474#comment-15231474
 ] 

haosdent commented on MESOS-5038:
-

[~gilbert] I have already changed to use
{code}
Future collect(
const std::list& futures,
const collect::Mode& mode);
{code}

May I have your review? Thanks in advance.

> Added a any mechanism for futures
> -
>
> Key: MESOS-5038
> URL: https://issues.apache.org/jira/browse/MESOS-5038
> Project: Mesos
>  Issue Type: Improvement
>  Components: libprocess
>Reporter: haosdent
>Assignee: haosdent
>
> Now we already have {{collect}} and {{await}} mechanisms which would wait for 
> a list of {{Future}}. However, we would like to return immediately if any of 
> the list of {{Future}} complete instead of wait for the whole list finished 
> in {{collect}}. The interface of this any mechanism could be
> {code}
> template 
> Future any(const std::list& futures);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5141) Framework (CodeFuturesExampleFramework-1) at scheduler already subscribed, resending acknowledgement

2016-04-07 Thread inred (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231473#comment-15231473
 ] 

inred commented on MESOS-5141:
--

my laptop is  in a router ,and its outer ip is the old ip.

> Framework  (CodeFuturesExampleFramework-1) at scheduler  already subscribed, 
> resending acknowledgement
> --
>
> Key: MESOS-5141
> URL: https://issues.apache.org/jira/browse/MESOS-5141
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.28.0
> Environment: ubuntu140.04
>  mesos master and slave are 0.28
>Reporter: inred
>
> i have  a master run on 192.168.60.103, slave1 running on 192.168.60.102.
> i write a framework on my laptop 192.168.13.159 and can register it and 
> launch task successfully.
> but when my laptop changed ip from 192.168.13.159 to 192.168.1.103,
> can't  register my framework anymore.
> the master log:
> 0407 15:01:37.085171  1819 master.cpp:2231] Received SUBSCRIBE call for 
> framework 'CodeFuturesExampleFramework-1' at 
> scheduler-78b13c90-8b55-4ddd-a42e-1c8343532ba8@192.168.13.159:37072
> I0407 15:01:37.085450  1819 master.cpp:2302] Subscribing framework 
> CodeFuturesExampleFramework-1 with checkpointing disabled and capabilities [  
> ]
> I0407 15:01:37.085548  1819 master.cpp:2312] Framework 
> 8a791189-e940-4e2f-9c1e-2fb66a50191c- (CodeFuturesExampleFramework-1) at 
> scheduler-78b13c90-8b55-4ddd-a42e-1c8343532ba8@192.168.13.159:37072 already 
> subscribed, resending acknowledgement
> I0407 15:01:37.086802  1825 master.cpp:2231] Received SUBSCRIBE call for 
> framework 'CodeFuturesExampleFramework-1' at 
> scheduler-78b13c90-8b55-4ddd-a42e-1c8343532ba8@192.168.13.159:37072
> I0407 15:01:37.087117  1825 master.cpp:2302] Subscribing framework 
> CodeFuturesExampleFramework-1 with checkpointing disabled and capabilities [  
> ]
> I0407 15:01:37.087219  1825 master.cpp:2312] Framework 
> 8a791189-e940-4e2f-9c1e-2fb66a50191c- (CodeFuturesExampleFramework-1) at 
> scheduler-78b13c90-8b55-4ddd-a42e-1c8343532ba8@192.168.13.159:37072 already 
> subscribed, resending acknowledgement
> I0407 15:01:37.088713  1811 master.cpp:2231] Received SUBSCRIBE call for 
> framework 'CodeFuturesExampleFramework-1' at 
> scheduler-78b13c90-8b55-4ddd-a42e-1c8343532ba8@192.168.13.159:37072



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5141) Framework (CodeFuturesExampleFramework-1) at scheduler already subscribed, resending acknowledgement

2016-04-07 Thread inred (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231407#comment-15231407
 ] 

inred commented on MESOS-5141:
--

i  have restarted my laptop many times to restart the framework, it hangs up at 

I0408 08:16:46.207515  3892 sched.cpp:222] Version: 0.28.0
I0408 08:16:46.238123  3910 sched.cpp:326] New master detected at 
master@192.168.60.103:5050
I0408 08:16:46.241624  3910 sched.cpp:336] No credentials provided. Attempting 
to register without authentication




> Framework  (CodeFuturesExampleFramework-1) at scheduler  already subscribed, 
> resending acknowledgement
> --
>
> Key: MESOS-5141
> URL: https://issues.apache.org/jira/browse/MESOS-5141
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.28.0
> Environment: ubuntu140.04
>  mesos master and slave are 0.28
>Reporter: inred
>
> i have  a master run on 192.168.60.103, slave1 running on 192.168.60.102.
> i write a framework on my laptop 192.168.13.159 and can register it and 
> launch task successfully.
> but when my laptop changed ip from 192.168.13.159 to 192.168.1.103,
> can't  register my framework anymore.
> the master log:
> 0407 15:01:37.085171  1819 master.cpp:2231] Received SUBSCRIBE call for 
> framework 'CodeFuturesExampleFramework-1' at 
> scheduler-78b13c90-8b55-4ddd-a42e-1c8343532ba8@192.168.13.159:37072
> I0407 15:01:37.085450  1819 master.cpp:2302] Subscribing framework 
> CodeFuturesExampleFramework-1 with checkpointing disabled and capabilities [  
> ]
> I0407 15:01:37.085548  1819 master.cpp:2312] Framework 
> 8a791189-e940-4e2f-9c1e-2fb66a50191c- (CodeFuturesExampleFramework-1) at 
> scheduler-78b13c90-8b55-4ddd-a42e-1c8343532ba8@192.168.13.159:37072 already 
> subscribed, resending acknowledgement
> I0407 15:01:37.086802  1825 master.cpp:2231] Received SUBSCRIBE call for 
> framework 'CodeFuturesExampleFramework-1' at 
> scheduler-78b13c90-8b55-4ddd-a42e-1c8343532ba8@192.168.13.159:37072
> I0407 15:01:37.087117  1825 master.cpp:2302] Subscribing framework 
> CodeFuturesExampleFramework-1 with checkpointing disabled and capabilities [  
> ]
> I0407 15:01:37.087219  1825 master.cpp:2312] Framework 
> 8a791189-e940-4e2f-9c1e-2fb66a50191c- (CodeFuturesExampleFramework-1) at 
> scheduler-78b13c90-8b55-4ddd-a42e-1c8343532ba8@192.168.13.159:37072 already 
> subscribed, resending acknowledgement
> I0407 15:01:37.088713  1811 master.cpp:2231] Received SUBSCRIBE call for 
> framework 'CodeFuturesExampleFramework-1' at 
> scheduler-78b13c90-8b55-4ddd-a42e-1c8343532ba8@192.168.13.159:37072



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5146) MasterAllocatorTest/1.RebalancedForUpdatedWeights is flaky

2016-04-07 Thread Adam B (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231389#comment-15231389
 ] 

Adam B commented on MESOS-5146:
---

cc: [~gradywang]

> MasterAllocatorTest/1.RebalancedForUpdatedWeights is flaky
> --
>
> Key: MESOS-5146
> URL: https://issues.apache.org/jira/browse/MESOS-5146
> Project: Mesos
>  Issue Type: Bug
>  Components: allocation, tests
>Affects Versions: 0.28.0
> Environment: Ubuntu 14.04 using clang, without libevent or SSL
>Reporter: Greg Mann
>  Labels: mesosphere
>
> Observed on the ASF CI:
> {code}
> [ RUN  ] MasterAllocatorTest/1.RebalancedForUpdatedWeights
> I0407 22:34:10.330394 29278 cluster.cpp:149] Creating default 'local' 
> authorizer
> I0407 22:34:10.466182 29278 leveldb.cpp:174] Opened db in 135.608207ms
> I0407 22:34:10.516398 29278 leveldb.cpp:181] Compacted db in 50.159558ms
> I0407 22:34:10.516464 29278 leveldb.cpp:196] Created db iterator in 34959ns
> I0407 22:34:10.516484 29278 leveldb.cpp:202] Seeked to beginning of db in 
> 10195ns
> I0407 22:34:10.516496 29278 leveldb.cpp:271] Iterated through 0 keys in the 
> db in 7324ns
> I0407 22:34:10.516547 29278 replica.cpp:779] Replica recovered with log 
> positions 0 -> 0 with 1 holes and 0 unlearned
> I0407 22:34:10.517277 29298 recover.cpp:447] Starting replica recovery
> I0407 22:34:10.517693 29300 recover.cpp:473] Replica is in EMPTY status
> I0407 22:34:10.520251 29310 replica.cpp:673] Replica in EMPTY status received 
> a broadcasted recover request from (4775)@172.17.0.3:35855
> I0407 22:34:10.520611 29311 recover.cpp:193] Received a recover response from 
> a replica in EMPTY status
> I0407 22:34:10.521164 29299 recover.cpp:564] Updating replica status to 
> STARTING
> I0407 22:34:10.523435 29298 master.cpp:382] Master 
> f59f9057-a5c7-43e1-b129-96862e640a12 (129e11060069) started on 
> 172.17.0.3:35855
> I0407 22:34:10.523473 29298 master.cpp:384] Flags at startup: --acls="" 
> --allocation_interval="1secs" --allocator="HierarchicalDRF" 
> --authenticate="true" --authenticate_http="true" --authenticate_slaves="true" 
> --authenticators="crammd5" --authorizers="local" 
> --credentials="/tmp/3rZY8C/credentials" --framework_sorter="drf" 
> --help="false" --hostname_lookup="true" --http_authenticators="basic" 
> --initialize_driver_logging="true" --log_auto_initialize="true" 
> --logbufsecs="0" --logging_level="INFO" --max_completed_frameworks="50" 
> --max_completed_tasks_per_framework="1000" --max_slave_ping_timeouts="5" 
> --quiet="false" --recovery_slave_removal_limit="100%" 
> --registry="replicated_log" --registry_fetch_timeout="1mins" 
> --registry_store_timeout="100secs" --registry_strict="true" 
> --root_submissions="true" --slave_ping_timeout="15secs" 
> --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" 
> --webui_dir="/mesos/mesos-0.29.0/_inst/share/mesos/webui" 
> --work_dir="/tmp/3rZY8C/master" --zk_session_timeout="10secs"
> I0407 22:34:10.523885 29298 master.cpp:433] Master only allowing 
> authenticated frameworks to register
> I0407 22:34:10.523901 29298 master.cpp:438] Master only allowing 
> authenticated agents to register
> I0407 22:34:10.523913 29298 credentials.hpp:37] Loading credentials for 
> authentication from '/tmp/3rZY8C/credentials'
> I0407 22:34:10.524298 29298 master.cpp:480] Using default 'crammd5' 
> authenticator
> I0407 22:34:10.524441 29298 master.cpp:551] Using default 'basic' HTTP 
> authenticator
> I0407 22:34:10.524564 29298 master.cpp:589] Authorization enabled
> I0407 22:34:10.525269 29305 hierarchical.cpp:145] Initialized hierarchical 
> allocator process
> I0407 22:34:10.525333 29305 whitelist_watcher.cpp:77] No whitelist given
> I0407 22:34:10.527331 29298 master.cpp:1832] The newly elected leader is 
> master@172.17.0.3:35855 with id f59f9057-a5c7-43e1-b129-96862e640a12
> I0407 22:34:10.527441 29298 master.cpp:1845] Elected as the leading master!
> I0407 22:34:10.527545 29298 master.cpp:1532] Recovering from registrar
> I0407 22:34:10.527889 29298 registrar.cpp:331] Recovering registrar
> I0407 22:34:10.549734 29299 leveldb.cpp:304] Persisting metadata (8 bytes) to 
> leveldb took 28.25177ms
> I0407 22:34:10.549782 29299 replica.cpp:320] Persisted replica status to 
> STARTING
> I0407 22:34:10.550010 29299 recover.cpp:473] Replica is in STARTING status
> I0407 22:34:10.551352 29299 replica.cpp:673] Replica in STARTING status 
> received a broadcasted recover request from (4777)@172.17.0.3:35855
> I0407 22:34:10.551676 29299 recover.cpp:193] Received a recover response from 
> a replica in STARTING status
> I0407 22:34:10.552315 29308 recover.cpp:564] Updating replica status to VOTING
> I0407 22:34:10.574865 29308 leveldb.cpp:304] Persisting metadata (8 bytes) to 
> leveldb took 22.413614ms
> I0407 

[jira] [Created] (MESOS-5146) MasterAllocatorTest/1.RebalancedForUpdatedWeights is flaky

2016-04-07 Thread Greg Mann (JIRA)
Greg Mann created MESOS-5146:


 Summary: MasterAllocatorTest/1.RebalancedForUpdatedWeights is flaky
 Key: MESOS-5146
 URL: https://issues.apache.org/jira/browse/MESOS-5146
 Project: Mesos
  Issue Type: Bug
  Components: allocation, tests
Affects Versions: 0.28.0
 Environment: Ubuntu 14.04 using clang, without libevent or SSL
Reporter: Greg Mann


Observed on the ASF CI:

{code}
[ RUN  ] MasterAllocatorTest/1.RebalancedForUpdatedWeights
I0407 22:34:10.330394 29278 cluster.cpp:149] Creating default 'local' authorizer
I0407 22:34:10.466182 29278 leveldb.cpp:174] Opened db in 135.608207ms
I0407 22:34:10.516398 29278 leveldb.cpp:181] Compacted db in 50.159558ms
I0407 22:34:10.516464 29278 leveldb.cpp:196] Created db iterator in 34959ns
I0407 22:34:10.516484 29278 leveldb.cpp:202] Seeked to beginning of db in 
10195ns
I0407 22:34:10.516496 29278 leveldb.cpp:271] Iterated through 0 keys in the db 
in 7324ns
I0407 22:34:10.516547 29278 replica.cpp:779] Replica recovered with log 
positions 0 -> 0 with 1 holes and 0 unlearned
I0407 22:34:10.517277 29298 recover.cpp:447] Starting replica recovery
I0407 22:34:10.517693 29300 recover.cpp:473] Replica is in EMPTY status
I0407 22:34:10.520251 29310 replica.cpp:673] Replica in EMPTY status received a 
broadcasted recover request from (4775)@172.17.0.3:35855
I0407 22:34:10.520611 29311 recover.cpp:193] Received a recover response from a 
replica in EMPTY status
I0407 22:34:10.521164 29299 recover.cpp:564] Updating replica status to STARTING
I0407 22:34:10.523435 29298 master.cpp:382] Master 
f59f9057-a5c7-43e1-b129-96862e640a12 (129e11060069) started on 172.17.0.3:35855
I0407 22:34:10.523473 29298 master.cpp:384] Flags at startup: --acls="" 
--allocation_interval="1secs" --allocator="HierarchicalDRF" 
--authenticate="true" --authenticate_http="true" --authenticate_slaves="true" 
--authenticators="crammd5" --authorizers="local" 
--credentials="/tmp/3rZY8C/credentials" --framework_sorter="drf" --help="false" 
--hostname_lookup="true" --http_authenticators="basic" 
--initialize_driver_logging="true" --log_auto_initialize="true" 
--logbufsecs="0" --logging_level="INFO" --max_completed_frameworks="50" 
--max_completed_tasks_per_framework="1000" --max_slave_ping_timeouts="5" 
--quiet="false" --recovery_slave_removal_limit="100%" 
--registry="replicated_log" --registry_fetch_timeout="1mins" 
--registry_store_timeout="100secs" --registry_strict="true" 
--root_submissions="true" --slave_ping_timeout="15secs" 
--slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" 
--webui_dir="/mesos/mesos-0.29.0/_inst/share/mesos/webui" 
--work_dir="/tmp/3rZY8C/master" --zk_session_timeout="10secs"
I0407 22:34:10.523885 29298 master.cpp:433] Master only allowing authenticated 
frameworks to register
I0407 22:34:10.523901 29298 master.cpp:438] Master only allowing authenticated 
agents to register
I0407 22:34:10.523913 29298 credentials.hpp:37] Loading credentials for 
authentication from '/tmp/3rZY8C/credentials'
I0407 22:34:10.524298 29298 master.cpp:480] Using default 'crammd5' 
authenticator
I0407 22:34:10.524441 29298 master.cpp:551] Using default 'basic' HTTP 
authenticator
I0407 22:34:10.524564 29298 master.cpp:589] Authorization enabled
I0407 22:34:10.525269 29305 hierarchical.cpp:145] Initialized hierarchical 
allocator process
I0407 22:34:10.525333 29305 whitelist_watcher.cpp:77] No whitelist given
I0407 22:34:10.527331 29298 master.cpp:1832] The newly elected leader is 
master@172.17.0.3:35855 with id f59f9057-a5c7-43e1-b129-96862e640a12
I0407 22:34:10.527441 29298 master.cpp:1845] Elected as the leading master!
I0407 22:34:10.527545 29298 master.cpp:1532] Recovering from registrar
I0407 22:34:10.527889 29298 registrar.cpp:331] Recovering registrar
I0407 22:34:10.549734 29299 leveldb.cpp:304] Persisting metadata (8 bytes) to 
leveldb took 28.25177ms
I0407 22:34:10.549782 29299 replica.cpp:320] Persisted replica status to 
STARTING
I0407 22:34:10.550010 29299 recover.cpp:473] Replica is in STARTING status
I0407 22:34:10.551352 29299 replica.cpp:673] Replica in STARTING status 
received a broadcasted recover request from (4777)@172.17.0.3:35855
I0407 22:34:10.551676 29299 recover.cpp:193] Received a recover response from a 
replica in STARTING status
I0407 22:34:10.552315 29308 recover.cpp:564] Updating replica status to VOTING
I0407 22:34:10.574865 29308 leveldb.cpp:304] Persisting metadata (8 bytes) to 
leveldb took 22.413614ms
I0407 22:34:10.574928 29308 replica.cpp:320] Persisted replica status to VOTING
I0407 22:34:10.575103 29308 recover.cpp:578] Successfully joined the Paxos group
I0407 22:34:10.575346 29308 recover.cpp:462] Recover process terminated
I0407 22:34:10.575913 29308 log.cpp:659] Attempting to start the writer
I0407 22:34:10.577512 29308 replica.cpp:493] Replica received implicit promise 
request from 

[jira] [Updated] (MESOS-4325) Offer shareable resources to frameworks only if it is opted in

2016-04-07 Thread Anindya Sinha (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anindya Sinha updated MESOS-4325:
-
Shepherd: Yan Xu  (was: Adam B)

> Offer shareable resources to frameworks only if it is opted in
> --
>
> Key: MESOS-4325
> URL: https://issues.apache.org/jira/browse/MESOS-4325
> Project: Mesos
>  Issue Type: Improvement
>  Components: general
>Affects Versions: 0.25.0
>Reporter: Anindya Sinha
>Assignee: Anindya Sinha
>Priority: Minor
>  Labels: external-volumes, persistent-volumes
>
> Added a new capability SHAREABLE_RESOURCES that frameworks need to opt in if 
> they are interested in receiving shared resources in their offers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4324) Allow access to shared persistent volumes as read only or read write by tasks

2016-04-07 Thread Anindya Sinha (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anindya Sinha updated MESOS-4324:
-
Shepherd: Yan Xu  (was: Adam B)

> Allow access to shared persistent volumes as read only or read write by tasks
> -
>
> Key: MESOS-4324
> URL: https://issues.apache.org/jira/browse/MESOS-4324
> Project: Mesos
>  Issue Type: Improvement
>  Components: general
>Affects Versions: 0.25.0
>Reporter: Anindya Sinha
>Assignee: Anindya Sinha
>Priority: Minor
>  Labels: external-volumes, persistent-volumes
>
> Allow the task to specify the access to a shared persistent volume to be 
> read-only or read-write. Note that the persistent volume is always created as 
> read-write.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4892) Support arithmetic operations for shared resources with consumer counts

2016-04-07 Thread Anindya Sinha (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anindya Sinha updated MESOS-4892:
-
Shepherd: Yan Xu

> Support arithmetic operations for shared resources with consumer counts
> ---
>
> Key: MESOS-4892
> URL: https://issues.apache.org/jira/browse/MESOS-4892
> Project: Mesos
>  Issue Type: Improvement
>  Components: general
>Reporter: Anindya Sinha
>Assignee: Anindya Sinha
>  Labels: external-volum, persistent-volumes, resource
>
> With the introduction of shared resources, we need to add support for 
> arithmetic operations on Resources which perform such operations on shared 
> resources. Shared resources need to be handled differently so as to account 
> for incrementing/decrementing consumer counts maintained by each Resources 
> object.
> Case 1:
> Resources total += shared_resource;
> If shared_resource exists in total, this would imply that the consumer count 
> is incremented. If shared_resource does not exist in total, this would imply 
> we start tracking consumers for this shared resource initialized to 0 
> consumers.
> Case 2
> Resources total -= shared_resource;
> If shared_resource exists in total, this would imply that the consumer count 
> is decremented. However, the shared_resource is removed from total if the 
> consumer count is originally 0 in total).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4325) Offer shareable resources to frameworks only if it is opted in

2016-04-07 Thread Anindya Sinha (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231349#comment-15231349
 ] 

Anindya Sinha commented on MESOS-4325:
--

Discarding the above RRs based on modification in protobuf and tracking of 
consumer count internally for shared resources. Updated RRs shall be published.


> Offer shareable resources to frameworks only if it is opted in
> --
>
> Key: MESOS-4325
> URL: https://issues.apache.org/jira/browse/MESOS-4325
> Project: Mesos
>  Issue Type: Improvement
>  Components: general
>Affects Versions: 0.25.0
>Reporter: Anindya Sinha
>Assignee: Anindya Sinha
>Priority: Minor
>  Labels: external-volumes, persistent-volumes
>
> Added a new capability SHAREABLE_RESOURCES that frameworks need to opt in if 
> they are interested in receiving shared resources in their offers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4324) Allow access to shared persistent volumes as read only or read write by tasks

2016-04-07 Thread Anindya Sinha (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231347#comment-15231347
 ] 

Anindya Sinha commented on MESOS-4324:
--

Discarding the above RRs based on modification in protobuf and tracking of 
consumer count internally for shared resources. Updated RRs shall be published.

> Allow access to shared persistent volumes as read only or read write by tasks
> -
>
> Key: MESOS-4324
> URL: https://issues.apache.org/jira/browse/MESOS-4324
> Project: Mesos
>  Issue Type: Improvement
>  Components: general
>Affects Versions: 0.25.0
>Reporter: Anindya Sinha
>Assignee: Anindya Sinha
>Priority: Minor
>  Labels: external-volumes, persistent-volumes
>
> Allow the task to specify the access to a shared persistent volume to be 
> read-only or read-write. Note that the persistent volume is always created as 
> read-write.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-4431) Sharing of persistent volumes via reference counting

2016-04-07 Thread Anindya Sinha (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231341#comment-15231341
 ] 

Anindya Sinha edited comment on MESOS-4431 at 4/7/16 11:40 PM:
---

Discarding the above RRs based on modification in protobuf and tracking of 
consumer count internally for shared resources. Updated RRs shall be published.


was (Author: anindya.sinha):
Discarding the above RRs based on modification in design. Updated RRs shall be 
published.

> Sharing of persistent volumes via reference counting
> 
>
> Key: MESOS-4431
> URL: https://issues.apache.org/jira/browse/MESOS-4431
> Project: Mesos
>  Issue Type: Improvement
>  Components: general
>Affects Versions: 0.25.0
>Reporter: Anindya Sinha
>Assignee: Anindya Sinha
>  Labels: external-volumes, persistent-volumes
>
> Add capability for specific resources to be shared amongst tasks within or 
> across frameworks/roles. Enable this functionality for persistent volumes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4431) Sharing of persistent volumes via reference counting

2016-04-07 Thread Anindya Sinha (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231341#comment-15231341
 ] 

Anindya Sinha commented on MESOS-4431:
--

Discarding the above RRs based on modification in design. Updated RRs shall be 
published.

> Sharing of persistent volumes via reference counting
> 
>
> Key: MESOS-4431
> URL: https://issues.apache.org/jira/browse/MESOS-4431
> Project: Mesos
>  Issue Type: Improvement
>  Components: general
>Affects Versions: 0.25.0
>Reporter: Anindya Sinha
>Assignee: Anindya Sinha
>  Labels: external-volumes, persistent-volumes
>
> Add capability for specific resources to be shared amongst tasks within or 
> across frameworks/roles. Enable this functionality for persistent volumes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-5064) Document avoiding using `/tmp` as agent’s work directory in production

2016-04-07 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231295#comment-15231295
 ] 

Greg Mann edited comment on MESOS-5064 at 4/7/16 10:59 PM:
---

After discussing a bit more with Jie, Joris, and BenM, we're going to change 
the agent's {{\-\-work_dir}} flag to have no default value, but remain a 
required field. This means that running the agent binary without specifying 
{{\-\-work_dir}} will fail. While this may break some deployments, we think 
this is acceptable since it's quite unadvisable to be running in production 
with the default agent work directory; best to fail hard now and force users to 
specify a suitable directory that is accessible on their system. I'll send an 
email out to the dev/user lists announcing this plan.


was (Author: greggomann):
After discussing a bit more with Jie, Joris, and BenM, we're going to change 
the agent's {{--work_dir}} flag to have no default value, but remain a required 
field. This means that running the agent binary without specifying 
{{--work_dir}} will fail. While this may break some deployments, we think this 
is acceptable since it's quite unadvisable to be running in production with the 
default agent work directory; best to fail hard now and force users to specify 
a suitable directory that is accessible on their system. I'll send an email out 
to the dev/user lists announcing this plan.

> Document avoiding using `/tmp` as agent’s work directory in production
> --
>
> Key: MESOS-5064
> URL: https://issues.apache.org/jira/browse/MESOS-5064
> Project: Mesos
>  Issue Type: Bug
>Reporter: Artem Harutyunyan
>Assignee: Greg Mann
>
> Following a crash report from the user we need to be more explicit about the 
> dangers of using {{/tmp}} as agent {{work_dir}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5145) protobuf vendored but its depencencies are not

2016-04-07 Thread Vinod Kone (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231287#comment-15231287
 ] 

Vinod Kone commented on MESOS-5145:
---

cc [~chenzhiwei]

Can Mesos be built without Internet access today? IIRC our CMake build 
explicitly reaches out to 3rd party repos to download deps. Also I thought we 
required access to maven during build time. cc [~hausdorff] [~karya]

> protobuf vendored but its depencencies are not
> --
>
> Key: MESOS-5145
> URL: https://issues.apache.org/jira/browse/MESOS-5145
> Project: Mesos
>  Issue Type: Bug
>  Components: build
>Reporter: David Robinson
>
> Updating [protobuf from 2.5 to 
> 2.6.1|https://github.com/apache/mesos/commit/51872fba7f94d80e55c9cc9b46f96780a938f626]
>  has caused Mesos builds to fail if pypi.python.org is unreachable. 
> Protobuf-2.6.1 requires 
> [google-apputils|https://pypi.python.org/pypi/google-apputils] and if it's 
> not available the build process will attempt to download it from pypi.
> Prior to this change it was possible to build Mesos without Internet access. 
> If the build process reaches out to arbitrary things on the Internet it's 
> impossible to guarantee build reproducibility.
> {noformat:title=snippet from setup.py in protobuf-2.6.1.tar.gz}
>   setup(name = 'protobuf',
> version = '2.6.1',
> ...
> setup_requires = ['google-apputils'],
> ...
> )
> {noformat}
> {noformat:title=snippet from build log}
> 08:20:49 DEBUG: Building protobuf Python egg ...
> 08:20:49 DEBUG: cd ../3rdparty/libprocess/3rdparty/protobuf-2.6.1/python &&   
> \
> 08:20:49 DEBUG: CC="gcc"  \
> 08:20:49 DEBUG: CXX="g++" \
> 08:20:49 DEBUG: CFLAGS="-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 
> -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic 
> -Wno-unused-local-typedefs"   \
> 08:20:49 DEBUG: CXXFLAGS="-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 
> -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic 
> -Wno-unused-local-typedefs -Wno-maybe-uninitialized -std=c++11"   
>   \
> 08:20:49 DEBUG: 
> PYTHONPATH=/builddir/build/BUILD/mesos-0.29.0/3rdparty/distribute-0.6.26  
> \
> 08:20:49 DEBUG: /usr/bin/python2.7 setup.py build bdist_egg
> 08:20:49 DEBUG: Download error on 
> http://pypi.python.org/simple/google-apputils/: [Errno 111] Connection 
> refused -- Some packages may not be found!
> 08:20:49 DEBUG: Download error on 
> http://pypi.python.org/simple/google-apputils/: [Errno 111] Connection 
> refused -- Some packages may not be found!
> 08:20:49 DEBUG: Couldn't find index page for 'google-apputils' (maybe 
> misspelled?)
> 08:20:49 DEBUG: Download error on http://pypi.python.org/simple/: [Errno 111] 
> Connection refused -- Some packages may not be found!
> 08:20:49 DEBUG: No local packages or download links found for google-apputils
> 08:20:49 DEBUG: Traceback (most recent call last):
> 08:20:49 DEBUG:   File "setup.py", line 200, in 
> 08:20:49 DEBUG: "Protocol Buffers are Google's data interchange format.",
> 08:20:49 DEBUG:   File "/usr/lib64/python2.7/distutils/core.py", line 111, in 
> setup
> 08:20:49 DEBUG: _setup_distribution = dist = klass(attrs)
> 08:20:49 DEBUG:   File 
> "/builddir/build/BUILD/mesos-0.29.0/3rdparty/distribute-0.6.26/setuptools/dist.py",
>  line 221, in __init__
> 08:20:49 DEBUG: self.fetch_build_eggs(attrs.pop('setup_requires'))
> 08:20:49 DEBUG:   File 
> "/builddir/build/BUILD/mesos-0.29.0/3rdparty/distribute-0.6.26/setuptools/dist.py",
>  line 245, in fetch_build_eggs
> 08:20:49 DEBUG: parse_requirements(requires), 
> installer=self.fetch_build_egg
> 08:20:49 DEBUG:   File 
> "/builddir/build/BUILD/mesos-0.29.0/3rdparty/distribute-0.6.26/pkg_resources.py",
>  line 580, in resolve
> 08:20:49 DEBUG: dist = best[req.key] = env.best_match(req, self, 
> installer)
> 08:20:49 DEBUG:   File 
> "/builddir/build/BUILD/mesos-0.29.0/3rdparty/distribute-0.6.26/pkg_resources.py",
>  line 825, in best_match
> 08:20:49 DEBUG: return self.obtain(req, installer) # try and 
> download/install
> 08:20:49 DEBUG:   File 
> "/builddir/build/BUILD/mesos-0.29.0/3rdparty/distribute-0.6.26/pkg_resources.py",
>  line 837, in obtain
> 08:20:49 DEBUG: return installer(requirement)
> 08:20:49 DEBUG:   File 
> "/builddir/build/BUILD/mesos-0.29.0/3rdparty/distribute-0.6.26/setuptools/dist.py",
>  line 294, in fetch_build_egg
> 08:20:49 DEBUG: return cmd.easy_install(req)
> 08:20:49 DEBUG:   File 
> "/builddir/build/BUILD/mesos-0.29.0/3rdparty/distribute-0.6.26/setuptools/command/easy_install.py",
>  line 584, in easy_install
> 08:20:49 DEBUG: raise 

[jira] [Commented] (MESOS-5064) Document avoiding using `/tmp` as agent’s work directory in production

2016-04-07 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231295#comment-15231295
 ] 

Greg Mann commented on MESOS-5064:
--

After discussing a bit more with Jie, Joris, and BenM, we're going to change 
the agent's {{--work_dir}} flag to have no default value, but remain a required 
field. This means that running the agent binary without specifying 
{{--work_dir}} will fail. While this may break some deployments, we think this 
is acceptable since it's quite unadvisable to be running in production with the 
default agent work directory; best to fail hard now and force users to specify 
a suitable directory that is accessible on their system. I'll send an email out 
to the dev/user lists announcing this plan.

> Document avoiding using `/tmp` as agent’s work directory in production
> --
>
> Key: MESOS-5064
> URL: https://issues.apache.org/jira/browse/MESOS-5064
> Project: Mesos
>  Issue Type: Bug
>Reporter: Artem Harutyunyan
>Assignee: Greg Mann
>
> Following a crash report from the user we need to be more explicit about the 
> dangers of using {{/tmp}} as agent {{work_dir}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3573) Mesos does not kill orphaned docker containers

2016-04-07 Thread Anand Mazumdar (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231282#comment-15231282
 ] 

Anand Mazumdar commented on MESOS-3573:
---

{code}
commit 1fa6340f8c8723b8d23934898f6e1599b9ba13c1
Author: Anand Mazumdar 
Date:   Wed Apr 6 11:10:59 2016 -0700

Added test for recovering orphaned docker containers.

Review: https://reviews.apache.org/r/45455/

commit 54926ad18d3ef90ad452a8e216ff7b1cd465df0a
Author: Anand Mazumdar 
Date:   Tue Apr 5 09:35:45 2016 -0700

Cleaned up orphaned docker containers owned by previous agent instance.

This change modifies the docker containerizer to cleanup docker
containers left from another agent instance. The containers can
become orphans due to any of the scenarios mentioned here:
http://bit.ly/1RxCpPl

This change modifies the logic to invoke docker `ps` on all
containers on the agent instead of limiting itself to the
current slaveID. This change also means that running multiple
agent instances on the same host might not work well for docker
containers from now on i.e. another agent instance might
cleanup the docker containers that belong to another instance.
The cgroup isolators/linux launcher for the Mesos containerizer
anyways don't recommend running multiple instances of the agent
on the same host.

In case one still wants to run multiple agent instances on a
test cluster using the docker containerizer, we can use the
`--no-docker_kill_orphans` flag and then kill the docker
containers manually using a script.

Review: https://reviews.apache.org/r/45454/

commit ca747406574b51b17cdcce8ced2ac5d4dfaa091a
Author: Anand Mazumdar 
Date:   Tue Apr 5 09:35:19 2016 -0700

Fixed minor spacing cleanups in docker containerizer.

Review: https://reviews.apache.org/r/45453/
{code}

> Mesos does not kill orphaned docker containers
> --
>
> Key: MESOS-3573
> URL: https://issues.apache.org/jira/browse/MESOS-3573
> Project: Mesos
>  Issue Type: Bug
>  Components: docker, slave
>Reporter: Ian Babrou
>Assignee: Anand Mazumdar
>  Labels: mesosphere
>
> After upgrade to 0.24.0 we noticed hanging containers appearing. Looks like 
> there were changes between 0.23.0 and 0.24.0 that broke cleanup.
> Here's how to trigger this bug:
> 1. Deploy app in docker container.
> 2. Kill corresponding mesos-docker-executor process
> 3. Observe hanging container
> Here are the logs after kill:
> {noformat}
> slave_1| I1002 12:12:59.362002  7791 docker.cpp:1576] Executor for 
> container 'f083aaa2-d5c3-43c1-b6ba-342de8829fa8' has exited
> slave_1| I1002 12:12:59.362284  7791 docker.cpp:1374] Destroying 
> container 'f083aaa2-d5c3-43c1-b6ba-342de8829fa8'
> slave_1| I1002 12:12:59.363404  7791 docker.cpp:1478] Running docker stop 
> on container 'f083aaa2-d5c3-43c1-b6ba-342de8829fa8'
> slave_1| I1002 12:12:59.363876  7791 slave.cpp:3399] Executor 
> 'sleepy.87eb6191-68fe-11e5-9444-8eb895523b9c' of framework 
> 20150923-122130-2153451692-5050-1- terminated with signal Terminated
> slave_1| I1002 12:12:59.367570  7791 slave.cpp:2696] Handling status 
> update TASK_FAILED (UUID: 4a1b2387-a469-4f01-bfcb-0d1cccbde550) for task 
> sleepy.87eb6191-68fe-11e5-9444-8eb895523b9c of framework 
> 20150923-122130-2153451692-5050-1- from @0.0.0.0:0
> slave_1| I1002 12:12:59.367842  7791 slave.cpp:5094] Terminating task 
> sleepy.87eb6191-68fe-11e5-9444-8eb895523b9c
> slave_1| W1002 12:12:59.368484  7791 docker.cpp:986] Ignoring updating 
> unknown container: f083aaa2-d5c3-43c1-b6ba-342de8829fa8
> slave_1| I1002 12:12:59.368671  7791 status_update_manager.cpp:322] 
> Received status update TASK_FAILED (UUID: 
> 4a1b2387-a469-4f01-bfcb-0d1cccbde550) for task 
> sleepy.87eb6191-68fe-11e5-9444-8eb895523b9c of framework 
> 20150923-122130-2153451692-5050-1-
> slave_1| I1002 12:12:59.368741  7791 status_update_manager.cpp:826] 
> Checkpointing UPDATE for status update TASK_FAILED (UUID: 
> 4a1b2387-a469-4f01-bfcb-0d1cccbde550) for task 
> sleepy.87eb6191-68fe-11e5-9444-8eb895523b9c of framework 
> 20150923-122130-2153451692-5050-1-
> slave_1| I1002 12:12:59.370636  7791 status_update_manager.cpp:376] 
> Forwarding update TASK_FAILED (UUID: 4a1b2387-a469-4f01-bfcb-0d1cccbde550) 
> for task sleepy.87eb6191-68fe-11e5-9444-8eb895523b9c of framework 
> 20150923-122130-2153451692-5050-1- to the slave
> slave_1| I1002 12:12:59.371335  7791 slave.cpp:2975] Forwarding the 
> update TASK_FAILED (UUID: 4a1b2387-a469-4f01-bfcb-0d1cccbde550) for task 
> sleepy.87eb6191-68fe-11e5-9444-8eb895523b9c of framework 
> 20150923-122130-2153451692-5050-1- to 

[jira] [Created] (MESOS-5145) protobuf vendored but its depencencies are not

2016-04-07 Thread David Robinson (JIRA)
David Robinson created MESOS-5145:
-

 Summary: protobuf vendored but its depencencies are not
 Key: MESOS-5145
 URL: https://issues.apache.org/jira/browse/MESOS-5145
 Project: Mesos
  Issue Type: Bug
  Components: build
Reporter: David Robinson


Updating [protobuf from 2.5 to 
2.6.1|https://github.com/apache/mesos/commit/51872fba7f94d80e55c9cc9b46f96780a938f626]
 has caused Mesos builds to fail if pypi.python.org is unreachable. 
Protobuf-2.6.1 requires 
[google-apputils|https://pypi.python.org/pypi/google-apputils] and if it's not 
available the build process will attempt to download it from pypi.

Prior to this change it was possible to build Mesos without Internet access. If 
the build process reaches out to arbitrary things on the Internet it's 
impossible to guarantee build reproducibility.

{noformat:title=snippet from setup.py in protobuf-2.6.1.tar.gz}
  setup(name = 'protobuf',
version = '2.6.1',
...
setup_requires = ['google-apputils'],
...
)
{noformat}

{noformat:title=snippet from build log}
08:20:49 DEBUG: Building protobuf Python egg ...
08:20:49 DEBUG: cd ../3rdparty/libprocess/3rdparty/protobuf-2.6.1/python && 
\
08:20:49 DEBUG:   CC="gcc"  \
08:20:49 DEBUG:   CXX="g++" \
08:20:49 DEBUG:   CFLAGS="-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 
-fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic 
-Wno-unused-local-typedefs"   \
08:20:49 DEBUG:   CXXFLAGS="-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 
-fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic 
-Wno-unused-local-typedefs -Wno-maybe-uninitialized -std=c++11" 
\
08:20:49 DEBUG:   
PYTHONPATH=/builddir/build/BUILD/mesos-0.29.0/3rdparty/distribute-0.6.26  \
08:20:49 DEBUG:   /usr/bin/python2.7 setup.py build bdist_egg
08:20:49 DEBUG: Download error on 
http://pypi.python.org/simple/google-apputils/: [Errno 111] Connection refused 
-- Some packages may not be found!
08:20:49 DEBUG: Download error on 
http://pypi.python.org/simple/google-apputils/: [Errno 111] Connection refused 
-- Some packages may not be found!
08:20:49 DEBUG: Couldn't find index page for 'google-apputils' (maybe 
misspelled?)
08:20:49 DEBUG: Download error on http://pypi.python.org/simple/: [Errno 111] 
Connection refused -- Some packages may not be found!
08:20:49 DEBUG: No local packages or download links found for google-apputils
08:20:49 DEBUG: Traceback (most recent call last):
08:20:49 DEBUG:   File "setup.py", line 200, in 
08:20:49 DEBUG: "Protocol Buffers are Google's data interchange format.",
08:20:49 DEBUG:   File "/usr/lib64/python2.7/distutils/core.py", line 111, in 
setup
08:20:49 DEBUG: _setup_distribution = dist = klass(attrs)
08:20:49 DEBUG:   File 
"/builddir/build/BUILD/mesos-0.29.0/3rdparty/distribute-0.6.26/setuptools/dist.py",
 line 221, in __init__
08:20:49 DEBUG: self.fetch_build_eggs(attrs.pop('setup_requires'))
08:20:49 DEBUG:   File 
"/builddir/build/BUILD/mesos-0.29.0/3rdparty/distribute-0.6.26/setuptools/dist.py",
 line 245, in fetch_build_eggs
08:20:49 DEBUG: parse_requirements(requires), installer=self.fetch_build_egg
08:20:49 DEBUG:   File 
"/builddir/build/BUILD/mesos-0.29.0/3rdparty/distribute-0.6.26/pkg_resources.py",
 line 580, in resolve
08:20:49 DEBUG: dist = best[req.key] = env.best_match(req, self, installer)
08:20:49 DEBUG:   File 
"/builddir/build/BUILD/mesos-0.29.0/3rdparty/distribute-0.6.26/pkg_resources.py",
 line 825, in best_match
08:20:49 DEBUG: return self.obtain(req, installer) # try and 
download/install
08:20:49 DEBUG:   File 
"/builddir/build/BUILD/mesos-0.29.0/3rdparty/distribute-0.6.26/pkg_resources.py",
 line 837, in obtain
08:20:49 DEBUG: return installer(requirement)
08:20:49 DEBUG:   File 
"/builddir/build/BUILD/mesos-0.29.0/3rdparty/distribute-0.6.26/setuptools/dist.py",
 line 294, in fetch_build_egg
08:20:49 DEBUG: return cmd.easy_install(req)
08:20:49 DEBUG:   File 
"/builddir/build/BUILD/mesos-0.29.0/3rdparty/distribute-0.6.26/setuptools/command/easy_install.py",
 line 584, in easy_install
08:20:49 DEBUG: raise DistutilsError(msg)
08:20:49 DEBUG: distutils.errors.DistutilsError: Could not find suitable 
distribution for Requirement.parse('google-apputils')
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5144) Cleanup memory leaks in libprocess finalize()

2016-04-07 Thread Neil Conway (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neil Conway updated MESOS-5144:
---
Attachment: libprocess_finalize_mem_leaks-1.patch

Attached is a quick patch to fix most of the current leaks. Will polish for 
submission later.

> Cleanup memory leaks in libprocess finalize()
> -
>
> Key: MESOS-5144
> URL: https://issues.apache.org/jira/browse/MESOS-5144
> Project: Mesos
>  Issue Type: Task
>  Components: libprocess
>Reporter: Neil Conway
>Assignee: Neil Conway
>  Labels: mesosphere, tech-debt
> Attachments: libprocess_finalize_mem_leaks-1.patch
>
>
> libprocess's {{finalize}} function currently leaks memory for a few different 
> reasons. Cleaning up the {{SocketManager}} will be somewhat involved 
> (MESOS-3910), but the remaining memory leaks should be fairly easy to address.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-5144) Cleanup memory leaks in libprocess finalize()

2016-04-07 Thread Neil Conway (JIRA)
Neil Conway created MESOS-5144:
--

 Summary: Cleanup memory leaks in libprocess finalize()
 Key: MESOS-5144
 URL: https://issues.apache.org/jira/browse/MESOS-5144
 Project: Mesos
  Issue Type: Task
  Components: libprocess
Reporter: Neil Conway
Assignee: Neil Conway


libprocess's {{finalize}} function currently leaks memory for a few different 
reasons. Cleaning up the {{SocketManager}} will be somewhat involved 
(MESOS-3910), but the remaining memory leaks should be fairly easy to address.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (MESOS-3235) FetcherCacheHttpTest.HttpCachedSerialized and FetcherCacheHttpTest.HttpCachedConcurrent are flaky

2016-04-07 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-3235:
-
Comment: was deleted

(was: Logs from the Mesosphere CI on CentOS 6, with libevent and SSL enabled)

> FetcherCacheHttpTest.HttpCachedSerialized and 
> FetcherCacheHttpTest.HttpCachedConcurrent are flaky
> -
>
> Key: MESOS-3235
> URL: https://issues.apache.org/jira/browse/MESOS-3235
> Project: Mesos
>  Issue Type: Bug
>  Components: fetcher, tests
>Affects Versions: 0.23.0
>Reporter: Joseph Wu
>Assignee: Bernd Mathiske
>  Labels: mesosphere
> Fix For: 0.27.0
>
> Attachments: fetchercache_log_centos_6.txt
>
>
> On OSX, {{make clean && make -j8 V=0 check}}:
> {code}
> [--] 3 tests from FetcherCacheHttpTest
> [ RUN  ] FetcherCacheHttpTest.HttpCachedSerialized
> HTTP/1.1 200 OK
> Date: Fri, 07 Aug 2015 17:23:05 GMT
> Content-Length: 30
> I0807 10:23:05.673596 2085372672 exec.cpp:133] Version: 0.24.0
> E0807 10:23:05.675884 184373248 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> I0807 10:23:05.675897 182226944 exec.cpp:207] Executor registered on slave 
> 20150807-102305-139395082-52338-52313-S0
> E0807 10:23:05.683980 184373248 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> Registered executor on 10.0.79.8
> Starting task 0
> Forked command at 54363
> sh -c './mesos-fetcher-test-cmd 0'
> E0807 10:23:05.694953 184373248 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> Command exited with status 0 (pid: 54363)
> E0807 10:23:05.793927 184373248 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> I0807 10:23:06.590008 2085372672 exec.cpp:133] Version: 0.24.0
> E0807 10:23:06.592244 355938304 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> I0807 10:23:06.592243 353255424 exec.cpp:207] Executor registered on slave 
> 20150807-102305-139395082-52338-52313-S0
> E0807 10:23:06.597995 355938304 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> Registered executor on 10.0.79.8
> Starting task 1
> Forked command at 54411
> sh -c './mesos-fetcher-test-cmd 1'
> E0807 10:23:06.608708 355938304 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> Command exited with status 0 (pid: 54411)
> E0807 10:23:06.707649 355938304 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> ../../src/tests/fetcher_cache_tests.cpp:860: Failure
> Failed to wait 15secs for awaitFinished(task.get())
> *** Aborted at 1438968214 (unix time) try "date -d @1438968214" if you are 
> using GNU date ***
> [  FAILED  ] FetcherCacheHttpTest.HttpCachedSerialized (28685 ms)
> [ RUN  ] FetcherCacheHttpTest.HttpCachedConcurrent
> PC: @0x113723618 process::Owned<>::get()
> *** SIGSEGV (@0x0) received by PID 52313 (TID 0x118d59000) stack trace: ***
> @ 0x7fff8fcacf1a _sigtramp
> @ 0x7f9bc3109710 (unknown)
> @0x1136f07e2 mesos::internal::slave::Fetcher::fetch()
> @0x113862f9d 
> mesos::internal::slave::MesosContainerizerProcess::fetch()
> @0x1138f1b5d 
> _ZZN7process8dispatchI7NothingN5mesos8internal5slave25MesosContainerizerProcessERKNS2_11ContainerIDERKNS2_11CommandInfoERKNSt3__112basic_stringIcNSC_11char_traitsIcEENSC_9allocatorIcRK6OptionISI_ERKNS2_7SlaveIDES6_S9_SI_SM_SP_EENS_6FutureIT_EERKNS_3PIDIT0_EEMSW_FSU_T1_T2_T3_T4_T5_ET6_T7_T8_T9_T10_ENKUlPNS_11ProcessBaseEE_clES1D_
> @0x1138f18cf 
> _ZNSt3__110__function6__funcIZN7process8dispatchI7NothingN5mesos8internal5slave25MesosContainerizerProcessERKNS5_11ContainerIDERKNS5_11CommandInfoERKNS_12basic_stringIcNS_11char_traitsIcEENS_9allocatorIcRK6OptionISK_ERKNS5_7SlaveIDES9_SC_SK_SO_SR_EENS2_6FutureIT_EERKNS2_3PIDIT0_EEMSY_FSW_T1_T2_T3_T4_T5_ET6_T7_T8_T9_T10_EUlPNS2_11ProcessBaseEE_NSI_IS1G_EEFvS1F_EEclEOS1F_
> @0x1143768cf std::__1::function<>::operator()()
> @0x11435ca7f process::ProcessBase::visit()
> @0x1143ed6fe process::DispatchEvent::visit()
> @0x11271 process::ProcessBase::serve()
> @0x114343b4e process::ProcessManager::resume()
> @0x1143431ca process::internal::schedule()
> @0x1143da646 _ZNSt3__114__thread_proxyINS_5tupleIJPFvvEEPvS5_
> @ 0x7fff95090268 _pthread_body
> @ 0x7fff950901e5 _pthread_start
> @ 0x7fff9508e41d thread_start
> Failed to synchronize with slave (it's probably exited)
> make[3]: *** [check-local] Segmentation fault: 11
> make[2]: *** [check-am] Error 2
> make[1]: *** [check] Error 2
> make: *** [check-recursive] Error 1
> {code}
> This was encountered just once out 

[jira] [Updated] (MESOS-3235) FetcherCacheHttpTest.HttpCachedSerialized and FetcherCacheHttpTest.HttpCachedConcurrent are flaky

2016-04-07 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-3235:
-
Attachment: fetchercache_log_centos_6.txt

Logs from the Mesosphere CI on CentOS 6, with libevent and SSL enabled

> FetcherCacheHttpTest.HttpCachedSerialized and 
> FetcherCacheHttpTest.HttpCachedConcurrent are flaky
> -
>
> Key: MESOS-3235
> URL: https://issues.apache.org/jira/browse/MESOS-3235
> Project: Mesos
>  Issue Type: Bug
>  Components: fetcher, tests
>Affects Versions: 0.23.0
>Reporter: Joseph Wu
>Assignee: Bernd Mathiske
>  Labels: mesosphere
> Fix For: 0.27.0
>
> Attachments: fetchercache_log_centos_6.txt
>
>
> On OSX, {{make clean && make -j8 V=0 check}}:
> {code}
> [--] 3 tests from FetcherCacheHttpTest
> [ RUN  ] FetcherCacheHttpTest.HttpCachedSerialized
> HTTP/1.1 200 OK
> Date: Fri, 07 Aug 2015 17:23:05 GMT
> Content-Length: 30
> I0807 10:23:05.673596 2085372672 exec.cpp:133] Version: 0.24.0
> E0807 10:23:05.675884 184373248 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> I0807 10:23:05.675897 182226944 exec.cpp:207] Executor registered on slave 
> 20150807-102305-139395082-52338-52313-S0
> E0807 10:23:05.683980 184373248 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> Registered executor on 10.0.79.8
> Starting task 0
> Forked command at 54363
> sh -c './mesos-fetcher-test-cmd 0'
> E0807 10:23:05.694953 184373248 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> Command exited with status 0 (pid: 54363)
> E0807 10:23:05.793927 184373248 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> I0807 10:23:06.590008 2085372672 exec.cpp:133] Version: 0.24.0
> E0807 10:23:06.592244 355938304 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> I0807 10:23:06.592243 353255424 exec.cpp:207] Executor registered on slave 
> 20150807-102305-139395082-52338-52313-S0
> E0807 10:23:06.597995 355938304 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> Registered executor on 10.0.79.8
> Starting task 1
> Forked command at 54411
> sh -c './mesos-fetcher-test-cmd 1'
> E0807 10:23:06.608708 355938304 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> Command exited with status 0 (pid: 54411)
> E0807 10:23:06.707649 355938304 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> ../../src/tests/fetcher_cache_tests.cpp:860: Failure
> Failed to wait 15secs for awaitFinished(task.get())
> *** Aborted at 1438968214 (unix time) try "date -d @1438968214" if you are 
> using GNU date ***
> [  FAILED  ] FetcherCacheHttpTest.HttpCachedSerialized (28685 ms)
> [ RUN  ] FetcherCacheHttpTest.HttpCachedConcurrent
> PC: @0x113723618 process::Owned<>::get()
> *** SIGSEGV (@0x0) received by PID 52313 (TID 0x118d59000) stack trace: ***
> @ 0x7fff8fcacf1a _sigtramp
> @ 0x7f9bc3109710 (unknown)
> @0x1136f07e2 mesos::internal::slave::Fetcher::fetch()
> @0x113862f9d 
> mesos::internal::slave::MesosContainerizerProcess::fetch()
> @0x1138f1b5d 
> _ZZN7process8dispatchI7NothingN5mesos8internal5slave25MesosContainerizerProcessERKNS2_11ContainerIDERKNS2_11CommandInfoERKNSt3__112basic_stringIcNSC_11char_traitsIcEENSC_9allocatorIcRK6OptionISI_ERKNS2_7SlaveIDES6_S9_SI_SM_SP_EENS_6FutureIT_EERKNS_3PIDIT0_EEMSW_FSU_T1_T2_T3_T4_T5_ET6_T7_T8_T9_T10_ENKUlPNS_11ProcessBaseEE_clES1D_
> @0x1138f18cf 
> _ZNSt3__110__function6__funcIZN7process8dispatchI7NothingN5mesos8internal5slave25MesosContainerizerProcessERKNS5_11ContainerIDERKNS5_11CommandInfoERKNS_12basic_stringIcNS_11char_traitsIcEENS_9allocatorIcRK6OptionISK_ERKNS5_7SlaveIDES9_SC_SK_SO_SR_EENS2_6FutureIT_EERKNS2_3PIDIT0_EEMSY_FSW_T1_T2_T3_T4_T5_ET6_T7_T8_T9_T10_EUlPNS2_11ProcessBaseEE_NSI_IS1G_EEFvS1F_EEclEOS1F_
> @0x1143768cf std::__1::function<>::operator()()
> @0x11435ca7f process::ProcessBase::visit()
> @0x1143ed6fe process::DispatchEvent::visit()
> @0x11271 process::ProcessBase::serve()
> @0x114343b4e process::ProcessManager::resume()
> @0x1143431ca process::internal::schedule()
> @0x1143da646 _ZNSt3__114__thread_proxyINS_5tupleIJPFvvEEPvS5_
> @ 0x7fff95090268 _pthread_body
> @ 0x7fff950901e5 _pthread_start
> @ 0x7fff9508e41d thread_start
> Failed to synchronize with slave (it's probably exited)
> make[3]: *** [check-local] Segmentation fault: 11
> make[2]: *** [check-am] Error 2
> make[1]: *** [check] Error 2
> make: *** [check-recursive] Error 1
> {code}
> This was encountered 

[jira] [Commented] (MESOS-3235) FetcherCacheHttpTest.HttpCachedSerialized and FetcherCacheHttpTest.HttpCachedConcurrent are flaky

2016-04-07 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231054#comment-15231054
 ] 

Greg Mann commented on MESOS-3235:
--

[~bernd-mesos], it looks like this is still flaky - see the attached log file, 
which was observed today on the internal Mesosphere CI on CentOS 6 with 
libevent and SSL enabled. It appears to be the same future that is failing.

> FetcherCacheHttpTest.HttpCachedSerialized and 
> FetcherCacheHttpTest.HttpCachedConcurrent are flaky
> -
>
> Key: MESOS-3235
> URL: https://issues.apache.org/jira/browse/MESOS-3235
> Project: Mesos
>  Issue Type: Bug
>  Components: fetcher, tests
>Affects Versions: 0.23.0
>Reporter: Joseph Wu
>Assignee: Bernd Mathiske
>  Labels: mesosphere
> Fix For: 0.27.0
>
>
> On OSX, {{make clean && make -j8 V=0 check}}:
> {code}
> [--] 3 tests from FetcherCacheHttpTest
> [ RUN  ] FetcherCacheHttpTest.HttpCachedSerialized
> HTTP/1.1 200 OK
> Date: Fri, 07 Aug 2015 17:23:05 GMT
> Content-Length: 30
> I0807 10:23:05.673596 2085372672 exec.cpp:133] Version: 0.24.0
> E0807 10:23:05.675884 184373248 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> I0807 10:23:05.675897 182226944 exec.cpp:207] Executor registered on slave 
> 20150807-102305-139395082-52338-52313-S0
> E0807 10:23:05.683980 184373248 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> Registered executor on 10.0.79.8
> Starting task 0
> Forked command at 54363
> sh -c './mesos-fetcher-test-cmd 0'
> E0807 10:23:05.694953 184373248 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> Command exited with status 0 (pid: 54363)
> E0807 10:23:05.793927 184373248 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> I0807 10:23:06.590008 2085372672 exec.cpp:133] Version: 0.24.0
> E0807 10:23:06.592244 355938304 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> I0807 10:23:06.592243 353255424 exec.cpp:207] Executor registered on slave 
> 20150807-102305-139395082-52338-52313-S0
> E0807 10:23:06.597995 355938304 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> Registered executor on 10.0.79.8
> Starting task 1
> Forked command at 54411
> sh -c './mesos-fetcher-test-cmd 1'
> E0807 10:23:06.608708 355938304 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> Command exited with status 0 (pid: 54411)
> E0807 10:23:06.707649 355938304 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> ../../src/tests/fetcher_cache_tests.cpp:860: Failure
> Failed to wait 15secs for awaitFinished(task.get())
> *** Aborted at 1438968214 (unix time) try "date -d @1438968214" if you are 
> using GNU date ***
> [  FAILED  ] FetcherCacheHttpTest.HttpCachedSerialized (28685 ms)
> [ RUN  ] FetcherCacheHttpTest.HttpCachedConcurrent
> PC: @0x113723618 process::Owned<>::get()
> *** SIGSEGV (@0x0) received by PID 52313 (TID 0x118d59000) stack trace: ***
> @ 0x7fff8fcacf1a _sigtramp
> @ 0x7f9bc3109710 (unknown)
> @0x1136f07e2 mesos::internal::slave::Fetcher::fetch()
> @0x113862f9d 
> mesos::internal::slave::MesosContainerizerProcess::fetch()
> @0x1138f1b5d 
> _ZZN7process8dispatchI7NothingN5mesos8internal5slave25MesosContainerizerProcessERKNS2_11ContainerIDERKNS2_11CommandInfoERKNSt3__112basic_stringIcNSC_11char_traitsIcEENSC_9allocatorIcRK6OptionISI_ERKNS2_7SlaveIDES6_S9_SI_SM_SP_EENS_6FutureIT_EERKNS_3PIDIT0_EEMSW_FSU_T1_T2_T3_T4_T5_ET6_T7_T8_T9_T10_ENKUlPNS_11ProcessBaseEE_clES1D_
> @0x1138f18cf 
> _ZNSt3__110__function6__funcIZN7process8dispatchI7NothingN5mesos8internal5slave25MesosContainerizerProcessERKNS5_11ContainerIDERKNS5_11CommandInfoERKNS_12basic_stringIcNS_11char_traitsIcEENS_9allocatorIcRK6OptionISK_ERKNS5_7SlaveIDES9_SC_SK_SO_SR_EENS2_6FutureIT_EERKNS2_3PIDIT0_EEMSY_FSW_T1_T2_T3_T4_T5_ET6_T7_T8_T9_T10_EUlPNS2_11ProcessBaseEE_NSI_IS1G_EEFvS1F_EEclEOS1F_
> @0x1143768cf std::__1::function<>::operator()()
> @0x11435ca7f process::ProcessBase::visit()
> @0x1143ed6fe process::DispatchEvent::visit()
> @0x11271 process::ProcessBase::serve()
> @0x114343b4e process::ProcessManager::resume()
> @0x1143431ca process::internal::schedule()
> @0x1143da646 _ZNSt3__114__thread_proxyINS_5tupleIJPFvvEEPvS5_
> @ 0x7fff95090268 _pthread_body
> @ 0x7fff950901e5 _pthread_start
> @ 0x7fff9508e41d thread_start
> Failed to synchronize with slave (it's probably exited)
> make[3]: *** [check-local] Segmentation fault: 11
> make[2]: *** [check-am] Error 2
> 

[jira] [Created] (MESOS-5143) LostSlaveMessage should not be broadcasted.

2016-04-07 Thread Yan Xu (JIRA)
Yan Xu created MESOS-5143:
-

 Summary: LostSlaveMessage should not be broadcasted.
 Key: MESOS-5143
 URL: https://issues.apache.org/jira/browse/MESOS-5143
 Project: Mesos
  Issue Type: Bug
  Components: master
Reporter: Yan Xu


Currently a {{LostSlaveMessage}} (in v1 it's a type of {{Event::Failure}}) is 
broadcasted to all registered frameworks in the cluster whenever a slave is 
lost.

This is unnecessary and kind of breaks the Mesos abstraction: Frameworks are a 
given a slice of the cluster, not the entirety. They know about the slice when 
offers are extended to them, so we shouldn't inform all of them when all agents 
go away.

This message should instead be narrowcasted to all frameworks who have a stake 
in this agent: running tasks, pending offers, reservations, persistent volumes, 
etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4982) Update example long running to use v1 API.

2016-04-07 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-4982:
--
  Sprint: Mesosphere Sprint 32
Story Points: 3

> Update example long running to use v1 API.
> --
>
> Key: MESOS-4982
> URL: https://issues.apache.org/jira/browse/MESOS-4982
> Project: Mesos
>  Issue Type: Task
>Reporter: Anand Mazumdar
>Assignee: Anand Mazumdar
>  Labels: mesosphere
> Fix For: 0.29.0
>
>
> We need to modify the long running test framework similar to 
> {{src/examples/long_lived_framework.cpp}} to use the v1 API.
> This would allow us to vet the v1 API and the scheduler library in test 
> clusters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-5027) Enable authenticated login in the webui

2016-04-07 Thread Joerg Schad (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joerg Schad reassigned MESOS-5027:
--

Assignee: Joerg Schad

> Enable authenticated login in the webui
> ---
>
> Key: MESOS-5027
> URL: https://issues.apache.org/jira/browse/MESOS-5027
> Project: Mesos
>  Issue Type: Improvement
>  Components: master, security, webui
>Reporter: Greg Mann
>Assignee: Joerg Schad
>  Labels: mesosphere, security
> Fix For: 0.29.0
>
> Attachments: Screen Shot 2016-04-07 at 21.02.45.png
>
>
> The webui hits a number of endpoints to get the data that it displays: 
> {{/state}}, {{/metrics/snapshot}}, {{/files/browse}}, {{/files/read}}, and 
> maybe others? Once authentication is enabled on these endpoints, we need to 
> add a login prompt to the webui so that users can provide credentials.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-5027) Enable authenticated login in the webui

2016-04-07 Thread Joerg Schad (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230921#comment-15230921
 ] 

Joerg Schad edited comment on MESOS-5027 at 4/7/16 8:30 PM:


I tested a number of browser (chrome, firefox, safari) and all of them generate 
a default login popup automatically (see screenshot).
After adding your credentials, the browser will reuse that for subsequent 
requests so you only have to login once for a session (or until the browser 
specific timeout > 15min is up).
Note: When accessing different agents we have to authenticate for each agent 
individually.

Please note this only holds for Basic Auth.



was (Author: js84):
I tested a number of browser (chrome, firefox, safari) and all of them generate 
a default login popup automatically (see screenshot).
After adding your credentials, the browser will reuse that for subsequent 
requests so you only have to login once for a session (or until the browser 
specific timeout > 15min is up).
Please note this only holds for Basic Auth.


> Enable authenticated login in the webui
> ---
>
> Key: MESOS-5027
> URL: https://issues.apache.org/jira/browse/MESOS-5027
> Project: Mesos
>  Issue Type: Improvement
>  Components: master, security, webui
>Reporter: Greg Mann
>  Labels: mesosphere, security
> Fix For: 0.29.0
>
> Attachments: Screen Shot 2016-04-07 at 21.02.45.png
>
>
> The webui hits a number of endpoints to get the data that it displays: 
> {{/state}}, {{/metrics/snapshot}}, {{/files/browse}}, {{/files/read}}, and 
> maybe others? Once authentication is enabled on these endpoints, we need to 
> add a login prompt to the webui so that users can provide credentials.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-5027) Enable authenticated login in the webui

2016-04-07 Thread Joerg Schad (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230921#comment-15230921
 ] 

Joerg Schad edited comment on MESOS-5027 at 4/7/16 7:53 PM:


I tested a number of browser (chrome, firefox, safari) and all of them generate 
a default login popup automatically (see screenshot).
After adding your credentials, the browser will reuse that for subsequent 
requests so you only have to login once for a session (or until the browser 
specific timeout> 15min is up).
Please note this only holds for Basic Auth.



was (Author: js84):
I tested a number of browser (chrome, firefox, safari) and all of them generate 
a default login popup automatically (see screenshot).
After adding your credentials, the browser will reuse that for subsequent 
requests so you only have to login once for a session (or until the timeout 
-browser specific > 15min- is up).
Please note this only holds for Basic Auth.


> Enable authenticated login in the webui
> ---
>
> Key: MESOS-5027
> URL: https://issues.apache.org/jira/browse/MESOS-5027
> Project: Mesos
>  Issue Type: Improvement
>  Components: master, security, webui
>Reporter: Greg Mann
>  Labels: mesosphere, security
> Fix For: 0.29.0
>
> Attachments: Screen Shot 2016-04-07 at 21.02.45.png
>
>
> The webui hits a number of endpoints to get the data that it displays: 
> {{/state}}, {{/metrics/snapshot}}, {{/files/browse}}, {{/files/read}}, and 
> maybe others? Once authentication is enabled on these endpoints, we need to 
> add a login prompt to the webui so that users can provide credentials.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-5027) Enable authenticated login in the webui

2016-04-07 Thread Joerg Schad (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230921#comment-15230921
 ] 

Joerg Schad edited comment on MESOS-5027 at 4/7/16 7:53 PM:


I tested a number of browser (chrome, firefox, safari) and all of them generate 
a default login popup automatically (see screenshot).
After adding your credentials, the browser will reuse that for subsequent 
requests so you only have to login once for a session (or until the browser 
specific timeout > 15min is up).
Please note this only holds for Basic Auth.



was (Author: js84):
I tested a number of browser (chrome, firefox, safari) and all of them generate 
a default login popup automatically (see screenshot).
After adding your credentials, the browser will reuse that for subsequent 
requests so you only have to login once for a session (or until the browser 
specific timeout> 15min is up).
Please note this only holds for Basic Auth.


> Enable authenticated login in the webui
> ---
>
> Key: MESOS-5027
> URL: https://issues.apache.org/jira/browse/MESOS-5027
> Project: Mesos
>  Issue Type: Improvement
>  Components: master, security, webui
>Reporter: Greg Mann
>  Labels: mesosphere, security
> Fix For: 0.29.0
>
> Attachments: Screen Shot 2016-04-07 at 21.02.45.png
>
>
> The webui hits a number of endpoints to get the data that it displays: 
> {{/state}}, {{/metrics/snapshot}}, {{/files/browse}}, {{/files/read}}, and 
> maybe others? Once authentication is enabled on these endpoints, we need to 
> add a login prompt to the webui so that users can provide credentials.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-5027) Enable authenticated login in the webui

2016-04-07 Thread Joerg Schad (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230921#comment-15230921
 ] 

Joerg Schad edited comment on MESOS-5027 at 4/7/16 7:53 PM:


I tested a number of browser (chrome, firefox, safari) and all of them generate 
a default login popup automatically (see screenshot).
After adding your credentials, the browser will reuse that for subsequent 
requests so you only have to login once for a session (or until the timeout 
-browser specific > 15min- is up).
Please note this only holds for Basic Auth.



was (Author: js84):
I tested a number of browser (chrome, firefox, safari) and all of them generate 
a default login popup automatically (see screenshot).
After adding your credentials, the browser will reuse that for subsequent 
requests so you only have to login once for a session (or until the timeout 
-browser specific > 15min- is up).


> Enable authenticated login in the webui
> ---
>
> Key: MESOS-5027
> URL: https://issues.apache.org/jira/browse/MESOS-5027
> Project: Mesos
>  Issue Type: Improvement
>  Components: master, security, webui
>Reporter: Greg Mann
>  Labels: mesosphere, security
> Fix For: 0.29.0
>
> Attachments: Screen Shot 2016-04-07 at 21.02.45.png
>
>
> The webui hits a number of endpoints to get the data that it displays: 
> {{/state}}, {{/metrics/snapshot}}, {{/files/browse}}, {{/files/read}}, and 
> maybe others? Once authentication is enabled on these endpoints, we need to 
> add a login prompt to the webui so that users can provide credentials.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5027) Enable authenticated login in the webui

2016-04-07 Thread Joerg Schad (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230921#comment-15230921
 ] 

Joerg Schad commented on MESOS-5027:


I tested a number of browser (chrome, firefox, safari) and all of them generate 
a default login popup automatically (see screenshot).
After adding your credentials, the browser will reuse that for subsequent 
requests so you only have to login once for a session (or until the timeout 
-browser specific > 15min- is up).


> Enable authenticated login in the webui
> ---
>
> Key: MESOS-5027
> URL: https://issues.apache.org/jira/browse/MESOS-5027
> Project: Mesos
>  Issue Type: Improvement
>  Components: master, security, webui
>Reporter: Greg Mann
>  Labels: mesosphere, security
> Fix For: 0.29.0
>
> Attachments: Screen Shot 2016-04-07 at 21.02.45.png
>
>
> The webui hits a number of endpoints to get the data that it displays: 
> {{/state}}, {{/metrics/snapshot}}, {{/files/browse}}, {{/files/read}}, and 
> maybe others? Once authentication is enabled on these endpoints, we need to 
> add a login prompt to the webui so that users can provide credentials.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5027) Enable authenticated login in the webui

2016-04-07 Thread Joerg Schad (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joerg Schad updated MESOS-5027:
---
Attachment: Screen Shot 2016-04-07 at 21.02.45.png

Screenshot of automatic login window.

> Enable authenticated login in the webui
> ---
>
> Key: MESOS-5027
> URL: https://issues.apache.org/jira/browse/MESOS-5027
> Project: Mesos
>  Issue Type: Improvement
>  Components: master, security, webui
>Reporter: Greg Mann
>  Labels: mesosphere, security
> Fix For: 0.29.0
>
> Attachments: Screen Shot 2016-04-07 at 21.02.45.png
>
>
> The webui hits a number of endpoints to get the data that it displays: 
> {{/state}}, {{/metrics/snapshot}}, {{/files/browse}}, {{/files/read}}, and 
> maybe others? Once authentication is enabled on these endpoints, we need to 
> add a login prompt to the webui so that users can provide credentials.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4760) Expose metrics and gauges for fetcher cache usage and hit rate

2016-04-07 Thread Bernd Mathiske (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230864#comment-15230864
 ] 

Bernd Mathiske commented on MESOS-4760:
---

Alright - let's do this! :-) Thanks!

> Expose metrics and gauges for fetcher cache usage and hit rate
> --
>
> Key: MESOS-4760
> URL: https://issues.apache.org/jira/browse/MESOS-4760
> Project: Mesos
>  Issue Type: Improvement
>  Components: fetcher, statistics
>Reporter: Michael Browning
>Priority: Minor
>  Labels: features, fetcher, statistics, uber
>
> To evaluate the fetcher cache and calibrate the value of the 
> fetcher_cache_size flag, it would be useful to have metrics and gauges on 
> agents that expose operational statistics like cache hit rate, occupied cache 
> size, and time spent downloading resources that were not present.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5141) Framework (CodeFuturesExampleFramework-1) at scheduler already subscribed, resending acknowledgement

2016-04-07 Thread Joseph Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230830#comment-15230830
 ] 

Joseph Wu commented on MESOS-5141:
--

In your case, not being able to register after the framework's IP has changed 
is expected.  I'm guessing this is what happened:
1) Your framework started, and libprocess detected the framework's IP as 
{{192.168.13.159}}.
2) {{192.168.13.159}} was sent to the Mesos master, which opens up a persistent 
socket.  This socket is how the master communicates with your framework.
3) Your framework's network changes to {{192.168.1.103}}.  The framework didn't 
restart, so it still thinks its IP is {{192.168.13.159}}.  (In general, we 
don't expect the hostname or IP to change while something is running.)
4) The persistent socket in #2 is broken because of the network change.  The 
Mesos master interprets this as your framework disconnecting.  Hence, the 
framework needs to register to continue operating.
5) Your framework attempts to register with the old IP.  Master tries to 
respond to the old IP, but can't.

You can fix this by restarting your framework.

> Framework  (CodeFuturesExampleFramework-1) at scheduler  already subscribed, 
> resending acknowledgement
> --
>
> Key: MESOS-5141
> URL: https://issues.apache.org/jira/browse/MESOS-5141
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.28.0
> Environment: ubuntu140.04
>  mesos master and slave are 0.28
>Reporter: inred
>
> i have  a master run on 192.168.60.103, slave1 running on 192.168.60.102.
> i write a framework on my laptop 192.168.13.159 and can register it and 
> launch task successfully.
> but when my laptop changed ip from 192.168.13.159 to 192.168.1.103,
> can't  register my framework anymore.
> the master log:
> 0407 15:01:37.085171  1819 master.cpp:2231] Received SUBSCRIBE call for 
> framework 'CodeFuturesExampleFramework-1' at 
> scheduler-78b13c90-8b55-4ddd-a42e-1c8343532ba8@192.168.13.159:37072
> I0407 15:01:37.085450  1819 master.cpp:2302] Subscribing framework 
> CodeFuturesExampleFramework-1 with checkpointing disabled and capabilities [  
> ]
> I0407 15:01:37.085548  1819 master.cpp:2312] Framework 
> 8a791189-e940-4e2f-9c1e-2fb66a50191c- (CodeFuturesExampleFramework-1) at 
> scheduler-78b13c90-8b55-4ddd-a42e-1c8343532ba8@192.168.13.159:37072 already 
> subscribed, resending acknowledgement
> I0407 15:01:37.086802  1825 master.cpp:2231] Received SUBSCRIBE call for 
> framework 'CodeFuturesExampleFramework-1' at 
> scheduler-78b13c90-8b55-4ddd-a42e-1c8343532ba8@192.168.13.159:37072
> I0407 15:01:37.087117  1825 master.cpp:2302] Subscribing framework 
> CodeFuturesExampleFramework-1 with checkpointing disabled and capabilities [  
> ]
> I0407 15:01:37.087219  1825 master.cpp:2312] Framework 
> 8a791189-e940-4e2f-9c1e-2fb66a50191c- (CodeFuturesExampleFramework-1) at 
> scheduler-78b13c90-8b55-4ddd-a42e-1c8343532ba8@192.168.13.159:37072 already 
> subscribed, resending acknowledgement
> I0407 15:01:37.088713  1811 master.cpp:2231] Received SUBSCRIBE call for 
> framework 'CodeFuturesExampleFramework-1' at 
> scheduler-78b13c90-8b55-4ddd-a42e-1c8343532ba8@192.168.13.159:37072



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4760) Expose metrics and gauges for fetcher cache usage and hit rate

2016-04-07 Thread Michael Browning (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230831#comment-15230831
 ] 

Michael Browning commented on MESOS-4760:
-

Well, low-to-moderate -- I think it's difficult for users to make informed 
choices about fetcher cache configuration without any visibility into what it's 
buying them. FWIW, I've made some headway on this already, would you be able to 
shepherd?

> Expose metrics and gauges for fetcher cache usage and hit rate
> --
>
> Key: MESOS-4760
> URL: https://issues.apache.org/jira/browse/MESOS-4760
> Project: Mesos
>  Issue Type: Improvement
>  Components: fetcher, statistics
>Reporter: Michael Browning
>Priority: Minor
>  Labels: features, fetcher, statistics, uber
>
> To evaluate the fetcher cache and calibrate the value of the 
> fetcher_cache_size flag, it would be useful to have metrics and gauges on 
> agents that expose operational statistics like cache hit rate, occupied cache 
> size, and time spent downloading resources that were not present.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3435) Add containerizer support for hyper

2016-04-07 Thread haosdent (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent updated MESOS-3435:

Description: Secure as hypervisor, fast and easily used as Docker. This is 
hyper. https://docs.hyper.sh/Introduction/what_is_hyper_.html We could 
implement this through module way once MESOS-3709 finished.  (was: Hyper is 
Hypervisor-agnostic Docker Engine, I hope marathon can support 
it.(https://github.com/mesosphere/marathon/issues/1815)
https://hyper.sh/

In earlier talk about the implement possible with with Tim Chen, He suggest 
firstly implement the engine like mesos-src/docker/docker.hpp

)

> Add containerizer support for hyper
> ---
>
> Key: MESOS-3435
> URL: https://issues.apache.org/jira/browse/MESOS-3435
> Project: Mesos
>  Issue Type: Story
>Reporter: Deshi Xiao
>Assignee: haosdent
>
> Secure as hypervisor, fast and easily used as Docker. This is hyper. 
> https://docs.hyper.sh/Introduction/what_is_hyper_.html We could implement 
> this through module way once MESOS-3709 finished.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3435) Add containerizer support for hyper

2016-04-07 Thread haosdent (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent updated MESOS-3435:

Summary: Add containerizer support for hyper  (was: Add Hyper as Mesos 
Docker alike support)

> Add containerizer support for hyper
> ---
>
> Key: MESOS-3435
> URL: https://issues.apache.org/jira/browse/MESOS-3435
> Project: Mesos
>  Issue Type: Story
>Reporter: Deshi Xiao
>Assignee: haosdent
>
> Hyper is Hypervisor-agnostic Docker Engine, I hope marathon can support 
> it.(https://github.com/mesosphere/marathon/issues/1815)
> https://hyper.sh/
> In earlier talk about the implement possible with with Tim Chen, He suggest 
> firstly implement the engine like mesos-src/docker/docker.hpp



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4891) Add a '/containers' endpoint to the agent to list all the active containers.

2016-04-07 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230547#comment-15230547
 ] 

Jie Yu commented on MESOS-4891:
---

Thanks for the reminder! [~guoger] Can you follow up on that? Thx!

> Add a '/containers' endpoint to the agent to list all the active containers.
> 
>
> Key: MESOS-4891
> URL: https://issues.apache.org/jira/browse/MESOS-4891
> Project: Mesos
>  Issue Type: Improvement
>  Components: slave
>Reporter: Jie Yu
>Assignee: Jay Guo
>  Labels: mesosphere
>
> This endpoint will be similar to /monitor/statistics.json endpoint, but it'll 
> also contain the 'container_status' about the container (see ContainerStatus 
> in mesos.proto). We'll eventually deprecate the /monitor/statistics.json 
> endpoint.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4891) Add a '/containers' endpoint to the agent to list all the active containers.

2016-04-07 Thread Joerg Schad (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230528#comment-15230528
 ] 

Joerg Schad commented on MESOS-4891:


[~guoger] [~jieyu]: I believe you should also update the endpoint 
documentation, see 
https://github.com/apache/mesos/blob/master/support/generate-endpoint-help.py

> Add a '/containers' endpoint to the agent to list all the active containers.
> 
>
> Key: MESOS-4891
> URL: https://issues.apache.org/jira/browse/MESOS-4891
> Project: Mesos
>  Issue Type: Improvement
>  Components: slave
>Reporter: Jie Yu
>Assignee: Jay Guo
>  Labels: mesosphere
>
> This endpoint will be similar to /monitor/statistics.json endpoint, but it'll 
> also contain the 'container_status' about the container (see ContainerStatus 
> in mesos.proto). We'll eventually deprecate the /monitor/statistics.json 
> endpoint.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3335) FlagsBase copy-ctor leads to dangling pointer

2016-04-07 Thread Neil Conway (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neil Conway updated MESOS-3335:
---
Description: 
Per [#3328], ubsan detects the following problem:

[ RUN ] FaultToleranceTest.ReregisterCompletedFrameworks
/mesos/3rdparty/libprocess/3rdparty/stout/include/stout/flags/flags.hpp:303:25: 
runtime error: load of value 33, which is not a valid value for type 'bool'

I believe what is going on here is the following:
* The test calls StartMaster(), which does MesosTest::CreateMasterFlags()
* MesosTest::CreateMasterFlags() allocates a new master::Flags on the stack, 
which is subsequently copy-constructed back to StartMaster()
* The FlagsBase constructor is:
bq. {{FlagsBase() { add(, "help", "...", false); }}}
where "help" is a member variable -- i.e., it is allocated on the stack in this 
case.
* {{FlagsBase()::add}} captures {{}}, e.g.:
{noformat}
flag.stringify = [t1](const FlagsBase&) -> Option {
return stringify(*t1);
  };}}
{noformat}
* The implicit copy constructor for FlagsBase is just going to copy the lambda 
above, i.e., the result of the copy constructor will have a lambda that points 
into MesosTest::CreateMasterFlags()'s stack frame, which is bad news.

Not sure the right fix -- comments welcome. You could define a copy-ctor for 
FlagsBase that does something gross (basically remove the old help flag and 
define a new one that points into the target of the copy), but that seems, 
well, gross.

Probably not a pressing-problem to fix -- AFAICS worst symptom is that we end 
up reading one byte from some random stack location when serving 
{{state.json}}, for example.

  was:
Per [#3328], ubsan detects the following problem:

[ RUN ] FaultToleranceTest.ReregisterCompletedFrameworks
/mesos/3rdparty/libprocess/3rdparty/stout/include/stout/flags/flags.hpp:303:25: 
runtime error: load of value 33, which is not a valid value for type 'bool'

I believe what is going on here is the following:
* The test calls StartMaster(), which does MesosTest::CreateMasterFlags()
* MesosTest::CreateMasterFlags() allocates a new master::Flags on the stack, 
which is subsequently copy-constructed back to StartMaster()
* The FlagsBase constructor is:
bq. {{FlagsBase() { add(, "help", "...", false); }}}
where "help" is a member variable -- i.e., it is allocated on the stack in this 
case.
* {{FlagsBase()::add}} captures {{}}, e.g.:
{noformat}
flag.stringify = [t1](const FlagsBase&) -> Option {
return stringify(*t1);
  };}}
{noformat}
* The implicit copy constructor for FlagsBase is just going to copy the lambda 
above, i.e., the result of the copy constructor will have a lambda that points 
into MesosTest::CreateMasterFlags()'s stack frame, which is bad news.

Not sure the right fix -- comments welcome. You could define a copy-ctor for 
FlagsBase that does something gross (basically remove the old help flag and 
define a new one that points into the target of the copy), but that seems less, 
well, gross.

Probably not a pressing-problem to fix -- AFAICS worst symptom is that we end 
up reading one byte from some random stack location when serving 
{{state.json}}, for example.


> FlagsBase copy-ctor leads to dangling pointer
> -
>
> Key: MESOS-3335
> URL: https://issues.apache.org/jira/browse/MESOS-3335
> Project: Mesos
>  Issue Type: Bug
>Reporter: Neil Conway
>Priority: Minor
> Attachments: lambda_capture_bug.cpp
>
>
> Per [#3328], ubsan detects the following problem:
> [ RUN ] FaultToleranceTest.ReregisterCompletedFrameworks
> /mesos/3rdparty/libprocess/3rdparty/stout/include/stout/flags/flags.hpp:303:25:
>  runtime error: load of value 33, which is not a valid value for type 'bool'
> I believe what is going on here is the following:
> * The test calls StartMaster(), which does MesosTest::CreateMasterFlags()
> * MesosTest::CreateMasterFlags() allocates a new master::Flags on the stack, 
> which is subsequently copy-constructed back to StartMaster()
> * The FlagsBase constructor is:
> bq. {{FlagsBase() { add(, "help", "...", false); }}}
> where "help" is a member variable -- i.e., it is allocated on the stack in 
> this case.
> * {{FlagsBase()::add}} captures {{}}, e.g.:
> {noformat}
> flag.stringify = [t1](const FlagsBase&) -> Option {
> return stringify(*t1);
>   };}}
> {noformat}
> * The implicit copy constructor for FlagsBase is just going to copy the 
> lambda above, i.e., the result of the copy constructor will have a lambda 
> that points into MesosTest::CreateMasterFlags()'s stack frame, which is bad 
> news.
> Not sure the right fix -- comments welcome. You could define a copy-ctor for 
> FlagsBase that does something gross (basically remove the old help flag and 
> define a new one that points into the target of the copy), but that seems, 
> well, gross.
> Probably not 

[jira] [Updated] (MESOS-5073) Mesos allocator leaks role sorter and quota role sorters

2016-04-07 Thread Benjamin Bannier (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Bannier updated MESOS-5073:

Sprint: Mesosphere Sprint 32

> Mesos allocator leaks role sorter and quota role sorters
> 
>
> Key: MESOS-5073
> URL: https://issues.apache.org/jira/browse/MESOS-5073
> Project: Mesos
>  Issue Type: Improvement
>  Components: allocation
>Reporter: Benjamin Bannier
>Assignee: Benjamin Bannier
>  Labels: tech-debt
> Fix For: 0.29.0
>
>
> The Mesos allocator {{internal::HierarchicalAllocatorProcess}} owns two raw 
> pointer members {{roleSorter}} and {{quotaRoleSorter}}, but fails to properly 
> manage their lifetime; they are e.g., not cleaned up in the allocator process 
> destructor.
> Since currently we do not recreate an existing allocator in production code 
> they seem to be unaffected by these leaks; they do affect tests though where 
> we create allocators multiple times.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2717) Qemu/KVM containerizer

2016-04-07 Thread Abhishek Dasgupta (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230160#comment-15230160
 ] 

Abhishek Dasgupta commented on MESOS-2717:
--

Hi,

This is a draft design document for qemu/kvm containerizer: 
https://docs.google.com/document/d/1_VuFiJqxjlH_CA1BCMknl3sadlTZ69FuDe7qasDIOk0/edit?usp=sharing

Please , feel free to comment on it.

ping [~tnachen][~idownes][~tillt]

> Qemu/KVM containerizer
> --
>
> Key: MESOS-2717
> URL: https://issues.apache.org/jira/browse/MESOS-2717
> Project: Mesos
>  Issue Type: Wish
>  Components: containerization
>Reporter: Pierre-Yves Ritschard
>Assignee: Abhishek Dasgupta
>
> I think it would make sense for Mesos to have the ability to treat 
> hypervisors as containerizers and the most sensible one to start with would 
> probably be Qemu/KVM.
> There are a few workloads that can require full-fledged VMs (the most obvious 
> one being Windows workloads).
> The containerization code is well decoupled and seems simple enough, I can 
> definitely take a shot at it. VMs do bring some questions with them here is 
> my take on them:
> 1. Routing, network strategy
> ==
> The simplest approach here might very well be to go for bridged networks
> and leave the setup and inter slave routing up to the administrator
> 2. IP Address assignment
> 
> At first, it can be up to the Frameworks to deal with IP assignment.
> The simplest way to address this could be to have an executor running
> on slaves providing the qemu/kvm containerizer which would instrument a DHCP 
> server and collect IP + Mac address resources from slaves. While it may be up 
> to the frameworks to provide this, an example should most likely be provided.
> 3. VM Templates
> ==
> VM templates should probably leverage the fetcher and could thus be copied 
> locally or fetch from HTTP(s) / HDFS.
> 4. Resource limiting
> 
> Mapping resouce constraints to the qemu command line is probably the easiest 
> part, Additional command line should also be fetchable. For Unix VMs, the 
> sandbox could show the output of the serial console
> 5. Libvirt / plain Qemu
> =
> I tend to favor limiting the amount of necessary hoops to jump through and 
> would thus investigate working directly with Qemu, maintaining an open 
> connection to the monitor to assert status.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5129) Supporting Container Images in Mesos Containerizer doesn't work

2016-04-07 Thread wangqun (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangqun updated MESOS-5129:
---
Description: 
lsHi
I try to test the feature of Supporting Container Images in Mesos 
Containerizer according to 
https://github.com/apache/mesos/blob/master/docs/container-image.md#test-it-out.
 But it doesn't work. 

   I use the mesos 0.29 version.
The following is my step:
1) sudo bin/mesos-master.sh --log_dir=/var/log/mesos --ip=9.5.124.139 
--work_dir=/tmp/mesos/master
2) sudo bin/mesos-slave.sh --master=9.5.124.139:5050 --ip=9.5.124.139 
--hostname=mesos --isolation=docker/runtime,filesystem/linux  
--work_dir=/tmp/mesos/slave --log_dir=/var/log/mesos --image_providers=docker 
--executor_environment_variables="{}"
3)sudo src/mesos-execute --master=9.5.124.139:5050 --name=test 
--docker_image=library/redis  --shell=false
WARNING: Logging before InitGoogleLogging() is written to STDERR
W0406 03:33:05.730432  5886 scheduler.cpp:157] 
**
Scheduler driver bound to loopback interface! Cannot communicate with remote 
master(s). You might want to set 'LIBPROCESS_IP' environment variable to use a 
routable IP address.
**
I0406 03:33:05.730623  5886 scheduler.cpp:172] Version: 0.29.0
Subscribed with ID '79b6ed58-46a9-4760-a589-a28061f4f1e9-
task test submitted to agent 7184bc3a-243c-4ca7-8897-c98e81836ed6-S1
Received status update TASK_RUNNING for task test

4) sudo vim lt-mesos-slave.mesos.root.log.ERROR
Command 'hadoop version 2>&1' failed; this is the output:
sh: 1: hadoop: not found




  was:
Hi
I try to test the feature of Supporting Container Images in Mesos 
Containerizer according to 
https://github.com/apache/mesos/blob/master/docs/container-image.md#test-it-out.
 But it doesn't work. 

   I use the mesos 0.29 version.
The following is my step:
1) sudo bin/mesos-master.sh --log_dir=/var/log/mesos --ip=9.5.124.139 
--work_dir=/tmp/mesos/master
2) sudo bin/mesos-slave.sh --master=9.5.124.139:5050 --ip=9.5.124.139 
--hostname=mesos --isolation=docker/runtime,filesystem/linux  
--work_dir=/tmp/mesos/slave --log_dir=/var/log/mesos --image_providers=docker 
--executor_environment_variables="{}"
3)sudo src/mesos-execute --master=9.5.124.139:5050 --name=test 
--docker_image=library/redis  --shell=false
WARNING: Logging before InitGoogleLogging() is written to STDERR
W0406 03:33:05.730432  5886 scheduler.cpp:157] 
**
Scheduler driver bound to loopback interface! Cannot communicate with remote 
master(s). You might want to set 'LIBPROCESS_IP' environment variable to use a 
routable IP address.
**
I0406 03:33:05.730623  5886 scheduler.cpp:172] Version: 0.29.0
Subscribed with ID '79b6ed58-46a9-4760-a589-a28061f4f1e9-
task test submitted to agent 7184bc3a-243c-4ca7-8897-c98e81836ed6-S1
Received status update TASK_RUNNING for task test

4) sudo vim lt-mesos-slave.mesos.root.log.ERROR
Command 'hadoop version 2>&1' failed; this is the output:
sh: 1: hadoop: not found





> Supporting Container Images in Mesos Containerizer doesn't work
> ---
>
> Key: MESOS-5129
> URL: https://issues.apache.org/jira/browse/MESOS-5129
> Project: Mesos
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.29.0
>Reporter: wangqun
>
> lsHi
> I try to test the feature of Supporting Container Images in Mesos 
> Containerizer according to 
> https://github.com/apache/mesos/blob/master/docs/container-image.md#test-it-out.
>  But it doesn't work. 
>I use the mesos 0.29 version.
> The following is my step:
> 1) sudo bin/mesos-master.sh --log_dir=/var/log/mesos --ip=9.5.124.139 
> --work_dir=/tmp/mesos/master
> 2) sudo bin/mesos-slave.sh --master=9.5.124.139:5050 --ip=9.5.124.139 
> --hostname=mesos --isolation=docker/runtime,filesystem/linux  
> --work_dir=/tmp/mesos/slave --log_dir=/var/log/mesos --image_providers=docker 
> --executor_environment_variables="{}"
> 3)sudo src/mesos-execute --master=9.5.124.139:5050 --name=test 
> --docker_image=library/redis  --shell=false
> WARNING: Logging before InitGoogleLogging() is written to STDERR
> W0406 03:33:05.730432  5886 scheduler.cpp:157] 
> **
> Scheduler driver bound to loopback interface! Cannot communicate with remote 
> master(s). You might want to set 'LIBPROCESS_IP' environment variable to use 
> a routable IP address.
> **
> I0406 03:33:05.730623  5886 scheduler.cpp:172] Version: 0.29.0
> Subscribed with ID '79b6ed58-46a9-4760-a589-a28061f4f1e9-
> task test submitted to agent 7184bc3a-243c-4ca7-8897-c98e81836ed6-S1
> Received status update 

[jira] [Created] (MESOS-5142) Add agent flags for HTTP authorization

2016-04-07 Thread Jan Schlicht (JIRA)
Jan Schlicht created MESOS-5142:
---

 Summary: Add agent flags for HTTP authorization
 Key: MESOS-5142
 URL: https://issues.apache.org/jira/browse/MESOS-5142
 Project: Mesos
  Issue Type: Bug
  Components: security, slave
Reporter: Jan Schlicht
Assignee: Jan Schlicht


Flags should be added to the agent to:
1. Enable authorization ({{--authorizers}})
2. Provide ACLs ({{--acls}})



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4902) Add authentication to remaining agent endpoints

2016-04-07 Thread Jan Schlicht (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Schlicht updated MESOS-4902:

Assignee: (was: Jan Schlicht)

> Add authentication to remaining agent endpoints
> ---
>
> Key: MESOS-4902
> URL: https://issues.apache.org/jira/browse/MESOS-4902
> Project: Mesos
>  Issue Type: Improvement
>  Components: HTTP API
>Reporter: Greg Mann
>  Labels: authentication, http, mesosphere, security
>
> In addition to the endpoints addressed by MESOS-4850 and MESOS-4951, the 
> following endpoints would also benefit from HTTP authentication:
> * {{/profiler/*}}
> * {{/logging/toggle}}
> * {{/metrics/snapshot}}
> * {{/monitor/statistics}}
> * {{/system/stats.json}}
> Adding HTTP authentication to these endpoints is a bit more complicated: some 
> endpoints are defined at the libprocess level, while others are defined in 
> code that is shared by the master and agent.
> While working on MESOS-4850, it became apparent that since our tests use the 
> same instance of libprocess for both master and agent, different default 
> authentication realms must be used for master/agent so that HTTP 
> authentication can be independently enabled/disabled for each.
> We should establish a mechanism for making an endpoint authenticated that 
> allows us to:
> 1) Install an endpoint like {{/files}}, whose code is shared by the master 
> and agent, with different authentication realms for the master and agent
> 2) Avoid hard-coding a default authentication realm into libprocess, to 
> permit the use of different authentication realms for the master and agent 
> and to keep application-level concerns from leaking into libprocess
> Another option would be to use a single default authentication realm and 
> always enable or disable HTTP authentication for *both* the master and agent 
> in tests. However, this wouldn't allow us to test scenarios where HTTP 
> authentication is enabled on one but disabled on the other.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5141) Framework (CodeFuturesExampleFramework-1) at scheduler already subscribed, resending acknowledgement

2016-04-07 Thread inred (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15229837#comment-15229837
 ] 

inred commented on MESOS-5141:
--

ano@master:~$ env |grep MESOS
MESOS_ip=192.168.60.103
MESOS_log_dir=/var/log/mesos/master
MESOS_work_dir=/var/lib/mesos/master


ano@slave1:~$ env|grep MESOS
MESOS_isolation=cgroups/cpu,cgroups/mem
MESOS_ip=192.168.60.102
MESOS_container_logger=org_apache_mesos_LogrotateContainerLogger
MESOS_containerizers=docker,mesos
MESOS_log_dir=/var/log/mesos/slave
MESOS_logrotate_path=/var/log/mesos/rotate
MESOS_hostname=slave1
MESOS_master=master:5050
MESOS_work_dir=/var/lib/mesos/slave

> Framework  (CodeFuturesExampleFramework-1) at scheduler  already subscribed, 
> resending acknowledgement
> --
>
> Key: MESOS-5141
> URL: https://issues.apache.org/jira/browse/MESOS-5141
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.28.0
> Environment: ubuntu140.04
>  mesos master and slave are 0.28
>Reporter: inred
>
> i have  a master run on 192.168.60.103, slave1 running on 192.168.60.102.
> i write a framework on my laptop 192.168.13.159 and can register it and 
> launch task successfully.
> but when my laptop changed ip from 192.168.13.159 to 192.168.1.103,
> can't  register my framework anymore.
> the master log:
> 0407 15:01:37.085171  1819 master.cpp:2231] Received SUBSCRIBE call for 
> framework 'CodeFuturesExampleFramework-1' at 
> scheduler-78b13c90-8b55-4ddd-a42e-1c8343532ba8@192.168.13.159:37072
> I0407 15:01:37.085450  1819 master.cpp:2302] Subscribing framework 
> CodeFuturesExampleFramework-1 with checkpointing disabled and capabilities [  
> ]
> I0407 15:01:37.085548  1819 master.cpp:2312] Framework 
> 8a791189-e940-4e2f-9c1e-2fb66a50191c- (CodeFuturesExampleFramework-1) at 
> scheduler-78b13c90-8b55-4ddd-a42e-1c8343532ba8@192.168.13.159:37072 already 
> subscribed, resending acknowledgement
> I0407 15:01:37.086802  1825 master.cpp:2231] Received SUBSCRIBE call for 
> framework 'CodeFuturesExampleFramework-1' at 
> scheduler-78b13c90-8b55-4ddd-a42e-1c8343532ba8@192.168.13.159:37072
> I0407 15:01:37.087117  1825 master.cpp:2302] Subscribing framework 
> CodeFuturesExampleFramework-1 with checkpointing disabled and capabilities [  
> ]
> I0407 15:01:37.087219  1825 master.cpp:2312] Framework 
> 8a791189-e940-4e2f-9c1e-2fb66a50191c- (CodeFuturesExampleFramework-1) at 
> scheduler-78b13c90-8b55-4ddd-a42e-1c8343532ba8@192.168.13.159:37072 already 
> subscribed, resending acknowledgement
> I0407 15:01:37.088713  1811 master.cpp:2231] Received SUBSCRIBE call for 
> framework 'CodeFuturesExampleFramework-1' at 
> scheduler-78b13c90-8b55-4ddd-a42e-1c8343532ba8@192.168.13.159:37072



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-5141) Framework (CodeFuturesExampleFramework-1) at scheduler already subscribed, resending acknowledgement

2016-04-07 Thread inred (JIRA)
inred created MESOS-5141:


 Summary: Framework  (CodeFuturesExampleFramework-1) at scheduler  
already subscribed, resending acknowledgement
 Key: MESOS-5141
 URL: https://issues.apache.org/jira/browse/MESOS-5141
 Project: Mesos
  Issue Type: Bug
  Components: master
Affects Versions: 0.28.0
 Environment: ubuntu140.04
 mesos master and slave are 0.28
Reporter: inred


i have  a master run on 192.168.60.103, slave1 running on 192.168.60.102.

i write a framework on my laptop 192.168.13.159 and can register it and launch 
task successfully.

but when my laptop changed ip from 192.168.13.159 to 192.168.1.103,

can't  register my framework anymore.

the master log:

0407 15:01:37.085171  1819 master.cpp:2231] Received SUBSCRIBE call for 
framework 'CodeFuturesExampleFramework-1' at 
scheduler-78b13c90-8b55-4ddd-a42e-1c8343532ba8@192.168.13.159:37072
I0407 15:01:37.085450  1819 master.cpp:2302] Subscribing framework 
CodeFuturesExampleFramework-1 with checkpointing disabled and capabilities [  ]
I0407 15:01:37.085548  1819 master.cpp:2312] Framework 
8a791189-e940-4e2f-9c1e-2fb66a50191c- (CodeFuturesExampleFramework-1) at 
scheduler-78b13c90-8b55-4ddd-a42e-1c8343532ba8@192.168.13.159:37072 already 
subscribed, resending acknowledgement
I0407 15:01:37.086802  1825 master.cpp:2231] Received SUBSCRIBE call for 
framework 'CodeFuturesExampleFramework-1' at 
scheduler-78b13c90-8b55-4ddd-a42e-1c8343532ba8@192.168.13.159:37072
I0407 15:01:37.087117  1825 master.cpp:2302] Subscribing framework 
CodeFuturesExampleFramework-1 with checkpointing disabled and capabilities [  ]
I0407 15:01:37.087219  1825 master.cpp:2312] Framework 
8a791189-e940-4e2f-9c1e-2fb66a50191c- (CodeFuturesExampleFramework-1) at 
scheduler-78b13c90-8b55-4ddd-a42e-1c8343532ba8@192.168.13.159:37072 already 
subscribed, resending acknowledgement
I0407 15:01:37.088713  1811 master.cpp:2231] Received SUBSCRIBE call for 
framework 'CodeFuturesExampleFramework-1' at 
scheduler-78b13c90-8b55-4ddd-a42e-1c8343532ba8@192.168.13.159:37072



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-1837) failed to determine cgroup for the 'cpu' subsystem

2016-04-07 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-1837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15229832#comment-15229832
 ] 

haosdent commented on MESOS-1837:
-

Looks like should be docker. Could you show
{code}
$ cat /proc/self/mountinfo
{code}

> failed to determine cgroup for the 'cpu' subsystem
> --
>
> Key: MESOS-1837
> URL: https://issues.apache.org/jira/browse/MESOS-1837
> Project: Mesos
>  Issue Type: Bug
>  Components: docker
>Affects Versions: 0.20.1
> Environment: Ubuntu 14.04
>Reporter: Chris Fortier
>Assignee: Timothy Chen
>
> Attempting to launch Docker container with Marathon. Container is launched 
> then fails. 
> A search of /var/log/syslog reveals:
> Sep 27 03:01:43 vagrant-ubuntu-trusty-64 mesos-slave[1409]: E0927 
> 03:01:43.546957  1463 slave.cpp:2205] Failed to update resources for 
> container 8c2429d9-f090-4443-8108-0206ca37f3fd of executor 
> hello-world.970dbe74-45f2-11e4-8b1d-56847afe9799 running task 
> hello-world.970dbe74-45f2-11e4-8b1d-56847afe9799 on status update for 
> terminal task, destroying container: Failed to determine cgroup for the 'cpu' 
> subsystem: Failed to read /proc/9792/cgroup: Failed to open file 
> '/proc/9792/cgroup': No such file or directory



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-1837) failed to determine cgroup for the 'cpu' subsystem

2016-04-07 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-1837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15229826#comment-15229826
 ] 

haosdent commented on MESOS-1837:
-

Hi, are you use docker or MesosContainerizer? 

> failed to determine cgroup for the 'cpu' subsystem
> --
>
> Key: MESOS-1837
> URL: https://issues.apache.org/jira/browse/MESOS-1837
> Project: Mesos
>  Issue Type: Bug
>  Components: docker
>Affects Versions: 0.20.1
> Environment: Ubuntu 14.04
>Reporter: Chris Fortier
>Assignee: Timothy Chen
>
> Attempting to launch Docker container with Marathon. Container is launched 
> then fails. 
> A search of /var/log/syslog reveals:
> Sep 27 03:01:43 vagrant-ubuntu-trusty-64 mesos-slave[1409]: E0927 
> 03:01:43.546957  1463 slave.cpp:2205] Failed to update resources for 
> container 8c2429d9-f090-4443-8108-0206ca37f3fd of executor 
> hello-world.970dbe74-45f2-11e4-8b1d-56847afe9799 running task 
> hello-world.970dbe74-45f2-11e4-8b1d-56847afe9799 on status update for 
> terminal task, destroying container: Failed to determine cgroup for the 'cpu' 
> subsystem: Failed to read /proc/9792/cgroup: Failed to open file 
> '/proc/9792/cgroup': No such file or directory



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-5140) Update CHANGELOG for XFS disk isolator

2016-04-07 Thread Yan Xu (JIRA)
Yan Xu created MESOS-5140:
-

 Summary: Update CHANGELOG for XFS disk isolator
 Key: MESOS-5140
 URL: https://issues.apache.org/jira/browse/MESOS-5140
 Project: Mesos
  Issue Type: Task
Reporter: Yan Xu






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)