[jira] [Created] (MESOS-5450) Make authentication pluggable

2016-05-24 Thread Alex Clemmer (JIRA)
Alex Clemmer created MESOS-5450:
---

 Summary: Make authentication pluggable
 Key: MESOS-5450
 URL: https://issues.apache.org/jira/browse/MESOS-5450
 Project: Mesos
  Issue Type: Bug
  Components: slave
Reporter: Alex Clemmer
Assignee: Alex Clemmer


Right now there is a hard dependency on SASL, which probably won't work well on 
Windows (at least) in the near future for our use cases.

In the future, it would be nice to have a pluggable authentication layer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5380) Killing a queued task can cause the corresponding command executor to never terminate.

2016-05-24 Thread Vinod Kone (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299446#comment-15299446
 ] 

Vinod Kone commented on MESOS-5380:
---

Committed phase 2.

commit 8be9b5b5decd9ec2bcad547b1dff29b777cbc438
Author: Vinod Kone 
Date:   Sun May 15 12:31:31 2016 -0700

Fixed agent to properly handle killTask during agent restart.

If the agent restarts after handling killTask but before sending
shutdown message to the executor, we ensure the executor terminates.

Review: https://reviews.apache.org/r/47402


> Killing a queued task can cause the corresponding command executor to never 
> terminate.
> --
>
> Key: MESOS-5380
> URL: https://issues.apache.org/jira/browse/MESOS-5380
> Project: Mesos
>  Issue Type: Bug
>  Components: slave
>Affects Versions: 0.28.0, 0.28.1
>Reporter: Jie Yu
>Assignee: Vinod Kone
>Priority: Blocker
>  Labels: mesosphere
> Fix For: 0.29.0, 0.28.2
>
>
> We observed this in our testing environment. Sequence of events:
> 1) A command task is queued since the executor has not registered yet.
> 2) The framework issues a killTask.
> 3) Since executor is in REGISTERING state, agent calls 
> `statusUpdate(TASK_KILLED, UPID())`
> 4) `statusUpdate` now will call `containerizer->status()` before calling 
> `executor->terminateTask(status.task_id(), status);` which will remove the 
> queued task. (Introduced in this patch: https://reviews.apache.org/r/43258).
> 5) Since the above is async, it's possible that the task is still in queued 
> task when we trying to see if we need to kill unregistered executor in 
> `killTask`:
> {code}
>   // TODO(jieyu): Here, we kill the executor if it no longer has
>   // any task to run and has not yet registered. This is a
>   // workaround for those single task executors that do not have a
>   // proper self terminating logic when they haven't received the
>   // task within a timeout.
>   if (executor->queuedTasks.empty()) {
> CHECK(executor->launchedTasks.empty())
> << " Unregistered executor '" << executor->id
> << "' has launched tasks";
> LOG(WARNING) << "Killing the unregistered executor " << *executor
>  << " because it has no tasks";
> executor->state = Executor::TERMINATING;
> containerizer->destroy(executor->containerId);
>   }
> {code}
> 6) Consequently, the executor will never be terminated by Mesos.
> Attaching the relevant agent log:
> {noformat}
> May 13 15:36:13 ip-10-0-2-74.us-west-2.compute.internal mesos-slave[1304]: 
> I0513 15:36:13.640527  1342 slave.cpp:1361] Got assigned task 
> mesosvol.6ccd993c-1920-11e6-a722-9648cb19afd6 for framework 
> a3ad8418-cb77-4705-b353-4b514ceca52c-
> May 13 15:36:13 ip-10-0-2-74.us-west-2.compute.internal mesos-slave[1304]: 
> I0513 15:36:13.641034  1342 slave.cpp:1480] Launching task 
> mesosvol.6ccd993c-1920-11e6-a722-9648cb19afd6 for framework 
> a3ad8418-cb77-4705-b353-4b514ceca52c-
> May 13 15:36:13 ip-10-0-2-74.us-west-2.compute.internal mesos-slave[1304]: 
> I0513 15:36:13.641440  1342 paths.cpp:528] Trying to chown 
> '/var/lib/mesos/slave/slaves/a3ad8418-cb77-4705-b353-4b514ceca52c-S0/frameworks/a3ad8418-cb77-4705-b353-4b514ceca52c-/executors/mesosvol.6ccd993c-1920-11e6-a722-9648cb19afd6/runs/24762d43-2134-475e-b724-caa72110497a'
>  to user 'root'
> May 13 15:36:13 ip-10-0-2-74.us-west-2.compute.internal mesos-slave[1304]: 
> I0513 15:36:13.644664  1342 slave.cpp:5389] Launching executor 
> mesosvol.6ccd993c-1920-11e6-a722-9648cb19afd6 of framework 
> a3ad8418-cb77-4705-b353-4b514ceca52c- with resources cpus(*):0.1; 
> mem(*):32 in work directory 
> '/var/lib/mesos/slave/slaves/a3ad8418-cb77-4705-b353-4b514ceca52c-S0/frameworks/a3ad8418-cb77-4705-b353-4b514ceca52c-/executors/mesosvol.6ccd993c-1920-11e6-a722-9648cb19afd6/runs/24762d43-2134-475e-b724-caa72110497a'
> May 13 15:36:13 ip-10-0-2-74.us-west-2.compute.internal mesos-slave[1304]: 
> I0513 15:36:13.645195  1342 slave.cpp:1698] Queuing task 
> 'mesosvol.6ccd993c-1920-11e6-a722-9648cb19afd6' for executor 
> 'mesosvol.6ccd993c-1920-11e6-a722-9648cb19afd6' of framework 
> a3ad8418-cb77-4705-b353-4b514ceca52c-
> May 13 15:36:13 ip-10-0-2-74.us-west-2.compute.internal mesos-slave[1304]: 
> I0513 15:36:13.645491  1338 containerizer.cpp:671] Starting container 
> '24762d43-2134-475e-b724-caa72110497a' for executor 
> 'mesosvol.6ccd993c-1920-11e6-a722-9648cb19afd6' of framework 
> 'a3ad8418-cb77-4705-b353-4b514ceca52c-'
> May 13 15:36:13 ip-10-0-2-74.us-west-2.compute.internal mesos-slave[1304]: 
> I0513 15:36:13.647897  1345 cpushare.cpp:389] Updated 'cpu.shares' to 1126 
> 

[jira] [Commented] (MESOS-5449) Memory leak in SchedulerProcess.declineOffer

2016-05-24 Thread Vinod Kone (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299439#comment-15299439
 ] 

Vinod Kone commented on MESOS-5449:
---

commit 927bec15d94e40928180769300b239a7e6bb9d6f
Author: Dario Rexin 
Date:   Tue May 24 21:07:20 2016 -0700

Fixed a memory leak in SchedulerProcess.decline.

MesosScheduler.declineOffers has been changed ~6 months ago to send a
Decline message instead of calling acceptOffers with an empty list of
task infos. The changed version of declineOffer however did not remove
the offerId from the savedOffers map, causing a memory leak.

Review: https://reviews.apache.org/r/47804/


> Memory leak in SchedulerProcess.declineOffer
> 
>
> Key: MESOS-5449
> URL: https://issues.apache.org/jira/browse/MESOS-5449
> Project: Mesos
>  Issue Type: Bug
>  Components: scheduler driver
>Affects Versions: 0.26.0, 0.27.0, 0.27.1, 0.28.0, 0.26.1, 0.28.1
>Reporter: Dario Rexin
>Assignee: Dario Rexin
>Priority: Blocker
> Fix For: 0.29.0, 0.27.3, 0.28.2, 0.26.2
>
>
> MesosScheduler.declineOffers has been changed ~6 months ago to send a Decline 
> message instead of calling acceptOffers with an empty list of task infos. The 
> changed version of declineOffer however did not remove the offerId from the 
> savedOffers map, causing a memory leak.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5449) Memory leak in SchedulerProcess.declineOffer

2016-05-24 Thread Joseph Wu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Wu updated MESOS-5449:
-
Shepherd: Vinod Kone

> Memory leak in SchedulerProcess.declineOffer
> 
>
> Key: MESOS-5449
> URL: https://issues.apache.org/jira/browse/MESOS-5449
> Project: Mesos
>  Issue Type: Bug
>  Components: scheduler driver
>Affects Versions: 0.26.0, 0.27.0, 0.27.1, 0.28.0, 0.26.1, 0.28.1
>Reporter: Dario Rexin
>Assignee: Dario Rexin
>Priority: Blocker
> Fix For: 0.29.0, 0.27.3, 0.28.2, 0.26.2
>
>
> MesosScheduler.declineOffers has been changed ~6 months ago to send a Decline 
> message instead of calling acceptOffers with an empty list of task infos. The 
> changed version of declineOffer however did not remove the offerId from the 
> savedOffers map, causing a memory leak.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5449) Memory leak in SchedulerProcess.declineOffer

2016-05-24 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-5449:
--
Fix Version/s: 0.26.2
   0.28.2
   0.27.3
   0.29.0

> Memory leak in SchedulerProcess.declineOffer
> 
>
> Key: MESOS-5449
> URL: https://issues.apache.org/jira/browse/MESOS-5449
> Project: Mesos
>  Issue Type: Bug
>  Components: scheduler driver
>Affects Versions: 0.26.0, 0.27.0, 0.27.1, 0.28.0, 0.26.1, 0.28.1
>Reporter: Dario Rexin
>Assignee: Dario Rexin
>Priority: Blocker
> Fix For: 0.29.0, 0.27.3, 0.28.2, 0.26.2
>
>
> MesosScheduler.declineOffers has been changed ~6 months ago to send a Decline 
> message instead of calling acceptOffers with an empty list of task infos. The 
> changed version of declineOffer however did not remove the offerId from the 
> savedOffers map, causing a memory leak.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5449) Memory leak in SchedulerProcess.declineOffer

2016-05-24 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299269#comment-15299269
 ] 

Jie Yu commented on MESOS-5449:
---

[~vinodkone] Can you take a look?

> Memory leak in SchedulerProcess.declineOffer
> 
>
> Key: MESOS-5449
> URL: https://issues.apache.org/jira/browse/MESOS-5449
> Project: Mesos
>  Issue Type: Bug
>  Components: scheduler driver
>Affects Versions: 0.26.0, 0.27.0, 0.27.1, 0.28.0, 0.26.1, 0.28.1
>Reporter: Dario Rexin
>Assignee: Dario Rexin
>Priority: Blocker
>
> MesosScheduler.declineOffers has been changed ~6 months ago to send a Decline 
> message instead of calling acceptOffers with an empty list of task infos. The 
> changed version of declineOffer however did not remove the offerId from the 
> savedOffers map, causing a memory leak.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5449) Memory leak in SchedulerProcess.declineOffer

2016-05-24 Thread Dario Rexin (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299256#comment-15299256
 ] 

Dario Rexin commented on MESOS-5449:


Review request is here: https://reviews.apache.org/r/47804/

> Memory leak in SchedulerProcess.declineOffer
> 
>
> Key: MESOS-5449
> URL: https://issues.apache.org/jira/browse/MESOS-5449
> Project: Mesos
>  Issue Type: Bug
>  Components: scheduler driver
>Affects Versions: 0.26.0, 0.27.0, 0.27.1, 0.28.0, 0.26.1, 0.28.1
>Reporter: Dario Rexin
>Assignee: Dario Rexin
>Priority: Blocker
>
> MesosScheduler.declineOffers has been changed ~6 months ago to send a Decline 
> message instead of calling acceptOffers with an empty list of task infos. The 
> changed version of declineOffer however did not remove the offerId from the 
> savedOffers map, causing a memory leak.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-5449) Memory leak in SchedulerProcess.declineOffer

2016-05-24 Thread Dario Rexin (JIRA)
Dario Rexin created MESOS-5449:
--

 Summary: Memory leak in SchedulerProcess.declineOffer
 Key: MESOS-5449
 URL: https://issues.apache.org/jira/browse/MESOS-5449
 Project: Mesos
  Issue Type: Bug
  Components: scheduler driver
Affects Versions: 0.28.1, 0.26.1, 0.28.0, 0.27.1, 0.27.0, 0.26.0
Reporter: Dario Rexin
Assignee: Dario Rexin
Priority: Blocker


MesosScheduler.declineOffers has been changed ~6 months ago to send a Decline 
message instead of calling acceptOffers with an empty list of task infos. The 
changed version of declineOffer however did not remove the offerId from the 
savedOffers map, causing a memory leak.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5153) Sandboxes contents should be protected from unauthorized users

2016-05-24 Thread Alexander Rojas (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299028#comment-15299028
 ] 

Alexander Rojas commented on MESOS-5153:


[r/47794/|https://reviews.apache.org/r/47794/]: Added authorization support for 
{{mesos::internal::Files}}.
[r/47795/|https://reviews.apache.org/r/47795/]: Enabled authorization for 
sandboxes.


> Sandboxes contents should be protected from unauthorized users
> --
>
> Key: MESOS-5153
> URL: https://issues.apache.org/jira/browse/MESOS-5153
> Project: Mesos
>  Issue Type: Bug
>  Components: security, slave
>Reporter: Alexander Rojas
>Assignee: Alexander Rojas
>  Labels: mesosphere, security
> Fix For: 0.29.0
>
>
> MESOS-4956 introduced authentication support for the sandboxes. However, 
> authentication can only go as far as to tell whether an user is known to 
> mesos or not. An extra additional step is necessary to verify whether the 
> known user is allowed to executed the requested operation on the sandbox 
> (browse, read, download, debug).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-5448) Persistent volume deletion on the agent should survive slave restart

2016-05-24 Thread Anindya Sinha (JIRA)
Anindya Sinha created MESOS-5448:


 Summary: Persistent volume deletion on the agent should survive 
slave restart
 Key: MESOS-5448
 URL: https://issues.apache.org/jira/browse/MESOS-5448
 Project: Mesos
  Issue Type: Bug
  Components: general
Reporter: Anindya Sinha
Assignee: Anindya Sinha


When the master sends a CheckpointResourcesMessage to the agent, the agent 
attempts to rmdir the persistent volume (if it existed before, and is no longer 
in the updated checkpoint in CheckpointResourcesMessage).

If the slave restarts before the operation finishes, the disk space can be 
leaked because a reattempt of a rmdir is not done.
Subsequently, a CREATE on the same path could result in leaking of the data to 
another framework (since the directory was not rm-ed).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4757) Mesos containerizer should get uid/gids before pivot_root.

2016-05-24 Thread Gilbert Song (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298975#comment-15298975
 ] 

Gilbert Song commented on MESOS-4757:
-

[~idownes], Kevin proposed a solution for host user -> container user around 
two months ago via mailing list. Could you take a look at it to see whether it 
may break your cases? Thanks! :)

https://docs.google.com/document/d/1ENNJKyPrqqm8OsYV8-dDoHTiRmqtuVbcdzNWj1nURsQ/edit#heading=h.j9cu8f69ljik

> Mesos containerizer should get uid/gids before pivot_root.
> --
>
> Key: MESOS-4757
> URL: https://issues.apache.org/jira/browse/MESOS-4757
> Project: Mesos
>  Issue Type: Bug
>Reporter: Jie Yu
>Assignee: Jie Yu
>
> Currently, we call os::su(user) after pivot_root. This is problematic because 
> /etc/passwd and /etc/group might be missing in container's root filesystem. 
> We should instead, get the uid/gids before pivot_root, and call 
> setuid/setgroups after pivot_root.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (MESOS-3085) Make failed on Ubuntu 14.04 ppc64le

2016-05-24 Thread Tomasz Janiszewski (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomasz Janiszewski updated MESOS-3085:
--
Comment: was deleted

(was: Duplicate: MESOS-5263)

> Make failed on Ubuntu 14.04 ppc64le
> ---
>
> Key: MESOS-3085
> URL: https://issues.apache.org/jira/browse/MESOS-3085
> Project: Mesos
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.23.0, 0.24.0
> Environment: Ubuntu 14.04 ppc64le
>Reporter: Jihun Kang
>Assignee: Jihun Kang
> Fix For: 0.29.0
>
>
> When trying to compile linux/fs.cpp, make failed with a following message.
> {noformat}
> /bin/bash ../libtool  --tag=CXX   --mode=compile g++ -DPACKAGE_NAME=\"mesos\" 
> -DPACKAGE_TARNAME=\"mesos\" -DPACKAGE_VERSION=\"0.24.0\" 
> -DPACKAGE_STRING=\"mesos\ 0.24.0\" -DPACKAGE_BUGREPORT=\"\" 
> -DPACKAGE_URL=\"\" -DPACKAGE=\"mesos\" -DVERSION=\"0.24.0\" -DSTDC_HEADERS=1 
> -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 
> -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 
> -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 -DLT_OBJDIR=\".libs/\" 
> -DHAVE_PTHREAD_PRIO_INHERIT=1 -DHAVE_PTHREAD=1 -DHAVE_LIBZ=1 -DHAVE_LIBCURL=1 
> -DHAVE_APR_POOLS_H=1 -DHAVE_LIBAPR_1=1 -DHAVE_SVN_VERSION_H=1 
> -DHAVE_LIBSVN_SUBR_1=1 -DHAVE_SVN_DELTA_H=1 -DHAVE_LIBSVN_DELTA_1=1 
> -DHAVE_LIBSASL2=1 -DMESOS_HAS_JAVA=1 -DHAVE_PYTHON=\"2.7\" 
> -DMESOS_HAS_PYTHON=1 -I. -I../../src   -Wall -Werror 
> -DLIBDIR=\"/usr/local/lib\" -DPKGLIBEXECDIR=\"/usr/local/libexec/mesos\" 
> -DPKGDATADIR=\"/usr/local/share/mesos\" -I../../include 
> -I../../3rdparty/libprocess/include 
> -I../../3rdparty/libprocess/3rdparty/stout/include -I../include 
> -I../include/mesos -I../3rdparty/libprocess/3rdparty/boost-1.53.0 
> -I../3rdparty/libprocess/3rdparty/picojson-4f93734 
> -I../3rdparty/libprocess/3rdparty/protobuf-2.5.0/src 
> -I../3rdparty/libprocess/3rdparty/glog-0.3.3/src 
> -I../3rdparty/libprocess/3rdparty/glog-0.3.3/src 
> -I../3rdparty/leveldb/include -I../3rdparty/zookeeper-3.4.5/src/c/include 
> -I../3rdparty/zookeeper-3.4.5/src/c/generated 
> -I../3rdparty/libprocess/3rdparty/protobuf-2.5.0/src 
> -I/usr/include/subversion-1 -I/usr/include/apr-1 -I/usr/include/apr-1.0  
> -pthread -g1 -O0 -Wno-unused-local-typedefs -std=c++11 -MT 
> linux/libmesos_no_3rdparty_la-fs.lo -MD -MP -MF 
> linux/.deps/libmesos_no_3rdparty_la-fs.Tpo -c -o 
> linux/libmesos_no_3rdparty_la-fs.lo `test -f 'linux/fs.cpp' || echo 
> '../../src/'`linux/fs.cpp
> libtool: compile:  g++ -DPACKAGE_NAME=\"mesos\" -DPACKAGE_TARNAME=\"mesos\" 
> -DPACKAGE_VERSION=\"0.24.0\" "-DPACKAGE_STRING=\"mesos 0.24.0\"" 
> -DPACKAGE_BUGREPORT=\"\" -DPACKAGE_URL=\"\" -DPACKAGE=\"mesos\" 
> -DVERSION=\"0.24.0\" -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 
> -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 
> -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 
> -DHAVE_DLFCN_H=1 -DLT_OBJDIR=\".libs/\" -DHAVE_PTHREAD_PRIO_INHERIT=1 
> -DHAVE_PTHREAD=1 -DHAVE_LIBZ=1 -DHAVE_LIBCURL=1 -DHAVE_APR_POOLS_H=1 
> -DHAVE_LIBAPR_1=1 -DHAVE_SVN_VERSION_H=1 -DHAVE_LIBSVN_SUBR_1=1 
> -DHAVE_SVN_DELTA_H=1 -DHAVE_LIBSVN_DELTA_1=1 -DHAVE_LIBSASL2=1 
> -DMESOS_HAS_JAVA=1 -DHAVE_PYTHON=\"2.7\" -DMESOS_HAS_PYTHON=1 -I. -I../../src 
> -Wall -Werror -DLIBDIR=\"/usr/local/lib\" 
> -DPKGLIBEXECDIR=\"/usr/local/libexec/mesos\" 
> -DPKGDATADIR=\"/usr/local/share/mesos\" -I../../include 
> -I../../3rdparty/libprocess/include 
> -I../../3rdparty/libprocess/3rdparty/stout/include -I../include 
> -I../include/mesos -I../3rdparty/libprocess/3rdparty/boost-1.53.0 
> -I../3rdparty/libprocess/3rdparty/picojson-4f93734 
> -I../3rdparty/libprocess/3rdparty/protobuf-2.5.0/src 
> -I../3rdparty/libprocess/3rdparty/glog-0.3.3/src 
> -I../3rdparty/libprocess/3rdparty/glog-0.3.3/src 
> -I../3rdparty/leveldb/include -I../3rdparty/zookeeper-3.4.5/src/c/include 
> -I../3rdparty/zookeeper-3.4.5/src/c/generated 
> -I../3rdparty/libprocess/3rdparty/protobuf-2.5.0/src 
> -I/usr/include/subversion-1 -I/usr/include/apr-1 -I/usr/include/apr-1.0 
> -pthread -g1 -O0 -Wno-unused-local-typedefs -std=c++11 -MT 
> linux/libmesos_no_3rdparty_la-fs.lo -MD -MP -MF 
> linux/.deps/libmesos_no_3rdparty_la-fs.Tpo -c ../../src/linux/fs.cpp  -fPIC 
> -DPIC -o linux/.libs/libmesos_no_3rdparty_la-fs.o
> ../../src/linux/fs.cpp:346:2: error: #error "pivot_root is not available"
>  #error "pivot_root is not available"
>   ^
> ../../src/linux/fs.cpp: In function 'Try 
> mesos::internal::fs::pivot_root(const string&, const string&)':
> ../../src/linux/fs.cpp:348:7: error: 'ret' was not declared in this scope
>if (ret == -1) {
>^
> make[2]: *** [linux/libmesos_no_3rdparty_la-fs.lo] Error 1
> make[2]: *** Waiting for unfinished jobs
> {noformat}



--
This 

[jira] [Commented] (MESOS-970) Upgrade bundled leveldb to 1.18

2016-05-24 Thread Vinod Kone (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298963#comment-15298963
 ] 

Vinod Kone commented on MESOS-970:
--

Looks great. Thanks for sharing!

> Upgrade bundled leveldb to 1.18
> ---
>
> Key: MESOS-970
> URL: https://issues.apache.org/jira/browse/MESOS-970
> Project: Mesos
>  Issue Type: Improvement
>  Components: replicated log
>Reporter: Benjamin Mahler
>Assignee: Tomasz Janiszewski
>
> We currently bundle leveldb 1.4, and the latest version is leveldb 1.18.
> Upgrade to 1.18 could solve the problems when build Mesos in some non-x86 
> architecture CPU.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5237) The windows version of `os::access` has differing behavior than the POSIX version.

2016-05-24 Thread Michael Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Park updated MESOS-5237:

Description: 
The POSIX version of {{os::access}} looks like this:

{code}
inline Try access(const std::string& path, int how)
{
  if (::access(path.c_str(), how) < 0) {
if (errno == EACCES) {
  return false;
} else {
  return ErrnoError();
}
  }
  return true;
}
{code}

Compare this to the Windows version of {{os::access}} which looks like this 
following:

{code}
inline Try access(const std::string& fileName, int how)
{
  if (::_access(fileName.c_str(), how) != 0) {
return ErrnoError("access: Could not access path '" + fileName + "'");
  }

  return true;
}
{code}

As we can see, the case where {{errno}} is set to {{EACCES}} is handled 
differently between the 2 functions.

We can actually consolidate the 2 functions by simply using the POSIX version. 
The challenge is that on POSIX, we should use {{::access}} and {{::_access}} on 
Windows. Note however, that this problem is already solved, as we have an 
implementation of {{::access}} for Windows in 
{{3rdparty/libprocess/3rdparty/stout/include/stout/windows.hpp}} which simply 
defers to {{::_access}}.

Thus, I propose to simply consolidate the 2 implementations.

  was:
The POSIX version of {{os::access}} looks like this:

{code}
inline Try access(const std::string& path, int how)
{
  if (::access(path.c_str(), how) < 0) {
if (errno == EACCES) {
  return false;
} else {
  return ErrnoError();
}
  }
  return true;
}
{code}

Compare this to the Windows version of {{os::access}} which looks like this 
following:

{code}
inline Try access(const std::string& fileName, int how)
{
  if (::_access(fileName.c_str(), how) != 0) {
return ErrnoError("access: Could not access path '" + fileName + "'");
  }

  return true;
}
{code}

As we can see, the case where {{errno}} is set to {{EACCES}} is handled 
differently between the 2 functions.

We can actually consolidate the 2 functions by simply using the POSIX version. 
The challenge is that on POSIX, we should use {{::access}} and {{_::access}} on 
Windows. Note however, that this problem is already solved, as we have an 
implementation of {{::access}} for Windows in 
{{3rdparty/libprocess/3rdparty/stout/include/stout/windows.hpp}} which simply 
defers to {{::_access}}.

Thus, I propose to simply consolidate the 2 implementations.


> The windows version of `os::access` has differing behavior than the POSIX 
> version.
> --
>
> Key: MESOS-5237
> URL: https://issues.apache.org/jira/browse/MESOS-5237
> Project: Mesos
>  Issue Type: Bug
>  Components: stout
>Reporter: Michael Park
>Assignee: Michael Park
>  Labels: mesosphere,, windows
>
> The POSIX version of {{os::access}} looks like this:
> {code}
> inline Try access(const std::string& path, int how)
> {
>   if (::access(path.c_str(), how) < 0) {
> if (errno == EACCES) {
>   return false;
> } else {
>   return ErrnoError();
> }
>   }
>   return true;
> }
> {code}
> Compare this to the Windows version of {{os::access}} which looks like this 
> following:
> {code}
> inline Try access(const std::string& fileName, int how)
> {
>   if (::_access(fileName.c_str(), how) != 0) {
> return ErrnoError("access: Could not access path '" + fileName + "'");
>   }
>   return true;
> }
> {code}
> As we can see, the case where {{errno}} is set to {{EACCES}} is handled 
> differently between the 2 functions.
> We can actually consolidate the 2 functions by simply using the POSIX 
> version. The challenge is that on POSIX, we should use {{::access}} and 
> {{::_access}} on Windows. Note however, that this problem is already solved, 
> as we have an implementation of {{::access}} for Windows in 
> {{3rdparty/libprocess/3rdparty/stout/include/stout/windows.hpp}} which simply 
> defers to {{::_access}}.
> Thus, I propose to simply consolidate the 2 implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5447) Configurable agent memory reservation

2016-05-24 Thread Karl Isenberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Isenberg updated MESOS-5447:
-
Issue Type: Improvement  (was: Story)

> Configurable agent memory reservation
> -
>
> Key: MESOS-5447
> URL: https://issues.apache.org/jira/browse/MESOS-5447
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Karl Isenberg
>
> When deciding what memory to make available to tasks (if not explicitly 
> configured), mesos agents currently reserve 1GB or 1/2 of system memory. 
> I'd really like to be able to configure the reservation amount so that I 
> don't have to reproduce the system memory lookup and math in order to 
> configure "mem=(system memory - reservation)".
> Ideally this would be configurable with a command line argument and 
> environment variable like "--reserved-memory=512".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-5447) Configurable agent memory reservation

2016-05-24 Thread Karl Isenberg (JIRA)
Karl Isenberg created MESOS-5447:


 Summary: Configurable agent memory reservation
 Key: MESOS-5447
 URL: https://issues.apache.org/jira/browse/MESOS-5447
 Project: Mesos
  Issue Type: Story
Reporter: Karl Isenberg


When deciding what memory to make available to tasks (if not explicitly 
configured), mesos agents currently reserve 1GB or 1/2 of system memory. 

I'd really like to be able to configure the reservation amount so that I don't 
have to reproduce the system memory lookup and math in order to configure 
"mem=(system memory - reservation)".

Ideally this would be configurable with a command line argument and environment 
variable like "--reserved-memory=512".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5294) Status updates after a health check are incomplete or invalid

2016-05-24 Thread Dmitry Fedorov (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298731#comment-15298731
 ] 

Dmitry Fedorov commented on MESOS-5294:
---

I've met the same issue, an issue occurs when we are using custom networking in 
docker, as [~thegner] described.

I've manage to "fix" it by duplicating 

{code}
164 inspect = docker->inspect(containerName, DOCKER_INSPECT_DELAY)
165   .then(defer(self(), [=](const Docker::Container& container) {
166 if (!killed) {
167   TaskStatus status;
168   status.mutable_task_id()->CopyFrom(taskId.get());
169   status.set_state(TASK_RUNNING);
170   status.set_data(container.output);
171   if (container.ipAddress.isSome()) {
172 // TODO(karya): Deprecated -- Remove after 0.25.0 has shipped.
173 Label* label = status.mutable_labels()->add_labels();
174 label->set_key("Docker.NetworkSettings.IPAddress");
175 label->set_value(container.ipAddress.get());
176
177 NetworkInfo* networkInfo =
178   status.mutable_container_status()->add_network_infos();
179
180 // TODO(CD): Deprecated -- Remove after 0.27.0.
181 networkInfo->set_ip_address(container.ipAddress.get());
182
183 NetworkInfo::IPAddress* ipAddress =
184   networkInfo->add_ip_addresses();
185 ipAddress->set_ip_address(container.ipAddress.get());
186   }
{code}
from `launchTask` method in src/docker/executor.cpp to taskHealthUpdated method

> Status updates after a health check are incomplete or invalid
> -
>
> Key: MESOS-5294
> URL: https://issues.apache.org/jira/browse/MESOS-5294
> Project: Mesos
>  Issue Type: Bug
> Environment: mesos 0.28.0, docker 1.11, marathon 0.15.3, mesos-dns, 
> ubuntu 14.04
>Reporter: Travis Hegner
>Assignee: Travis Hegner
>
> With command health checks enabled via marathon, mesos-dns will resolve the 
> task correctly until the task is reported as "healthy". At that point, 
> mesos-dns stops resolving the task correctly.
> -Digging through src/docker/executor.cpp, I found that in the 
> {{taskHealthUpdated()}} function is attempting to copy the taskID to the new 
> status instance with-
> {code}status.mutable_task_id()->CopyFrom(taskID);{code}
> -but other instances of status updates have a similar line-
> {code}status.mutable_task_id()->CopyFrom(taskID.get());{code}
> -My assumption is that this difference is causing the status update after a 
> health check to not have a proper taskID, which in turn is causing an 
> incorrect state.json output.-
> -I'll try to get a patch together soon.-
> UPDATE:
> None of the above assumption are correct. Something else is causing the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5421) Mesos Docker executor taskHealthUpdated removes information about job ipAddresses

2016-05-24 Thread Dmitry Fedorov (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298715#comment-15298715
 ] 

Dmitry Fedorov commented on MESOS-5421:
---

[~kaysoky], [~jieyu] It seems that my issue is a duplicate of MESOS-5294, I'll 
add more information in it.


> Mesos Docker executor taskHealthUpdated removes information about job 
> ipAddresses
> -
>
> Key: MESOS-5421
> URL: https://issues.apache.org/jira/browse/MESOS-5421
> Project: Mesos
>  Issue Type: Bug
>  Components: slave
>Affects Versions: 0.28.1
>Reporter: Dmitry Fedorov
>Priority: Minor
> Fix For: 0.29.0
>
>
> When you create job with command health check, right after job is launched 
> the status is correct and ipAddresses field is present in it. 
> But after health status is updated, ipAddresses field is missed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5430) Design the improvement of the home page of mesos.apache.org

2016-05-24 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298655#comment-15298655
 ] 

haosdent commented on MESOS-5430:
-

Thank you very much for your sketch file, let me update to match your design.

> Design the improvement of the home page of mesos.apache.org
> ---
>
> Key: MESOS-5430
> URL: https://issues.apache.org/jira/browse/MESOS-5430
> Project: Mesos
>  Issue Type: Improvement
>  Components: project website
>Reporter: Vinod Kone
>Assignee: Jonathan Manalus
>
> The idea is to come up with a minimal improvement for the design of the home 
> page of mesos.apache.org.
> Proposed Redesign: https://invis.io/CV7DZF1JW#/159898819_Mesos-apache-org



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-970) Upgrade bundled leveldb to 1.18

2016-05-24 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298583#comment-15298583
 ] 

haosdent commented on MESOS-970:


I created a document to describe my tests 
https://docs.google.com/document/d/1fv2OMvH6hVm6waacOejSrTJwUuDQeXlqqPDZjBmbcKU/edit?usp=sharing
 

Now finish {{Compatible test (Single node mode)}}  and testing {{Compatible 
test (Zookeeper mode)}}, feel free to add comments on it if you have any 
questions or suggestions.

> Upgrade bundled leveldb to 1.18
> ---
>
> Key: MESOS-970
> URL: https://issues.apache.org/jira/browse/MESOS-970
> Project: Mesos
>  Issue Type: Improvement
>  Components: replicated log
>Reporter: Benjamin Mahler
>Assignee: Tomasz Janiszewski
>
> We currently bundle leveldb 1.4, and the latest version is leveldb 1.18.
> Upgrade to 1.18 could solve the problems when build Mesos in some non-x86 
> architecture CPU.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5430) Design the improvement of the home page of mesos.apache.org

2016-05-24 Thread Jonathan Manalus (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298534#comment-15298534
 ] 

Jonathan Manalus commented on MESOS-5430:
-

[~vinodkone] - [~haosd...@gmail.com] first iteration looks great. It just needs 
small tweaks to match the sketch file that I posted above which I can help 
answer any questions. 

[~vinodkone] can you work with Amr Abdelrazik @Mesosphere on the copy of the 
homepage? and work with Ben Hindman to figure out the launch date. 

> Design the improvement of the home page of mesos.apache.org
> ---
>
> Key: MESOS-5430
> URL: https://issues.apache.org/jira/browse/MESOS-5430
> Project: Mesos
>  Issue Type: Improvement
>  Components: project website
>Reporter: Vinod Kone
>Assignee: Jonathan Manalus
>
> The idea is to come up with a minimal improvement for the design of the home 
> page of mesos.apache.org.
> Proposed Redesign: https://invis.io/CV7DZF1JW#/159898819_Mesos-apache-org



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5430) Design the improvement of the home page of mesos.apache.org

2016-05-24 Thread Vinod Kone (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298524#comment-15298524
 ] 

Vinod Kone commented on MESOS-5430:
---

I thought someone at Mesosphere was going to implement your design. Is that not 
true [~jmanalus]? Don't want [~haosd...@gmail.com] to spend cycles on 
implementation if someone else is going to do it.

> Design the improvement of the home page of mesos.apache.org
> ---
>
> Key: MESOS-5430
> URL: https://issues.apache.org/jira/browse/MESOS-5430
> Project: Mesos
>  Issue Type: Improvement
>  Components: project website
>Reporter: Vinod Kone
>Assignee: Jonathan Manalus
>
> The idea is to come up with a minimal improvement for the design of the home 
> page of mesos.apache.org.
> Proposed Redesign: https://invis.io/CV7DZF1JW#/159898819_Mesos-apache-org



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5435) Add default implementations to all Isolator virtual functions

2016-05-24 Thread Kevin Klues (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Klues updated MESOS-5435:
---
Sprint: Mesosphere Sprint 35

> Add default implementations to all Isolator virtual functions
> -
>
> Key: MESOS-5435
> URL: https://issues.apache.org/jira/browse/MESOS-5435
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Kevin Klues
>Assignee: Kevin Klues
> Fix For: 0.29.0
>
>
> Currently, all of the virtual functions in `mesos::slave::Isolator` are pure 
> virtual (expect status()). For many isolators, however, it doesn't make sense 
> to implement all of these virtual functions. Each isolator has to provide its 
> own default implementation of these functions even if they aren't really 
> relying on them. This adds unnecessary extra code to many isolators that 
> don't need them.
> Moreover, the `MesosIsolatorProcess` has the same problem for each of its 
> virtual functions.
> We should provide defaults for these instead of making each and every 
> isolator implement even in cases when it doesn't make sense.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5435) Add default implementations to all Isolator virtual functions

2016-05-24 Thread Kevin Klues (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Klues updated MESOS-5435:
---
Story Points: 1

> Add default implementations to all Isolator virtual functions
> -
>
> Key: MESOS-5435
> URL: https://issues.apache.org/jira/browse/MESOS-5435
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Kevin Klues
>Assignee: Kevin Klues
> Fix For: 0.29.0
>
>
> Currently, all of the virtual functions in `mesos::slave::Isolator` are pure 
> virtual (expect status()). For many isolators, however, it doesn't make sense 
> to implement all of these virtual functions. Each isolator has to provide its 
> own default implementation of these functions even if they aren't really 
> relying on them. This adds unnecessary extra code to many isolators that 
> don't need them.
> Moreover, the `MesosIsolatorProcess` has the same problem for each of its 
> virtual functions.
> We should provide defaults for these instead of making each and every 
> isolator implement even in cases when it doesn't make sense.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5167) Add tests for `network/cni` isolator

2016-05-24 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298506#comment-15298506
 ] 

Jie Yu commented on MESOS-5167:
---

commit d6846f952b55c47d33a926e4d4da38d058e6dcf3
Author: Qian Zhang 
Date:   Tue May 24 09:50:57 2016 -0700

Added the test "CniIsolatorTest.ROOT_SlaveRecovery".

Review: https://reviews.apache.org/r/46438/

commit d02cb7848fb02d2faf660cb6664578c0554f7aca
Author: Qian Zhang 
Date:   Mon May 23 14:43:54 2016 -0700

Added the test "CniIsolatorTest.ROOT_FailedPlugin".

Review: https://reviews.apache.org/r/46436/

commit 999bb7168a9ea78e7978bdc17e64672c32dc1ab7
Author: Qian Zhang 
Date:   Mon May 23 14:39:03 2016 -0700

Added the test "CniIsolatorTest.ROOT_VerifyCheckpointedInfo".

Review: https://reviews.apache.org/r/46435/

commit 8c10513ef50185dfe7d477525799826cc7a1b056
Author: Qian Zhang 
Date:   Mon May 23 14:25:41 2016 -0700

Added the test "CniIsolatorTest.ROOT_LaunchCommandTask".

Review: https://reviews.apache.org/r/46097/

> Add tests for `network/cni` isolator
> 
>
> Key: MESOS-5167
> URL: https://issues.apache.org/jira/browse/MESOS-5167
> Project: Mesos
>  Issue Type: Task
>  Components: test
>Reporter: Qian Zhang
>Assignee: Qian Zhang
>  Labels: mesosphere
>
> We need to add tests to verify the functionality of `network/cni` isolator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5256) Add support for per-containerizer resource enumeration

2016-05-24 Thread Kevin Klues (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Klues updated MESOS-5256:
---
Sprint: Mesosphere Sprint 35

> Add support for per-containerizer resource enumeration
> --
>
> Key: MESOS-5256
> URL: https://issues.apache.org/jira/browse/MESOS-5256
> Project: Mesos
>  Issue Type: Task
>  Components: isolation
>Reporter: Kevin Klues
>Assignee: Kevin Klues
>  Labels: containerizer
>
> Currently the top level containerizer includes a static function for 
> enumerating the resources available on a given agent. Ideally, this 
> functionality should be the responsibility of individual containerizers (and 
> specifically the responsibility of each isolator used to control access to 
> those resources).
> Adding support for this will involve making the `Containerizer::resources()` 
> function virtual instead of static and then implementing it on a 
> per-containerizer basis.  We should consider providing a default to make this 
> easier in cases where there is only really one good way of enumerating a 
> given set of resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5430) Design the improvement of the home page of mesos.apache.org

2016-05-24 Thread Jonathan Manalus (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298476#comment-15298476
 ] 

Jonathan Manalus commented on MESOS-5430:
-

[~haosd...@gmail.com] - You can download the sketch file here 
http://cl.ly/27393B403U2p

The demo looks great!

> Design the improvement of the home page of mesos.apache.org
> ---
>
> Key: MESOS-5430
> URL: https://issues.apache.org/jira/browse/MESOS-5430
> Project: Mesos
>  Issue Type: Improvement
>  Components: project website
>Reporter: Vinod Kone
>Assignee: Jonathan Manalus
>
> The idea is to come up with a minimal improvement for the design of the home 
> page of mesos.apache.org.
> Proposed Redesign: https://invis.io/CV7DZF1JW#/159898819_Mesos-apache-org



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5421) Mesos Docker executor taskHealthUpdated removes information about job ipAddresses

2016-05-24 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298472#comment-15298472
 ] 

Jie Yu commented on MESOS-5421:
---

[~dfedorov] Can you provide more debug information (e.g., agent/executor logs, 
job description, what networking are you using)? It's hard for us to triage the 
issue without those information.

Also, since it's not clear what the issue is, i'll mark the fix version as 0.29 
as we are cutting 0.28.2. If this turns out to be an issue, we can backport it 
into 0.28.3.

> Mesos Docker executor taskHealthUpdated removes information about job 
> ipAddresses
> -
>
> Key: MESOS-5421
> URL: https://issues.apache.org/jira/browse/MESOS-5421
> Project: Mesos
>  Issue Type: Bug
>  Components: slave
>Affects Versions: 0.28.1
>Reporter: Dmitry Fedorov
>Priority: Minor
> Fix For: 0.29.0
>
>
> When you create job with command health check, right after job is launched 
> the status is correct and ipAddresses field is present in it. 
> But after health status is updated, ipAddresses field is missed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5421) Mesos Docker executor taskHealthUpdated removes information about job ipAddresses

2016-05-24 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-5421:
--
Fix Version/s: (was: 0.28.2)
   0.29.0

> Mesos Docker executor taskHealthUpdated removes information about job 
> ipAddresses
> -
>
> Key: MESOS-5421
> URL: https://issues.apache.org/jira/browse/MESOS-5421
> Project: Mesos
>  Issue Type: Bug
>  Components: slave
>Affects Versions: 0.28.1
>Reporter: Dmitry Fedorov
>Priority: Minor
> Fix For: 0.29.0
>
>
> When you create job with command health check, right after job is launched 
> the status is correct and ipAddresses field is present in it. 
> But after health status is updated, ipAddresses field is missed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5359) The scheduler library should have a delay before initiating a connection with master.

2016-05-24 Thread Anand Mazumdar (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298436#comment-15298436
 ] 

Anand Mazumdar commented on MESOS-5359:
---

Ideally, yes. The delay should be configurable by a flag.

Have a look at how we are already doing so for the old driver based interface 
{{src/sched/flags.hpp}}. The only difference here is that we just need a single 
maximum connection backoff variable e.g., 
{{MESOS_CONNECTION_BACKOFF_MAX=~500ms}}. The scheduler library can then do a 
linear backoff after picking a random delay between 0 and maxBackoff for 
initiating the connection with the master.



> The scheduler library should have a delay before initiating a connection with 
> master.
> -
>
> Key: MESOS-5359
> URL: https://issues.apache.org/jira/browse/MESOS-5359
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.29.0
>Reporter: Anand Mazumdar
>Assignee: José Guilherme Vanz
>  Labels: mesosphere
>
> Currently, the scheduler library {{src/scheduler/scheduler.cpp}} does have an 
> artificially induced delay when trying to initially establish a connection 
> with the master. In the event of a master failover or ZK disconnect, a large 
> number of frameworks can get disconnected and then thereby overwhelm the 
> master with TCP SYN requests. 
> On a large cluster with many agents, the master is already overwhelmed with 
> handling connection requests from the agents. This compounds the issue 
> further on the master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3085) Make failed on Ubuntu 14.04 ppc64le

2016-05-24 Thread Tomasz Janiszewski (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298362#comment-15298362
 ] 

Tomasz Janiszewski commented on MESOS-3085:
---

Duplicate: MESOS-5263

> Make failed on Ubuntu 14.04 ppc64le
> ---
>
> Key: MESOS-3085
> URL: https://issues.apache.org/jira/browse/MESOS-3085
> Project: Mesos
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.23.0, 0.24.0
> Environment: Ubuntu 14.04 ppc64le
>Reporter: Jihun Kang
>Assignee: Jihun Kang
>
> When trying to compile linux/fs.cpp, make failed with a following message.
> {noformat}
> /bin/bash ../libtool  --tag=CXX   --mode=compile g++ -DPACKAGE_NAME=\"mesos\" 
> -DPACKAGE_TARNAME=\"mesos\" -DPACKAGE_VERSION=\"0.24.0\" 
> -DPACKAGE_STRING=\"mesos\ 0.24.0\" -DPACKAGE_BUGREPORT=\"\" 
> -DPACKAGE_URL=\"\" -DPACKAGE=\"mesos\" -DVERSION=\"0.24.0\" -DSTDC_HEADERS=1 
> -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 
> -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 
> -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 -DLT_OBJDIR=\".libs/\" 
> -DHAVE_PTHREAD_PRIO_INHERIT=1 -DHAVE_PTHREAD=1 -DHAVE_LIBZ=1 -DHAVE_LIBCURL=1 
> -DHAVE_APR_POOLS_H=1 -DHAVE_LIBAPR_1=1 -DHAVE_SVN_VERSION_H=1 
> -DHAVE_LIBSVN_SUBR_1=1 -DHAVE_SVN_DELTA_H=1 -DHAVE_LIBSVN_DELTA_1=1 
> -DHAVE_LIBSASL2=1 -DMESOS_HAS_JAVA=1 -DHAVE_PYTHON=\"2.7\" 
> -DMESOS_HAS_PYTHON=1 -I. -I../../src   -Wall -Werror 
> -DLIBDIR=\"/usr/local/lib\" -DPKGLIBEXECDIR=\"/usr/local/libexec/mesos\" 
> -DPKGDATADIR=\"/usr/local/share/mesos\" -I../../include 
> -I../../3rdparty/libprocess/include 
> -I../../3rdparty/libprocess/3rdparty/stout/include -I../include 
> -I../include/mesos -I../3rdparty/libprocess/3rdparty/boost-1.53.0 
> -I../3rdparty/libprocess/3rdparty/picojson-4f93734 
> -I../3rdparty/libprocess/3rdparty/protobuf-2.5.0/src 
> -I../3rdparty/libprocess/3rdparty/glog-0.3.3/src 
> -I../3rdparty/libprocess/3rdparty/glog-0.3.3/src 
> -I../3rdparty/leveldb/include -I../3rdparty/zookeeper-3.4.5/src/c/include 
> -I../3rdparty/zookeeper-3.4.5/src/c/generated 
> -I../3rdparty/libprocess/3rdparty/protobuf-2.5.0/src 
> -I/usr/include/subversion-1 -I/usr/include/apr-1 -I/usr/include/apr-1.0  
> -pthread -g1 -O0 -Wno-unused-local-typedefs -std=c++11 -MT 
> linux/libmesos_no_3rdparty_la-fs.lo -MD -MP -MF 
> linux/.deps/libmesos_no_3rdparty_la-fs.Tpo -c -o 
> linux/libmesos_no_3rdparty_la-fs.lo `test -f 'linux/fs.cpp' || echo 
> '../../src/'`linux/fs.cpp
> libtool: compile:  g++ -DPACKAGE_NAME=\"mesos\" -DPACKAGE_TARNAME=\"mesos\" 
> -DPACKAGE_VERSION=\"0.24.0\" "-DPACKAGE_STRING=\"mesos 0.24.0\"" 
> -DPACKAGE_BUGREPORT=\"\" -DPACKAGE_URL=\"\" -DPACKAGE=\"mesos\" 
> -DVERSION=\"0.24.0\" -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 
> -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 
> -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 
> -DHAVE_DLFCN_H=1 -DLT_OBJDIR=\".libs/\" -DHAVE_PTHREAD_PRIO_INHERIT=1 
> -DHAVE_PTHREAD=1 -DHAVE_LIBZ=1 -DHAVE_LIBCURL=1 -DHAVE_APR_POOLS_H=1 
> -DHAVE_LIBAPR_1=1 -DHAVE_SVN_VERSION_H=1 -DHAVE_LIBSVN_SUBR_1=1 
> -DHAVE_SVN_DELTA_H=1 -DHAVE_LIBSVN_DELTA_1=1 -DHAVE_LIBSASL2=1 
> -DMESOS_HAS_JAVA=1 -DHAVE_PYTHON=\"2.7\" -DMESOS_HAS_PYTHON=1 -I. -I../../src 
> -Wall -Werror -DLIBDIR=\"/usr/local/lib\" 
> -DPKGLIBEXECDIR=\"/usr/local/libexec/mesos\" 
> -DPKGDATADIR=\"/usr/local/share/mesos\" -I../../include 
> -I../../3rdparty/libprocess/include 
> -I../../3rdparty/libprocess/3rdparty/stout/include -I../include 
> -I../include/mesos -I../3rdparty/libprocess/3rdparty/boost-1.53.0 
> -I../3rdparty/libprocess/3rdparty/picojson-4f93734 
> -I../3rdparty/libprocess/3rdparty/protobuf-2.5.0/src 
> -I../3rdparty/libprocess/3rdparty/glog-0.3.3/src 
> -I../3rdparty/libprocess/3rdparty/glog-0.3.3/src 
> -I../3rdparty/leveldb/include -I../3rdparty/zookeeper-3.4.5/src/c/include 
> -I../3rdparty/zookeeper-3.4.5/src/c/generated 
> -I../3rdparty/libprocess/3rdparty/protobuf-2.5.0/src 
> -I/usr/include/subversion-1 -I/usr/include/apr-1 -I/usr/include/apr-1.0 
> -pthread -g1 -O0 -Wno-unused-local-typedefs -std=c++11 -MT 
> linux/libmesos_no_3rdparty_la-fs.lo -MD -MP -MF 
> linux/.deps/libmesos_no_3rdparty_la-fs.Tpo -c ../../src/linux/fs.cpp  -fPIC 
> -DPIC -o linux/.libs/libmesos_no_3rdparty_la-fs.o
> ../../src/linux/fs.cpp:346:2: error: #error "pivot_root is not available"
>  #error "pivot_root is not available"
>   ^
> ../../src/linux/fs.cpp: In function 'Try 
> mesos::internal::fs::pivot_root(const string&, const string&)':
> ../../src/linux/fs.cpp:348:7: error: 'ret' was not declared in this scope
>if (ret == -1) {
>^
> make[2]: *** [linux/libmesos_no_3rdparty_la-fs.lo] Error 1
> make[2]: *** Waiting for unfinished jobs
> {noformat}



--
This message was sent by 

[jira] [Commented] (MESOS-5446) NsTest.ROOT_setns and NsTest.ROOT_getns failed in Linux 4.6

2016-05-24 Thread Abhishek Dasgupta (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298352#comment-15298352
 ] 

Abhishek Dasgupta commented on MESOS-5446:
--

Even after disabling lxcfs on ubuntu 16.04, these tests are still failing.

> NsTest.ROOT_setns and NsTest.ROOT_getns failed in Linux 4.6
> ---
>
> Key: MESOS-5446
> URL: https://issues.apache.org/jira/browse/MESOS-5446
> Project: Mesos
>  Issue Type: Bug
>Reporter: haosdent
>Assignee: haosdent
>Priority: Minor
>
> From [~nthakkar%40us.ibm.com]
> {quote}
> Currently because "cgroup" namespace is not supported, following two 
> test-case are failing:
> 1. NsTest.ROOT_setns
> 2. NsTest.ROOT_getns
> The error observed is : "nstype: Unknown namespace 'cgroup'"
> This is because the contents of the directory "/proc/self/ns" has been 
> changed in kernel version 4.6 (cgroup is added).
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5430) Design the improvement of the home page of mesos.apache.org

2016-05-24 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298260#comment-15298260
 ] 

haosdent commented on MESOS-5430:
-

Or you could use sketch plugin https://github.com/utom/sketch-measure to export 
a measure html page.

> Design the improvement of the home page of mesos.apache.org
> ---
>
> Key: MESOS-5430
> URL: https://issues.apache.org/jira/browse/MESOS-5430
> Project: Mesos
>  Issue Type: Improvement
>  Components: project website
>Reporter: Vinod Kone
>Assignee: Jonathan Manalus
>
> The idea is to come up with a minimal improvement for the design of the home 
> page of mesos.apache.org.
> Proposed Redesign: https://invis.io/CV7DZF1JW#/159898819_Mesos-apache-org



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5406) Validate ACLs on creating an instance of local authorizer.

2016-05-24 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298130#comment-15298130
 ] 

Jay Guo commented on MESOS-5406:


Some more thoughts:
# Should we sort ACLs and apply some mechanism like longest-prefix-match in 
routing table? Instead of relying on the order they are specified by user
# Also should aggregate ACLs for given action? I saw TODO in codebase: 
TODO(vinod): Do aggregation of ACLs when possible.

> Validate ACLs on creating an instance of local authorizer.
> --
>
> Key: MESOS-5406
> URL: https://issues.apache.org/jira/browse/MESOS-5406
> Project: Mesos
>  Issue Type: Improvement
>  Components: security
>Reporter: Alexander Rukletsov
>Assignee: Jay Guo
>  Labels: mesosphere, security
>
> Some combinations of ACLs are not allowed, for example, specifying both 
> {{SetQuota}} and {{UpdateQuota}}. We should capture such issues and error out 
> early. 
> This ticket aims to add as many validations as possible to a dedicated 
> {{validate()}} routine, instead of having them implicitly in the codebase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5410) Support cgroup namespace in unified container

2016-05-24 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15297786#comment-15297786
 ] 

haosdent commented on MESOS-5410:
-

Cool! Could you send a email to the dev mailing list to become a contributor in 
jira, so that I could change the assignee of MESOS-5446 to you.

> Support cgroup namespace in unified container
> -
>
> Key: MESOS-5410
> URL: https://issues.apache.org/jira/browse/MESOS-5410
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Qian Zhang
>Assignee: Qian Zhang
>
> In Linux 4.6 kernel, a new namespace (cgroup namespace) was introduced to 
> make a process can be created in its own cgroup namespace so that the global 
> cgroup hierarchy will not be leaked to the process. See the following link 
> for more details about this namespace:
> http://man7.org/linux/man-pages/man7/cgroup_namespaces.7.html
> We need to support this namespace in unified container to provide better 
> isolation for the containers created by Mesos.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5410) Support cgroup namespace in unified container

2016-05-24 Thread Nirav (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15297764#comment-15297764
 ] 

Nirav commented on MESOS-5410:
--

Hi,
Or we can add a macro in the file. I tried adding that,and it worked well. 
Since that would help in future.

#ifndef CLONE_NEWCGROUP
#define CLONE_NEWCGROUP 0x0200
#endif

and

nstypes["cgroup"] = CLONE_NEWCGROUP;

I can submit the required patch.

> Support cgroup namespace in unified container
> -
>
> Key: MESOS-5410
> URL: https://issues.apache.org/jira/browse/MESOS-5410
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Qian Zhang
>Assignee: Qian Zhang
>
> In Linux 4.6 kernel, a new namespace (cgroup namespace) was introduced to 
> make a process can be created in its own cgroup namespace so that the global 
> cgroup hierarchy will not be leaked to the process. See the following link 
> for more details about this namespace:
> http://man7.org/linux/man-pages/man7/cgroup_namespaces.7.html
> We need to support this namespace in unified container to provide better 
> isolation for the containers created by Mesos.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4565) slave recovers and attempt to destroy executor's child containers, then begins rejecting task status updates

2016-05-24 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15297752#comment-15297752
 ] 

haosdent commented on MESOS-4565:
-

[~giaosuddau] Do you encounter the 
{code}
E0130 02:22:21.009094 12686 containerizer.cpp:553] Failed to clean up an 
isolator when destroying orphan container kube-proxy: Failed to remove cgroup 
'/sys/fs/cgroup/memory/mesos/1d965a20-849c-40d8-9446-27cb723220a9/kube-proxy': 
Device or resource busy
{code}

A quick workaround it unmount it manually and make Agent recover successfully. 

> slave recovers and attempt to destroy executor's child containers, then 
> begins rejecting task status updates
> 
>
> Key: MESOS-4565
> URL: https://issues.apache.org/jira/browse/MESOS-4565
> Project: Mesos
>  Issue Type: Bug
>  Components: docker
>Affects Versions: 0.26.0
>Reporter: James DeFelice
>  Labels: mesosphere
>
> AFAICT the slave is doing this:
> 1) recovering from some kind of failure
> 2) checking the containers that it pulled from its state store
> 3) complaining about cgroup children hanging off of executor containers
> 4) rejecting task status updates related to the executor container, the first 
> of which in the logs is:
> {code}
> E0130 02:22:21.979852 12683 slave.cpp:2963] Failed to update resources for 
> container 1d965a20-849c-40d8-9446-27cb723220a9 of executor 
> 'd701ab48a0c0f13_k8sm-executor' running task 
> pod.f2dc2c43-c6f7-11e5-ad28-0ad18c5e6c7f on status update for terminal task, 
> destroying container: Container '1d965a20-849c-40d8-9446-27cb723220a9' not 
> found
> {code}
> To be fair, I don't believe that my custom executor is re-registering 
> properly with the slave prior to attempting to send these (failing) status 
> updates. But the slave doesn't complain about that .. it complains that it 
> can't find the **container**.
> slave log here:
> https://gist.github.com/jdef/265663461156b7a7ed4e



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-5446) NsTest.ROOT_setns and NsTest.ROOT_getns failed in Linux 4.6

2016-05-24 Thread haosdent (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent reassigned MESOS-5446:
---

Assignee: haosdent

> NsTest.ROOT_setns and NsTest.ROOT_getns failed in Linux 4.6
> ---
>
> Key: MESOS-5446
> URL: https://issues.apache.org/jira/browse/MESOS-5446
> Project: Mesos
>  Issue Type: Bug
>Reporter: haosdent
>Assignee: haosdent
>Priority: Minor
>
> From [~nthakkar%40us.ibm.com]
> {quote}
> Currently because "cgroup" namespace is not supported, following two 
> test-case are failing:
> 1. NsTest.ROOT_setns
> 2. NsTest.ROOT_getns
> The error observed is : "nstype: Unknown namespace 'cgroup'"
> This is because the contents of the directory "/proc/self/ns" has been 
> changed in kernel version 4.6 (cgroup is added).
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-5446) NsTest.ROOT_setns and NsTest.ROOT_getns failed in Linux 4.6

2016-05-24 Thread haosdent (JIRA)
haosdent created MESOS-5446:
---

 Summary: NsTest.ROOT_setns and NsTest.ROOT_getns failed in Linux 
4.6
 Key: MESOS-5446
 URL: https://issues.apache.org/jira/browse/MESOS-5446
 Project: Mesos
  Issue Type: Bug
Reporter: haosdent
Priority: Minor


>From [~nthakkar%40us.ibm.com]
{quote}
Currently because "cgroup" namespace is not supported, following two test-case 
are failing:
1. NsTest.ROOT_setns
2. NsTest.ROOT_getns
The error observed is : "nstype: Unknown namespace 'cgroup'"
This is because the contents of the directory "/proc/self/ns" has been changed 
in kernel version 4.6 (cgroup is added).
{quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5410) Support cgroup namespace in unified container

2016-05-24 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15297740#comment-15297740
 ] 

haosdent commented on MESOS-5410:
-

I think we could add
{code}
namespaces.erase("cgroup");
{code}
as a workaround. Let me file a jira for this.

> Support cgroup namespace in unified container
> -
>
> Key: MESOS-5410
> URL: https://issues.apache.org/jira/browse/MESOS-5410
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Qian Zhang
>Assignee: Qian Zhang
>
> In Linux 4.6 kernel, a new namespace (cgroup namespace) was introduced to 
> make a process can be created in its own cgroup namespace so that the global 
> cgroup hierarchy will not be leaked to the process. See the following link 
> for more details about this namespace:
> http://man7.org/linux/man-pages/man7/cgroup_namespaces.7.html
> We need to support this namespace in unified container to provide better 
> isolation for the containers created by Mesos.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)