[jira] [Created] (MESOS-5450) Make authentication pluggable
Alex Clemmer created MESOS-5450: --- Summary: Make authentication pluggable Key: MESOS-5450 URL: https://issues.apache.org/jira/browse/MESOS-5450 Project: Mesos Issue Type: Bug Components: slave Reporter: Alex Clemmer Assignee: Alex Clemmer Right now there is a hard dependency on SASL, which probably won't work well on Windows (at least) in the near future for our use cases. In the future, it would be nice to have a pluggable authentication layer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5380) Killing a queued task can cause the corresponding command executor to never terminate.
[ https://issues.apache.org/jira/browse/MESOS-5380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299446#comment-15299446 ] Vinod Kone commented on MESOS-5380: --- Committed phase 2. commit 8be9b5b5decd9ec2bcad547b1dff29b777cbc438 Author: Vinod KoneDate: Sun May 15 12:31:31 2016 -0700 Fixed agent to properly handle killTask during agent restart. If the agent restarts after handling killTask but before sending shutdown message to the executor, we ensure the executor terminates. Review: https://reviews.apache.org/r/47402 > Killing a queued task can cause the corresponding command executor to never > terminate. > -- > > Key: MESOS-5380 > URL: https://issues.apache.org/jira/browse/MESOS-5380 > Project: Mesos > Issue Type: Bug > Components: slave >Affects Versions: 0.28.0, 0.28.1 >Reporter: Jie Yu >Assignee: Vinod Kone >Priority: Blocker > Labels: mesosphere > Fix For: 0.29.0, 0.28.2 > > > We observed this in our testing environment. Sequence of events: > 1) A command task is queued since the executor has not registered yet. > 2) The framework issues a killTask. > 3) Since executor is in REGISTERING state, agent calls > `statusUpdate(TASK_KILLED, UPID())` > 4) `statusUpdate` now will call `containerizer->status()` before calling > `executor->terminateTask(status.task_id(), status);` which will remove the > queued task. (Introduced in this patch: https://reviews.apache.org/r/43258). > 5) Since the above is async, it's possible that the task is still in queued > task when we trying to see if we need to kill unregistered executor in > `killTask`: > {code} > // TODO(jieyu): Here, we kill the executor if it no longer has > // any task to run and has not yet registered. This is a > // workaround for those single task executors that do not have a > // proper self terminating logic when they haven't received the > // task within a timeout. > if (executor->queuedTasks.empty()) { > CHECK(executor->launchedTasks.empty()) > << " Unregistered executor '" << executor->id > << "' has launched tasks"; > LOG(WARNING) << "Killing the unregistered executor " << *executor > << " because it has no tasks"; > executor->state = Executor::TERMINATING; > containerizer->destroy(executor->containerId); > } > {code} > 6) Consequently, the executor will never be terminated by Mesos. > Attaching the relevant agent log: > {noformat} > May 13 15:36:13 ip-10-0-2-74.us-west-2.compute.internal mesos-slave[1304]: > I0513 15:36:13.640527 1342 slave.cpp:1361] Got assigned task > mesosvol.6ccd993c-1920-11e6-a722-9648cb19afd6 for framework > a3ad8418-cb77-4705-b353-4b514ceca52c- > May 13 15:36:13 ip-10-0-2-74.us-west-2.compute.internal mesos-slave[1304]: > I0513 15:36:13.641034 1342 slave.cpp:1480] Launching task > mesosvol.6ccd993c-1920-11e6-a722-9648cb19afd6 for framework > a3ad8418-cb77-4705-b353-4b514ceca52c- > May 13 15:36:13 ip-10-0-2-74.us-west-2.compute.internal mesos-slave[1304]: > I0513 15:36:13.641440 1342 paths.cpp:528] Trying to chown > '/var/lib/mesos/slave/slaves/a3ad8418-cb77-4705-b353-4b514ceca52c-S0/frameworks/a3ad8418-cb77-4705-b353-4b514ceca52c-/executors/mesosvol.6ccd993c-1920-11e6-a722-9648cb19afd6/runs/24762d43-2134-475e-b724-caa72110497a' > to user 'root' > May 13 15:36:13 ip-10-0-2-74.us-west-2.compute.internal mesos-slave[1304]: > I0513 15:36:13.644664 1342 slave.cpp:5389] Launching executor > mesosvol.6ccd993c-1920-11e6-a722-9648cb19afd6 of framework > a3ad8418-cb77-4705-b353-4b514ceca52c- with resources cpus(*):0.1; > mem(*):32 in work directory > '/var/lib/mesos/slave/slaves/a3ad8418-cb77-4705-b353-4b514ceca52c-S0/frameworks/a3ad8418-cb77-4705-b353-4b514ceca52c-/executors/mesosvol.6ccd993c-1920-11e6-a722-9648cb19afd6/runs/24762d43-2134-475e-b724-caa72110497a' > May 13 15:36:13 ip-10-0-2-74.us-west-2.compute.internal mesos-slave[1304]: > I0513 15:36:13.645195 1342 slave.cpp:1698] Queuing task > 'mesosvol.6ccd993c-1920-11e6-a722-9648cb19afd6' for executor > 'mesosvol.6ccd993c-1920-11e6-a722-9648cb19afd6' of framework > a3ad8418-cb77-4705-b353-4b514ceca52c- > May 13 15:36:13 ip-10-0-2-74.us-west-2.compute.internal mesos-slave[1304]: > I0513 15:36:13.645491 1338 containerizer.cpp:671] Starting container > '24762d43-2134-475e-b724-caa72110497a' for executor > 'mesosvol.6ccd993c-1920-11e6-a722-9648cb19afd6' of framework > 'a3ad8418-cb77-4705-b353-4b514ceca52c-' > May 13 15:36:13 ip-10-0-2-74.us-west-2.compute.internal mesos-slave[1304]: > I0513 15:36:13.647897 1345 cpushare.cpp:389] Updated 'cpu.shares' to 1126 >
[jira] [Commented] (MESOS-5449) Memory leak in SchedulerProcess.declineOffer
[ https://issues.apache.org/jira/browse/MESOS-5449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299439#comment-15299439 ] Vinod Kone commented on MESOS-5449: --- commit 927bec15d94e40928180769300b239a7e6bb9d6f Author: Dario RexinDate: Tue May 24 21:07:20 2016 -0700 Fixed a memory leak in SchedulerProcess.decline. MesosScheduler.declineOffers has been changed ~6 months ago to send a Decline message instead of calling acceptOffers with an empty list of task infos. The changed version of declineOffer however did not remove the offerId from the savedOffers map, causing a memory leak. Review: https://reviews.apache.org/r/47804/ > Memory leak in SchedulerProcess.declineOffer > > > Key: MESOS-5449 > URL: https://issues.apache.org/jira/browse/MESOS-5449 > Project: Mesos > Issue Type: Bug > Components: scheduler driver >Affects Versions: 0.26.0, 0.27.0, 0.27.1, 0.28.0, 0.26.1, 0.28.1 >Reporter: Dario Rexin >Assignee: Dario Rexin >Priority: Blocker > Fix For: 0.29.0, 0.27.3, 0.28.2, 0.26.2 > > > MesosScheduler.declineOffers has been changed ~6 months ago to send a Decline > message instead of calling acceptOffers with an empty list of task infos. The > changed version of declineOffer however did not remove the offerId from the > savedOffers map, causing a memory leak. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5449) Memory leak in SchedulerProcess.declineOffer
[ https://issues.apache.org/jira/browse/MESOS-5449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Wu updated MESOS-5449: - Shepherd: Vinod Kone > Memory leak in SchedulerProcess.declineOffer > > > Key: MESOS-5449 > URL: https://issues.apache.org/jira/browse/MESOS-5449 > Project: Mesos > Issue Type: Bug > Components: scheduler driver >Affects Versions: 0.26.0, 0.27.0, 0.27.1, 0.28.0, 0.26.1, 0.28.1 >Reporter: Dario Rexin >Assignee: Dario Rexin >Priority: Blocker > Fix For: 0.29.0, 0.27.3, 0.28.2, 0.26.2 > > > MesosScheduler.declineOffers has been changed ~6 months ago to send a Decline > message instead of calling acceptOffers with an empty list of task infos. The > changed version of declineOffer however did not remove the offerId from the > savedOffers map, causing a memory leak. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5449) Memory leak in SchedulerProcess.declineOffer
[ https://issues.apache.org/jira/browse/MESOS-5449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-5449: -- Fix Version/s: 0.26.2 0.28.2 0.27.3 0.29.0 > Memory leak in SchedulerProcess.declineOffer > > > Key: MESOS-5449 > URL: https://issues.apache.org/jira/browse/MESOS-5449 > Project: Mesos > Issue Type: Bug > Components: scheduler driver >Affects Versions: 0.26.0, 0.27.0, 0.27.1, 0.28.0, 0.26.1, 0.28.1 >Reporter: Dario Rexin >Assignee: Dario Rexin >Priority: Blocker > Fix For: 0.29.0, 0.27.3, 0.28.2, 0.26.2 > > > MesosScheduler.declineOffers has been changed ~6 months ago to send a Decline > message instead of calling acceptOffers with an empty list of task infos. The > changed version of declineOffer however did not remove the offerId from the > savedOffers map, causing a memory leak. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5449) Memory leak in SchedulerProcess.declineOffer
[ https://issues.apache.org/jira/browse/MESOS-5449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299269#comment-15299269 ] Jie Yu commented on MESOS-5449: --- [~vinodkone] Can you take a look? > Memory leak in SchedulerProcess.declineOffer > > > Key: MESOS-5449 > URL: https://issues.apache.org/jira/browse/MESOS-5449 > Project: Mesos > Issue Type: Bug > Components: scheduler driver >Affects Versions: 0.26.0, 0.27.0, 0.27.1, 0.28.0, 0.26.1, 0.28.1 >Reporter: Dario Rexin >Assignee: Dario Rexin >Priority: Blocker > > MesosScheduler.declineOffers has been changed ~6 months ago to send a Decline > message instead of calling acceptOffers with an empty list of task infos. The > changed version of declineOffer however did not remove the offerId from the > savedOffers map, causing a memory leak. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5449) Memory leak in SchedulerProcess.declineOffer
[ https://issues.apache.org/jira/browse/MESOS-5449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299256#comment-15299256 ] Dario Rexin commented on MESOS-5449: Review request is here: https://reviews.apache.org/r/47804/ > Memory leak in SchedulerProcess.declineOffer > > > Key: MESOS-5449 > URL: https://issues.apache.org/jira/browse/MESOS-5449 > Project: Mesos > Issue Type: Bug > Components: scheduler driver >Affects Versions: 0.26.0, 0.27.0, 0.27.1, 0.28.0, 0.26.1, 0.28.1 >Reporter: Dario Rexin >Assignee: Dario Rexin >Priority: Blocker > > MesosScheduler.declineOffers has been changed ~6 months ago to send a Decline > message instead of calling acceptOffers with an empty list of task infos. The > changed version of declineOffer however did not remove the offerId from the > savedOffers map, causing a memory leak. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-5449) Memory leak in SchedulerProcess.declineOffer
Dario Rexin created MESOS-5449: -- Summary: Memory leak in SchedulerProcess.declineOffer Key: MESOS-5449 URL: https://issues.apache.org/jira/browse/MESOS-5449 Project: Mesos Issue Type: Bug Components: scheduler driver Affects Versions: 0.28.1, 0.26.1, 0.28.0, 0.27.1, 0.27.0, 0.26.0 Reporter: Dario Rexin Assignee: Dario Rexin Priority: Blocker MesosScheduler.declineOffers has been changed ~6 months ago to send a Decline message instead of calling acceptOffers with an empty list of task infos. The changed version of declineOffer however did not remove the offerId from the savedOffers map, causing a memory leak. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5153) Sandboxes contents should be protected from unauthorized users
[ https://issues.apache.org/jira/browse/MESOS-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299028#comment-15299028 ] Alexander Rojas commented on MESOS-5153: [r/47794/|https://reviews.apache.org/r/47794/]: Added authorization support for {{mesos::internal::Files}}. [r/47795/|https://reviews.apache.org/r/47795/]: Enabled authorization for sandboxes. > Sandboxes contents should be protected from unauthorized users > -- > > Key: MESOS-5153 > URL: https://issues.apache.org/jira/browse/MESOS-5153 > Project: Mesos > Issue Type: Bug > Components: security, slave >Reporter: Alexander Rojas >Assignee: Alexander Rojas > Labels: mesosphere, security > Fix For: 0.29.0 > > > MESOS-4956 introduced authentication support for the sandboxes. However, > authentication can only go as far as to tell whether an user is known to > mesos or not. An extra additional step is necessary to verify whether the > known user is allowed to executed the requested operation on the sandbox > (browse, read, download, debug). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-5448) Persistent volume deletion on the agent should survive slave restart
Anindya Sinha created MESOS-5448: Summary: Persistent volume deletion on the agent should survive slave restart Key: MESOS-5448 URL: https://issues.apache.org/jira/browse/MESOS-5448 Project: Mesos Issue Type: Bug Components: general Reporter: Anindya Sinha Assignee: Anindya Sinha When the master sends a CheckpointResourcesMessage to the agent, the agent attempts to rmdir the persistent volume (if it existed before, and is no longer in the updated checkpoint in CheckpointResourcesMessage). If the slave restarts before the operation finishes, the disk space can be leaked because a reattempt of a rmdir is not done. Subsequently, a CREATE on the same path could result in leaking of the data to another framework (since the directory was not rm-ed). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4757) Mesos containerizer should get uid/gids before pivot_root.
[ https://issues.apache.org/jira/browse/MESOS-4757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298975#comment-15298975 ] Gilbert Song commented on MESOS-4757: - [~idownes], Kevin proposed a solution for host user -> container user around two months ago via mailing list. Could you take a look at it to see whether it may break your cases? Thanks! :) https://docs.google.com/document/d/1ENNJKyPrqqm8OsYV8-dDoHTiRmqtuVbcdzNWj1nURsQ/edit#heading=h.j9cu8f69ljik > Mesos containerizer should get uid/gids before pivot_root. > -- > > Key: MESOS-4757 > URL: https://issues.apache.org/jira/browse/MESOS-4757 > Project: Mesos > Issue Type: Bug >Reporter: Jie Yu >Assignee: Jie Yu > > Currently, we call os::su(user) after pivot_root. This is problematic because > /etc/passwd and /etc/group might be missing in container's root filesystem. > We should instead, get the uid/gids before pivot_root, and call > setuid/setgroups after pivot_root. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (MESOS-3085) Make failed on Ubuntu 14.04 ppc64le
[ https://issues.apache.org/jira/browse/MESOS-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tomasz Janiszewski updated MESOS-3085: -- Comment: was deleted (was: Duplicate: MESOS-5263) > Make failed on Ubuntu 14.04 ppc64le > --- > > Key: MESOS-3085 > URL: https://issues.apache.org/jira/browse/MESOS-3085 > Project: Mesos > Issue Type: Bug > Components: build >Affects Versions: 0.23.0, 0.24.0 > Environment: Ubuntu 14.04 ppc64le >Reporter: Jihun Kang >Assignee: Jihun Kang > Fix For: 0.29.0 > > > When trying to compile linux/fs.cpp, make failed with a following message. > {noformat} > /bin/bash ../libtool --tag=CXX --mode=compile g++ -DPACKAGE_NAME=\"mesos\" > -DPACKAGE_TARNAME=\"mesos\" -DPACKAGE_VERSION=\"0.24.0\" > -DPACKAGE_STRING=\"mesos\ 0.24.0\" -DPACKAGE_BUGREPORT=\"\" > -DPACKAGE_URL=\"\" -DPACKAGE=\"mesos\" -DVERSION=\"0.24.0\" -DSTDC_HEADERS=1 > -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 > -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 > -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 -DLT_OBJDIR=\".libs/\" > -DHAVE_PTHREAD_PRIO_INHERIT=1 -DHAVE_PTHREAD=1 -DHAVE_LIBZ=1 -DHAVE_LIBCURL=1 > -DHAVE_APR_POOLS_H=1 -DHAVE_LIBAPR_1=1 -DHAVE_SVN_VERSION_H=1 > -DHAVE_LIBSVN_SUBR_1=1 -DHAVE_SVN_DELTA_H=1 -DHAVE_LIBSVN_DELTA_1=1 > -DHAVE_LIBSASL2=1 -DMESOS_HAS_JAVA=1 -DHAVE_PYTHON=\"2.7\" > -DMESOS_HAS_PYTHON=1 -I. -I../../src -Wall -Werror > -DLIBDIR=\"/usr/local/lib\" -DPKGLIBEXECDIR=\"/usr/local/libexec/mesos\" > -DPKGDATADIR=\"/usr/local/share/mesos\" -I../../include > -I../../3rdparty/libprocess/include > -I../../3rdparty/libprocess/3rdparty/stout/include -I../include > -I../include/mesos -I../3rdparty/libprocess/3rdparty/boost-1.53.0 > -I../3rdparty/libprocess/3rdparty/picojson-4f93734 > -I../3rdparty/libprocess/3rdparty/protobuf-2.5.0/src > -I../3rdparty/libprocess/3rdparty/glog-0.3.3/src > -I../3rdparty/libprocess/3rdparty/glog-0.3.3/src > -I../3rdparty/leveldb/include -I../3rdparty/zookeeper-3.4.5/src/c/include > -I../3rdparty/zookeeper-3.4.5/src/c/generated > -I../3rdparty/libprocess/3rdparty/protobuf-2.5.0/src > -I/usr/include/subversion-1 -I/usr/include/apr-1 -I/usr/include/apr-1.0 > -pthread -g1 -O0 -Wno-unused-local-typedefs -std=c++11 -MT > linux/libmesos_no_3rdparty_la-fs.lo -MD -MP -MF > linux/.deps/libmesos_no_3rdparty_la-fs.Tpo -c -o > linux/libmesos_no_3rdparty_la-fs.lo `test -f 'linux/fs.cpp' || echo > '../../src/'`linux/fs.cpp > libtool: compile: g++ -DPACKAGE_NAME=\"mesos\" -DPACKAGE_TARNAME=\"mesos\" > -DPACKAGE_VERSION=\"0.24.0\" "-DPACKAGE_STRING=\"mesos 0.24.0\"" > -DPACKAGE_BUGREPORT=\"\" -DPACKAGE_URL=\"\" -DPACKAGE=\"mesos\" > -DVERSION=\"0.24.0\" -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 > -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 > -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 > -DHAVE_DLFCN_H=1 -DLT_OBJDIR=\".libs/\" -DHAVE_PTHREAD_PRIO_INHERIT=1 > -DHAVE_PTHREAD=1 -DHAVE_LIBZ=1 -DHAVE_LIBCURL=1 -DHAVE_APR_POOLS_H=1 > -DHAVE_LIBAPR_1=1 -DHAVE_SVN_VERSION_H=1 -DHAVE_LIBSVN_SUBR_1=1 > -DHAVE_SVN_DELTA_H=1 -DHAVE_LIBSVN_DELTA_1=1 -DHAVE_LIBSASL2=1 > -DMESOS_HAS_JAVA=1 -DHAVE_PYTHON=\"2.7\" -DMESOS_HAS_PYTHON=1 -I. -I../../src > -Wall -Werror -DLIBDIR=\"/usr/local/lib\" > -DPKGLIBEXECDIR=\"/usr/local/libexec/mesos\" > -DPKGDATADIR=\"/usr/local/share/mesos\" -I../../include > -I../../3rdparty/libprocess/include > -I../../3rdparty/libprocess/3rdparty/stout/include -I../include > -I../include/mesos -I../3rdparty/libprocess/3rdparty/boost-1.53.0 > -I../3rdparty/libprocess/3rdparty/picojson-4f93734 > -I../3rdparty/libprocess/3rdparty/protobuf-2.5.0/src > -I../3rdparty/libprocess/3rdparty/glog-0.3.3/src > -I../3rdparty/libprocess/3rdparty/glog-0.3.3/src > -I../3rdparty/leveldb/include -I../3rdparty/zookeeper-3.4.5/src/c/include > -I../3rdparty/zookeeper-3.4.5/src/c/generated > -I../3rdparty/libprocess/3rdparty/protobuf-2.5.0/src > -I/usr/include/subversion-1 -I/usr/include/apr-1 -I/usr/include/apr-1.0 > -pthread -g1 -O0 -Wno-unused-local-typedefs -std=c++11 -MT > linux/libmesos_no_3rdparty_la-fs.lo -MD -MP -MF > linux/.deps/libmesos_no_3rdparty_la-fs.Tpo -c ../../src/linux/fs.cpp -fPIC > -DPIC -o linux/.libs/libmesos_no_3rdparty_la-fs.o > ../../src/linux/fs.cpp:346:2: error: #error "pivot_root is not available" > #error "pivot_root is not available" > ^ > ../../src/linux/fs.cpp: In function 'Try > mesos::internal::fs::pivot_root(const string&, const string&)': > ../../src/linux/fs.cpp:348:7: error: 'ret' was not declared in this scope >if (ret == -1) { >^ > make[2]: *** [linux/libmesos_no_3rdparty_la-fs.lo] Error 1 > make[2]: *** Waiting for unfinished jobs > {noformat} -- This
[jira] [Commented] (MESOS-970) Upgrade bundled leveldb to 1.18
[ https://issues.apache.org/jira/browse/MESOS-970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298963#comment-15298963 ] Vinod Kone commented on MESOS-970: -- Looks great. Thanks for sharing! > Upgrade bundled leveldb to 1.18 > --- > > Key: MESOS-970 > URL: https://issues.apache.org/jira/browse/MESOS-970 > Project: Mesos > Issue Type: Improvement > Components: replicated log >Reporter: Benjamin Mahler >Assignee: Tomasz Janiszewski > > We currently bundle leveldb 1.4, and the latest version is leveldb 1.18. > Upgrade to 1.18 could solve the problems when build Mesos in some non-x86 > architecture CPU. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5237) The windows version of `os::access` has differing behavior than the POSIX version.
[ https://issues.apache.org/jira/browse/MESOS-5237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Park updated MESOS-5237: Description: The POSIX version of {{os::access}} looks like this: {code} inline Try access(const std::string& path, int how) { if (::access(path.c_str(), how) < 0) { if (errno == EACCES) { return false; } else { return ErrnoError(); } } return true; } {code} Compare this to the Windows version of {{os::access}} which looks like this following: {code} inline Try access(const std::string& fileName, int how) { if (::_access(fileName.c_str(), how) != 0) { return ErrnoError("access: Could not access path '" + fileName + "'"); } return true; } {code} As we can see, the case where {{errno}} is set to {{EACCES}} is handled differently between the 2 functions. We can actually consolidate the 2 functions by simply using the POSIX version. The challenge is that on POSIX, we should use {{::access}} and {{::_access}} on Windows. Note however, that this problem is already solved, as we have an implementation of {{::access}} for Windows in {{3rdparty/libprocess/3rdparty/stout/include/stout/windows.hpp}} which simply defers to {{::_access}}. Thus, I propose to simply consolidate the 2 implementations. was: The POSIX version of {{os::access}} looks like this: {code} inline Try access(const std::string& path, int how) { if (::access(path.c_str(), how) < 0) { if (errno == EACCES) { return false; } else { return ErrnoError(); } } return true; } {code} Compare this to the Windows version of {{os::access}} which looks like this following: {code} inline Try access(const std::string& fileName, int how) { if (::_access(fileName.c_str(), how) != 0) { return ErrnoError("access: Could not access path '" + fileName + "'"); } return true; } {code} As we can see, the case where {{errno}} is set to {{EACCES}} is handled differently between the 2 functions. We can actually consolidate the 2 functions by simply using the POSIX version. The challenge is that on POSIX, we should use {{::access}} and {{_::access}} on Windows. Note however, that this problem is already solved, as we have an implementation of {{::access}} for Windows in {{3rdparty/libprocess/3rdparty/stout/include/stout/windows.hpp}} which simply defers to {{::_access}}. Thus, I propose to simply consolidate the 2 implementations. > The windows version of `os::access` has differing behavior than the POSIX > version. > -- > > Key: MESOS-5237 > URL: https://issues.apache.org/jira/browse/MESOS-5237 > Project: Mesos > Issue Type: Bug > Components: stout >Reporter: Michael Park >Assignee: Michael Park > Labels: mesosphere,, windows > > The POSIX version of {{os::access}} looks like this: > {code} > inline Try access(const std::string& path, int how) > { > if (::access(path.c_str(), how) < 0) { > if (errno == EACCES) { > return false; > } else { > return ErrnoError(); > } > } > return true; > } > {code} > Compare this to the Windows version of {{os::access}} which looks like this > following: > {code} > inline Try access(const std::string& fileName, int how) > { > if (::_access(fileName.c_str(), how) != 0) { > return ErrnoError("access: Could not access path '" + fileName + "'"); > } > return true; > } > {code} > As we can see, the case where {{errno}} is set to {{EACCES}} is handled > differently between the 2 functions. > We can actually consolidate the 2 functions by simply using the POSIX > version. The challenge is that on POSIX, we should use {{::access}} and > {{::_access}} on Windows. Note however, that this problem is already solved, > as we have an implementation of {{::access}} for Windows in > {{3rdparty/libprocess/3rdparty/stout/include/stout/windows.hpp}} which simply > defers to {{::_access}}. > Thus, I propose to simply consolidate the 2 implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5447) Configurable agent memory reservation
[ https://issues.apache.org/jira/browse/MESOS-5447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Isenberg updated MESOS-5447: - Issue Type: Improvement (was: Story) > Configurable agent memory reservation > - > > Key: MESOS-5447 > URL: https://issues.apache.org/jira/browse/MESOS-5447 > Project: Mesos > Issue Type: Improvement >Reporter: Karl Isenberg > > When deciding what memory to make available to tasks (if not explicitly > configured), mesos agents currently reserve 1GB or 1/2 of system memory. > I'd really like to be able to configure the reservation amount so that I > don't have to reproduce the system memory lookup and math in order to > configure "mem=(system memory - reservation)". > Ideally this would be configurable with a command line argument and > environment variable like "--reserved-memory=512". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-5447) Configurable agent memory reservation
Karl Isenberg created MESOS-5447: Summary: Configurable agent memory reservation Key: MESOS-5447 URL: https://issues.apache.org/jira/browse/MESOS-5447 Project: Mesos Issue Type: Story Reporter: Karl Isenberg When deciding what memory to make available to tasks (if not explicitly configured), mesos agents currently reserve 1GB or 1/2 of system memory. I'd really like to be able to configure the reservation amount so that I don't have to reproduce the system memory lookup and math in order to configure "mem=(system memory - reservation)". Ideally this would be configurable with a command line argument and environment variable like "--reserved-memory=512". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5294) Status updates after a health check are incomplete or invalid
[ https://issues.apache.org/jira/browse/MESOS-5294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298731#comment-15298731 ] Dmitry Fedorov commented on MESOS-5294: --- I've met the same issue, an issue occurs when we are using custom networking in docker, as [~thegner] described. I've manage to "fix" it by duplicating {code} 164 inspect = docker->inspect(containerName, DOCKER_INSPECT_DELAY) 165 .then(defer(self(), [=](const Docker::Container& container) { 166 if (!killed) { 167 TaskStatus status; 168 status.mutable_task_id()->CopyFrom(taskId.get()); 169 status.set_state(TASK_RUNNING); 170 status.set_data(container.output); 171 if (container.ipAddress.isSome()) { 172 // TODO(karya): Deprecated -- Remove after 0.25.0 has shipped. 173 Label* label = status.mutable_labels()->add_labels(); 174 label->set_key("Docker.NetworkSettings.IPAddress"); 175 label->set_value(container.ipAddress.get()); 176 177 NetworkInfo* networkInfo = 178 status.mutable_container_status()->add_network_infos(); 179 180 // TODO(CD): Deprecated -- Remove after 0.27.0. 181 networkInfo->set_ip_address(container.ipAddress.get()); 182 183 NetworkInfo::IPAddress* ipAddress = 184 networkInfo->add_ip_addresses(); 185 ipAddress->set_ip_address(container.ipAddress.get()); 186 } {code} from `launchTask` method in src/docker/executor.cpp to taskHealthUpdated method > Status updates after a health check are incomplete or invalid > - > > Key: MESOS-5294 > URL: https://issues.apache.org/jira/browse/MESOS-5294 > Project: Mesos > Issue Type: Bug > Environment: mesos 0.28.0, docker 1.11, marathon 0.15.3, mesos-dns, > ubuntu 14.04 >Reporter: Travis Hegner >Assignee: Travis Hegner > > With command health checks enabled via marathon, mesos-dns will resolve the > task correctly until the task is reported as "healthy". At that point, > mesos-dns stops resolving the task correctly. > -Digging through src/docker/executor.cpp, I found that in the > {{taskHealthUpdated()}} function is attempting to copy the taskID to the new > status instance with- > {code}status.mutable_task_id()->CopyFrom(taskID);{code} > -but other instances of status updates have a similar line- > {code}status.mutable_task_id()->CopyFrom(taskID.get());{code} > -My assumption is that this difference is causing the status update after a > health check to not have a proper taskID, which in turn is causing an > incorrect state.json output.- > -I'll try to get a patch together soon.- > UPDATE: > None of the above assumption are correct. Something else is causing the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5421) Mesos Docker executor taskHealthUpdated removes information about job ipAddresses
[ https://issues.apache.org/jira/browse/MESOS-5421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298715#comment-15298715 ] Dmitry Fedorov commented on MESOS-5421: --- [~kaysoky], [~jieyu] It seems that my issue is a duplicate of MESOS-5294, I'll add more information in it. > Mesos Docker executor taskHealthUpdated removes information about job > ipAddresses > - > > Key: MESOS-5421 > URL: https://issues.apache.org/jira/browse/MESOS-5421 > Project: Mesos > Issue Type: Bug > Components: slave >Affects Versions: 0.28.1 >Reporter: Dmitry Fedorov >Priority: Minor > Fix For: 0.29.0 > > > When you create job with command health check, right after job is launched > the status is correct and ipAddresses field is present in it. > But after health status is updated, ipAddresses field is missed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5430) Design the improvement of the home page of mesos.apache.org
[ https://issues.apache.org/jira/browse/MESOS-5430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298655#comment-15298655 ] haosdent commented on MESOS-5430: - Thank you very much for your sketch file, let me update to match your design. > Design the improvement of the home page of mesos.apache.org > --- > > Key: MESOS-5430 > URL: https://issues.apache.org/jira/browse/MESOS-5430 > Project: Mesos > Issue Type: Improvement > Components: project website >Reporter: Vinod Kone >Assignee: Jonathan Manalus > > The idea is to come up with a minimal improvement for the design of the home > page of mesos.apache.org. > Proposed Redesign: https://invis.io/CV7DZF1JW#/159898819_Mesos-apache-org -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-970) Upgrade bundled leveldb to 1.18
[ https://issues.apache.org/jira/browse/MESOS-970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298583#comment-15298583 ] haosdent commented on MESOS-970: I created a document to describe my tests https://docs.google.com/document/d/1fv2OMvH6hVm6waacOejSrTJwUuDQeXlqqPDZjBmbcKU/edit?usp=sharing Now finish {{Compatible test (Single node mode)}} and testing {{Compatible test (Zookeeper mode)}}, feel free to add comments on it if you have any questions or suggestions. > Upgrade bundled leveldb to 1.18 > --- > > Key: MESOS-970 > URL: https://issues.apache.org/jira/browse/MESOS-970 > Project: Mesos > Issue Type: Improvement > Components: replicated log >Reporter: Benjamin Mahler >Assignee: Tomasz Janiszewski > > We currently bundle leveldb 1.4, and the latest version is leveldb 1.18. > Upgrade to 1.18 could solve the problems when build Mesos in some non-x86 > architecture CPU. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5430) Design the improvement of the home page of mesos.apache.org
[ https://issues.apache.org/jira/browse/MESOS-5430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298534#comment-15298534 ] Jonathan Manalus commented on MESOS-5430: - [~vinodkone] - [~haosd...@gmail.com] first iteration looks great. It just needs small tweaks to match the sketch file that I posted above which I can help answer any questions. [~vinodkone] can you work with Amr Abdelrazik @Mesosphere on the copy of the homepage? and work with Ben Hindman to figure out the launch date. > Design the improvement of the home page of mesos.apache.org > --- > > Key: MESOS-5430 > URL: https://issues.apache.org/jira/browse/MESOS-5430 > Project: Mesos > Issue Type: Improvement > Components: project website >Reporter: Vinod Kone >Assignee: Jonathan Manalus > > The idea is to come up with a minimal improvement for the design of the home > page of mesos.apache.org. > Proposed Redesign: https://invis.io/CV7DZF1JW#/159898819_Mesos-apache-org -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5430) Design the improvement of the home page of mesos.apache.org
[ https://issues.apache.org/jira/browse/MESOS-5430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298524#comment-15298524 ] Vinod Kone commented on MESOS-5430: --- I thought someone at Mesosphere was going to implement your design. Is that not true [~jmanalus]? Don't want [~haosd...@gmail.com] to spend cycles on implementation if someone else is going to do it. > Design the improvement of the home page of mesos.apache.org > --- > > Key: MESOS-5430 > URL: https://issues.apache.org/jira/browse/MESOS-5430 > Project: Mesos > Issue Type: Improvement > Components: project website >Reporter: Vinod Kone >Assignee: Jonathan Manalus > > The idea is to come up with a minimal improvement for the design of the home > page of mesos.apache.org. > Proposed Redesign: https://invis.io/CV7DZF1JW#/159898819_Mesos-apache-org -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5435) Add default implementations to all Isolator virtual functions
[ https://issues.apache.org/jira/browse/MESOS-5435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Klues updated MESOS-5435: --- Sprint: Mesosphere Sprint 35 > Add default implementations to all Isolator virtual functions > - > > Key: MESOS-5435 > URL: https://issues.apache.org/jira/browse/MESOS-5435 > Project: Mesos > Issue Type: Improvement >Reporter: Kevin Klues >Assignee: Kevin Klues > Fix For: 0.29.0 > > > Currently, all of the virtual functions in `mesos::slave::Isolator` are pure > virtual (expect status()). For many isolators, however, it doesn't make sense > to implement all of these virtual functions. Each isolator has to provide its > own default implementation of these functions even if they aren't really > relying on them. This adds unnecessary extra code to many isolators that > don't need them. > Moreover, the `MesosIsolatorProcess` has the same problem for each of its > virtual functions. > We should provide defaults for these instead of making each and every > isolator implement even in cases when it doesn't make sense. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5435) Add default implementations to all Isolator virtual functions
[ https://issues.apache.org/jira/browse/MESOS-5435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Klues updated MESOS-5435: --- Story Points: 1 > Add default implementations to all Isolator virtual functions > - > > Key: MESOS-5435 > URL: https://issues.apache.org/jira/browse/MESOS-5435 > Project: Mesos > Issue Type: Improvement >Reporter: Kevin Klues >Assignee: Kevin Klues > Fix For: 0.29.0 > > > Currently, all of the virtual functions in `mesos::slave::Isolator` are pure > virtual (expect status()). For many isolators, however, it doesn't make sense > to implement all of these virtual functions. Each isolator has to provide its > own default implementation of these functions even if they aren't really > relying on them. This adds unnecessary extra code to many isolators that > don't need them. > Moreover, the `MesosIsolatorProcess` has the same problem for each of its > virtual functions. > We should provide defaults for these instead of making each and every > isolator implement even in cases when it doesn't make sense. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5167) Add tests for `network/cni` isolator
[ https://issues.apache.org/jira/browse/MESOS-5167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298506#comment-15298506 ] Jie Yu commented on MESOS-5167: --- commit d6846f952b55c47d33a926e4d4da38d058e6dcf3 Author: Qian ZhangDate: Tue May 24 09:50:57 2016 -0700 Added the test "CniIsolatorTest.ROOT_SlaveRecovery". Review: https://reviews.apache.org/r/46438/ commit d02cb7848fb02d2faf660cb6664578c0554f7aca Author: Qian Zhang Date: Mon May 23 14:43:54 2016 -0700 Added the test "CniIsolatorTest.ROOT_FailedPlugin". Review: https://reviews.apache.org/r/46436/ commit 999bb7168a9ea78e7978bdc17e64672c32dc1ab7 Author: Qian Zhang Date: Mon May 23 14:39:03 2016 -0700 Added the test "CniIsolatorTest.ROOT_VerifyCheckpointedInfo". Review: https://reviews.apache.org/r/46435/ commit 8c10513ef50185dfe7d477525799826cc7a1b056 Author: Qian Zhang Date: Mon May 23 14:25:41 2016 -0700 Added the test "CniIsolatorTest.ROOT_LaunchCommandTask". Review: https://reviews.apache.org/r/46097/ > Add tests for `network/cni` isolator > > > Key: MESOS-5167 > URL: https://issues.apache.org/jira/browse/MESOS-5167 > Project: Mesos > Issue Type: Task > Components: test >Reporter: Qian Zhang >Assignee: Qian Zhang > Labels: mesosphere > > We need to add tests to verify the functionality of `network/cni` isolator. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5256) Add support for per-containerizer resource enumeration
[ https://issues.apache.org/jira/browse/MESOS-5256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Klues updated MESOS-5256: --- Sprint: Mesosphere Sprint 35 > Add support for per-containerizer resource enumeration > -- > > Key: MESOS-5256 > URL: https://issues.apache.org/jira/browse/MESOS-5256 > Project: Mesos > Issue Type: Task > Components: isolation >Reporter: Kevin Klues >Assignee: Kevin Klues > Labels: containerizer > > Currently the top level containerizer includes a static function for > enumerating the resources available on a given agent. Ideally, this > functionality should be the responsibility of individual containerizers (and > specifically the responsibility of each isolator used to control access to > those resources). > Adding support for this will involve making the `Containerizer::resources()` > function virtual instead of static and then implementing it on a > per-containerizer basis. We should consider providing a default to make this > easier in cases where there is only really one good way of enumerating a > given set of resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5430) Design the improvement of the home page of mesos.apache.org
[ https://issues.apache.org/jira/browse/MESOS-5430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298476#comment-15298476 ] Jonathan Manalus commented on MESOS-5430: - [~haosd...@gmail.com] - You can download the sketch file here http://cl.ly/27393B403U2p The demo looks great! > Design the improvement of the home page of mesos.apache.org > --- > > Key: MESOS-5430 > URL: https://issues.apache.org/jira/browse/MESOS-5430 > Project: Mesos > Issue Type: Improvement > Components: project website >Reporter: Vinod Kone >Assignee: Jonathan Manalus > > The idea is to come up with a minimal improvement for the design of the home > page of mesos.apache.org. > Proposed Redesign: https://invis.io/CV7DZF1JW#/159898819_Mesos-apache-org -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5421) Mesos Docker executor taskHealthUpdated removes information about job ipAddresses
[ https://issues.apache.org/jira/browse/MESOS-5421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298472#comment-15298472 ] Jie Yu commented on MESOS-5421: --- [~dfedorov] Can you provide more debug information (e.g., agent/executor logs, job description, what networking are you using)? It's hard for us to triage the issue without those information. Also, since it's not clear what the issue is, i'll mark the fix version as 0.29 as we are cutting 0.28.2. If this turns out to be an issue, we can backport it into 0.28.3. > Mesos Docker executor taskHealthUpdated removes information about job > ipAddresses > - > > Key: MESOS-5421 > URL: https://issues.apache.org/jira/browse/MESOS-5421 > Project: Mesos > Issue Type: Bug > Components: slave >Affects Versions: 0.28.1 >Reporter: Dmitry Fedorov >Priority: Minor > Fix For: 0.29.0 > > > When you create job with command health check, right after job is launched > the status is correct and ipAddresses field is present in it. > But after health status is updated, ipAddresses field is missed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5421) Mesos Docker executor taskHealthUpdated removes information about job ipAddresses
[ https://issues.apache.org/jira/browse/MESOS-5421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-5421: -- Fix Version/s: (was: 0.28.2) 0.29.0 > Mesos Docker executor taskHealthUpdated removes information about job > ipAddresses > - > > Key: MESOS-5421 > URL: https://issues.apache.org/jira/browse/MESOS-5421 > Project: Mesos > Issue Type: Bug > Components: slave >Affects Versions: 0.28.1 >Reporter: Dmitry Fedorov >Priority: Minor > Fix For: 0.29.0 > > > When you create job with command health check, right after job is launched > the status is correct and ipAddresses field is present in it. > But after health status is updated, ipAddresses field is missed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5359) The scheduler library should have a delay before initiating a connection with master.
[ https://issues.apache.org/jira/browse/MESOS-5359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298436#comment-15298436 ] Anand Mazumdar commented on MESOS-5359: --- Ideally, yes. The delay should be configurable by a flag. Have a look at how we are already doing so for the old driver based interface {{src/sched/flags.hpp}}. The only difference here is that we just need a single maximum connection backoff variable e.g., {{MESOS_CONNECTION_BACKOFF_MAX=~500ms}}. The scheduler library can then do a linear backoff after picking a random delay between 0 and maxBackoff for initiating the connection with the master. > The scheduler library should have a delay before initiating a connection with > master. > - > > Key: MESOS-5359 > URL: https://issues.apache.org/jira/browse/MESOS-5359 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.29.0 >Reporter: Anand Mazumdar >Assignee: José Guilherme Vanz > Labels: mesosphere > > Currently, the scheduler library {{src/scheduler/scheduler.cpp}} does have an > artificially induced delay when trying to initially establish a connection > with the master. In the event of a master failover or ZK disconnect, a large > number of frameworks can get disconnected and then thereby overwhelm the > master with TCP SYN requests. > On a large cluster with many agents, the master is already overwhelmed with > handling connection requests from the agents. This compounds the issue > further on the master. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3085) Make failed on Ubuntu 14.04 ppc64le
[ https://issues.apache.org/jira/browse/MESOS-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298362#comment-15298362 ] Tomasz Janiszewski commented on MESOS-3085: --- Duplicate: MESOS-5263 > Make failed on Ubuntu 14.04 ppc64le > --- > > Key: MESOS-3085 > URL: https://issues.apache.org/jira/browse/MESOS-3085 > Project: Mesos > Issue Type: Bug > Components: build >Affects Versions: 0.23.0, 0.24.0 > Environment: Ubuntu 14.04 ppc64le >Reporter: Jihun Kang >Assignee: Jihun Kang > > When trying to compile linux/fs.cpp, make failed with a following message. > {noformat} > /bin/bash ../libtool --tag=CXX --mode=compile g++ -DPACKAGE_NAME=\"mesos\" > -DPACKAGE_TARNAME=\"mesos\" -DPACKAGE_VERSION=\"0.24.0\" > -DPACKAGE_STRING=\"mesos\ 0.24.0\" -DPACKAGE_BUGREPORT=\"\" > -DPACKAGE_URL=\"\" -DPACKAGE=\"mesos\" -DVERSION=\"0.24.0\" -DSTDC_HEADERS=1 > -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 > -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 > -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 -DLT_OBJDIR=\".libs/\" > -DHAVE_PTHREAD_PRIO_INHERIT=1 -DHAVE_PTHREAD=1 -DHAVE_LIBZ=1 -DHAVE_LIBCURL=1 > -DHAVE_APR_POOLS_H=1 -DHAVE_LIBAPR_1=1 -DHAVE_SVN_VERSION_H=1 > -DHAVE_LIBSVN_SUBR_1=1 -DHAVE_SVN_DELTA_H=1 -DHAVE_LIBSVN_DELTA_1=1 > -DHAVE_LIBSASL2=1 -DMESOS_HAS_JAVA=1 -DHAVE_PYTHON=\"2.7\" > -DMESOS_HAS_PYTHON=1 -I. -I../../src -Wall -Werror > -DLIBDIR=\"/usr/local/lib\" -DPKGLIBEXECDIR=\"/usr/local/libexec/mesos\" > -DPKGDATADIR=\"/usr/local/share/mesos\" -I../../include > -I../../3rdparty/libprocess/include > -I../../3rdparty/libprocess/3rdparty/stout/include -I../include > -I../include/mesos -I../3rdparty/libprocess/3rdparty/boost-1.53.0 > -I../3rdparty/libprocess/3rdparty/picojson-4f93734 > -I../3rdparty/libprocess/3rdparty/protobuf-2.5.0/src > -I../3rdparty/libprocess/3rdparty/glog-0.3.3/src > -I../3rdparty/libprocess/3rdparty/glog-0.3.3/src > -I../3rdparty/leveldb/include -I../3rdparty/zookeeper-3.4.5/src/c/include > -I../3rdparty/zookeeper-3.4.5/src/c/generated > -I../3rdparty/libprocess/3rdparty/protobuf-2.5.0/src > -I/usr/include/subversion-1 -I/usr/include/apr-1 -I/usr/include/apr-1.0 > -pthread -g1 -O0 -Wno-unused-local-typedefs -std=c++11 -MT > linux/libmesos_no_3rdparty_la-fs.lo -MD -MP -MF > linux/.deps/libmesos_no_3rdparty_la-fs.Tpo -c -o > linux/libmesos_no_3rdparty_la-fs.lo `test -f 'linux/fs.cpp' || echo > '../../src/'`linux/fs.cpp > libtool: compile: g++ -DPACKAGE_NAME=\"mesos\" -DPACKAGE_TARNAME=\"mesos\" > -DPACKAGE_VERSION=\"0.24.0\" "-DPACKAGE_STRING=\"mesos 0.24.0\"" > -DPACKAGE_BUGREPORT=\"\" -DPACKAGE_URL=\"\" -DPACKAGE=\"mesos\" > -DVERSION=\"0.24.0\" -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 > -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 > -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 > -DHAVE_DLFCN_H=1 -DLT_OBJDIR=\".libs/\" -DHAVE_PTHREAD_PRIO_INHERIT=1 > -DHAVE_PTHREAD=1 -DHAVE_LIBZ=1 -DHAVE_LIBCURL=1 -DHAVE_APR_POOLS_H=1 > -DHAVE_LIBAPR_1=1 -DHAVE_SVN_VERSION_H=1 -DHAVE_LIBSVN_SUBR_1=1 > -DHAVE_SVN_DELTA_H=1 -DHAVE_LIBSVN_DELTA_1=1 -DHAVE_LIBSASL2=1 > -DMESOS_HAS_JAVA=1 -DHAVE_PYTHON=\"2.7\" -DMESOS_HAS_PYTHON=1 -I. -I../../src > -Wall -Werror -DLIBDIR=\"/usr/local/lib\" > -DPKGLIBEXECDIR=\"/usr/local/libexec/mesos\" > -DPKGDATADIR=\"/usr/local/share/mesos\" -I../../include > -I../../3rdparty/libprocess/include > -I../../3rdparty/libprocess/3rdparty/stout/include -I../include > -I../include/mesos -I../3rdparty/libprocess/3rdparty/boost-1.53.0 > -I../3rdparty/libprocess/3rdparty/picojson-4f93734 > -I../3rdparty/libprocess/3rdparty/protobuf-2.5.0/src > -I../3rdparty/libprocess/3rdparty/glog-0.3.3/src > -I../3rdparty/libprocess/3rdparty/glog-0.3.3/src > -I../3rdparty/leveldb/include -I../3rdparty/zookeeper-3.4.5/src/c/include > -I../3rdparty/zookeeper-3.4.5/src/c/generated > -I../3rdparty/libprocess/3rdparty/protobuf-2.5.0/src > -I/usr/include/subversion-1 -I/usr/include/apr-1 -I/usr/include/apr-1.0 > -pthread -g1 -O0 -Wno-unused-local-typedefs -std=c++11 -MT > linux/libmesos_no_3rdparty_la-fs.lo -MD -MP -MF > linux/.deps/libmesos_no_3rdparty_la-fs.Tpo -c ../../src/linux/fs.cpp -fPIC > -DPIC -o linux/.libs/libmesos_no_3rdparty_la-fs.o > ../../src/linux/fs.cpp:346:2: error: #error "pivot_root is not available" > #error "pivot_root is not available" > ^ > ../../src/linux/fs.cpp: In function 'Try > mesos::internal::fs::pivot_root(const string&, const string&)': > ../../src/linux/fs.cpp:348:7: error: 'ret' was not declared in this scope >if (ret == -1) { >^ > make[2]: *** [linux/libmesos_no_3rdparty_la-fs.lo] Error 1 > make[2]: *** Waiting for unfinished jobs > {noformat} -- This message was sent by
[jira] [Commented] (MESOS-5446) NsTest.ROOT_setns and NsTest.ROOT_getns failed in Linux 4.6
[ https://issues.apache.org/jira/browse/MESOS-5446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298352#comment-15298352 ] Abhishek Dasgupta commented on MESOS-5446: -- Even after disabling lxcfs on ubuntu 16.04, these tests are still failing. > NsTest.ROOT_setns and NsTest.ROOT_getns failed in Linux 4.6 > --- > > Key: MESOS-5446 > URL: https://issues.apache.org/jira/browse/MESOS-5446 > Project: Mesos > Issue Type: Bug >Reporter: haosdent >Assignee: haosdent >Priority: Minor > > From [~nthakkar%40us.ibm.com] > {quote} > Currently because "cgroup" namespace is not supported, following two > test-case are failing: > 1. NsTest.ROOT_setns > 2. NsTest.ROOT_getns > The error observed is : "nstype: Unknown namespace 'cgroup'" > This is because the contents of the directory "/proc/self/ns" has been > changed in kernel version 4.6 (cgroup is added). > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5430) Design the improvement of the home page of mesos.apache.org
[ https://issues.apache.org/jira/browse/MESOS-5430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298260#comment-15298260 ] haosdent commented on MESOS-5430: - Or you could use sketch plugin https://github.com/utom/sketch-measure to export a measure html page. > Design the improvement of the home page of mesos.apache.org > --- > > Key: MESOS-5430 > URL: https://issues.apache.org/jira/browse/MESOS-5430 > Project: Mesos > Issue Type: Improvement > Components: project website >Reporter: Vinod Kone >Assignee: Jonathan Manalus > > The idea is to come up with a minimal improvement for the design of the home > page of mesos.apache.org. > Proposed Redesign: https://invis.io/CV7DZF1JW#/159898819_Mesos-apache-org -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5406) Validate ACLs on creating an instance of local authorizer.
[ https://issues.apache.org/jira/browse/MESOS-5406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298130#comment-15298130 ] Jay Guo commented on MESOS-5406: Some more thoughts: # Should we sort ACLs and apply some mechanism like longest-prefix-match in routing table? Instead of relying on the order they are specified by user # Also should aggregate ACLs for given action? I saw TODO in codebase: TODO(vinod): Do aggregation of ACLs when possible. > Validate ACLs on creating an instance of local authorizer. > -- > > Key: MESOS-5406 > URL: https://issues.apache.org/jira/browse/MESOS-5406 > Project: Mesos > Issue Type: Improvement > Components: security >Reporter: Alexander Rukletsov >Assignee: Jay Guo > Labels: mesosphere, security > > Some combinations of ACLs are not allowed, for example, specifying both > {{SetQuota}} and {{UpdateQuota}}. We should capture such issues and error out > early. > This ticket aims to add as many validations as possible to a dedicated > {{validate()}} routine, instead of having them implicitly in the codebase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5410) Support cgroup namespace in unified container
[ https://issues.apache.org/jira/browse/MESOS-5410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15297786#comment-15297786 ] haosdent commented on MESOS-5410: - Cool! Could you send a email to the dev mailing list to become a contributor in jira, so that I could change the assignee of MESOS-5446 to you. > Support cgroup namespace in unified container > - > > Key: MESOS-5410 > URL: https://issues.apache.org/jira/browse/MESOS-5410 > Project: Mesos > Issue Type: Improvement >Reporter: Qian Zhang >Assignee: Qian Zhang > > In Linux 4.6 kernel, a new namespace (cgroup namespace) was introduced to > make a process can be created in its own cgroup namespace so that the global > cgroup hierarchy will not be leaked to the process. See the following link > for more details about this namespace: > http://man7.org/linux/man-pages/man7/cgroup_namespaces.7.html > We need to support this namespace in unified container to provide better > isolation for the containers created by Mesos. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5410) Support cgroup namespace in unified container
[ https://issues.apache.org/jira/browse/MESOS-5410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15297764#comment-15297764 ] Nirav commented on MESOS-5410: -- Hi, Or we can add a macro in the file. I tried adding that,and it worked well. Since that would help in future. #ifndef CLONE_NEWCGROUP #define CLONE_NEWCGROUP 0x0200 #endif and nstypes["cgroup"] = CLONE_NEWCGROUP; I can submit the required patch. > Support cgroup namespace in unified container > - > > Key: MESOS-5410 > URL: https://issues.apache.org/jira/browse/MESOS-5410 > Project: Mesos > Issue Type: Improvement >Reporter: Qian Zhang >Assignee: Qian Zhang > > In Linux 4.6 kernel, a new namespace (cgroup namespace) was introduced to > make a process can be created in its own cgroup namespace so that the global > cgroup hierarchy will not be leaked to the process. See the following link > for more details about this namespace: > http://man7.org/linux/man-pages/man7/cgroup_namespaces.7.html > We need to support this namespace in unified container to provide better > isolation for the containers created by Mesos. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4565) slave recovers and attempt to destroy executor's child containers, then begins rejecting task status updates
[ https://issues.apache.org/jira/browse/MESOS-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15297752#comment-15297752 ] haosdent commented on MESOS-4565: - [~giaosuddau] Do you encounter the {code} E0130 02:22:21.009094 12686 containerizer.cpp:553] Failed to clean up an isolator when destroying orphan container kube-proxy: Failed to remove cgroup '/sys/fs/cgroup/memory/mesos/1d965a20-849c-40d8-9446-27cb723220a9/kube-proxy': Device or resource busy {code} A quick workaround it unmount it manually and make Agent recover successfully. > slave recovers and attempt to destroy executor's child containers, then > begins rejecting task status updates > > > Key: MESOS-4565 > URL: https://issues.apache.org/jira/browse/MESOS-4565 > Project: Mesos > Issue Type: Bug > Components: docker >Affects Versions: 0.26.0 >Reporter: James DeFelice > Labels: mesosphere > > AFAICT the slave is doing this: > 1) recovering from some kind of failure > 2) checking the containers that it pulled from its state store > 3) complaining about cgroup children hanging off of executor containers > 4) rejecting task status updates related to the executor container, the first > of which in the logs is: > {code} > E0130 02:22:21.979852 12683 slave.cpp:2963] Failed to update resources for > container 1d965a20-849c-40d8-9446-27cb723220a9 of executor > 'd701ab48a0c0f13_k8sm-executor' running task > pod.f2dc2c43-c6f7-11e5-ad28-0ad18c5e6c7f on status update for terminal task, > destroying container: Container '1d965a20-849c-40d8-9446-27cb723220a9' not > found > {code} > To be fair, I don't believe that my custom executor is re-registering > properly with the slave prior to attempting to send these (failing) status > updates. But the slave doesn't complain about that .. it complains that it > can't find the **container**. > slave log here: > https://gist.github.com/jdef/265663461156b7a7ed4e -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-5446) NsTest.ROOT_setns and NsTest.ROOT_getns failed in Linux 4.6
[ https://issues.apache.org/jira/browse/MESOS-5446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haosdent reassigned MESOS-5446: --- Assignee: haosdent > NsTest.ROOT_setns and NsTest.ROOT_getns failed in Linux 4.6 > --- > > Key: MESOS-5446 > URL: https://issues.apache.org/jira/browse/MESOS-5446 > Project: Mesos > Issue Type: Bug >Reporter: haosdent >Assignee: haosdent >Priority: Minor > > From [~nthakkar%40us.ibm.com] > {quote} > Currently because "cgroup" namespace is not supported, following two > test-case are failing: > 1. NsTest.ROOT_setns > 2. NsTest.ROOT_getns > The error observed is : "nstype: Unknown namespace 'cgroup'" > This is because the contents of the directory "/proc/self/ns" has been > changed in kernel version 4.6 (cgroup is added). > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-5446) NsTest.ROOT_setns and NsTest.ROOT_getns failed in Linux 4.6
haosdent created MESOS-5446: --- Summary: NsTest.ROOT_setns and NsTest.ROOT_getns failed in Linux 4.6 Key: MESOS-5446 URL: https://issues.apache.org/jira/browse/MESOS-5446 Project: Mesos Issue Type: Bug Reporter: haosdent Priority: Minor >From [~nthakkar%40us.ibm.com] {quote} Currently because "cgroup" namespace is not supported, following two test-case are failing: 1. NsTest.ROOT_setns 2. NsTest.ROOT_getns The error observed is : "nstype: Unknown namespace 'cgroup'" This is because the contents of the directory "/proc/self/ns" has been changed in kernel version 4.6 (cgroup is added). {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5410) Support cgroup namespace in unified container
[ https://issues.apache.org/jira/browse/MESOS-5410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15297740#comment-15297740 ] haosdent commented on MESOS-5410: - I think we could add {code} namespaces.erase("cgroup"); {code} as a workaround. Let me file a jira for this. > Support cgroup namespace in unified container > - > > Key: MESOS-5410 > URL: https://issues.apache.org/jira/browse/MESOS-5410 > Project: Mesos > Issue Type: Improvement >Reporter: Qian Zhang >Assignee: Qian Zhang > > In Linux 4.6 kernel, a new namespace (cgroup namespace) was introduced to > make a process can be created in its own cgroup namespace so that the global > cgroup hierarchy will not be leaked to the process. See the following link > for more details about this namespace: > http://man7.org/linux/man-pages/man7/cgroup_namespaces.7.html > We need to support this namespace in unified container to provide better > isolation for the containers created by Mesos. -- This message was sent by Atlassian JIRA (v6.3.4#6332)