Re: hostname in task

2019-08-03 Thread James Peach
> On Aug 3, 2019, at 10:59 PM, Marc Roos wrote: > > > I read you can add a hostname option to the container in this issue[0], > however I still have the uuid. Is this in available in mesos 1.8? Yep. > Can I > somewhere read all these options? Like here[1] The Mesos API is defined in the

Re: On adding a debug endpoint for Mesos containerizer

2019-06-05 Thread James Peach
I really like this proposal and I think that it would help opertional teams a lot. Let’s make sure that it is well documented :) > On Jun 5, 2019, at 1:05 AM, Andrei Budnik wrote: > > Hi folks, > > We have been encountering container stuck issues for quite a long time. Some > of these issues

Re: ssl mesos-executor not using /etc/default/mesos

2019-02-18 Thread James Peach
> On Feb 16, 2019, at 9:46 AM, Marc Roos wrote: > > > > Looks like the mesos-executor is not using /etc/default/mesos > environment variables Depending on your configuration, the executor runs inside the container, which means that /etc/default/mesos is probably not available. > > If

Re: How is running 1.7.0 in production?

2018-11-13 Thread James Peach
> On Nov 13, 2018, at 5:45 PM, Stuart Elston wrote: > > Hi everyone, > > We are contemplating an upgrade to Mesos 1.7.0 but are generally a little > wary of running .0 releases. Has anyone encountered any showstoppers while > running 1.7.0? We'd be curious to hear your experiences! I’ve

[ANNOUNCE] mesos_exporter 1.1.1 released

2018-10-25 Thread James Peach
Hi all, Just a quick note to say that mesos_exporter 1.1.1 has been released. This is a bug fix release that fixes a regression I introduced to v1.1.0. Source code an binaries are available on Github. https://github.com/mesos/mesos_exporter/releases/tag/v1.1.1 Thanks to Chase Sillevis who

Re: Propose to run debug container as the same user of its parent container by default

2018-10-25 Thread James Peach
> On Oct 23, 2018, at 7:47 PM, Qian Zhang wrote: > > Hi all, > > Currently when launching a debug container (e.g., via `dcos task exec` or > command health check) to debug a task, by default Mesos agent will use the > executor's user as the debug container's user. There are actually 2

Re: [VOTE] Release Apache Mesos 1.7.0 (rc3)

2018-09-14 Thread James Peach
+1 (binding) make check on Fedora 28 > On Sep 11, 2018, at 11:09 AM, Gastón Kleiman wrote: > > Hi all, > > Please vote on releasing the following candidate as Apache Mesos 1.7.0. > > > 1.7.0 includes the following: >

Re: Libevent bundling ahead.

2018-09-12 Thread James Peach
> On Sep 11, 2018, at 6:14 PM, Till Toenshoff wrote: > > Hey All, > > We are considering bundling/vendoring libevent 2.0.22 with upcoming releases > of Mesos. > > Let me explain the motivation and then go into some details. > > Due to https://issues.apache.org/jira/browse/MESOS-7076, SSL

Re: make check failed, but mesos-tests.sh --gtest_filter="SVNTest.DiffPatch" tests passed

2018-09-04 Thread James Peach
This might be caused by inconsistent linking in Homebrew. Try forcing Homebrew to build svn from source, something like this: brew install --force --build-from-source subversion > On Sep 4, 2018, at 2:29 AM, Chang Shawn wrote: > > After 'make' succesfully on my macOS 10.13.6, I run 'make

Re: [VOTE] Release Apache Mesos 1.7.0 (rc2)

2018-08-29 Thread James Peach
+1 (binding) Built and tested on Fedora 28 (clang). > On Aug 24, 2018, at 4:42 PM, Chun-Hung Hsiao wrote: > > Hi all, > > Please vote on releasing the following candidate as Apache Mesos 1.7.0. > > > 1.7.0 includes the following: >

[ANNOUNCE] mesos_exporter 1.1.0 released

2018-08-23 Thread James Peach
, and to the following contributors: Alan Bover, Eric Lubow, Hector Fernandez, Jack Thomasson, James Peach, Jonathan Sokolowski, Philip Norman, Stephan Erb, Trevor Wood and Vinod Kone. cheers, James

Re: Volume ownership and permission

2018-08-16 Thread James Peach
e volume. >> >>> I'd argue that the "rw" on the sandbox path is analogous to the "rw" >> mount option. That is, it is mounted writeable, but says nothing about >> which credentials can write to it. >> >> Can you please elaborate a bit on this? What would

Re: [VOTE] Move the project repos to gitbox

2018-07-17 Thread James Peach
> On Jul 17, 2018, at 7:58 AM, Vinod Kone wrote: > > Hi, > > As discussed in another thread and in the committers sync, there seem to be > heavy interest in moving our project repos ("mesos", "mesos-site") from the > "git-wip" git server to the new "gitbox" server to better avail GitHub >

implicit mesos-local support in scheduler drivers

2018-07-03 Thread James Peach
Hi all, I found recently, that the Mesos scheduler drivers will implicitly spin up a `mesos-local` cluster for testing if your scheduler uses the Mesos scheduler drivers, specifies “local” as the master, and exports “MESOS_" environment variables to configure the master. Do any scheduler

Re: narrowing task sandbox permissions

2018-06-15 Thread James Peach
> On Jun 15, 2018, at 11:06 AM, Zhitao Li wrote: > > Sorry for getting back to this really late, but we got bit by this behavior > change in our environment. > > The broken scenario we had: > > 1. We are using Aurora to launch docker containerizer based tasks on > Mesos; > 2. Most of

Re: Deprecating the Python bindings

2018-06-06 Thread James Peach
> On May 9, 2018, at 11:51 AM, Andrew Schwartzmeyer > wrote: > > Hi all, > > There are two parallel efforts underway that would both benefit from > officially deprecating (and then removing) the Python bindings. The first > effort is the move to the CMake system: adding support to generate

Re: Volume ownership and permission

2018-04-26 Thread James Peach
if we want to document it, what is our recommended > solution in the doc? > > > > Regards, > Qian Zhang > > On Fri, Apr 27, 2018 at 1:16 AM, James Peach <jpe...@apache.org> wrote: > >> I commented on the doc, but at least some of the issues raised there I >

Re: Volume ownership and permission

2018-04-26 Thread James Peach
I commented on the doc, but at least some of the issues raised there I would not regard as issues. Rather, they are about setting expectations correctly and ensuring that we are documenting (and maybe enforcing) sensible behavior. I'm not that keen on Mesos automatically "fixing" filesystem

Re: Update the *Minimum Linux Kernel version* supported on Mesos

2018-04-05 Thread James Peach
> On Apr 5, 2018, at 5:00 AM, Andrei Budnik wrote: > > Hi All, > > We would like to update minimum supported Linux kernel from 2.6.23 to > 2.6.28. > Linux kernel supports cgroups v1 starting from 2.6.24, but `freezer` cgroup > functionality was merged into 2.6.28, which

Re: Support deadline for tasks

2018-03-23 Thread James Peach
> On Mar 23, 2018, at 9:57 AM, Renan DelValle wrote: > > Hi Zhitao, > > Since this is something that could potentially be handled by the executor > and/or framework, I was wondering if you could speak to the advantages of > making this a TaskInfo primitive vs

Re: Support deadline for tasks

2018-03-22 Thread James Peach
> On Mar 22, 2018, at 10:06 AM, Zhitao Li wrote: > > In our environment, we run a lot of batch jobs, some of which have tight > timeline. If any tasks in the job runs longer than x hours, it does not make > sense to run it anymore. > > For instance, a team would

Re: Build Failure

2018-03-19 Thread James Peach
> On Mar 19, 2018, at 4:38 PM, Shiv Deepak wrote: > > Thanks. I installed unzip. That worked. FWIW the test suite was fixed for 1.6 in 0da7b6cc37786df94465ae98948fd7be669a843e. > > On Mon, Mar 19, 2018 at 3:48 PM, Tomek Janiszewski wrote: > Do you

Re: [VOTE] Release Apache Mesos 1.5.0 (rc2)

2018-02-07 Thread James Peach
+1 (binding) Tested on Fedora 27 > On Feb 1, 2018, at 5:36 PM, Gilbert Song wrote: > > Hi all, > > Please vote on releasing the following candidate as Apache Mesos 1.5.0. > > 1.5.0 includes the following: >

Re: [VOTE] Release Apache Mesos 1.5.0 (rc1)

2018-01-24 Thread James Peach
+1 Verified on CentOS 6 and Fedora 27 > On Jan 22, 2018, at 7:15 PM, Gilbert Song wrote: > > Hi all, > > Please vote on releasing the following candidate as Apache Mesos 1.5.0. > > 1.5.0 includes the following: >

Re: Doc-a-thon - January 11th, 2018

2018-01-09 Thread James Peach
Just a reminder that the Docathon is this Thursday :) > On Nov 21, 2017, at 4:14 PM, Judith Malnick wrote: > > Hi all, > > I'm excited to announce the next Apache Mesos doc-a-thon! > > *Date:* January 11th, 2018 > > Location: > > Mesosphere HQ > > 88 Stevenson

Re: Container user '27' is not supported

2017-12-25 Thread James Peach
0.251715 18595 > runtime.cpp:111] Container user 'sflowrt' is not supported yet for > container 375b21ca-2d12-4a81-8429-897aac75eaa0 > Dec 25 23:15:40 m02 mesos-slave[18569]: W1225 23:15:40.251715 18595 > runtime.cpp:111] Container user 'sflowrt' is not supported yet for &

Re: Container user '27' is not supported

2017-12-24 Thread James Peach
> On Dec 24, 2017, at 5:20 AM, Marc Roos wrote: > > > I am seeing this in the logs: > > Container user '27' is not supported yet for container > d823196a-4ec3-41e3-a4c0-6680ba5cc99 > > I guess this means that the container requests to run under a specific > user

narrowing task sandbox permissions

2017-12-14 Thread James Peach
Hi all, In https://issues.apache.org/jira/browse/MESOS-8332, I'm proposing a change to narrow the permissions used for the task sandbox directory from 0755 to 0750. Note that this change also makes failure to chown this directory into a hard failure. I expect this is a safe change for

Re: Adding a new agent terminates existing executors?

2017-11-15 Thread James Peach
> On Nov 15, 2017, at 8:24 AM, Dan Leary wrote: > > Yes, as I said at the outset, the agents are on the same host, with different > ip's and hostname's and work_dir's. > If having separate work_dirs is not sufficient to keep containers separated > by agent, what

Re: 1.4.1 release

2017-11-03 Thread James Peach
I think MESOS-8169 is a candidate, but I don't be able to get to it until next week > On Nov 3, 2017, at 1:48 AM, Qian Zhang wrote: > > And I will backport MESOS-8051 to 1.2.x, 1.3.x and 1.4.x. > > > Regards, > Qian Zhang > > On Fri, Nov 3, 2017 at 9:01 AM, Qian Zhang

Re: clearing the executor authentication token from the task environment

2017-11-02 Thread James Peach
> On Nov 1, 2017, at 2:28 PM, James Peach <jor...@gmail.com> wrote: > > Hi all, > > In https://issues.apache.org/jira/browse/MESOS-8140, I'm proposing that we > clear the MESOS_EXECUTOR_AUTHENTICATION_TOKEN environment variable > immediately after consuming i

clearing the executor authentication token from the task environment

2017-11-01 Thread James Peach
Hi all, In https://issues.apache.org/jira/browse/MESOS-8140, I'm proposing that we clear the MESOS_EXECUTOR_AUTHENTICATION_TOKEN environment variable immediately after consuming it in the built-in executors. This protects it from observation by other tasks in the same PID namespace, however I

Re: Adding the limited resource to TaskStatus messages

2017-10-10 Thread James Peach
s the `unreachable_time` field. I'm not planning to add structured information to any other failure reasons, but I'd support doing it if you have a specific suggestion. > On Mon, Oct 9, 2017, 3:50 PM James Peach <jor...@gmail.com> wrote: > > > On Oct 9, 2017, at

Re: Adding the limited resource to TaskStatus messages

2017-10-09 Thread James Peach
> On Oct 9, 2017, at 1:27 PM, Vinod Kone wrote: > >> In the case that a task is killed because it violated a resource >> constraint (ie. the reason field is REASON_CONTAINER_LIMITATION, >> REASON_CONTAINER_LIMITATION_DISK or REASON_CONTAINER_LIMITATION_MEMORY), >> this

Adding the limited resource to TaskStatus messages

2017-10-09 Thread James Peach
Hi all, In https://reviews.apache.org/r/62644/, I am proposing to add an optional Resources field to the TaskStatus message named `limited_resources`. In the case that a task is killed because it violated a resource constraint (ie. the reason field is REASON_CONTAINER_LIMITATION,

Re: RFC: Partition Awareness

2017-10-05 Thread James Peach
> On Jun 21, 2017, at 10:16 AM, Megha Sharma wrote: > > Thank you all for the feedback. > To summarize, not killing tasks for non-Partition Aware frameworks will make > the schedulers see a higher volume of non terminal updates for tasks for > which they have already

Re: Are there any supported systems without O_CLOEXEC?

2017-09-29 Thread James Peach
ly Ubuntu 14.04 has 3.19. Do we support anything older than that? > > On Fri, Sep 29, 2017 at 9:15 AM, James Peach <jor...@gmail.com> wrote: > >> >>> On Sep 27, 2017, at 5:03 PM, James Peach <jor...@gmail.com> wrote: >>> >>> Hi all, >>

Re: Are there any supported systems without O_CLOEXEC?

2017-09-29 Thread James Peach
> On Sep 27, 2017, at 5:03 PM, James Peach <jor...@gmail.com> wrote: > > Hi all, > > In MESOS-8027 and https://reviews.apache.org/r/62638/, I'm claiming that, in > practice, we do not have any supported platforms that don't implement > O_CLOEXEC to open. All current

Re: Collect feedbacks on TASK_FINISHED

2017-09-22 Thread James Peach
> On Sep 21, 2017, at 10:12 PM, Vinod Kone wrote: > > I think it makes sense for `TASK_KILLED` to be sent in response to a KILL > call irrespective of the exit status. IIRC, that was the original intention. Those are the semantics we implement and expect in our scheduler

Re: TASK_FAILED - Mesos Container Images

2017-09-06 Thread James Peach
> On Sep 6, 2017, at 4:41 AM, Thodoris Zois wrote: > > Hello, > > I am using the Mesos Containerizer with Docker Images. The problem is that > whenever a container exits my task gets TASK_FAILED because the container > exits with ‘1’. > My docker file invokes a shell

Re: Deprecating `--disable-zlib` in libprocess

2017-08-08 Thread James Peach
> On Aug 8, 2017, at 10:57 AM, Chun-Hung Hsiao wrote: > > Hi all, > > In libprocess, we have an optional `--disable-zlib` flag, but it's > currently not used > for conditional compilation and we always use zlib in libprocess, > and there's a requirement check in Mesos to

Re: Command Executor

2017-08-07 Thread James Peach
> On Aug 5, 2017, at 3:03 AM, Oeg Bizz wrote: > > I have a framework that relies on information sent by a custom Java Command > Executor; think of some sort of heartbeat. I start getting hearbeats after I > send a task to that mesos-slave, but never before that. That

Re: Mesos-docker-executor understanding

2017-07-21 Thread James Peach
> On Jul 19, 2017, at 10:05 AM, Thomas HUMMEL wrote: > > Hello, > > I've read some books about Mesos, installed one multi-master cluster (for POC > purposes) with some frameworks (Marathon, Spark for instance) and watch some > talks. > > Everything works and my

Re: Format for attributes with no value

2017-07-14 Thread James Peach
ont of it and save it in the /etc/mesos- slave> directory. For instance, if you want to enable authentication and > want to pass the --authenticate attribute then create an empty file called > /etc/mesos-master/?authenticate. > > Not sure if that is what you meant with your qu

Re: Format for attributes with no value

2017-07-10 Thread James Peach
> On Jul 7, 2017, at 4:46 PM, Jeff Kubina wrote: > > When setting an attribute with no value of a mesos-agent is the colon needed, > optional, or must it be omitted? It's not clear from the documentation. For > example, which line or lines below are correct? > >

Re: Dynamic reservations without a principal

2017-07-05 Thread James Peach
> On Jul 4, 2017, at 5:27 PM, Srikanth Viswanathan wrote: > > Hi folks, > > I am trying to have the Chronos framework consume dynamic reservations in > Mesos. However, it appears that Chronos is unable to do this because it does > not pass the framework principal to

Re: Framework change role

2017-07-05 Thread James Peach
ow to change the role of the > Framework without losing that TreeMap, and also how to set it with version > 1.3.0. > > Hope that everybody understands now…. > Thank you, and i am really sorry for the spam > > >> On 5 Jul 2017, at 12:24, James Peach <jor...@gmail.com>

Re: Framework change role

2017-07-05 Thread James Peach
> On Jul 5, 2017, at 12:54 AM, Thodoris Zois wrote: > > Hi, > > No, i would like my framework to be offered resources from agent with role > (e.g: thz) and after running the specific tasks change its role to (*) in > order to get offers from different agents, but it will

Re: ensuring a particular task is deployed to "all" Mesos Worker hosts

2017-07-01 Thread James Peach
> On Jul 1, 2017, at 11:14 AM, Erik Weathers wrote: > > Thanks for the info Kevin. Seems there's no JIRAs nor design docs floating > about yet for "admin tasks" or "daemon sets". > > Just FYI, this is the ticket in Storm for the problem I've been mentioning: > >

Re: Mesos-Metrics per task

2017-06-29 Thread James Peach
> On Jun 29, 2017, at 3:53 PM, Thodoris Zois wrote: > > Hello, i would like to get some metrics per task. E.g memory/cpu usage is > there any way? > > Thank you! You can use the GET_CONTAINERS agent API call

Re: Agent Working Directory Best Practices

2017-06-26 Thread James Peach
> On Jun 26, 2017, at 4:05 PM, Steven Schlansker > wrote: > > >> On Jun 25, 2017, at 11:24 PM, Benjamin Mahler wrote: >> >> As a data point, as far as I'm aware, most users are using a local work >> directory, not an NFS mounted one. Would

Re: Work group on Community

2017-06-16 Thread James Peach
> On Jun 15, 2017, at 10:57 AM, Vinod Kone wrote: > > Hi folks, > > Seeing that our first official containerizer WG is off to a good start, we > want to use that momentum to start new WGs. > > I'm proposing that we start a new work group on community. The mission of >

Re: How to filter GET_TASKS api result

2017-04-19 Thread James Peach
> On Apr 19, 2017, at 5:00 PM, Benjamin Mahler wrote: > > We can add a Call.GetTasks message to allow you to specify which task ids you > would like to retrieve. But this isn't supported yet, the code needs to be > written. E.g. > > message Call { > enum Type { >

Re: Structured logging for Mesos (or c++ glog)

2016-12-19 Thread James Peach
> On Dec 19, 2016, at 2:54 PM, Zhitao Li wrote: > > Hi James, > > Stitching events together is only one possible use cases, and I'm not exactly > sure what you meant by directly event logging. > > Taking the hierarchical allocator for example. In a multi-framework

Re: MESOS-6233 Allow agents to re-register post a host reboot

2016-11-29 Thread James Peach
> On Nov 28, 2016, at 6:09 PM, Yan Xu wrote: > > So one thing that was brought up during offline conversations was that if the > host reboot is associated with hardware change (e.g., a new memory stick): > > • Currently: the agent would skip the recovery (and the

Re: Persistent volume ownership issue

2016-06-21 Thread James Peach
ould be to make the owner the creator of the volume, then use ACL inheritance to grant additional access to other users. You'd have to reflow the inheritance, but it could probably done. -- James Peach | jor...@gmail.com

Re: Persistent volume ownership issue

2016-06-21 Thread James Peach
sive chown. That'll allow the new task to at least > create new files under the persistent volume, but do not change ownership of > files created by previous tasks. It should be a very simple fix which we can > ship in 1.0. We'll ship MESOS-4893 after 1.0. What do you guys think? > > Thanks, > - Jie -- James Peach | jor...@gmail.com

Re: How is the OS X environment created with Mesos

2016-05-18 Thread James Peach
s >> a task as some other user. Clearly it is not running some of the scripts >> normally run during login. This was a constant source of confusion with >> Jenkins. If one can state what exactly is done to create the user >> environment each platform and how it

Re: [Proposal] Remove the default value for agent work_dir

2016-04-12 Thread James Peach
> On Apr 12, 2016, at 3:58 PM, Greg Mann wrote: > > Hey folks! > A number of situations have arisen in which the default value of the Mesos > agent `--work_dir` flag (/tmp/mesos) has caused problems on systems in which > the automatic cleanup of '/tmp' deletes agent

Re: verbose logging with the docker executor

2016-03-19 Thread James Peach
> On Mar 17, 2016, at 10:09 AM, Clarke, Trevor wrote: > > Looking in the docker executor, the docker command line is logged with > VLOG(1) but I'm not sure how to generate that level of log output. Some > googling suggests it's used in the google logging library and verbose

Re: OS X build

2015-09-27 Thread James Peach
r option, and it works AFAICT jpeach$ ./configure --help | grep apr --with-apr=[=DIR] specify where to locate the apr-1 library > > On Sat, Sep 26, 2015 at 9:26 PM, James Peach <jor...@gmail.com> wrote: > > > On Sep 26, 2015, at 12:01 PM, Vaibhav Khanduja &l

Re: OS X build

2015-09-26 Thread James Peach
> On Sep 26, 2015, at 12:01 PM, Vaibhav Khanduja > wrote: > > I am running into issues with build on my MAC - OSX … the configure scripts > complaints about libapr-1 not present. I was able to find a workaround by > passing configure with —with-apr option. Looks

Re: Building portable binaries

2015-09-17 Thread James Peach
> On Sep 17, 2015, at 4:33 PM, F21 wrote: > > Is there anyway to build portable binaries for mesos? > > Currently, I have tried building my own libsvn, libsasl2, libcurl, libapr and > then built mesos using the following: > > ../configure CC=gcc-4.8 CXX=g++-4.8 >

Re: Recommended way to discover current master

2015-08-31 Thread James Peach
> On Aug 31, 2015, at 10:25 AM, Philip Weaver wrote: > > My framework knows the list of zookeeper hosts and the list of mesos master > hosts. > > I can think of a few ways for the framework to figure out which host is the > current master. What would be the best?

Re: Build 0.23 gcc Version

2015-07-29 Thread James Peach
. John On Mon, Jul 27, 2015 at 10:56 AM, James Peach jor...@gmail.com wrote: On Jul 24, 2015, at 3:57 PM, Michael Park mcyp...@gmail.com wrote: Hi John, I would first suggest trying CC=gcc CXX=g++ ../configure, and if that works, try to find out what which cc and which c++ return

Re: Build 0.23 gcc Version

2015-07-27 Thread James Peach
On Jul 24, 2015, at 3:57 PM, Michael Park mcyp...@gmail.com wrote: Hi John, I would first suggest trying CC=gcc CXX=g++ ../configure, and if that works, try to find out what which cc and which c++ return and find out what they symlink to. I believe autotools uses cc and c++ rather