Re: [VOTE] Move Apache Mesos to Attic

2021-04-06 Thread Benjamin Mahler
+1 (binding) Thanks to all who contributed to the project. On Mon, Apr 5, 2021 at 1:58 PM Vinod Kone wrote: > Hi folks, > > Based on the recent conversations > < > https://lists.apache.org/thread.html/raed89cc5ab78531c48f56aa1989e1e7eb05f89a6941e38e9bc8803ff%40%3Cuser.mesos.apache.org%3E > > >

Re: Feature requests for Mesos

2021-03-08 Thread Benjamin Mahler
I think the key issues have been brought up by Benjamin and Renan. Just to add to Benjamin's comments above, achieving those key markers of a healthy project requires serious corporate backing such that people are being employed to primarily work on Mesos. It takes a lot of work to keep the

Re: Slow communications between components

2020-11-08 Thread Benjamin Mahler
Which version? I'm not sure what you're observing but slower responses is usually due to backlogging from expensive requests (like /state), however we made several changes that have made it much less of a potential problem (see the blog posts). How much CPU is the master consuming? What kind of

Re: Changing logging timestamp

2020-09-21 Thread Benjamin Mahler
Mesos uses Google's glog library for logging: https://github.com/google/glog I believe it just prints the local time. You can see that it's produced by a call to gettimeofday(, NULL) and localtime_r (not gmtime_r): https://github.com/google/glog/blob/v0.4.0/src/logging.cc#L1267

Re: No offers are being made -- how to debug Mesos?

2020-06-06 Thread Benjamin Mahler
Don't worry about that "Ignoring" message on the agent. When the framework information is updated, the master broadcasts it to the agents, and in this case the agent doesn't know about the framework since it has no tasks for it, and so it ignores the updated information. I can't quite tell from

Re: Subject: [VOTE] Release Apache Mesos 1.10.0 (rc1)

2020-05-27 Thread Benjamin Mahler
+1 (binding) On Mon, May 18, 2020 at 4:36 PM Andrei Sekretenko wrote: > Hi all, > > Please vote on releasing the following candidate as Apache Mesos 1.10.0. > > 1.10.0 includes the following major improvements: > >

Re: [VOTE] Release Apache Mesos 1.7.3 (rc1)

2020-05-07 Thread Benjamin Mahler
+1 (binding) On Mon, May 4, 2020 at 1:48 PM Greg Mann wrote: > Hi all, > > Please vote on releasing the following candidate as Apache Mesos 1.7.3. > > The CHANGELOG for the release is available at: > > https://gitbox.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=1.7.3-rc1 > >

Re: Found no roles suitable for revive repetition.

2020-03-19 Thread Benjamin Mahler
gt; Thanks, > Marc > > > > -----Original Message- > From: Benjamin Mahler [mailto:bmah...@apache.org] > Sent: 18 March 2020 18:32 > To: user > Subject: Re: Found no roles suitable for revive repetition. > > Hi Marc, can you contact the marathon mailing list or s

Re: registered in SERVER runtime does not implement any provider interfaces applicable in the SERVER runtime.

2020-03-18 Thread Benjamin Mahler
Same here, please reach out to marathon support channels and include additional context. On Wed, Mar 18, 2020 at 12:27 PM Marc Roos wrote: > > I am having these, has been reported already on Jira long time ago. How > to fix these? > > > > der mesosphere.marathon.api.v2.PodsResource will be

Re: Found no roles suitable for revive repetition.

2020-03-18 Thread Benjamin Mahler
Hi Marc, can you contact the marathon mailing list or slack channel. Also, if there is a question here or some more context, please include that so they know what you need help with. On Wed, Mar 18, 2020 at 9:46 AM Marc Roos wrote: > > > Marathon is stuck on 'loading applications' > > > Mar 18

Re: Kill task, but not restarted

2020-02-03 Thread Benjamin Mahler
There's not enough information to understand the situation. How did you kill the task? Did the task get correctly marked as killed? Did the killed notification get correctly acknowledged? On Sun, Feb 2, 2020 at 9:04 AM Marc Roos wrote: > > > > Because the instance was not showing in the

Welcome Andrei Sekretenko as a new committer and PMC member!

2020-01-21 Thread Benjamin Mahler
Please join me in welcoming Andrei Sekretenko as the newest committer and PMC member! Andrei has been active in the project for almost a year at this point and has been a productive and collaborative member of the community. He has helped out a lot with allocator work, both with code and

Re: Task Pinning

2019-10-22 Thread Benjamin Mahler
It's easier to do something custom for your own needs than to bring generic support into the project. For example, in kubernetes, as far as I can tell they offer two modes for the agent: "static" (i.e. pinning for integer requests) and "none" (regular shares / limit model).

Re: large task scheduling on multi-framework cluster

2019-10-01 Thread Benjamin Mahler
Note that with the newest marathon that is capable of handling multiple roles, you would not need to run a dedicated marathon instance. On Tue, Oct 1, 2019 at 8:17 AM Grégoire Seux wrote: > Hello, > > I'm wondering how other mesos users deal with scheduling of large tasks > (using all resources

Re: reservations from terminated frameworks

2019-09-30 Thread Benjamin Mahler
Hi Hendrik, currently reservations are tied to a role, not framework. In this case, it's a static reservation which means you need to update the agent configuration and restart it destructively (we don't currently support a non-destructive non-additive agent resources change). If it was a dynamic

Re: Attach Shared Volume to new tasks

2019-09-26 Thread Benjamin Mahler
Can you show the full resource information from the offer? On Tue, Sep 10, 2019 at 6:50 AM Harold Molina-Bulla wrote: > Hi everybody, > > We are implementing a Scheduler for Mesos in python, and we need to attach > a preconfigured shared volume to a new Task. The Shared volume is now > offered

Re: [VOTE] Release Apache Mesos 1.9.0 (rc1)

2019-08-27 Thread Benjamin Mahler
> We upgraded the version of the bundled boost very late in the release cycle Did we? We still bundle boost 1.65.0, just like we did during 1.8.x. We just adjusted our special stripped bundle to include additional headers. On Tue, Aug 27, 2019 at 1:39 PM Vinod Kone wrote: > -1 > > We upgraded

[Performance / Resource Management WG] August Update

2019-08-21 Thread Benjamin Mahler
Can't make today's meeting, so sending out some notes: On the performance front: - Long Fei reported a slow master, and perf data indicates a lot of time is spent handling executor churn, this can be easily improved: https://issues.apache.org/jira/browse/MESOS-9948 On the resource management

Re: Mesos 1.9.0 release

2019-08-13 Thread Benjamin Mahler
Thanks for taking this on Qian! I seem to be unable to view the dashboard. Also, when are we aiming to make the cut? On Tue, Aug 13, 2019 at 10:58 PM Qian Zhang wrote: > Folks, > > It is time for Mesos 1.9.0 release and I am the release manager. Here is > the dashboard: >

Re: Should mesos 1.8 (and marathon 1.8) drain/migrate tasks or not?

2019-08-13 Thread Benjamin Mahler
(had to join the marathon-framework group to post to it, re-sending) On Tue, Aug 13, 2019 at 1:26 PM Benjamin Mahler wrote: > > I know DRAIN_AGENT is only for mesos 1.9. But what use it to post a > > maintenance schedule, see the node being marked as draining, and nothing

Re: Should mesos 1.8 (and marathon 1.8) drain/migrate tasks or not?

2019-08-13 Thread Benjamin Mahler
> I know DRAIN_AGENT is only for mesos 1.9. But what use it to post a > maintenance schedule, see the node being marked as draining, and nothing > happens with the tasks? The maintenance schedules require that schedulers implement support for them. Nothing happens if the scheduler does not have

Re: Mesos-dns srv weight

2019-08-01 Thread Benjamin Mahler
Please seek support through the mesos DNS channels: https://github.com/mesosphere/mesos-dns#contact On Fri, Jul 26, 2019 at 9:50 AM Marc Roos wrote: > > Is it possible to configure a task with srv record weight? > > > [@ mesos-cni]# dig +short @192.168.10.151 _webchat._tcp.marathon.mesos > SRV

[Performance / Resource Management WG] July Update

2019-07-17 Thread Benjamin Mahler
On the resource management front, Meng Zhu, Andrei Sekretenko, and myself have been working on quota limits and enhancing multi-role framework support: - A memory leak in the allocator was fixed: MESOS-9852 - Support for quota limits work is well underway, and at this point the major pieces are

Re: Design doc: Agent draining and deprecation of maintenance primitives

2019-06-06 Thread Benjamin Mahler
> With the new proposal, it's going to be as difficult as before to have SLA-aware maintenances because it will need cooperation from the frameworks anyway and we know this is rarely a priority for them. We will also lose the ability to signal future maintenance in order to optimize allocations.

[Performance / Resource management WG] Notes in lieu of tomorrow's meeting

2019-05-13 Thread Benjamin Mahler
I'm out of the country and so I'm sending out notes in lieu of tomorrow's performance / resource management meeting. Resource Management: - Work is underway for adding the UPDATE_FRAMEWORK scheduler::Call. - Some fixes and small performance improvements landed for the random sorter. - Perf data

Re: Slack upgrade to Standard plan. Thanks Criteo

2019-04-25 Thread Benjamin Mahler
Thank you Criteo! On Tue, Apr 23, 2019 at 1:12 PM Vinod Kone wrote: > Hi folks, > > As you probably realized today, we got our Slack upgraded from "free" plan > to "standard" plan, which allows us to have unlimited message history and > better analytics among other things! This would be great

Performance / Resource Management Update

2019-04-17 Thread Benjamin Mahler
In lieu of today's meeting, this is an email update: The 1.8 release process is underway, and it includes a few performance related changes: - Parallel reads for the v0 API have been extended to all other v0 read only endpoints (e.g. /state-summary, /roles, etc). Whereas in 1.7.0, only /state

Re: Subject: [VOTE] Release Apache Mesos 1.8.0 (rc1)

2019-04-15 Thread Benjamin Mahler
The CHANGELOG highlights seem a bit lacking? - For some reason, the task CLI command is listed in a performance section? - The parallel endpoint serving changes are in the longer list of items, seems like we highlight them in the performance section? Maybe we could be specific too about what we

Re: Mesos Master Crashes when Task launched with LAUNCH_GROUP fails

2019-03-01 Thread Benjamin Mahler
For posterity: https://issues.apache.org/jira/browse/MESOS-9619 On Thu, Feb 28, 2019 at 6:02 PM Meng Zhu wrote: > Hi Nimi: > > Thanks for reporting this. > > From the log snippet, looks like, when de-allocating resources, the agent > does not have the port resources that is supposed to have

Re: Enabling framework authentication Loaded deprecated flag 'authenticate'

2019-02-15 Thread Benjamin Mahler
The --authenticate master flag has been renamed: https://github.com/apache/mesos/blob/1.7.1/src/master/flags.cpp#L221 So yes, the documentation you linked to needs an update. On Fri, Feb 15, 2019 at 12:26 PM Marc Roos wrote: > > [@]# cat /etc/mesos-master/authenticate > true > > Is this page

Re: centos7/el7 newer marathon rpms

2019-02-12 Thread Benjamin Mahler
Hi Marc, You can reach out to the marathon community to get this question answered: https://mesosphere.github.io/marathon/support.html Ben On Fri, Feb 8, 2019 at 6:33 PM Marc Roos wrote: > > Where can I get newer marathon rpms, currently I am getting them from > here > > > >

Re: Check failed: reservationScalarQuantities.contains(role)

2019-02-06 Thread Benjamin Mahler
Thanks for reporting this, we can help investigate this with you in JIRA. On Tue, Feb 5, 2019 at 5:40 PM Jeff Pollard wrote: > Thanks for the info. I did find the "Removed agent" line as you suspected, > but not much else in logging looked promising. I opened a JIRA to track > from here on out

Re: Welcome Benno Evers as committer and PMC member!

2019-01-30 Thread Benjamin Mahler
Welcome Benno! Thanks for all the great contributions On Wed, Jan 30, 2019 at 6:21 PM Alex R wrote: > Folks, > > Please welcome Benno Evers as an Apache committer and PMC member of the > Apache Mesos! > > Benno has been active in the project for more than a year now and has made > significant

Re: [VOTE] Release Apache Mesos 1.7.1 (rc1)

2019-01-02 Thread Benjamin Mahler
+1 (binding) make check passes on macOS 10.14.2 $ clang++ --version Apple LLVM version 10.0.0 (clang-1000.10.44.4) Target: x86_64-apple-darwin18.2.0 Thread model: posix InstalledDir: /Library/Developer/CommandLineTools/usr/bin $ ./configure CC=clang CXX=clang++

Re: New scheduler API proposal: unsuppress and clear_filter

2018-12-10 Thread Benjamin Mahler
I think we're agreed: -There are no schedulers modeling the existing per-agent time-based filters that mesos is tracking, and we shouldn't go in a direction that encourages frameworks to try to model and manage these. So, we should be very careful in considering something like CLEAR_FILTERS.

Re: New scheduler API proposal: unsuppress and clear_filter

2018-12-05 Thread Benjamin Mahler
Thanks for bringing REQUEST_RESOURCES up for discussion, it's one of the mechanisms that we've been considering for further scaling pessimistic offers before we make the migration to optimistic offers. It's also been referred to as "demand" rather than "request", but for the sake of this

Re: [API WG] Proposals for dealing with master subscriber leaks.

2018-11-11 Thread Benjamin Mahler
>- We can add heartbeats to the SUBSCRIBE call. > This would need to be > part of a separate operator Call, because one platform (browsers) that > might subscribe to the master does not support two-way streaming. This doesn't make sense to me, the heartbeats should still be part of the same

Re: Rhythm - time-based job scheduler

2018-11-02 Thread Benjamin Mahler
Thanks for sharing Michał! could you tell us how you (or your employer) are using it? On Tue, Oct 30, 2018 at 10:34 AM Michał Łowicki wrote: > Hey! > > I would like to announce project I've been working on recently - > https://github.com/mlowicki/rhythm. It's a Cron-like scheduler with > couple

Welcome Meng Zhu as PMC member and committer!

2018-10-31 Thread Benjamin Mahler
Please join me in welcoming Meng Zhu as a PMC member and committer! Meng has been active in the project for almost a year and has been very productive and collaborative. He is now one of the few people of understands the allocator code well, as well as the roadmap for this area of the project. He

Re: Dedup mesos agent status updates at framework

2018-10-29 Thread Benjamin Mahler
l) scheduler will remove the status > update from the queue, and in case of failure, Mesos Master will send > status update again. > > > > On Sun, Oct 28, 2018 at 10:15 PM Benjamin Mahler > wrote: > > > Which version of mesos are you running? > > > > >

Re: Dedup mesos agent status updates at framework

2018-10-28 Thread Benjamin Mahler
ff period from 10s -> 30s or > 60s, and simultaneously explore if dedup is an option. > > Thanks, > Varun > > On Sun, Oct 28, 2018 at 6:49 PM Benjamin Mahler > wrote: > > > Hi Varun, > > > > What problem are you trying to solve precisely? There seems to be

Re: Dedup mesos agent status updates at framework

2018-10-28 Thread Benjamin Mahler
Hi Varun, What problem are you trying to solve precisely? There seems to be an implication that the duplicate acknowledgements are expensive. They should be low cost, so that's rather surprising. Do you have any data related to this? You can also tune the backoff rate on the agents, if the

Re: Proposal: Adding health check definitions to master state output

2018-10-18 Thread Benjamin Mahler
> It's worth mentioning that I believe the original intention of the 'Task' > message was to contain most information contained in 'TaskInfo', except for > those fields which could grow very large, like the 'data' field. +1 all task / executor metadata should be exposed IMO. I look at the 'data'

1.7.x Performance Improvements Blog Post

2018-10-09 Thread Benjamin Mahler
We published a blog post highlighting the performance improvements in Mesos 1.7.x, take a look! https://twitter.com/ApacheMesos/status/1049740950359044096 Ben

Re: Vote now for MesosCon 2018 proposals!

2018-09-25 Thread Benjamin Mahler
Voted! Thanks Jörg and the PC! On Thu, Sep 20, 2018 at 9:51 AM Jörg Schad wrote: > Dear Mesos Community, > > Please take a few minutes over the next few days and review what members > of the community have submitted for MesosCon 2018 > (which will be held in San

Re: [VOTE] Release Apache Mesos 1.4.2 (rc1)

2018-08-13 Thread Benjamin Mahler
+1 (binding) make check passes on macOS 10.13.6 with Apple LLVM version 9.1.0 (clang-902.0.39.2). Thanks Kapil! On Wed, Aug 8, 2018 at 3:06 PM, Kapil Arya wrote: > Hi all, > > Please vote on releasing the following candidate as Apache Mesos 1.4.2. > > 1.4.2 is a bug fix release. The CHANGELOG

Re: [VOTE] Release Apache Mesos 1.4.2 (rc1)

2018-08-13 Thread Benjamin Mahler
This was fixed in https://github.com/apache/mesos/commit/02ad5c8cdd644ee8eec83bf887daa98bb163637d, I don't recall there being any issues due to it. On Mon, Aug 13, 2018 at 4:50 PM, Benjamin Mahler wrote: > Hm.. I ran make check on macOS saw the following: > &

Re: [VOTE] Release Apache Mesos 1.4.2 (rc1)

2018-08-13 Thread Benjamin Mahler
Hm.. I ran make check on macOS saw the following: [ RUN ] AwaitTest.AwaitSingleDiscard src/tests/collect_tests.cpp:275: Failure Value of: promise.future().hasDiscard() Actual: false Expected: true [ FAILED ] AwaitTest.AwaitSingleDiscard (0 ms) On Wed, Aug 8, 2018 at 3:06 PM, Kapil Arya

Re: Understand fixed resource estimator to get oversubscribe resources

2018-08-10 Thread Benjamin Mahler
The fixed resource estimator provides a fixed size revocable pool: if you tell it to create a 24 cpu revocable pool, there will be a 24 cpu revocable pool. It is not looking at utilization slack. On Mon, Aug 6, 2018 at 2:28 PM, Varun Gupta wrote: > Hi, > > I was reading the code >

Re: Backport Policy

2018-07-26 Thread Benjamin Mahler
> consistent >>> > >>> (and safe) within a release. With that as the goal of a branch in >>> > >>> maintenance mode, it makes sense to fix regressions, and make >>> > exceptions to >>> > >>> fix CVEs and other critical/bl

Mesos 1.7.x and JSON clients

2018-07-25 Thread Benjamin Mahler
TLDR: If you use a spec-compliant JSON parser, you will observe no change in Mesos 1.7.x and everything will continue to work as before. Longer version: JSON allows strings to be encoded in several different ways. For example, "/" can be encoded directly as "/", or "\/", "\u002F", or "\u002f". A

Re: Backport Policy

2018-07-12 Thread Benjamin Mahler
he clarification. I'm in agreement with the points you > > made. > > > > Once we have consensus, would you mind updating the doc? > > > > On Wed, Jul 11, 2018 at 5:15 PM Benjamin Mahler > > wrote: > > > > > I realized recently that we aren't all on

Re: Normalization of metric keys

2018-07-06 Thread Benjamin Mahler
Do we also want: 3. Has an unambiguous decoding. Replacing '/' with '#%$' means I don't know if the user actually supplied '#%$' or '/'. But using something like percent-encoding would have property 3. On Fri, Jul 6, 2018 at 10:25 AM, Greg Mann wrote: > Thanks for the reply Ben! > > Yea I

Re: Normalization of metric keys

2018-07-03 Thread Benjamin Mahler
I don't think the lack of principal normalization was intentional. Why spread that further? Don't we also have some normalization today? Having slashes show up in components complicates parsing (can no longer split on '/'), no? For example, if we were to introduce the ability to query a subset of

Re: [VOTE] Release Apache Mesos 1.3.3 (rc1)

2018-05-29 Thread Benjamin Mahler
d, May 23, 2018 at 11:39 AM, Michael Park wrote: > >> Huh... 樂 Super weird. I'll look into it. >> >> Thanks for checking! >> >> MPark >> >> On Wed, May 23, 2018 at 11:34 AM Vinod Kone wrote: >> >>> It's empty for me too! >&g

Re: [VOTE] Release Apache Mesos 1.5.1 (rc1)

2018-05-23 Thread Benjamin Mahler
+1 (binding) make check passes on macOS 10.13.4 with Apple LLVM version 9.1.0 (clang-902.0.39.1) On Fri, May 11, 2018 at 12:35 PM, Gilbert Song wrote: > Hi all, > > Please vote on releasing the following candidate as Apache Mesos 1.5.1. > > 1.5.1 includes the following: >

Re: Mesos Roles | Min or Max ?

2018-05-21 Thread Benjamin Mahler
Currently a role either has no guarantee and no limit, or a guarantee and limit set to the same amount of resources. The work is underway to allow setting limit distinct from guarantee: https://issues.apache.org/jira/browse/MESOS-8068 On Mon, May 21, 2018 at 4:17 PM Ken Sipe

Re: Operator ReadFile API

2018-05-05 Thread Benjamin Mahler
Yes, it's base64 encoded. The protobuf schema defines this field of type "bytes": https://github.com/apache/mesos/blob/1.5.0/include/mesos/v1/agent/agent.proto#L460 When converted to JSON, this follows the standard protobuf -> JSON conversion by converting "bytes" fields into base64 encoded

Re: mesos-slave Failed to initialize: Failed to bind on 0.0.0.0:0: Address already in use: Address already in use [98]

2018-05-03 Thread Benjamin Mahler
>From the man page for bind: *EADDRINUSE* (Internet domain sockets) The port number was specified as zero in the socket address structure, but, upon attempting to bind to an ephemeral port, it was determined that all port numbers in the

Re: Reason of cascaded kill in a group

2018-04-10 Thread Benjamin Mahler
Are you saying that there was no reason previously, and there would be a reason after the change? If so, adding a reason where one did not exist is safe from a backwards compatibility perspective. On Mon, Apr 9, 2018 at 10:32 AM, Zhitao Li wrote: > Hi, > > We are

Re: Troubleshooting Mesos SSL setup

2018-04-10 Thread Benjamin Mahler
Are there bugs here? Is there anything that mesos could have logged / handled better? On Fri, Mar 16, 2018 at 11:46 AM, Renan DelValle wrote: > Follow up, we weren't able to get our wildcard certificate working but we > did get it to work when we used a certificate

Re: Support deadline for tasks

2018-03-23 Thread Benjamin Mahler
a few tasks that should be killed after > some timeout. We currently have some logic in our scheduler to kill these > tasks. Would be nice to delegate this to the executor. > > - Sagar > > On Fri, Mar 23, 2018 at 3:29 PM, Benjamin Mahler <bmah...@apache.org> > wrote: >

Re: Support deadline for tasks

2018-03-23 Thread Benjamin Mahler
Sagar, could you share your use case? Or is it exactly the same as Zhitao's? On Fri, Mar 23, 2018 at 3:15 PM, Sagar Sadashiv Patwardhan <sag...@yelp.com> wrote: > +1 > > This will be useful for us(Yelp) as well. > > On Fri, Mar 23, 2018 at 1:31 PM, Benjamin Mahler <bma

Re: Support deadline for tasks

2018-03-23 Thread Benjamin Mahler
Also, it's advantageous for mesos to be aware of a hard deadline when it comes to resource allocation. We know that some resources will free up and can make better decisions when it comes to pre-emption, for example. Currently, mesos doesn't know if a task will run forever or will run to

Re: Mesos scalability

2018-03-23 Thread Benjamin Mahler
Hi Karan, Only one master can be elected leader in the current architecture. It's unlikely we're at a point where we need to balance work across masters to push scalability further. That comes with a lot of complexity, and we still have a lot of room for performance improvements on a single

Re: Mesos on OS X

2018-03-21 Thread Benjamin Mahler
MacOS is a supported platform, you can see the supported versions here: http://mesos.apache.org/documentation/latest/building/ The containerization maintainers could probably chime in to elaborate on the isolation caveats. For example, you won't have many of the resource isolators available and

Re: 答复: 答复: Status update: task 1 is in state TASK_ERROR

2018-03-16 Thread Benjamin Mahler
kInfos(task.build() > } > > > > And after that I met another problem: my task is always in staging, and > terminates after 1min due to timeout. I think there are many mini process > in a scheduler app including callbacks, such as connect, register, get > offers list,

Re: Welcome Zhitao Li as Mesos Committer and PMC Member

2018-03-12 Thread Benjamin Mahler
Welcome Zhitao! Thanks for your contributions so far On Mon, Mar 12, 2018 at 2:02 PM, Gilbert Song wrote: > Hi, > > I am excited to announce that the PMC has voted Zhitao Li as a new > committer and member of PMC for the Apache Mesos project. Please join me to > congratulate

Re: Welcome Chun-Hung Hsiao as Mesos Committer and PMC Member

2018-03-12 Thread Benjamin Mahler
Welcome Chun! It's been great discussing things with you so far and thanks for the all the hard work! On Sat, Mar 10, 2018 at 9:14 PM, Jie Yu wrote: > Hi, > > I am happy to announce that the PMC has voted Chun-Hung Hsiao as a new > committer and member of PMC for the Apache

Re: 答复: Status update: task 1 is in state TASK_ERROR

2018-03-09 Thread Benjamin Mahler
; disk(allocated: controller)(reservations: > [(STATIC,controller)]):550264; ports(allocated: > controller):[31000-32000] > 233 Status update: task 1 is in state TASK_ERROR > > > > 罗辉 > > 基础架构 > -- > *发件人:* Benjamin Mahler <bm

Re: Status update: task 1 is in state TASK_ERROR

2018-03-08 Thread Benjamin Mahler
Can you log the message provided in the TaskStatus? https://github.com/apache/mesos/blob/1.5.0/include/ mesos/v1/mesos.proto#L2424 On Wed, Mar 7, 2018 at 11:23 PM, 罗 辉 wrote: > Hi guys: > > I got a mesos test app, mostly likely > >

Re: Tasks may be explicitly dropped by agent in Mesos 1.5

2018-03-01 Thread Benjamin Mahler
Put another way, we currently don't guarantee in-order task delivery to the executor. Due to the changes for MESOS-1720, one special case of task re-ordering now leads to the re-ordered task being dropped (rather than delivered out-of-order as before). Technically, this is strictly better.

Re: is there any docs to show how to secure http(s) for masters

2018-02-23 Thread Benjamin Mahler
+Alexander On Mon, Feb 19, 2018 at 11:00 AM Mclain, Warren wrote: > I am not finding any documentation that tells you how to actually > implement the following on the mesos masters and agents. > > > > authenticate=true > > authenticate_http_readonly=true > >

Re: http://mesos.apache.org/downloads/ is not up to date

2018-02-12 Thread Benjamin Mahler
Thanks for pointing this out Adam, I've added mpark who is the release manager for 1.3.2. On Tue, Feb 6, 2018 at 6:12 AM, Adam Cecile wrote: > Hi guys, > > > Did you notice Mesos 1.3.2 is missing from the official download page ? > > http://mesos.apache.org/downloads/ > >

Reminder: Design Doc for Mesos CLI Re-design

2018-02-12 Thread Benjamin Mahler
I've heard a lot of interest in there being investment in the mesos CLI. For those that are interested, please take a look at the re-design doc and share your feedback: https://docs.google.com/document/d/1r6Iv4Efu8v8IBrcUTjgYkvZ32WVsc gYqrD07OyIglsA/edit Feel free to make comments in the doc,

Re: Questions about Pods and the Mesos Containerizer

2018-01-29 Thread Benjamin Mahler
If moving the conversation to slack, it would be great to post back to the list with a summary! On Mon, Jan 29, 2018 at 1:38 PM, Vinod Kone wrote: > Hi David, > > It's probably worth having a synchronous discussion around your proposed > approach in our slack. I would like

Re: java driver/shutdown call

2018-01-17 Thread Benjamin Mahler
on than KILL. > > On Tue, Jan 16, 2018 at 6:40 PM, Benjamin Mahler <bmah...@apache.org> > wrote: > >> Mohit, what are you trying to accomplish by going from KILL to SHUTDOWN? >> >> On Tue, Jan 16, 2018 at 5:15 PM, Joseph Wu <jos...@mesosphere.io> wrote:

Re: Mesos slave ID change after reboot

2018-01-16 Thread Benjamin Mahler
Yes, the agent used to check for the boot id having changed in order to decide whether to try to recover. On Wed, Jan 10, 2018 at 5:53 PM, Srikanth Viswanathan wrote: > I am trying to understand under what cases the mesos slave ID changes in > response to reboot. I

Re: java driver/shutdown call

2018-01-16 Thread Benjamin Mahler
Mohit, what are you trying to accomplish by going from KILL to SHUTDOWN? On Tue, Jan 16, 2018 at 5:15 PM, Joseph Wu wrote: > If a framework launches tasks, then it will use an executor. Mesos > provides a "default" executor if the framework doesn't explicitly specify > an

Re: Duplicate task ID for same framework on different agents

2017-12-21 Thread Benjamin Mahler
It's a known issue: https://issues.apache.org/jira/browse/MESOS-3070 Putting in place a protection mechanism sounds good, but is rather complicated. See the comment in this ticket: https://issues.apache.org/jira/browse/MESOS-6785 On Wed, Dec 20, 2017 at 8:26 PM, Zhitao Li

Re: Mesos 1.5.0 Release

2017-12-21 Thread Benjamin Mahler
Meng is working on https://issues.apache.org/jira/browse/MESOS-8352 and we should land it tonight if not tomorrow. I can cherry pick if it's after your cut, and worst case it can go in 1.5.1. Have you guys gone over the unresolved items targeted for 1.5.0? I see a lot of stuff, might be good to

Re: [VOTE] Release Apache Mesos 1.3.2 (rc1)

2017-12-14 Thread Benjamin Mahler
+1 (binding) make check passes on macOS 10.13.2 with Apple LLVM version 9.0.0 (clang-900.0.39.2) On Thu, Dec 7, 2017 at 2:44 PM, Michael Park wrote: > Hi all, > > Please vote on releasing the following candidate as Apache Mesos 1.3.2. > > The CHANGELOG for the release is

December Performance Working Group Report

2017-12-11 Thread Benjamin Mahler
The December performance working group report is published on the website here: http://mesos.apache.org/blog/performance-working-group-progress-report/ This report highlights the progress we've made recently in the performance of master failover. Special thanks to Dmitry Zhuk, Michael Park and

Re: Resource allocation cycle in DRF for multiple frameworks

2017-12-05 Thread Benjamin Mahler
sed by a > framework, it immediately comes to to next available framework even though > next frameworks share is higher than the previous one. Is that by > implementation or I am getting something wrong here? > > Thanks > > > On Mon, Dec 4, 2017 at 2:37 PM, Benjamin

Re: Resource allocation cycle in DRF for multiple frameworks

2017-12-04 Thread Benjamin Mahler
I don't think I understood the questions here, but let me add some explanation and we can go from there. Mesos will use DRF to choose an ordering amongst the roles that are actively interested in obtaining resources. Within a role, we currently use DRF again to choose an ordering amongst the

Re: Documentation for Mesos On windows

2017-11-29 Thread Benjamin Mahler
+Andrew On Tue, Nov 28, 2017 at 5:41 PM, sweta Das wrote: > Hi > > Is there any other documentation than the one on mesos site > http://mesos.apache.org/documentation/latest/windows/ > > I was able to build mesos on AWS on an windows 2016 server. But I am not > able to find

Re: [VOTE] Release Apache Mesos 1.2.3 (rc1)

2017-11-29 Thread Benjamin Mahler
+1 (binding) make check on macOS 10.13.1 On Wed, Nov 29, 2017 at 9:17 PM, Adam Bordelon wrote: > +1 (binding) > > Passed all tests in DC/OS integration CI, with a bump to 1.2.x at f8706e5, > just one changelog update before 1.2.3-rc1. >

Re: Persistent volumes

2017-11-29 Thread Benjamin Mahler
+jpeach The polling mechanism is used by the "disk/du" isolator to handle the case where we don't have filesystem support for enforcing a quota on a per-directory basis. I believe the "disk/xfs" isolator will stop writes with EDQUOT without killing the task:

Re: Welcome Andrew Schwartzmeyer as a new committer and PMC member!

2017-11-27 Thread Benjamin Mahler
Welcome and thanks for your contributions so far! On Mon, Nov 27, 2017 at 11:00 PM, Joseph Wu wrote: > Hi devs & users, > > I'm happy to announce that Andrew Schwartzmeyer has become a new committer > and member of the PMC for the Apache Mesos project. Please join me in >

Stripping Offer.AllocationInfo and Resource.AllocationInfo for non-MULTI_ROLE schedulers.

2017-11-15 Thread Benjamin Mahler
Hi folks, When we released MULTI_ROLE support, Offers and Resources within them included additional information, specifically the AllocationInfo which indicated which role was being allocated to: https://github.com/apache/mesos/blob/1.3.0/include/ mesos/v1/mesos.proto#L907-L923

Re: 1.3.2 Release

2017-11-02 Thread Benjamin Mahler
Great! I cherry picked Gaston's fix for https://issues.apache.org/ jira/browse/MESOS-8135. On Wed, Nov 1, 2017 at 6:57 PM, Michael Park wrote: > Please reply to this email if you have pending patches to be backported to > 1.3.x, I'm aiming to cut a 1.3.2 on Friday. > >

Re: orphan executor

2017-11-02 Thread Benjamin Mahler
ket to track this? Any idea when this will be worked on? > > On Tue, Oct 31, 2017 at 5:22 PM, Benjamin Mahler <bmah...@apache.org> > wrote: > >> The question was posed merely to point out that there is no notion of the >> executor "running away" currently, due

Re: orphan executor

2017-10-31 Thread Benjamin Mahler
; include the "orphan" executor in the list there, so framework can find > runaways and kill them(using Mesos provided API)? > > On Tue, Oct 31, 2017 at 3:49 PM, Benjamin Mahler <bmah...@apache.org> > wrote: > >> What defines a runaway executor? >> >> M

Re: orphan executor

2017-10-31 Thread Benjamin Mahler
> container? >> >> On Tue, Oct 31, 2017 at 12:47 PM, Mohit Jaggi <mohit.ja...@uber.com> >> wrote: >> >>> Yes. There is a fix available now in Aurora/Thermos to try and exit in >>> such scenarios. But I am curious to know if Mesos agent has the >

Re: orphan executor

2017-10-31 Thread Benjamin Mahler
wrote: > Yes. There is a fix available now in Aurora/Thermos to try and exit in > such scenarios. But I am curious to know if Mesos agent has the > functionality to reap runaway executors. > > On Tue, Oct 31, 2017 at 12:08 PM, Benjamin Mahler <bmah...@apache.org> > wrote: &g

Re: rotating secrets when authenticating framework

2017-10-24 Thread Benjamin Mahler
+adam, alexander On Fri, Oct 20, 2017 at 2:54 PM, Devendra Ayalasomayajula < devend...@nvidia.com> wrote: > Corrected the subject > > > > *From:* Devendra Ayalasomayajula > *Sent:* Friday, October 20, 2017 2:40 PM > *To:* user@mesos.apache.org > *Subject:* rotting secrets when authenticating

Design Doc: Hierarchical Quota Guarantees and Limits

2017-10-11 Thread Benjamin Mahler
Hi folks, As part of the ongoing work for hierarchical role support, Michael Park and I have been working on a design doc that describes how the allocation algorithm needs to be updated to handle hierarchical quota guarantees. Also, as part of this work, we realized it makes sense to also make

Re: Are there any supported systems without O_CLOEXEC?

2017-09-29 Thread Benjamin Mahler
Is this altering the minimum Linux or OS X version we support? On Fri, Sep 29, 2017 at 9:15 AM, James Peach wrote: > > > On Sep 27, 2017, at 5:03 PM, James Peach wrote: > > > > Hi all, > > > > In MESOS-8027 and https://reviews.apache.org/r/62638/, I'm

Re: When is support for the AMD GPU driver on Mesos?

2017-09-06 Thread Benjamin Mahler
AMD support is not planned, no users have asked for it as far as I know. Nvidia support in mesos means: (1) Automatic detection of the GPUs via the NVML libraries. (2) Enforced isolation via device access. (3) Automatically making the nvidia driver libraries available within the container. We

Re: Welcome James Peach as a new committer and PMC memeber!

2017-09-06 Thread Benjamin Mahler
Thanks for all that you've done so far for the project James! On Wed, Sep 6, 2017 at 2:08 PM, Yan Xu wrote: > Hi Mesos devs and users, > > Please welcome James Peach as a new Apache Mesos committer and PMC member. > > James has been an active contributor to Mesos for over two

  1   2   3   4   >