Hi folks,
As part of the ongoing work for hierarchical role support, Michael Park and
I have been working on a design doc that describes how the allocation
algorithm needs to be updated to handle hierarchical quota guarantees.
Also, as part of this work, we realized it makes sense to also make
Is this altering the minimum Linux or OS X version we support?
On Fri, Sep 29, 2017 at 9:15 AM, James Peach wrote:
>
> > On Sep 27, 2017, at 5:03 PM, James Peach wrote:
> >
> > Hi all,
> >
> > In MESOS-8027 and https://reviews.apache.org/r/62638/, I'm
AMD support is not planned, no users have asked for it as far as I know.
Nvidia support in mesos means:
(1) Automatic detection of the GPUs via the NVML libraries.
(2) Enforced isolation via device access.
(3) Automatically making the nvidia driver libraries available within the
container.
We
Thanks for all that you've done so far for the project James!
On Wed, Sep 6, 2017 at 2:08 PM, Yan Xu wrote:
> Hi Mesos devs and users,
>
> Please welcome James Peach as a new Apache Mesos committer and PMC member.
>
> James has been an active contributor to Mesos for over two
-1 due to https://issues.apache.org/jira/browse/MESOS-7921
Thanks for reporting this Yan, it unfortunately went unnoticed despite CI
failures since Aug 3rd.
On Mon, Aug 28, 2017 at 12:29 PM, Yan Xu wrote:
> Also the libprocess refactor seems to have stability issues:
>
Yes, the UUID is how you would check for a duplicate due to
re-transmission. These duplicates still need to be acknowledged.
Ben
On Mon, Aug 28, 2017 at 9:59 AM, Christoph Heer
wrote:
> Hi,
>
> as described in Mesos' documentation [1], a framework scheduler should
>
Looks like you're asking about DC/OS? Their user list is: us...@dcos.io
On Fri, Aug 11, 2017 at 7:04 AM, Mclain, Warren
wrote:
> We (Optum) are interested in the Beta1.10 dcos. The one item that we are
> looking at is whether the OpenId support is in the beta.
>
>
>
>
You're free to write your own long lived executor that can process multiple
tasks. The built in executors self-terminate after running the tasks they
are launched with.
On Tue, Aug 8, 2017 at 2:36 AM, Oeg Bizz wrote:
> It is used to notify some services that the agents are
Sorry, I think this was me, feel free to remove it from libprocess now that
it's required.
On Tue, Aug 8, 2017 at 10:57 AM, Chun-Hung Hsiao
wrote:
> Hi all,
>
> In libprocess, we have an optional `--disable-zlib` flag, but it's
> currently not used
> for conditional
+1 (binding)
./configure CC=clang CXX=clang++ CXXFLAGS=-Wno-deprecated-declarations
--disable-python --disable-java --with-apr=/usr/local/opt/apr/libexec
--with-svn=/usr/local/opt/subversion && make check -j8
Ran into a known flaky test:
https://issues.apache.org/jira/browse/MESOS-7739
On Tue,
That file path looks valid from what I can tell:
/opt/mesos/build/src/../../src/python/cli/src/mesos/__init__.py
Is the file not there? Is the directory not there?
On Fri, Jul 28, 2017 at 9:45 AM, Traiano Welcome wrote:
> Hi All
>
> The latest version of mesos fails to
This is generally not something we want users to do (i.e. leak something
outside of their container).
Mesos will kill all tasks in the cgroup if you're using cgroup isolation,
so you would have to ensure the daemon escapes the cgroup. If you're using
the posix isolation, you also need to be sure
As a data point, as far as I'm aware, most users are using a local work
directory, not an NFS mounted one. Would love to hear from anyone on the
list if they are doing this, and if there are any subtleties that should be
documented.
On Thu, Jun 22, 2017 at 11:13 PM,
Thanks for kicking this off Vinod! (lists to bcc)
I'm happy to join, I would add the following under this umbrella for now:
--> Project PR (e.g. blog posts, twitter, etc)
--> Events
--> Website / documentation
--> New contributor UX
On Thu, Jun 15, 2017 at 10:57 AM, Vinod Kone
a flaky test or a bug?
On Thu, Jun 8, 2017 at 4:07 PM, Benjamin Mahler <bmah...@apache.org> wrote:
> Vinod I think that's the getenv issue from: https://issues.apache.or
> g/jira/browse/MESOS-6985
>
> On Wed, May 17, 2017 at 5:57 PM, Till Toenshoff <toensh...@me.com>
Vinod I think that's the getenv issue from: https://issues.apache.
org/jira/browse/MESOS-6985
On Wed, May 17, 2017 at 5:57 PM, Till Toenshoff wrote:
> +1
>
> Ran it through DC/OS builds and integration tests;
> https://github.com/dcos/dcos/pull/1530 => all green
>
> On May 17,
Thanks Yan!
On Fri, Jun 2, 2017 at 10:45 AM, Yan Xu <y...@jxu.me> wrote:
> +1 (binding)
>
> Ran it in a test cluster.
>
> ---
> Jiang Yan Xu <y...@jxu.me> | @xujyan <https://twitter.com/xujyan>
>
> On Thu, Jun 1, 2017 at 2:34 PM, Benjamin Mahler <
+1 (binding)
Looks like ExamplesTest.DynamicReservationFramework is flaky, unfortunately
wasn't able to get the logs for a failed run.
On Thu, Jun 1, 2017 at 2:03 PM, Benjamin Mahler <bmah...@apache.org> wrote:
> Not a blocker, but noticed the parallel test runner isn't bundled in the
If I understood correctly, the proposal is to not kill the tasks for
non-partition aware frameworks? That seems like a pretty big change for
frameworks that are not partition aware and expect the old killing
semantics.
It seems like we should just directly fix the issue, do you have a sense of
Not a blocker, but noticed the parallel test runner isn't bundled in the
release, if you configure with '--enable-parallel-test-execution':
/Users/bmahler/Downloads/mesos-1.3.0/support/mesos-gtest-runner.py
--sequential=*ROOT_* ./stout-tests
/bin/sh:
+Kevin
On Wed, May 31, 2017 at 3:31 PM, Brad wrote:
> Hi all,
>
> I'm interested in the container attach and exec feature added in version
> 1.2.0.
>
> I'm using the LAUNCH_NESTED_CONTAINER_SESSION and ATTACH_CONTAINER_INPUT
> calls on the operator API to launch an
Thanks Zhitao and Anand! I've been looking forward to using arena
allocation to improve performance.
On Fri, May 26, 2017 at 6:01 PM, Qian Zhang wrote:
> Thanks Anand and Zhitao!
>
> So I think we can remove the code like below, and switch to use the native
> maps supported
Thanks for all of your contributions to the project so far! It's been great
having you in the community
On Wed, May 24, 2017 at 10:32 AM, Jie Yu wrote:
> Hi folks,
>
> I' happy to announce that the PMC has voted Gilbert Song as a new
> committer and member of PMC for the
s]
>> > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Rel
>> ease/32/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--
>> verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%
>> 3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ub
I just was targeting a cherry pick and noticed the release isn't closed on
JIRA (see 'Releasing the Version on JIRA' section of the release guide). I
closed it and added a 1.1.3 version for folks to target for bug fixes.
On Fri, May 19, 2017 at 5:36 AM, Alex Rukletsov wrote:
OWLEDGE call with the uuid got in status
> update) also seen in master only ~35ms (lines 18-19 below) after the call.
> I’m starting to conclude the each call using the scheduler library (which
> actually involves HTTP POST) takes ~40ms.
>
>
>
> To sum it up, it seems that the m
-1 (binding)
Two upgrade blockers, these need to be backported to 1.2.x as well:
https://issues.apache.org/jira/browse/MESOS-7478 (I have a patch already)
https://issues.apache.org/jira/browse/MESOS-7460 (mpark is working on this)
Re: MESOS-7378
Any updates on whether this will be fixed?
On
+1 make check passes on macOS 10.12.4 with clang
On Tue, May 2, 2017 at 12:04 PM, Vinod Kone wrote:
> Hi all,
>
>
> Please vote on releasing the following candidate as Apache Mesos 1.0.4.
>
>
> 1.0.4 includes the following:
>
>
We can add a Call.GetTasks message to allow you to specify which task ids
you would like to retrieve. But this isn't supported yet, the code needs to
be written. E.g.
message Call {
enum Type {
GET_TASKS = 13; // Retrieves the information about tasks, see
`GetTasks` below.
}
I would recommend avoiding a manual clean up of the work directory, since
it's not guaranteed that this approach will remain correct as things
evolve. To have the agent perform the cleanup using its own logic, you can
run:
mesos-agent --recover=cleanup --work_dir= --master=
Also, there is
Hi Mark,
No, there is no support for this currently.
On Thu, Mar 16, 2017 at 2:11 PM, Mark Hammons wrote:
> Can you suspend a running task with mesos? I see that it can be killed,
> but it would be nice to have the ability to suspend tasks for a preemptive
>
lder()
>
> .setRefuseSeconds(0)
>
> .build()))
>
> .build());
>
> }
>
>
>
> LOGGER.info("Completed handling offers");
>
> }
>
>
>
Have you taken a look at the logs across your scheduler / master / agent to
see where the latency is introduced? We can then discuss options to reduce
the latency.
Ben
On Tue, Mar 7, 2017 at 5:03 AM, wrote:
> Hi,
>
>
>
> I’m implementing my own framework (scheduler +
Hartmann <gabr...@mesosphere.io>
wrote:
> Possibly the suppress/revive problem.
>
> On Thu, Mar 2, 2017 at 4:30 PM Benjamin Mahler <bmah...@apache.org> wrote:
>
>> Can you upload the full logs somewhere and link to them here?
>>
>> How many frameworks are yo
Also, what is the allocation that each framework has when you reach your
steady state?
Are there frameworks that don't have any more work to do but have a really
low share of the cluster?
On Thu, Mar 2, 2017 at 4:29 PM, Benjamin Mahler <bmah...@apache.org> wrote:
> Can you upload the
cpu per task).
>
> The problem is (we think): the mesos-master does not offers resources to
> all the tasks all the time and the declined resources are not re-offered to
> other tasks. Any idea to how to change the behavior or the rate to offer
> resources to the tasks?
>
> FY
Hey Zameer, great questions. Let us know if there's anything you think
could be improved or documented better.
Re 1:
The 'Viewing maintenance status' section of the documentation should
clarify this:
http://mesos.apache.org/documentation/latest/maintenance/
Re 2:
Both of these sound reasonable
Hi there, more clarification is needed:
> I have close to 800 CPUs, but the system does not assign all the available
> resources to all our tasks.
>
What do you mean precisely here? Can you describe what you're seeing?
Also, you have more than 800GB or RAM right?
Ben
On Thu, Mar 2, 2017 at 9:00
Hi Hendrik,
> Is it normal that the reserved resources are only available a bit after
> the task ended?
Yes, that's normal since we don't block the forwarding of the terminal
status update behind the allocation of the freed resources. Since the
latter can take some time, we opt to forward the
For GPUs there have been requests to expose the hardware and topology
information in a first class way, so that schedulers can consume it
consistently. Uses cases have been: handling heterogenous gpu hardware,
topology aware scheduling (critical for GPUs given NVLink vs PCI vs QPI
communication
Congrats and welcome!
On Fri, Jan 20, 2017 at 11:03 PM, Vinod Kone wrote:
> Hi folks,
>
> Please welcome Neil Conway as the newest committer and PMC member of the
> Apache Mesos project.
>
> Neil has been an active contributor to Mesos for more than a year now. As
> part
As promised when publishing the multi-role framework design doc, here is
the design doc for hierarchical roles.
Design Doc:
https://docs.google.com/document/d/1Ie2-6O400ayNXtRqipHq6_
CCQ4wOoLWzoqql3b0Y6HU/edit?usp=sharing
JIRA Epic:
https://issues.apache.org/jira/browse/MESOS-6375
Take a look
the network is not the
> bottleneck, so the RPC layer is too heavy.
>
> -邮件原件-
> 发件人: Benjamin Mahler [mailto:bmah...@apache.org]
> 发送时间: 2017年1月5日 9:26
> 收件人: dev
> 抄送: user@mesos.apache.org
> 主题: Re: Optimize libprocess performance
>
> Which area
Which areas does the performance not meet your needs? There are a lot of
aspects to libprocess that can be optimized, so it would be good to focus
on each of your particular use cases via benchmarks, this allows us to have
a shared way to profile and measure improvements.
Copy elimination is one
Maintenance should work in this case, it will just be applied to all agents
on the machine.
On Fri, Dec 9, 2016 at 1:20 PM, Charles Allen wrote:
> Thanks for the insight.
>
> I take that to mean the maintenance primitives might not work right for
> multi-agent
+1 (binding)
On Wed, Nov 30, 2016 at 2:53 PM, Greg Mann wrote:
> +1 (non-binding)
>
> Did `sudo make check` on CentOS 7. Aside from several
> LinuxFilesystemIsolatorTests and two other flaky
> tests, CgroupsAnyHierarchyWithFreezerTest.ROOT_CGROUPS_
> DestroyTracedProcess
>
Yes, if you re-register with the master, this will invalidate all
outstanding offers.
On Mon, Oct 31, 2016 at 2:28 PM, Hendrik Haddorp
wrote:
> Right, I have written my own scheduler and sometimes end up in a state
> that Mesos believes that there are outstanding offers
to the UI.
> Maintenance of nodes will be presented in a table. Code is on github but
> need some tweaks and tests with large maintenance json. I'll prepare patch
> shortly (when github DDoS will be over).
> https://issues.apache.org/jira/browse/MESOS-6443
>
> pt., 21.10.2016 o 20:30 użytk
When adding features we try to ensure the webui is updated accordingly.
However, there have been a few gaps where the webui has not been updated to
reflect the addition of functionality.
I filed the following epic to collect gaps in functionality:
https://issues.apache.org/jira/browse/MESOS-6440
Thanks for reporting this Rodrick, do you see any errors in your browser's
console?
On Tue, Sep 27, 2016 at 4:29 AM, Rodrick Brown
wrote:
>
> On Sep 27, 2016, at 3:43 AM, haosdent wrote:
>
> Hi, @Rodrick
>
> >"master/frameworks_connected": 0,
>
> Is
You may get better help from the Marathon team:
https://github.com/mesosphere/marathon#help
On Mon, Sep 19, 2016 at 11:49 PM, Cecile, Adam wrote:
> Hello Guys,
>
> We are sometime experiencing weird behavior between Mesos and Marathon.
> Some jobs that does not seem to
Hard to interpret the error message, it looks like it's pointing to our
$scope variables 'offered_cpus' and 'idle_cpus'.
Is the error consistent? When you say you get this error with the pailer,
what does that mean? You see this in the pailer window? In your browser
console after you click on the
Also I believe the CLI work that Haris / Kevin have been doing would make
this easy to do via the Mesos CLI (it's not integrated into the project
yet).
On Wed, Aug 10, 2016 at 9:57 AM, Erik Weathers
wrote:
> Just for completeness and to provide an alternative, you can
All of the issues I've been shepherding have been fixed.
The only one I see remaining is this one, but doesn't look like a blocking
issue: https://issues.apache.org/jira/browse/MESOS-5985
Anything else that needs to go in?
On Mon, Aug 1, 2016 at 4:19 PM, Vinod Kone wrote:
Unfortunately we log termination messages to stderr rather than the logging
files. Can you show stderr? I suspect we're printing the exit message there.
See: https://issues.apache.org/jira/browse/MESOS-5854
On Fri, Jul 29, 2016 at 5:57 PM, Douglas Nelson wrote:
> It might
+1 (binding)
OS X 10.11.6
./configure --disable-python --disable-java
make check
On Tue, Jul 26, 2016 at 10:24 AM, Greg Mann wrote:
> +1 (non-binding)
>
> * Ran `sudo make distcheck` successfully on CentOS 7.1 with only one test
> failure: ExamplesTest.PythonFramework fails
Just a reminder. If you're using Mesos and want to be featured in our list
of users, send a PR to get your organization added:
https://github.com/apache/mesos/blob/master/docs/powered-by-mesos.md
If you've built a framework, and would like it featured in our list of
frameworks, send a PR to get
+1 (binding)
Make check on OS X 10.11.5.
On Mon, Jun 20, 2016 at 5:10 PM, Kapil Arya wrote:
> +1 (binding) Internal CI build.
>
> Here is a link to the deb/rpm packages:
>
> http://open.mesosphere.com/downloads/mesos-rc/#apache-mesos-0.26.2-rc1
>
>
>
> On Mon, Jun 20, 2016
Moving this to a new thread (see some context below).
It may be worth exploring adding a generic mechanism for doing label-based
injection of volumes: if a container is tagged with a particular label, we
will inject a particular volume into the container.
For Nvidia GPU containers, the operator
Sounds OK to me if there are no objections, since it should not be a
difficult adjustment for users to make and users can use the more
expressive JSON format for resources already. (e.g.
https://github.com/dcos/dcos/blob/1.7-open/gen/dcos-config.yaml#L95)
Also, please document this in the
Is this the right project?
https://github.com/tomas-abrahamsson/gpb
If so, it seems to support package namespacing, just needs to be enabled:
"Gpb can optionally make use of the package attribute by prepending the
name of the package to every contained message type (if defined), which is
useful
> is largely immutable.
> >
> > Another distinction is that some configuration flags control behavior
> > that doesn't need to be consistent between master replicas (e.g.,
> > "--ip", "--port", "--advertise-ip", "--advertise-port&q
Welcome Anand and Joseph, thanks for all of your contributions!
Looking forward to seeing your ongoing positive influences on the community
and the project, let's build great software!
On Thu, Jun 9, 2016 at 2:00 PM, Vinod Kone wrote:
> Hi folks,
>
> I'm happy to announce
I'll make sure this gets fixed for 1.0. Apologies for the pain, it looks
like there is a significant amount of debt in the docker containerizer /
executor.
On Wed, May 18, 2016 at 10:51 AM, Steven Schlansker <
sschlans...@opentable.com> wrote:
>
> > On May 18, 2016, at 10:44 AM, haosdent
Cool stuff Andrew, thanks for sharing!
On Thu, Jun 2, 2016 at 11:50 AM, Andrew Spyker
wrote:
> FYI, based on the work others have done in the past, Netflix was able to
> get Mesos agent building and running on Raspberry Pi natively and under
> Docker containers.
+AlexR
On Mon, May 2, 2016 at 2:31 PM, Jeff Schroeder
wrote:
> Some frameworks like Aurora use custom executors to distribute the
> healthchecks with the tasks. This allows the task to survive a network
> partition without the scheduler setting it to TASK_LOST.
>
>
+user as an FYI
Going forward we'll push directly to these branches as backport decisions
are made. Since 0.28.x, 0.27.x, and 0.26.x have just been created, here is
what was already marked for these versions, that we'll have to cherry-pick:
The following need to be cherry-picked for 0.28.2:
guration options, I can see that there are two
> options --strict and --recover but their defaults looks good.
>
> On Fri, Apr 1, 2016 at 2:40 AM, Benjamin Mahler <bmah...@apache.org>
> wrote:
>
>> I'd recommend not using /tmp to store the meta-information because if
+1 (binding)
The following passes on OS X:
$ ./configure CC=clang CXX=clang++ --disable-python --disable-java
$ make check
On Tue, Apr 5, 2016 at 11:41 PM, Michael Park wrote:
> s/No changes from rc4/No changes from rc3/
> s/New fixes in rc5/New fixes in rc4/
>
> On 5 April
+1 (binding)
The following passes on OS X:
$ ./configure CC=clang CXX=clang++ --disable-python --disable-java
$ make check
On Tue, Apr 5, 2016 at 10:51 PM, Michael Park wrote:
> Hi all,
>
> Please vote on releasing the following candidate as Apache Mesos 0.24.2.
>
>
> 0.24.2
I'd recommend not using /tmp to store the meta-information because if there
is a tmpwatch it will remove things that we need for agent recovery. We
probably should change the default --work_dir, or require that the user
specify one.
It's expected that wiping the work directory will cause the
make check fails on OS X. Looks like we're missing the following:
commit 363b0b059bdc7742b2258a33ebfe430fd03f4311
Author: Kapil Arya
Date: Mon Jan 25 00:41:17 2016 -0500
Fixed non-linux build involving glog drop_log_meory flag.
The variable
make check fails on OS X. Looks like we're missing the following:
commit 363b0b059bdc7742b2258a33ebfe430fd03f4311
Author: Kapil Arya
Date: Mon Jan 25 00:41:17 2016 -0500
Fixed non-linux build involving glog drop_log_meory flag.
The variable
I'm seeing the following on OS X for the three RCs that were sent out:
$ ./configure CC=clang CXX=clang++ --disable-python --disable-java
...
$ make check -j7
...
./mesos-tests
dyld: Symbol not found: __ZN3fLB21FLAGS_drop_log_memoryE
Referenced from:
Also, I tagged https://issues.apache.org/jira/browse/MESOS-5021 with a fix
version of 0.26.1. Can you include it?
On Mon, Mar 21, 2016 at 1:59 PM, Benjamin Mahler <bmah...@apache.org> wrote:
> Yes it has existed for a long time but has only been discovered recently.
&g
Also, I tagged https://issues.apache.org/jira/browse/MESOS-5021 with a fix
version of 0.25.1. Can you include it?
On Sat, Mar 19, 2016 at 6:33 AM, Michael Park wrote:
> As there are insufficient votes on this rc along with a request
> from Evan Krall to include additional
Also, I tagged https://issues.apache.org/jira/browse/MESOS-5021 with a fix
version of 0.24.2. Can you include it?
On Sat, Mar 19, 2016 at 6:30 AM, Michael Park wrote:
> As there are insufficient votes on this rc along with a request
> from Evan Krall to include additional
Thanks Jie, I've added a fix version of 0.28.1 to:
https://issues.apache.org/jira/browse/MESOS-5021
On Fri, Mar 18, 2016 at 5:52 PM, Jie Yu wrote:
> Hi,
>
> We recently noticed two bugs
>
a little
> as to what the consequences are?
>
> Thanks!
>
> MPark
>
> On 18 March 2016 at 16:20, Benjamin Mahler <bmah...@apache.org> wrote:
>
>> These are be captured under:
>> https://issues.apache.org/jira/browse/MESOS-4979
>>
>> On
These are be captured under:
https://issues.apache.org/jira/browse/MESOS-4979
On Thu, Mar 17, 2016 at 5:04 PM, Benjamin Mahler <bmah...@apache.org> wrote:
> Thanks for the hard work! Do we need to backport the rmdir fixes on the
> outstanding release candidates
+michael who is managing the release, he'll get back to you shortly,
apologies for the delay!
On Fri, Mar 11, 2016 at 11:35 AM, Evan Krall wrote:
> I humbly request that the fixes for these issues are also included in
> 0.24.2:
>
>
Interesting, why does it take down the slaves?
Because a lot of organizations run with swap disabled (e.g. for more
deterministic performance), we originally did not set the swap limit at
all. When we introduced the '--cgroups_limit_swap' flag we had to make it
default to false initially in case
Non-terminal states are gauges (instantaneous measurements) whereas the
terminal states are counters (always increasing, at least for the lifetime
of a master process).
Hopefully this image doesn't get stripped, but we improved the wording here
to clarify which are gauges and which are counters:
t; I don't have the exit status. We haven't seen a repeat yet, will catch the
> exit status next time it happens.
>
> Yes, removing the metadata directory was the only way it was resolved.
> This happened on multiple hosts requiring the same resolution.
>
>
> On Thu, Feb 25, 20
the
>> detector.cpp:481 log line.
>> -The agents that continue to flap repaired with manual removal of
>> contents in mesos-slave's working dir
>>
>>
>>
>> On Wed, Feb 10, 2016 at 9:43 AM, Benjamin Mahler <bmah...@apache.org>
>> wrote:
>&g
Hey Sharma,
I didn't quite follow the timeline of events here or how the agent logs you
posted fit into the timeline of events. Here's how I interpreted:
-Agent running fine with 0.24.1
-Transient ZK issues, slave flapping with zookeeper_init failure
-ZK issue resolved
-Most agents stop flapping
could try some api generators like http://swagger.io/ or
> https://github.com/apidoc/apidoc
>
> On Tue, Feb 9, 2016 at 1:10 AM, Benjamin Mahler <bmah...@apache.org>
> wrote:
>
>> We now have endpoint documentation published on the website:
>>
>> http://mesos.apa
We now have endpoint documentation published on the website:
http://mesos.apache.org/documentation/latest/endpoints/
https://issues.apache.org/jira/browse/MESOS-3831
A big thank you goes out to Kevin Klues who made this happen, thanks also
goes out to Neil Conway for making the suggestion!
Our
Great! Is a blog post on the way?
On Sun, Jan 31, 2016 at 5:39 PM, Michael Park wrote:
> Hi all,
>
> The vote for Mesos 0.27.0 (rc2) has passed with the
> following votes.
>
> +1 (Binding)
> --
> Vinod Kone
> Joris Van Remoortere
> Till Toenshoff
>
Hi folks,
On behalf of the GPU working group [1] I'd like to share a design doc for
adding some initial support for GPU resources in Mesos:
JIRA Epic: https://issues.apache.org/jira/browse/MESOS-4424
Design Doc:
It's unlikely that a single response took 5 minutes for the master to
generate. It's more likely that the master was backlogged and it took the
majority of the 5 minutes for the backlog to be processed. For example, if
you have a number of webui instances open, they will each be polling
Hi Tom,
I suspect you may be tripping the following issue:
https://issues.apache.org/jira/browse/MESOS-4302
Please have a read through this and see if it applies here. You may also be
able to apply the fix to your cluster to see if that helps things.
Ben
On Wed, Jan 20, 2016 at 10:19 AM, Tom
I see that the following was filed:
https://issues.apache.org/jira/browse/MESOS-4477
But this sounds like a bug: if the master knows about the task during
reconciliation, labels should be sent. For example we had this same bug for
health information:
>From the slave (now known as agent) logs:
I0114 14:09:51.297840 23049 slave.cpp:3967] Sending reconnect request to
executor
thermos-1452181970177-USER-prod-JOB_NAME-0-99a16851-42d6-4a52-b768-359b4f499ff3
of framework 20150930-134812-84017418-5050-29407-0001 at
executor(1)@NET.10:57730
I0114
;> isolation). Our initial proposal is not exposing details of GPU but
>> subsequently more detail of GPU resources like (topology, memory, core,
>> bandwidth etc.) will be exposed to do better job scheduling.
>>
>>
>>
>> As Ben indicated very soon we will se
There is a design proposal coming that will include guidance around using
GPUs and better GPU support in mesos, so stay tuned.
Mesos supports adding arbitrary resources, e.g.
--resources=cpus(*):4;gpus(*):4
Mesos will then manage a scalar "gpu" resource with a value of 4. This
means "gpu"
ce
> 0.20 AFAIK.
> - There is a simple workaround.
>
> Bernd
>
> On Dec 10, 2015, at 3:05 AM, Benjamin Mahler <benjamin.mah...@gmail.com>
> wrote:
>
> I'd really like to pull in the fix for:
> https://issues.apache.org/jira/browse/MESOS-4106
>
> This has been a long s
, Dec 10, 2015 at 11:22 AM, Benjamin Mahler <benjamin.mah...@gmail.com
> wrote:
> What is the workaround?
>
> On Thu, Dec 10, 2015 at 4:37 AM, Bernd Mathiske <be...@mesosphere.io>
> wrote:
>
>> I think that whereas this would clearly be a desirable bug fi
I'd really like to pull in the fix for:
https://issues.apache.org/jira/browse/MESOS-4106
This has been a long standing bug that makes the health checking not
function correctly some of the time. While it is rare in CI, it appeared in
a colleague's cluster for about a third of the tasks he was
Great to hear Olivier, would you like to be added to the powered by mesos
list?
https://github.com/apache/mesos/blob/master/docs/powered-by-mesos.md
On Tue, Dec 8, 2015 at 1:04 PM, Arunabha Ghosh
wrote:
> Welcome to the community, Oliver.
>
> On Tue, Dec 8, 2015 at 5:49
101 - 200 of 317 matches
Mail list logo