Re: Disk Capacity Detection

2015-07-23 Thread Joseph Wu
Chris, There is a workaround. If you wrap up your storage devices prior to starting Mesos, then you can transparently use multiple disks as a single disk. See [1]. ~Joseph [1] https://www.mail-archive.com/user@mesos.apache.org/msg01726.html On Thu, Jul 23, 2015 at 2:31 PM, Jie Yu wrote: > C

Re: Reservations for multiple different agents

2015-09-28 Thread Joseph Wu
Hi Rinaldo, I'd like to point out a small error in your ACLs. If you want to specify "ANY", you should set the "type" field. i.e. For the RegisterFramework ACL: "register_frameworks": [ { "principals": { "values": "mesos-mach5-beta" }, "roles": { "type": 1 } } ] The ANY "type" is pa

Re: Reservations for multiple different agents

2015-09-29 Thread Joseph Wu
rization: Basic bWVzb3MtbWFjaDUtYmV0YTpwYXNzd29yZA==" That base64 blurb is the encoded version of "mesos-mach5-beta:password". ~Joseph On Mon, Sep 28, 2015 at 8:25 PM, DiGiorgio, Mr. Rinaldo S. < rdigior...@pace.edu> wrote: > > On Sep 28, 2015, at 8:03 PM, Joseph Wu wrote: > &g

Re: Viewing old versions of docs

2015-10-02 Thread Joseph Wu
Hi Alan, I don't think it's recommended to refer to older versions of the docs. But if you absolutely need to, you can find those by browsing the source. Take the version of Mesos you're looking for, and substitute it for "" below: https://github.com/apache/mesos/blob//docs/ i.e. For the most r

Re: Web UI Memory Usage in Firefox

2015-12-02 Thread Joseph Wu
Hi John, I wonder if this is just an issue with how Firefox does garbage collection. Can you try navigating to about:memory and clicking the "GC" button? The web UI's definitely should not need that much memory. ~Joseph On Wed, Dec 2, 2015 at 9:07 AM, John Omernik wrote: > I am cross posting

[Proposal] Unified logging for containerizers & the external containerizer

2015-12-11 Thread Joseph Wu
Hello All, As part of the work on managing the logs for executors and tasks, we're introducing a "ContainerLogger" module. This module will allow the stdout/stderr of executors and tasks to be managed or redirected. (Existing executor/task logs are written to plain files.) For example: - The

Re: Downloading s3 uris

2016-02-26 Thread Joseph Wu
The sandbox directory structure is a bit deep... See the "Where is the sandbox?" section here: http://mesos.apache.org/documentation/latest/sandbox/ On Fri, Feb 26, 2016 at 10:15 AM, Aaron Carey wrote: > A second question for you all.. > > I'm testing http uri downloads, and all the logs say t

Re: How did the mesos master detect the disconnect of a framework (scheduler)

2016-02-26 Thread Joseph Wu
Here's a brief(?) run-down: 1. https://github.com/apache/mesos/blob/4376803007446b949840d53945547d8a61b91339/src/master/master.cpp#L5739-L5748 When a new framework is added, the master opens a socket conn

Re: [VOTE] Release Apache Mesos 0.28.0 (rc1)

2016-03-08 Thread Joseph Wu
If we're re-cutting the release, can we also add this fix for maintenance? (still under review) https://reviews.apache.org/r/44258/ On Tue, Mar 8, 2016 at 2:43 PM, Kevin Klues wrote: > Here are the list of reviews/patches that have been called out in this > thread for inclusion in 0.28.0-rc2. S

Re: How to manage maintenance windows?

2016-03-14 Thread Joseph Wu
Managing maintenance is currently up to the operator (you, presumably). If you have something to contribute (code, docs, or examples), that would be greatly appreciated :) We haven't prioritized other integration (like the CLI or web UI) since maintenance primitives themselves need to be supporte

Re: [RESULT][VOTE] Release Apache Mesos 0.27.2 (rc1)

2016-03-18 Thread Joseph Wu
Cong Wang, The tags are sync'd. See: https://github.com/apache/mesos/releases You might not have done: git pull --tags On Wed, Mar 16, 2016 at 11:49 AM, Cong Wang wrote: > On Mon, Mar 7, 2016 at 8:29 PM, Michael Park wrote: > > Please find the release at: > > https://dist.apache.org/repos/di

Re: HTTP API

2016-03-19 Thread Joseph Wu
Zameer, In case you haven't seen this already, there is already a Java-based scheduler driver for the HTTP API here: https://github.com/mesosphere/mesos-rxjava On Thu, Mar 17, 2016 at 5:26 PM, Zameer Manji wrote: > > On Thu, Mar 17, 2016 at 10:03 AM, Vinod Kone wrote: > >> Other than the issu

Re: Offers to a framework

2016-05-02 Thread Joseph Wu
Both the Mesos master and Marathon have metrics that tell you how many offers have been sent, but not the contents of said offers. Marathon does not keep offers long enough for them to show up as "outstanding offers" in the Mesos UI. As far as I know, one way to get the offer contents is by setti

Re: Dynamic scaling of DCOS slave

2016-05-03 Thread Joseph Wu
You should be able to add as many agents as you like via manual installation. The list of agents you supply in the genconf is used for the automated install processes. For manual installation, the list of agents is inconsequential (since you, the operator, are SSH-ing into each box). See: https:/

Re: Marathon GUI -Scaling An Application Issue.

2016-05-09 Thread Joseph Wu
Can you check the following? - Are offers being sent to Marathon from both agents? This will show up in the master logs, with at least INFO level logging (default). - Do the resources from your second agent actually satisfy your container's constraints? It would help to see your Mar

Re: Enable s3a for fetcher

2016-05-10 Thread Joseph Wu
Mesos does not explicitly support HDFS and S3. Rather, Mesos will assume you have a hadoop binary and use it (blindly) for certain types of URIs. If the hadoop binary is not present, the mesos-fetcher will fail to fetch your HDFS or S3 URIs. Mesos does not ship/package hadoop, so these URIs are n

Re: Enable s3a for fetcher

2016-05-10 Thread Joseph Wu
configuration/> for how to > configure the slave with a path to the Hadoop client. [emphasis added] > > What you are saying is that dcos simply wont install hadoop on agents? > > Next question then: will you be nerfing fetcher.cpp, or will I be able to > install hadoop on th

Re: Marathon MySQL and Wordpress Deployment

2016-05-12 Thread Joseph Wu
You may want to elaborate on what exactly you want to do. But yes. Command tasks just run commands. As long the syntax is correct (i.e. can you run it locally?), it will run. On Wed, May 11, 2016 at 10:13 PM, wrote: > Hi, > > Would like to know can we deploy MySQL and Wordpress together throu

Re: Cannot pull from private docker v1 registry

2016-05-18 Thread Joseph Wu
The stderr you posted suggests that Mesos successfully fetched your .dockercfg. If the following docker pull fails, there should be additional logs printed either in the Mesos agent logs, or in the task stderr. Can you check those as well? (And post them here.) On Wed, May 18, 2016 at 2:29 PM,

Re: Completed tasks logs missing from mesos UI

2016-05-25 Thread Joseph Wu
Can you check that the ExecutorID and AgentID (actually SlaveID) match the path you found the logs on your box? See the diagram here for what each part of the path corresponds to each ID: http://mesos.apache.org/documentation/latest/sandbox/#where-is-it On Wed, May 25, 2016 at 3:04 AM, shakeel w

Re: 1.0 Release Candidate

2016-05-25 Thread Joseph Wu
I'm guessing you mean the "medium term" bullet point on the Roadmap ( https://cwiki.apache.org/confluence/display/MESOS/Roadmap): > >- Deprecate Docker containerizer (in favor of Unified containerizer w/ >Docker support) > > This was never meant to be done as part of the 1.0 release. I'm

Re: Benign 'Shutdown failed on fd' error messages

2016-05-27 Thread Joseph Wu
This log line is part of some socket cleanup Mesos performs for all sockets. Mesos calls the "shutdown" syscall on the socket: http://man7.org/linux/man-pages/man2/shutdown.2.html This part of the log line: > Transport endpoint is not connected comes from the *ENOTCONN* error code. We generally

Re: Mesos 0.28.2 does not start

2016-06-10 Thread Joseph Wu
I'm guessing you mis-configured a master flag/environment-variable somewhere. Since it looks like you're using systemd, can you run this command, "journalctl -xe", right after you try to start the service? That should show you more info on why your master aborted. On Fri, Jun 10, 2016 at 8:50

Re: Mesos 0.28.2 does not start

2016-06-10 Thread Joseph Wu
The log directory is based on your configuration. See the master config section here: http://mesos.apache.org/documentation/latest/configuration/ If you've set the --log_dir flag, you'll find your logs there. Otherwise, the logs will be in stderr. If you launched the master via a systemd service

Re: Mesos 0.28.2 does not start

2016-06-10 Thread Joseph Wu
alocal systemd[1]: *mesos-master.service: main > process exited, code=killed, status=6/ABRT* > > giu 09 23:26:15 master.novalocal systemd[1]: *Unit mesos-master.service > entered failed state.* > > giu 09 23:26:15 master.novalocal systemd[1]: *mesos-master.service > failed.* > > g

Re: Master slow to process status updates after massive killing of tasks?

2016-06-17 Thread Joseph Wu
A couple questions about your test: - By "killed off", were your agents killed permanently (i.e. powered off) or temporarily (i.e. network partition). And how long were your agents killed/down during the test? - How many of the 1000 accidentally killed tasks were running on your killed-off agents

Re: Master slow to process status updates after massive killing of tasks?

2016-06-20 Thread Joseph Wu
w things down is communication with ZooKeeper, which is a > possibility. > - Singularity calls reconcileTasks() every 10 minutes. How often would you > expect to see that log line? At the worst point, we saw it printed 637 > times in one minute in the master logs. > > Thanks, >

Re: Setting up SSL for mesos

2016-07-06 Thread Joseph Wu
If you can see the WebUI via HTTP, without downgrade support, you might be inadvertently running a different version of Mesos than the one you built. You can quickly sanity check this by removing either SSL_KEY_FILE or SSL_CERT_FILE and starting your master. If your build has SSL support, it shou

Re: Setting up SSL for mesos

2016-07-07 Thread Joseph Wu
mesos not > connecting with the environment variable I set for some reason? > > > > On Wed, Jul 6, 2016 at 2:20 PM, Joseph Wu wrote: > >> If you can see the WebUI via HTTP, without downgrade support, you might >> be inadvertently running a different version of Mesos than th

Re: Setting up SSL for mesos

2016-07-07 Thread Joseph Wu
ibssl.so.1.0.0 > > So I am missing the libssl3.so line. Is that another package I need to > install as a prerequisite? In case it's relevant, I'm running ubuntu. > > On Thu, Jul 7, 2016 at 1:14 PM, Joseph Wu wrote: > >> Can you double-check if your master is link

Re: Windows Build

2016-07-11 Thread Joseph Wu
There are some instructions here: https://github.com/apache/mesos/blob/master/docs/getting-started.md#building-mesos-windows When the website's update is pushed, the instructions will show up here too: http://mesos.apache.org/gettingstarted/ On Sat, Jul 9, 2016 at 3:43 PM, Artem Harutyunyan wrot

Re: mesos/dcos user issue?

2016-07-13 Thread Joseph Wu
Looks like you solved your problem: > either remove the "USER" statement or add the user locally on the mesos agent machines You can't run as a user that doesn't exist :) On Wed, Jul 13, 2016 at 7:18 AM, Clarke, Trevor wrote: > I've got an image with a local user and a 'USER myuser' statement

Re: Mesos fine-grained multi-user mode failed to allocate tasks

2016-07-13 Thread Joseph Wu
Looks like you're running Spark in "fine-grained" mode (deprecated). (The Spark website appears to be down right now, so here's the doc on Github:) https://github.com/apache/spark/blob/master/docs/running-on-mesos.md#fine-grained-deprecated Note that while Spark tasks in fine-grained will relinqu

Re: Windows Build on Jenkins almost working

2016-07-15 Thread Joseph Wu
A few notes: * Lowering the number of warnings is on our TODO list. Currently, seeing 1000's of warnings is fairly common :( * The windows build does not work if your files have Unix-style line endings. If you use Git on Windows, you should run: git config core.autocrlf true * The CMake warnings

Re: What will happen in maintenance mode

2016-07-18 Thread Joseph Wu
My guess is that your agents don't match the machines you specified. Note: The maintenance endpoints in Mesos allow you to specify maintenance against non-existent machines, because the operator may add agents on those machines in future. In Mesos' maintenance primitives, a "machine" is a hostnam

Re: Does a executing task has a expiry time?

2016-07-18 Thread Joseph Wu
The behavior and lifetime of a task is up to the executor (which is, in turn, controlled by the framework; which is decided by the operator). The default command executor does not have any timeouts for running tasks. On Mon, Jul 18, 2016 at 2:59 AM, Bryan Fok wrote: > Hi all > > Does a executin

Re: mesos crash

2016-07-19 Thread Joseph Wu
When you start a new group of masters, the masters will not initialize their replicated log (from the EMPTY state) until all masters are present. This means (quorum * 2 - 1) masters must be up and reachable. We enforce this behavior because the replicated log can get into a inconsistent state othe

Re: mesos crash

2016-07-21 Thread Joseph Wu
tryEE8onFailedISt5_BindIFSt7_Mem_fnIMS4_FbRKSsEES4_St12_PlaceholderILi1bEERKS4_OT_NS4_6PreferEENUlS9_E_clES9_ > @ 0x7fb9b6452752 > > _ZNSt17_Function_handlerIFvRKSsEZNK7process6FutureIN5mesos8internal8RegistryEE8onFailedISt5_BindIFSt7_Mem_fnIMS8_FbS1_EES8_St12_PlaceholderILi1bE

Re: What will happen in maintenance mode

2016-07-25 Thread Joseph Wu
t above is not correct, as if I only specific >> the hostname or IP, it will NOT take effect for the maintenance agents. >> but should specific both will OK. >> >> On 2016年07月19日 02:17, Joseph Wu wrote: >> >> [image: Boxbe] <https://www.boxbe.com/overview>

Re: Attributes cause agent to fail

2016-07-29 Thread Joseph Wu
Works fine for me. Make sure the agent isn't just complaining about invalid flags. i.e. This is invalid: --attributes="something" This is valid: --attributes="something:foo" --attributes="something:foo; nothing:bar" And make sure your agent's work directory doesn't contain info from an agent st

Re: Using mesos' cfs limits on a docker container?

2016-08-13 Thread Joseph Wu
If you're not against running Docker containers without the Docker daemon, try using the Unified containerizer. See the latter half of this document: http://mesos.apache.org/documentation/latest/mesos-containerizer/ On Sat, Aug 13, 2016 at 7:02 PM, Mark Hammons wrote: > Hi All, > > > > I was hav

Re: Marathon constantly unregisters on particular slaves

2016-08-24 Thread Joseph Wu
> > Scenario is: > * Marathon registers on slave, > Why is Marathon registering on the agent? This shouldn't even be possible, as frameworks must talk to the master. Marathon dies on two of them constantly. How are you starting Marathon? Via some init service? And are you starting Marathon on

Re: can we use mesos and spark with consul or etcd?

2016-08-25 Thread Joseph Wu
There's a bit of ongoing work on decoupling ZK from Mesos, but this is still some way off. See this epic: https://issues.apache.org/jira/browse/MESOS-1806 and it's children. Most likely, you run into headaches regardless of ZK/Consul/Etcd. All have their own set of quirks. (i.e. the grass is al

Re: what is the status on this?

2016-08-25 Thread Joseph Wu
There is no timeline as no one has done any work on the issue. On Thu, Aug 25, 2016 at 4:54 PM, kant kodali wrote: > Hi Guys, > > I see this ticket and other related tickets should be part of sprints in > 2015 and it is still not resolved yet. can we have a timeline on this? This > would be real

Re: Is this a CI system for is it a development system

2016-08-30 Thread Joseph Wu
The Windows CI can be found here: https://builds.apache.org/job/Mesos-Windows/ On Tue, Aug 30, 2016 at 7:14 AM, Alexander Rojas wrote: > Didn’t take it as such, I’m just trying to help you as good as I can :) > > > On 30 Aug 2016, at 16:11, DiGiorgio, Mr. Rinaldo S. > wrote: > > > On Aug 30, 20

Re: Mesos 1.0 WebUI does not display cluster name

2016-08-30 Thread Joseph Wu
Try "--cluster" instead of "—cluster". On Tue, Aug 30, 2016 at 2:01 PM, Haripriya Ayyalasomayajula < aharipriy...@gmail.com> wrote: > Hi all, > > > Mesos Web UI does not display the name of the cluster. > > I have a config file named cluster under /etc/mesos-master/ along with > other configurati

Re: Failed to shutdown socket

2016-09-06 Thread Joseph Wu
You can ignore that log line. It's something Mesos prints when the client side of a socket closes the socket before Mesos does. We've kept the log line thus far because it can be surprisingly insightful when tracking down things like FD leaks based on logs alone :) On Mon, Sep 5, 2016 at 4:26 AM

Re: Failed to shutdown socket

2016-09-06 Thread Joseph Wu
ignore it > versus a leak? > > > Thanks, > June Taylor > System Administrator, Minnesota Population Center > University of Minnesota > > On Tue, Sep 6, 2016 at 1:07 PM, Joseph Wu wrote: > >> You can ignore that log line. It's something Mesos prints when

Re: what is the status on this?

2016-09-06 Thread Joseph Wu
gt; mesos and consul working together in which we would be ready to jump at it >>>> and make a switch for YARN to Mesos. >>>> >>>> Thanks, >>>> Kant >>>> >>>> >>>> >>>> >>>> On Wed, Aug 31, 201

0.28.3 release dashboard!

2016-11-03 Thread Joseph Wu
Hi everyone! Anand and I will be the Release Managers for 0.28.3! We are planning to cut this patch release within three workdays - that would be around Monday next week. So, if you have any patches that need to get into 0.28.3 make sure that either it is already in the 0.28.x branch or the corre

Re: framework failover

2016-11-04 Thread Joseph Wu
A couple questions/notes: What do you mean by: > the system will deploy the framework on a new node within less than three > minutes. Are you running your frameworks via Marathon? How are you terminating the Mesos Agent? If you send a `kill -SIGUSR1`, the agent will immediately kill all of its

Re: 0.28.3 release dashboard!

2016-11-07 Thread Joseph Wu
Thanks for the suggestions Benjamin! I've re-purposed one of the dashboard queries to track "Issues affecting 0.28.x that are resolved in versions later than 0.28". https://issues.apache.org/jira/issues/?filter=12338701 ^ That will show up on the dashboard too. There are 26 issues in that list, w

Re: framework failover

2016-11-07 Thread Joseph Wu
> > > // Setup recovery timer. > > delay(ALLOCATION_HOLD_OFF_RECOVERY_TIMEOUT, self(), &Self::resume); > > > > // NOTE: `quotaRoleSorter` is updated implicitly in `setQuota()`. > > foreachpair (const string& role, const Quota& quota, quotas) { > >

Re: framework failover

2016-11-08 Thread Joseph Wu
we > have tried both with and without marathon health checks. No affect. > > > > Ø If you lose the Mesos agent forever, the master tells Marathon that > tasks are lost, but not the corresponding frameworks. > > This is interesting, so the loss off of framework in the fail

Re: 答复: Mesos Documentation Project

2016-11-09 Thread Joseph Wu
; My name is James Neiman. I have been working with Benjamin Hindman, Artem > Harutyunyan, Neil Conway, and Joseph Wu on improving the Mesos > documentation. We now have a proposal for the community to critique. > > Our goal is to satisfy the needs of Operators, Developers, and Contribu

Re: Mesos Docker logs

2016-11-10 Thread Joseph Wu
You can think of Mesos sandbox logs like: docker run ... > $MESOS_SANDBOX/stdout 2>&1 $MESOS_SANDBOX/stderr If `docker stop` happens to interrupt the above streams, you will lose some lines. The behavior here seems to depend heavily on the version of Docker you happen to be using. On Thu, Nov 1

Re: [CNI] Proxying connections from outside the cluster to tasks with IP from host-local IPAM

2016-11-14 Thread Joseph Wu
There are a variety of setups out there. For example, DC/OS has a variety of components supporting service-discovery/load-balancing: https://dcos.io/docs/1.8/usage/service-discovery/ I'm sure others in the community can provide other examples. On Sun, Nov 13, 2016 at 7:12 AM, Frank Scholten wro

Re: Mesos V1 Operator HTTP API - Java Proto Classes

2016-11-16 Thread Joseph Wu
Added. Welcome to the contributors list :) On Wed, Nov 16, 2016 at 9:49 AM, Vijay Srinivasaraghavan < vijikar...@yahoo.com> wrote: > I have created a JIRA and will submit a patch. Could someone please add me > to the contributor list as I am not able to assign the JIRA to myself? > > https://iss

Re: [VOTE] Release Apache Mesos 0.28.3 (rc1)

2016-11-29 Thread Joseph Wu
AlexR, Thanks for pointing out those test failures. As of 0.28, the LinuxFilesystemIsolatorTests were notoriously flaky on distributions with "large" root filesystems. The test would essentially copy the root filesystem, leading to timeouts in multiple places in the tests. CentOS 7 was known to

Re: High Availability Mesos and Zookeeper Security

2017-01-04 Thread Joseph Wu
Enabling SSL on Zookeeper will likely not work, as the Zookeeper C library (which Mesos uses to talk to Zookeeper) does not contain any concept of SSL. If they added SSL support to the C library in that alpha version, you would need to bump the library in the Mesos code and rebuild, possibly with

Re: Sigkill while running mesos agent (1.0.1) in docker

2017-01-12 Thread Joseph Wu
If Apache JIRA were up, I'd point you to a JIRA noting the problem with naming docker containers `mesos-*`, as Mesos reserves that prefix (and kills everything it considers "unknown"). As a quick workaround, try setting this flag to false: https://github.com/apache/mesos/blob/1.1.x/src/slave/flags

Re: Understanding Mesos Maintenance

2017-03-03 Thread Joseph Wu
Inverse offers have the same offer cycle as normal offers. They can be Accepted/Declined with a timeout (default 5 seconds). On Fri, Mar 3, 2017 at 5:29 PM, Zameer Manji wrote: > Ben, > > Thanks for responding to my questions. I have a follow up on #3. > > I have a framework which accepts invers

Re: [VOTE] Release Apache Mesos 1.2.0 (rc2)

2017-03-07 Thread Joseph Wu
+1 (binding) Deployed on a small-ish test cluster for about a week. Monitoring of that test cluster has not caught any problems with Mesos. Also confirmed that this SSL socket FD leak does not affect Mesos, except in tests: https://issues.apache.org/jira/browse/MESOS-6919 On Mon, Mar 6, 2017 at

Re: Mesos fetcher error when running as non-root user

2017-04-26 Thread Joseph Wu
There was a change in 1.2.0 which changed how the fetcher would chown the sandbox: https://issues.apache.org/jira/browse/MESOS-5218 Prior to 1.2, when the fetcher ran, it would recursively chown the entire sandbox to the given user. This was incorrect behavior, since the Mesos agent will create t

Re: Mesos Executor Failing

2017-05-19 Thread Joseph Wu
What version of Mesos are you using? (Just based on the word "slave" in that error message, I'm guessing 0.28 or older.) The "Failed to synchronize" error is something that can occur while the agent is launching the executor. During the launch, the agent will create a pipe to the executor subpro

Re: Mesos Executor Failing

2017-05-24 Thread Joseph Wu
g? > > Regards > Sumit Chawla > > > On Fri, May 19, 2017 at 2:31 PM, Joseph Wu wrote: > >> What version of Mesos are you using? (Just based on the word "slave" in >> that error message, I'm guessing 0.28 or older.) >> >> The "Failed

Re: Custom isolators - External container

2017-08-07 Thread Joseph Wu
First off, the external containerizer was officially removed in Mesos 1.1.0 (it had been deprecated long before that release): https://issues.apache.org/jira/browse/MESOS-3370 --- If you want to develop/deploy a new isolation method for Mesos, you should first consider writing isolator modules (M

[Design Doc] Standalone Container API

2017-08-07 Thread Joseph Wu
As part of work to improve storage support in Mesos [1], we will be adding the ability to launch containers via the Mesos Containerizer, without going through the traditional method (i.e. framework -> offer cycle -> launch executor/task -> status updates -> etc). Below I've linked a short design d

Re: Mesos containerizer with marathon

2017-10-13 Thread Joseph Wu
A quick modification to try... Replace the container type: "container": { > "type": "DOCKER", > With this: "container": { "type": "MESOS", That will tell Marathon to use the Mesos containerizer, rather than the Docker containerizer. On Fri, Oct 13, 2017 at 2:38 PM, Marc Roos wrote:

Welcome Andrew Schwartzmeyer as a new committer and PMC member!

2017-11-27 Thread Joseph Wu
Hi devs & users, I'm happy to announce that Andrew Schwartzmeyer has become a new committer and member of the PMC for the Apache Mesos project. Please join me in congratulating him! Andrew has been an active contributor to Mesos for about a year. He has been the primary contributor behind our e

Re: java driver/shutdown call

2018-01-16 Thread Joseph Wu
If a framework launches tasks, then it will use an executor. Mesos provides a "default" executor if the framework doesn't explicitly specify an executor. (And the Shutdown call will work with that default executor.) On Tue, Jan 16, 2018 at 4:49 PM, Mohit Jaggi wrote: > Gotcha. Another question

Re: Maintenance document

2018-04-16 Thread Joseph Wu
Unscheduling maintenance is equivalent to POST-ing to /maintenance/schedule with node(s) removed from the existing schedule. Inverse offers are more of a side-effect of scheduling maintenance. These notify frameworks of any scheduled maintenance, but do not directly influence the states (Up/Down/

Re: Resource offers - DRF - Mesos

2018-05-22 Thread Joseph Wu
1) DRF is based on the _current_ allocation of resources (from the master's perspective) rather than a historical allocation of resources. 2) So when a new cluster is started, all frameworks will have a current allocation of 0. And assuming all else (like quotas, roles, and weights) are equivalen

Re: Install target missing for CMake builds

2018-09-06 Thread Joseph Wu
We have not (yet) implemented the install target for the CMake build. The target does exist if you use the automake build however. On Thu, Sep 6, 2018 at 9:36 AM, Junker, Gregory wrote: > Hi > > I am trying to build Mesos on Linux (Ubuntu 18.04) using CMake (Makefile > generator) and following

[API WG] Proposals for dealing with master subscriber leaks.

2018-11-09 Thread Joseph Wu
Hi all, During some internal scale testing, we noticed that, when Mesos streaming endpoints are accessed via certain proxies (or load balancers), the proxies might not close connections after they are complete. For the Mesos master, which only has the /api/v1 SUBSCRIBE streaming endpoint, this ca

Re: [API WG] Proposals for dealing with master subscriber leaks.

2018-11-14 Thread Joseph Wu
e connection (request and response are infinite and heartbeating) by > default. Splitting into a separate call is messy and shouldn't be what we > force everyone to do, it should only be done in cases that it's impossible > to use a single connection (e.g. browsers). > >

Re: Check failed: reservationScalarQuantities.contains(role)

2019-02-05 Thread Joseph Wu
>From the stack, it looks like the master is attempting to remove an agent from the master's in-memory state. In the master's logs you should find a line shortly before the exit, like: master.cpp:] Removed agent : The agent's ID should at least give you some pointer to which agent is causi

Re: How to parse -v docker flags

2019-02-13 Thread Joseph Wu
Since you are using the Mesos containerizer, docker will not be part of the equation (even if you are using a docker image). By the looks of it, you are trying to mount a specific volume ("data") provided by the docker volume driver. In which case, you'll want to take a look at this documentation

Re: Failed to accept socket: Failed accept: connection error: error:1407609C:SSL routines:SSL23_GET_CLIENT_HELLO:http request

2019-02-20 Thread Joseph Wu
The "SSL routines:SSL23_GET_CLIENT_HELLO:http request" is OpenSSL's cryptic way of saying the client is using HTTP to talk to an HTTPS server. Since you've disabled LIBPROCESS_SSL_SUPPORT_DOWNGRADE, the error should be expected. On Wed, Feb 20, 2019 at 2:06 PM Marc Roos wrote: > > > Why am I ge

Re: [VOTE] Release Apache Mesos 1.8.0 (rc2)

2019-04-23 Thread Joseph Wu
-1 (binding) We found a serious bug when upgrading from 1.7.x to 1.8.x, which prevents agents from reregistering after upgrading the masters: https://issues.apache.org/jira/browse/MESOS-9740 On Tue, Apr 23, 2019 at 8:27 AM Andrei Budnik wrote: > +1 > > sudo make -j16 distcheck > DISTCHECK_CONFI

Design doc: Agent draining and deprecation of maintenance primitives

2019-05-29 Thread Joseph Wu
Hi all, A few years back, we added some constructs called maintenance primitives to Mesos. This feature was meant to allow operators and frameworks to cooperate in draining tasks off nodes scheduled for maintenance. As far as we've observed since, this feature never achieved enough adoption to b

Re: Design doc: Agent draining and deprecation of maintenance primitives

2019-05-30 Thread Joseph Wu
As far as I can tell, the document is public. On Thu, May 30, 2019 at 12:22 AM Marc Roos wrote: > > Is the doc not public? > > > -Original Message- > From: Joseph Wu [mailto:jos...@mesosphere.io] > Sent: donderdag 30 mei 2019 2:07 > To: dev; user > Subject: