Re: [VOTE] Release Apache Mesos 1.4.1 (rc1)

2017-11-14 Thread Anand Mazumdar
ache.org/view/M-R/view/Mesos/job/Mesos-Rel
>> ease/43/BUILDTOOL=cmake,COMPILER=clang,CONFIGURATION=--
>> verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A1
>> 4.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>>
>>
>> On Thu, Nov 9, 2017 at 6:27 PM, Kapil Arya <ka...@mesosphere.io> wrote:
>>
>> > Hi all,
>> >
>> > Please vote on releasing the following candidate as Apache Mesos 1.4.1.
>> >
>> > 1.4.1 includes the following:
>> > 
>> > 
>> > * [MESOS-7873] - Expose `ExecutorInfo.ContainerInfo.NetworkInfo` in
>> Mesos
>> > `state` endpoint.
>> > * [MESOS-7921] - ProcessManager::resume sometimes crashes accessing
>> > EventQueue.
>> > * [MESOS-7964] - Heavy-duty GC makes the agent unresponsive.
>> >
>> > * [MESOS-7968] - Handle `/proc/self/ns/pid_for_children` when parsing
>> > available namespace.
>> > * [MESOS-7969] - Handle cgroups v2 hierarchy when parsing
>> > /proc/self/cgroups.
>> > * [MESOS-7980] - Stout fails to compile with libc >= 2.26.
>> >
>> > * [MESOS-8051] - Killing TASK_GROUP fail to kill some tasks.
>> >
>> > * [MESOS-8080] - The default executor does not propagate missing task
>> exit
>> > status correctly.
>> > * [MESOS-8090] - Mesos 1.4.0 crashes with 1.3.x agent with
>> oversubscription
>> >
>> > * [MESOS-8135] - Masters can lose track of tasks' executor IDs.
>> >
>> > * [MESOS-8169] - Incorrect master validation forces executor IDs to be
>> > globally unique.
>> >
>> >
>> > The CHANGELOG for the release is available at:
>> > https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_
>> > plain;f=CHANGELOG;hb=1.4.1-rc1
>> > 
>> > ----
>> >
>> > The candidate for Mesos 1.4.1 release is available at:
>> > https://dist.apache.org/repos/dist/dev/mesos/1.4.1-rc1/mesos
>> -1.4.1.tar.gz
>> >
>> > The tag to be voted on is 1.4.1-rc1:
>> > https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit
>> ;h=1.4.1-rc1
>> >
>> > The MD5 checksum of the tarball can be found at:
>> > https://dist.apache.org/repos/dist/dev/mesos/1.4.1-rc1/
>> > mesos-1.4.1.tar.gz.md5
>> >
>> > The signature of the tarball can be found at:
>> > https://dist.apache.org/repos/dist/dev/mesos/1.4.1-rc1/
>> > mesos-1.4.1.tar.gz.asc
>> >
>> > The PGP key used to sign the release is here:
>> > https://dist.apache.org/repos/dist/release/mesos/KEYS
>> >
>> > The JAR is in a staging repository here:
>> > https://repository.apache.org/content/repositories/orgapachemesos-1216
>> >
>> > Please vote on releasing this package as Apache Mesos 1.4.1!
>> >
>> > The vote is open until Monday, November 13, 2017, 11:59 PM EST and
>> passes
>> > if a majority of at least 3 +1 PMC votes are cast.
>> >
>> > [ ] +1 Release this package as Apache Mesos 1.4.1
>> > [ ] -1 Do not release this package because ...
>> >
>> > Thanks,
>> > Anand and Kapil
>> >
>>
>
>


-- 
Anand Mazumdar


Re: [VOTE] Release Apache Mesos 1.4.0 (rc5)

2017-09-18 Thread Anand Mazumdar
+1 (binding)

make check passed on Ubuntu 16.04

-anand

On Fri, Sep 15, 2017 at 2:12 PM, Kapil Arya  wrote:

> +1 (binding)
>
> Internal CI with Centos 6/7, Fedora 23, Debian 8, and Ubuntu 12/14/16.
>
> On Fri, Sep 15, 2017 at 5:08 PM, Vinod Kone  wrote:
>
>> Ok. Looks like a test issue per https://reviews.apache.org/r/60467/
>>
>> +1(binding)
>>
>> On Fri, Sep 15, 2017 at 12:16 PM, Michael Park  wrote:
>>
>>> Vinod, regarding MESOS-7729
>>> :
>>>
>>> I found MESOS-6345  
>>> related
>>> to persistent volume framework, which leads me to believe that this is not
>>> new.
>>>
>>> Thanks,
>>>
>>> MPark
>>>
>>> On Tue, Sep 12, 2017 at 12:01 PM Vinod Kone 
>>> wrote:
>>>
 Tested this on ASF CI.

 Saw 3 flaky tests.

 https://issues.apache.org/jira/browse/MESOS-7729
 

 https://issues.apache.org/jira/browse/MESOS-7971
 https://issues.apache.org/jira/browse/MESOS-7972

 The first one was a known (since 1.4.0) flaky test with a double free
 corruption. @Kapil and @MPark can you verify that this is an issue with
 the
 test and not the source code? Once verified, I'll give a +1.

 *Revision*: b3fd2e7ab26e118222fe18af4b92c53a3c01e6cc

- refs/tags/1.4.0-rc5

 Configuration Matrix gcc clang
 centos:7 --verbose --enable-libevent --enable-ssl autotools
 [image: Success]
 
 [image: Not run]
 cmake
 [image: Success]
 
 [image: Not run]
 --verbose autotools
 [image: Failed]
 
 [image: Not run]
 cmake
 [image: Success]
 
 [image: Not run]
 ubuntu:14.04 --verbose --enable-libevent --enable-ssl autotools
 [image: Success]
 
 [image: Success]
 
 cmake
 [image: Success]
 
 [image: Success]
 
 --verbose autotools
 [image: Success]
 
 [image: Success]
 
 cmake
 [image: Failed]
 

Mesos 1.4.0

2017-07-27 Thread Anand Mazumdar
Hello everyone,

It's about time for Mesos 1.4.0 (somewhat late though, 1.3 rc1 was cut on
5/5) . Kapil would be the primary release manager and I would be the
co-release manager.

We expect to cut rc1 in the coming couple of weeks. Here's how you can help:
- Set *Target Version = "1.4.0"* for anything that needs to go into this
release. Anything not critical can wait for Mesos 1.5.
- Upgrade release blockers to *"Blocker" priority*. Use "Critical" for any
issues that would be painful (but possible) to ship Mesos 1.4 without.

Mesos 1.4 release dashboard:
https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12331513

-anand


Re: Plan for upgrading protobuf==3.2.0 in Mesos

2017-05-26 Thread Anand Mazumdar
We recently committed this [1] and it would be part of the *next major
release* (1.4.0). Also, we upgraded to the newer protobuf release 3.3.0.

For Mesos developers, this means that we can use proto3 features like arena
allocation [2], maps [3] etc. Note that we still need to use the proto2
syntax version for backward compatibility.

Thanks Zhitao for the contributions!

[1] https://issues.apache.org/jira/browse/MESOS-7228
[2] https://issues.apache.org/jira/browse/MESOS-5783
[3] https://developers.google.com/protocol-buffers/docs/proto#maps

-anand


On Thu, Apr 27, 2017 at 10:28 AM, Anand Mazumdar <an...@apache.org> wrote:

> + dev
>
> Bumping up the thread to ensure it's not missed.
>
> -anand
>
> On Tue, Apr 25, 2017 at 11:01 AM, Zhitao Li <zhitaoli...@gmail.com> wrote:
> > Dear framework owners and users,
> >
> > We are working on upgrading the protobuf library in Mesos to 3.2.0 in
> > https://issues.apache.org/jira/browse/MESOS-7228, to overcome some
> protobuf
> > limitation on message size as well as preparing for further improvement.
> We
> > aim to release this with the upcoming Mesos 1.3.0.
> >
> > Because we upgraded the protoc compiler in this process, all generated
> java
> > and python code may not be compatible with protobuf 2.6.1 (the previous
> > dependency), and we ask you to upgrade the protobuf dependency to 3.2.0
> when
> > you upgrade your framework dependency to 1.3.0.
> >
> > For java, a snapshot maven artifact has been prepared (by Anand
> Mazumdar's
> > courtesy) at
> > https://repository.apache.org/content/repositories/
> snapshots/org/apache/mesos/mesos/1.3.0-SNAPSHOT/
> > . Please feel free to play out with it and let us know if you run into
> any
> > issues.
> >
> > Note that the binary upgrade process should still be compatible: any
> java or
> > based framework (scheduler or executor) should still work out of box with
> > Mesos 1.3.0 once released. It is suggested to get your cluster upgraded
> to
> > 1.3.0 first, then come back and upgrade your executors and schedulers.
> >
> > We understand this may expose inconvenience around updating the protobuf
> > dependency, so please let us know if you have any concern or further
> > questions.
> >
> > --
> >
> > Cheers,
> >
> > Zhitao Li and Anand Mazumdar,
>


Mesos 1.4.0 Release Manager(s)

2017-05-15 Thread Anand Mazumdar
1.4 is still some time away but I was wondering if there are any committers
volunteering to be release managers?

If not, me and Kevin would be happy to volunteer. Also, let us know if you
would be interested to help us out in managing the release as a new
committer.

-anand


Re: [VOTE] Release Apache Mesos 1.0.4 (rc2)

2017-05-03 Thread Anand Mazumdar
+1 (binding)

make check passed on Ubuntu 16.04 with clang 3.6

-anand

On Wed, May 3, 2017 at 10:01 AM, Vinod Kone  wrote:

> +1 (binding)
>
> *Revision*: 4154f66d6c6dde8fd2cf2bbf0bfa155f24ac55d4
>
>- refs/tags/1.0.4-rc2
>
> Configuration Matrix gcc clang
> centos:7 --verbose --enable-libevent --enable-ssl autotools
> [image: Success]
> 
> [image: Not run]
> cmake
> [image: Success]
> 
> [image: Not run]
> --verbose autotools
> [image: Success]
> 
> [image: Not run]
> cmake
> [image: Success]
> 
> [image: Not run]
> ubuntu:14.04 --verbose --enable-libevent --enable-ssl autotools
> [image: Success]
> 
> [image: Success]
> 
> cmake
> [image: Success]
> 
> [image: Success]
> 
> --verbose autotools
> [image: Success]
> 
> [image: Success]
> 
> cmake
> [image: Success]
> 
> [image: Success]
> 
>
> On Tue, May 2, 2017 at 4:03 PM, Benjamin Mahler 
> wrote:
>
>> +1 make check passes on macOS 10.12.4 with clang
>>
>> On Tue, May 2, 2017 at 12:04 PM, Vinod Kone  wrote:
>>
>> > Hi all,
>> >
>> >
>> > Please vote on releasing the following candidate as Apache Mesos 1.0.4.
>> >
>> >
>> > 1.0.4 includes the following:
>> >
>> > 
>> > 
>> >
>> > * [MESOS-2537] - AC_ARG_ENABLED checks are broken
>> >
>> >
>> > * [MESOS-6606] - Reject optimized builds with libcxx before 3.9
>> >
>> >
>> > * [MESOS-7008] - Quota not recovered from registry in empty cluster.
>> >
>> >
>> > * [MESOS-7265] - Containerizer startup may cause sensitive data to
>> leak
>> > into sandbox logs.
>> >
>> > * [MESOS-7366] - Agent sandbox gc could accidentally delete the
>> entire
>> > persistent volume content.
>> >
>> > * [MESOS-7383] - Docker executor logs possibly sensitive parameters.
>> >
>> >
>> > * [MESOS-7422] - Docker 

Re: Plan for upgrading protobuf==3.2.0 in Mesos

2017-04-27 Thread Anand Mazumdar
+ dev

Bumping up the thread to ensure it's not missed.

-anand

On Tue, Apr 25, 2017 at 11:01 AM, Zhitao Li <zhitaoli...@gmail.com> wrote:
> Dear framework owners and users,
>
> We are working on upgrading the protobuf library in Mesos to 3.2.0 in
> https://issues.apache.org/jira/browse/MESOS-7228, to overcome some protobuf
> limitation on message size as well as preparing for further improvement. We
> aim to release this with the upcoming Mesos 1.3.0.
>
> Because we upgraded the protoc compiler in this process, all generated java
> and python code may not be compatible with protobuf 2.6.1 (the previous
> dependency), and we ask you to upgrade the protobuf dependency to 3.2.0 when
> you upgrade your framework dependency to 1.3.0.
>
> For java, a snapshot maven artifact has been prepared (by Anand Mazumdar's
> courtesy) at
> https://repository.apache.org/content/repositories/snapshots/org/apache/mesos/mesos/1.3.0-SNAPSHOT/
> . Please feel free to play out with it and let us know if you run into any
> issues.
>
> Note that the binary upgrade process should still be compatible: any java or
> based framework (scheduler or executor) should still work out of box with
> Mesos 1.3.0 once released. It is suggested to get your cluster upgraded to
> 1.3.0 first, then come back and upgrade your executors and schedulers.
>
> We understand this may expose inconvenience around updating the protobuf
> dependency, so please let us know if you have any concern or further
> questions.
>
> --
>
> Cheers,
>
> Zhitao Li and Anand Mazumdar,


[Design doc][RFC] Agent Lifecycle Management

2017-04-25 Thread Anand Mazumdar
Hello everyone,

We are working on adding support for agent lifecycle management [1] that
will provide a feedback mechanism for frameworks in case of agent node
failures. The existing agent lost [2] signal is not sufficient for
frameworks to ascertain that a given agent node isn't coming back.

Here is a link to the design doc:
https://docs.google.com/document/d/1XvP0acT8xadSev8UG2BXtsPlEh0Rb7R3WV3s-TnTeqg

Please feel free to provide any feedback via comments on the doc.

[1] JIRA Epic: https://issues.apache.org/jira/browse/MESOS-7426

[2]
https://github.com/apache/mesos/blob/master/include/mesos/v1/scheduler/scheduler.proto#L151

-anand


Re: protbuf to json not compatible

2017-03-24 Thread Anand Mazumdar
Hi Tomek,

Looks like we dropped the ball on MESOS-5995
(https://issues.apache.org/jira/browse/MESOS-5995). I assigned myself
as the shepherd and would take a look next week.

-anand

On Thu, Mar 23, 2017 at 2:09 AM, Tomek Janiszewski  wrote:
> I have a similar problem with protobuf and json. In my case numbers were
> incorrectly unmarshaled. Here is an issue
> https://issues.apache.org/jira/browse/MESOS-970 and review
> https://reviews.apache.org/r/50851/
>
> czw., 23.03.2017, 09:54 użytkownik Olivier Sallou 
> napisał:
>
>> Hi,
>>
>> when transforming a protobug message to json with MessageToJson, the
>> json is not compatible with the json format expected by Mesos master.
>>
>> For example, for volumes it generates
>>
>>
>> volumes: [
>>
>> {'hostPath': '',
>>
>>   'containerPath': '...',
>>
>>  ...
>>
>>}
>>
>> ]
>>
>>
>> but HTTP API expects "source" and "container_path"
>>
>> is it an expected behavior ? This prevents from "creating" a task in
>> protobuf format and sending it to HTTP API with a protobug to json
>> conversion.
>>
>> Thanks
>>
>> Olivier
>>
>> --
>> Olivier Sallou
>> IRISA / University of Rennes 1
>> Campus de Beaulieu, 35000 RENNES - FRANCE
>> Tel: 02.99.84.71.95
>>
>> gpg key id: 4096R/326D8438  (keyring.debian.org)
>> Key fingerprint = 5FB4 6F83 D3B9 5204 6335  D26D 78DC 68DB 326D 8438
>>
>>


[Proposal] Media type for streaming requests/responses

2017-01-07 Thread Anand Mazumdar
Hello All,

We recently added support for request streaming as part of the Debugging
epic (MESOS-6460). As a follow up on that, we want your suggestions and
feedback via comments on the proposal draft [1] around the media type to
use for the 'Content-Type' header for streaming requests/responses.

[1] http://bit.ly/2iovQVe

-anand


[RESULT][VOTE] Release Apache Mesos 0.28.3 (rc1)

2016-12-05 Thread Anand Mazumdar
Hi all,

The vote for Mesos 0.28.3 (rc1) has passed with the
following votes.

+1 (Binding)
--
Alex Rukletsov
Vinod Kone
Benjamin Mahler

+1 (Non-binding)
--
Greg Mann

There were no 0 or -1 votes.

Please find the release at:
https://dist.apache.org/repos/dist/release/mesos/0.28.3

It is recommended to use a mirror to download the release:
http://www.apache.org/dyn/closer.cgi

The CHANGELOG for the release is available at:
https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.28.3

The mesos-0.28.3.jar has been released to:
https://repository.apache.org

The website (http://mesos.apache.org) will be updated shortly to reflect
this release.

Thanks,
Anand & Joseph


[VOTE] Release Apache Mesos 0.28.3 (rc1)

2016-11-23 Thread Anand Mazumdar
Hi all,

Please vote on releasing the following candidate as Apache Mesos 0.28.3.


0.28.3 includes the following:


** Bug
  * [MESOS-2043] - Framework auth fail with timeout error and never
get authenticated
  * [MESOS-4638] - Versioning preprocessor macros.
  * [MESOS-5073] - Mesos allocator leaks role sorter and quota role sorters.
  * [MESOS-5330] - Agent should backoff before connecting to the master.
  * [MESOS-5390] - v1 Executor Protos not included in maven jar
  * [MESOS-5543] - /dev/fd is missing in the Mesos containerizer environment.
  * [MESOS-5571] - Scheduler JNI throws exception when the major
versions of JAR and libmesos don't match.
  * [MESOS-5576] - Masters may drop the first message they send
between masters after a network partition.
  * [MESOS-5673] - Port mapping isolator may cause segfault if it bind
mount root does not exist.
  * [MESOS-5691] - SSL downgrade support will leak sockets in CLOSE_WAIT status.
  * [MESOS-5698] - Quota sorter not updated for resource changes at agent.
  * [MESOS-5723] - SSL-enabled libprocess will leak incoming links to forks.
  * [MESOS-5740] - Consider adding `relink` functionality to libprocess.
  * [MESOS-5748] - Potential segfault in `link` when linking to a
remote process.
  * [MESOS-5763] - Task stuck in fetching is not cleaned up after
--executor_registration_timeout.
  * [MESOS-5913] - Stale socket FD usage when using libevent + SSL.
  * [MESOS-5927] - Unable to run "scratch" Dockerfiles with Unified
Containerizer.
  * [MESOS-5943] - Incremental http parsing of URLs leads to decoder error.
  * [MESOS-5986] - SSL Socket CHECK can fail after socket receives EOF.
  * [MESOS-6104] - Potential FD double close in libevent's
implementation of `sendfile`.
  * [MESOS-6142] - Frameworks may RESERVE for an arbitrary role.
  * [MESOS-6152] - Resource leak in libevent_ssl_socket.cpp.
  * [MESOS-6233] - Master CHECK fails during recovery while relinking
to other masters.
  * [MESOS-6234] - Potential socket leak during Zookeeper network changes.
  * [MESOS-6246] - Libprocess links will not generate an ExitedEvent
if the socket creation fails.
  * [MESOS-6299] - Master doesn't remove task from pending when it is invalid.
  * [MESOS-6457] - Tasks shouldn't transition from TASK_KILLING to TASK_RUNNING.
  * [MESOS-6502] - _version uses incorrect
MESOS_{MAJOR,MINOR,PATCH}_VERSION in libmesos java binding.
  * [MESOS-6527] - Memory leak in the libprocess request decoder.
  * [MESOS-6621] - SSL downgrade path will CHECK-fail when using both
temporary and persistent sockets


The CHANGELOG for the release is available at:
https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.28.3-rc1


The candidate for Mesos 0.28.3 release is available at:
https://dist.apache.org/repos/dist/dev/mesos/0.28.3-rc1/mesos-0.28.3.tar.gz

The tag to be voted on is 0.28.3-rc1:
https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=0.28.3-rc1

The MD5 checksum of the tarball can be found at:
https://dist.apache.org/repos/dist/dev/mesos/0.28.3-rc1/mesos-0.28.3.tar.gz.md5

The signature of the tarball can be found at:
https://dist.apache.org/repos/dist/dev/mesos/0.28.3-rc1/mesos-0.28.3.tar.gz.asc

The PGP key used to sign the release is here:
https://dist.apache.org/repos/dist/release/mesos/KEYS

The JAR is up in Maven in a staging repository here:
https://repository.apache.org/content/repositories/orgapachemesos-1170

Please vote on releasing this package as Apache Mesos 0.28.3!

The vote is open until Sat Nov 26 14:59:10 PST 2016 and passes if a
majority of at least 3 +1 PMC votes are cast.

[ ] +1 Release this package as Apache Mesos 0.28.3
[ ] -1 Do not release this package because ...

Thanks,
Anand & Joseph


Re: Mesos V1 Operator HTTP API - Java Proto Classes

2016-11-16 Thread Anand Mazumdar
We wanted to move the project away from officially supporting anything
other than C++ and discuss more on if we should be responsible for
publishing to the various language specific channels. However, for the time
being, we had decided to include the v1 protobufs in the mesos JAR itself.
(it already contains the v1 Scheduler/Executor protos)

Please file an issue as Zameer pointed out.

-anand

On Wed, Nov 16, 2016 at 8:34 AM, Zameer Manji  wrote:

> I think this is a bug, I feel the jar should include all v1 protobuf files.
>
> Vijay, I encourage you to file a ticket.
>
> On Tue, Nov 15, 2016 at 8:04 PM, Vijay Srinivasaraghavan <
> vijikar...@yahoo.com.invalid> wrote:
>
>> I believe the HTTP API will use the same underlying message format (proto
>> def) and hence the request/response value objects (java) needs to be
>> auto-generated from the proto files for it to be used in Jersey based java
>> rest client?
>>
>> On Tuesday, November 15, 2016 12:37 PM, Tomek Janiszewski <
>> jani...@gmail.com> wrote:
>>
>>
>>  I suspect jar is deprecated and includes only old API used by mesoslib.
>> The
>> goal is to create HTTP API and stop supporting native libs (jars, so,
>> etc).
>> I think you shouldn't use that jar in your project.
>>
>> wt., 15.11.2016, 20:38 użytkownik Vijay Srinivasaraghavan <
>> vijikar...@yahoo.com> napisał:
>>
>> > Hello,
>> >
>> > I am writing a rest client for "operator APIs" and found that some of
>> the
>> > protobuf java classes (like "include/mesos/v1/quota/quota.proto",
>> > "include/mesos/v1/master/master.proto") are not included in the mesos
>> jar
>> > file. While investigating, I have found that the "Make" file does not
>> > include these proto definition files.
>> >
>> > I have updated the Make file and added the protos that I am interested
>> in
>> > and built a new jar file. Is there any reason why these proto
>> definitions
>> > are not included in the original build apart from the reason that the
>> APIs
>> > are still evolving?
>> >
>> > Regards
>> > Vijay
>> >
>>
>> --
>> Zameer Manji
>>
>


Re: mesos git commit: Added MESOS-6497 to CHANGELOG.

2016-10-28 Thread Anand Mazumdar
Neil,

I had already committed it to master.

There were some merge conflicts with the CHANGELOG for the 1.1.x branch.
So, I had asked Till to resolve them and then commit it.

-anand

On Oct 28, 2016, at 1:23 PM, Neil Conway  wrote:

This commit should also appear in the master branch, not just 1.1.x

Neil

On Fri, Oct 28, 2016 at 4:06 PM,   wrote:

Repository: mesos
Updated Branches:
 refs/heads/1.1.x bc7ecb8cf -> 7fce1b33f


Added MESOS-6497 to CHANGELOG.


Project: http://git-wip-us.apache.org/repos/asf/mesos/repo
Commit: http://git-wip-us.apache.org/repos/asf/mesos/commit/7fce1b33
Tree: http://git-wip-us.apache.org/repos/asf/mesos/tree/7fce1b33
Diff: http://git-wip-us.apache.org/repos/asf/mesos/diff/7fce1b33

Branch: refs/heads/1.1.x
Commit: 7fce1b33fd7b0ef3f8dcfaa2d6557da1e3c6f957
Parents: bc7ecb8
Author: Till Toenshoff 
Authored: Fri Oct 28 20:23:48 2016 +0200
Committer: Till Toenshoff 
Committed: Fri Oct 28 22:03:39 2016 +0200

--
CHANGELOG | 1 +
1 file changed, 1 insertion(+)
--


http://git-wip-us.apache.org/repos/asf/mesos/blob/7fce1b33/CHANGELOG
--
diff --git a/CHANGELOG b/CHANGELOG
index 3f03be0..d0a679d 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -213,6 +213,7 @@ All Issues:
  * [MESOS-6446] - WebUI redirect doesn't work with stats from
/metric/snapshot.
  * [MESOS-6482] - Master check failure when marking an agent unreachable.
  * [MESOS-6483] - Check failure when a 1.1 master marking a 0.28 agent as
unreachable.
+  * [MESOS-6497] - Java Scheduler Adapter does not surface MasterInfo.

** Documentation
  * [MESOS-5221] - Add Documentation for Nvidia GPU support.


Re: Protobuf long number JSON serialisation

2016-08-04 Thread Anand Mazumdar
Tomek,

Thanks for reporting this. Looks like a bug in our JSON -> Protobuf parsing 
code. Mind filing a JIRA issue?

-anand
 

> On Aug 4, 2016, at 2:04 PM, Tomek Janiszewski  wrote:
> 
> Hi
> 
> I have a problem with HTTP API. Proto2 does not specify JSON mappings but
> Proto3 does and it recommend to map 64bit numbers as a string.
> Unfortunately Mesos does not accepts strings in places of uint64 and return
> 400 Bad Request error Failed to convert JSON into Call protobuf: Not
> expecting a JSON string for field 'value'.
> Is this by purpose or is this a bug?
> 
> Best
> Tomek



[HTTP API] Client Libraries

2016-07-06 Thread Anand Mazumdar
Hi,

We recently committed documentation around available client libraries for the 
Scheduler 
/Executor
  HTTP 
API’s. 

Link to doc: 
https://github.com/apache/mesos/blob/master/docs/api-client-libraries.md 
 

It would be great if folks can send a PR or review to add more implementations 
that they maintain/use.

-anand

Re: how to debug HTTP API

2016-06-07 Thread Anand Mazumdar
Olivier,

You are missing the “task_infos” key in your “ACCEPT” call. The master treats 
“Accept” operations with no launch tasks as declining offers implicitly. I 
would file a followup JIRA to ensure this is logged on the master (if not so).

An example correct JSON: 
https://gist.github.com/hatred/7325d8a4afde607ecc0f376ab62d60eb 


-anand

> On Jun 7, 2016, at 8:38 AM, Olivier Sallou  wrote:
> 
> 
> 
> On 06/07/2016 04:53 PM, Guangya Liu wrote:
>> So how many agent nodes are there in your cluster? If you continue
>> receiving offer but without getting UPDATE message, then it may be caused
>> by that your task definition and the framework continually decline offer.
> I have only one node (master/slave), for development. It worked fine
> with the python API.
> we see on master that it received the ACCEPT, and no DECLINE. However,
> as I receive no UPDATE, I suppose that mesos "drops" the ACCEPT (wrong
> task definition maybe), and sends new offers several seconds after I
> sent the ACCEPT.
>> 
>> Can you please share your framework code here for the logic of "Event::
>> OFFERS"?
> Code is available here:
> 
> https://bitbucket.org/osallou/go-docker/src/b1948063fb7f68fbc77f5de6b473d832a7dd36af/plugins/mesos.py?at=master=file-view-default
>  
> 
> 
> in method run of MesosThread, line 613
> 
> Code is a little complex, as it is a port of existing code using mesos
> python lib.
> 
> Code related to HTTP is in development, so there may be further errors,
> but registration is fine as well as offer messages.
> 
> I have added locally a debug print to show any message received by mesos
> (in case I would have received an other message indicating an error),
> but I received no other than offer and heartbeats.
> 
> If Mesos see the ACCEPT message as it appears in logs, that it should
> either reject it (with a different status code than 202) or send an
> UPDATE error message if there is an error with my task definition.
> 
> Olivier
>> 
>> Thanks,
>> 
>> Guangya
>> 
>> On Tue, Jun 7, 2016 at 8:29 PM, Olivier Sallou 
>> wrote:
>> 
>>> 
>>> On 06/07/2016 01:59 PM, Guangya Liu wrote:
 I can see that your framework is now holding the offer, how did you
>>> launch
 task?
>>> I execute an HTTP POST request in Python with json content-type:
>>> 
>>> {'type': 'ACCEPT',
>>> 'framework_id': {'value': u'e303a1f0-4e7c-4c32-aafc-8707ea2b2718-0020'},
>>> 'accept': {
>>>'operations': [
>>>{'type': 'LAUNCH',
>>>'launch': {'container': {
>>>'docker': {'image': u'centos:latest',
>>> 'force_pull_image': True, 'port_mappings': [], 'network': 2},
>>>'type': 1,
>>>'volumes': [
>>>{'host_path': u'/a/b', 'container_path':
>>> u'/mnt/home', 'mode': 1},
>>>{'host_path': u'/a/b/c', 'container_path':
>>> u'/mnt/go-docker', 'mode': 1},
>>>{'host_path': u'/b/c/d', 'container_path':
>>> u'/mnt/god-data', 'mode': 2}
>>>]
>>>},
>>>'name': u'testr',
>>>'task_id': {'value': '128'},
>>>'command': {'uris': [{'value':
>>> u'/home/osallou/docker.tar.gz'}], 'value': u'/mnt/go-docker/wrapper.sh'},
>>>'slave_id': {'value':
>>> u'e303a1f0-4e7c-4c32-aafc-8707ea2b2718-S0'},
>>>'resources': [
>>>{'scalar': {'value': 1}, 'type': 0, 'name': 'cpus'},
>>>{'scalar': {'value': 2000}, 'type': 0, 'name': 'mem'}
>>>]
>>>} # end launch
>>>} # end operation
>>>],
>>>'offer_ids': [{'value': u'e303a1f0-4e7c-4c32-aafc-8707ea2b2718-O28'}]
>>> }
>>> }
>>> 
>>> We can see that Mesos received the ACCEPT:
>>> 
>>> I0607 11:45:15.873584 14896 master.cpp:3104] Processing ACCEPT call for
>>> offers: [ e303a1f0-4e7c-4c32-aafc-8707ea2b2718-O28 ] on slave
>>> e303a1f0-4e7c-4c32-aafc-8707ea2b2718-S0 at slave(1)@127.0.1.1:5051
>>> (tifenn.irisa.fr) for framework
>>> 
>>> 
>>> and I continue to receive new offers, so "connection" is OK. I should
>>> receive an UPDATE message even if there is an error, but I receive none
>>> (I track/log all messages received, whatever the type).
>>> 
>>> Olivier
>>> 
 Perhaps you can take a look at
 https://github.com/apache/mesos/blob/master/src/cli/execute.cpp#L311
>>> which
 is an example framework using HTTP API
 
 Thanks,
 
 Guangya
 
 On Tue, Jun 7, 2016 at 7:19 PM, Olivier Sallou 
 wrote:
 
> On 06/07/2016 12:25 PM, Guangya Liu wrote:
>> Olivier,
>> 
>> For such case, seems there is sth wrong with your framework? can you
> please
>> run the following two commands and check the output?
> I don't think it is a 

Re: Status acknowledgements in MesosExecutor

2016-06-06 Thread Anand Mazumdar
Hi Evers,

Thanks for taking this on. Vinod has agreed to shepherd this and I would be 
happy to be the initial reviewer for the patches.

-anand


> On Jun 1, 2016, at 10:27 AM, Evers Benno  wrote:
> 
> Some more context about this bug:
> 
> We did some tests with a framework that does nothing but send empty
> tasks and sample executor that does nothing but send TASK_FINISHED and
> shut itself down.
> 
> Running on two virtual machines on the same host (i.e. no network
> involved), we see TASK_FAILED in about 3% of all tasks (271 out of
> 9000). Adding some megabytes of data into update.data, this can go up
> to 80%. In all cases where I looked manually, the logs look like this:
> (id's shortened to three characters for better readability)
> 
> [...]
> I0502 14:40:33.151075 394179 slave.cpp:3002] Handling status update
> TASK_FINISHED (UUID: 20c) for task 24c of framework f20 from
> executor(1)@[2a02:6b8:0:1a16::165]:49266
> I0502 14:40:33.151175 394179 slave.cpp:3528]
> executor(1)@[2a02:6b8:0:1a16::165]:49266 exited
> I0502 14:40:33.151190 394179 slave.cpp:3886] Executor 'executor_24c' of
> framework f20 exited with status 0
> I0502 14:40:33.151216 394179 slave.cpp:3002] Handling status update
> TASK_FAILED (UUID: 01b) for task 24c of framework f20 from @0.0.0.0:0
> [...]
> 
> The random failure chance is a bit too high to ignore, so we're
> currently writing/testing a patch to wait for confirmations for all
> status updates on executor shutdown.
> 
> It would be great if someone would like to shepherd this.
> 
> Best regards,
> Benno
> 
> On 03.05.2016 14:49, Evers Benno wrote:
>> Hi,
>> 
>> I was wondering about the semantics of the Executor::sendStatusUpdate()
>> method. It is described as
>> 
>>// Sends a status update to the framework scheduler, retrying as
>>// necessary until an acknowledgement has been received or the
>>// executor is terminated (in which case, a TASK_LOST status update
>>// will be sent). See Scheduler::statusUpdate for more information
>>// about status update acknowledgements.
>> 
>> I was understanding this to say that the function blocks until an
>> acknowledgement is received, but looking at the implementation of
>> MesosExecutor it seems that "retrying as necessary" only means
>> re-sending of unacknowledged updates when the slave reconnects.
>> (exec/exec.cpp:274)
>> 
>> I'm wondering because we're currently running a python executor which
>> ends its life like this:
>> 
>>driver.sendStatusUpdate(_create_task_status(TASK_FINISHED))
>>driver.stop()
>># in a different thread:
>>sys.exit(0 if driver.run() == mesos_pb2.DRIVER_STOPPED else 1)
>> 
>> and we're seeing situations (roughly once per 10,000 tasks) where the
>> executor process is reaped before the acknowledgement for TASK_FINISHED
>> is sent from slave to executor. This results in mesos generating a
>> TASK_FAILED status update, probably from
>> Slave::sendExecutorTerminatedStatusUpdate().
>> 
>> So, did I misunderstand how MesosExecutor works? Or is it indeed a race,
>> and we have to change the executor shutdown?
>> 
>> Best regards,
>> Benno
>> 



Re: MESOS-3777: Looking for a shepherd

2016-05-19 Thread Anand Mazumdar
Hi Jose,

Would you like to work on https://issues.apache.org/jira/browse/MESOS-5359 
 instead? It’s part of the 
Scheduler API v1 improvements 
 epic.

-anand

> On May 19, 2016, at 12:40 PM, José Guilherme Vanz  
> wrote:
> 
> Ok. Do you think I should work on other issue in the roadmap?
> 
> On Thu, 19 May 2016 at 01:18 Vinod Kone  wrote:
> 
>> Hi Jose,
>> 
>> I'm shepherding this epic. Given the complexity of MESOS-3777, I think it
>> is best to move it to Phase 2.
>> 
>> On Wed, May 18, 2016 at 8:00 PM, José Guilherme Vanz <
>> guilherme@gmail.com> wrote:
>> 
>>> Hi guys
>>> 
>>> I'm looking for shepherd in the issue that is in the roadmap:
>>> MESOS-3777: Replace Master/Slave Terminology Phase I - Modify public
>>> interfaces 
>>> 
>>> Thanks
>>> 
>> 



Re: Status acknowledgements in MesosExecutor

2016-05-03 Thread Anand Mazumdar
Also, we would be modifying the agent to always acknowledge status updates from 
the executor. (MESOS-5262 )

Once, that is done, it should be sufficient for an executor to terminate itself 
on receiving an acknowledgment message from the agent, instead of relying on 
the best effort hack of sleeping for some duration.

-anand

> On May 3, 2016, at 6:37 AM, Alex Rukletsov  wrote:
> 
> Benno—
> 
> you may be seeing MESOS-4111
> . Also, have a look at
> this comment:
> https://github.com/apache/mesos/blob/9f472b1eff904d0d96063d3bed535a8e81263d69/src/launcher/executor.cpp#L611-L617
> 
> On Tue, May 3, 2016 at 2:49 PM, Evers Benno  wrote:
> 
>> Hi,
>> 
>> I was wondering about the semantics of the Executor::sendStatusUpdate()
>> method. It is described as
>> 
>>// Sends a status update to the framework scheduler, retrying as
>>// necessary until an acknowledgement has been received or the
>>// executor is terminated (in which case, a TASK_LOST status update
>>// will be sent). See Scheduler::statusUpdate for more information
>>// about status update acknowledgements.
>> 
>> I was understanding this to say that the function blocks until an
>> acknowledgement is received, but looking at the implementation of
>> MesosExecutor it seems that "retrying as necessary" only means
>> re-sending of unacknowledged updates when the slave reconnects.
>> (exec/exec.cpp:274)
>> 
>> I'm wondering because we're currently running a python executor which
>> ends its life like this:
>> 
>>driver.sendStatusUpdate(_create_task_status(TASK_FINISHED))
>>driver.stop()
>># in a different thread:
>>sys.exit(0 if driver.run() == mesos_pb2.DRIVER_STOPPED else 1)
>> 
>> and we're seeing situations (roughly once per 10,000 tasks) where the
>> executor process is reaped before the acknowledgement for TASK_FINISHED
>> is sent from slave to executor. This results in mesos generating a
>> TASK_FAILED status update, probably from
>> Slave::sendExecutorTerminatedStatusUpdate().
>> 
>> So, did I misunderstand how MesosExecutor works? Or is it indeed a race,
>> and we have to change the executor shutdown?
>> 
>> Best regards,
>> Benno
>> 



Change minimum supported version for GCC to 4.8.1

2016-01-04 Thread Anand Mazumdar
I would like to propose that we bump our minimum supported version for gcc from 
4.8.0 to 4.8.1. The main motivation behind this is that there are at least 2 
outstanding reviews on RB that want to use ref-qualifiers 
 introduced 
in GCC 4.8.1. The reviews in question are r41870 
 and r41593 
. 

Also, our Getting Started document  
for Mesos already lists the minimum gcc version as > 4.8. Looking at the 
release timeline  for GCC, it seems that 
4.8.0/4.8.1 were released within a week of each other.

Does anyone have a strong opinion against this change ?

-anand




Re: Change minimum supported version for GCC to 4.8.1

2016-01-04 Thread Anand Mazumdar
Are you referring to the spread sheet linked in MESOS-2604 
?

AFAICT, it just shows that a particular variant of GCC 4.8+ is available on 
each of the supported distributions. So, this should not be an issue unless I 
am missing something?

-anand 

> On Jan 4, 2016, at 5:42 PM, Benjamin Mahler  wrote:
> 
> When we moved to 4.8 there was a spreadsheet that showed how folks can get
> 4.8 on various distributions, have you checked that 4.8.1 is available
> across distributions?
> 
> On Mon, Jan 4, 2016 at 4:07 PM, Adam Bordelon  wrote:
> 
>> +1
>> 
>> On Mon, Jan 4, 2016 at 3:19 PM, Joris Van Remoortere <
>> joris.van.remoort...@gmail.com> wrote:
>> 
>>> +1 (binding)
>>> 
>>> I would like to propose that we bump our minimum supported version for
>> gcc
 from 4.8.0 to 4.8.1. The main motivation behind this is that there are
>> at
 least 2 outstanding reviews on RB that want to use ref-qualifiers <
 http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2439.htm>
 introduced in GCC 4.8.1. The reviews in question are r41870 <
 https://reviews.apache.org/r/41870> and r41593 <
 
>> https://reviews.apache.org/r/41593/diff/2?file=1173648#file1173648line58
 .
 
 Also, our Getting Started document <
 http://mesos.apache.org/gettingstarted/> for Mesos already lists the
 minimum gcc version as > 4.8. Looking at the release timeline <
 https://gcc.gnu.org/gcc-4.8/> for GCC, it seems that 4.8.0/4.8.1 were
 released within a week of each other.
 
 Does anyone have a strong opinion against this change ?
 
 -anand
 
>>> 
>> 



Re: mesos git commit: Added documentation for API versioning.

2016-01-04 Thread Anand Mazumdar
Vinod alluded to a similar concern as a review comment and we added the 
following to the beginning of the doc:

“API versioning was introduced in Mesos 0.24.0 and this scheme only applies to 
Mesos 1.0.0 and higher.”

I was under the impression that the above statement should suffice?

-anand


> On Jan 4, 2016, at 3:02 AM, Alex R <ruklet...@gmail.com> wrote:
> 
> The "Upgrades" section says "The master and agent are typically compatible
> as long as they are running the same major version.". Is my understanding
> correct that this will apply to 1.0.0+ versions? Currently we advice people
> not to skip any Mesos releases, could you please clarify that in the doc?
> 
> 
> On 31 December 2015 at 02:12, <vinodk...@apache.org> wrote:
> 
>> Repository: mesos
>> Updated Branches:
>>  refs/heads/master b54d9ee06 -> 4568e584d
>> 
>> 
>> Added documentation for API versioning.
>> 
>> Review: https://reviews.apache.org/r/41661/
>> 
>> 
>> Project: http://git-wip-us.apache.org/repos/asf/mesos/repo
>> Commit: http://git-wip-us.apache.org/repos/asf/mesos/commit/4568e584
>> Tree: http://git-wip-us.apache.org/repos/asf/mesos/tree/4568e584
>> Diff: http://git-wip-us.apache.org/repos/asf/mesos/diff/4568e584
>> 
>> Branch: refs/heads/master
>> Commit: 4568e584d15e2da68f777aff2570d6b0ceaa14fa
>> Parents: b54d9ee
>> Author: Anand Mazumdar <mazumdar.an...@gmail.com>
>> Authored: Wed Dec 30 17:11:21 2015 -0800
>> Committer: Vinod Kone <vinodk...@gmail.com>
>> Committed: Wed Dec 30 17:11:21 2015 -0800
>> 
>> --
>> docs/home.md   |   1 +
>> docs/versioning.md | 100 
>> 2 files changed, 101 insertions(+)
>> --
>> 
>> 
>> http://git-wip-us.apache.org/repos/asf/mesos/blob/4568e584/docs/home.md
>> --
>> diff --git a/docs/home.md b/docs/home.md
>> index d929838..6f0f4b9 100644
>> --- a/docs/home.md
>> +++ b/docs/home.md
>> @@ -54,6 +54,7 @@ layout: documentation
>> * [Javadoc](/api/latest/java/) documents the Mesos Java API.
>> * [Doxygen](/api/latest/c++/namespacemesos.html) documents the Mesos C++
>> API.
>> * [Developer Tools](/documentation/latest/tools/) for hacking on Mesos or
>> writing frameworks.
>> +* [Versioning](/documentation/latest/versioning/) describes how Mesos
>> does API and release versioning.
>> 
>> ## Extending Mesos
>> 
>> 
>> 
>> http://git-wip-us.apache.org/repos/asf/mesos/blob/4568e584/docs/versioning.md
>> --
>> diff --git a/docs/versioning.md b/docs/versioning.md
>> new file mode 100644
>> index 000..7af6581
>> --- /dev/null
>> +++ b/docs/versioning.md
>> @@ -0,0 +1,100 @@
>> +---
>> +layout: documentation
>> +---
>> +
>> +# Mesos Versioning
>> +
>> +The Mesos API and release versioning policy gives operators and
>> developers clear guidelines on:
>> +
>> +* Making modifications to the existing APIs without affecting backward
>> compatibility.
>> +* How long a Mesos API will be supported.
>> +* Upgrading the Mesos installation across release versions.
>> +
>> +API versioning was introduced in Mesos 0.24.0 and this scheme only
>> applies to Mesos 1.0.0 and higher.
>> +
>> +## Terminology
>> +
>> +* **Release Versioning**: This refers to the version of Mesos that is
>> being released. It is of the form **Mesos X.Y.Z** (X is the major version,
>> Y is the minor version, and Z is the patch version).
>> +* **API Versioning**: This refers to the version of the Mesos API. It is
>> of the form **vX** (X is the major version).
>> +
>> +## How does it work?
>> +
>> +The Mesos APIs (constituting Scheduler, Executor, Internal,
>> Operator/Admin APIs) will have a version in the URL. The versioned URL will
>> have a prefix of **`/api/vN`** where "N" is the version of the API. The
>> "/api" prefix is chosen to distinguish API resources from Web UI paths.
>> +
>> +Examples:
>> +
>> +* http://localhost:5050/api/v1/scheduler :  Scheduler HTTP API hosted by
>> the master.
>> +* http://localhost:5051/api/v1/executor  :  Executor HTTP API hosted by
>> the agent.
>> +
>> +A given Mesos installation might host multiple versions of the same

Re: writing a scheduler against the v1 C++ API

2015-10-23 Thread Anand Mazumdar
James,

Mesos 0.24 included experimental support for the Scheduler V1 API. So, it is 
indeed quite an reasonable thing to do and provide us feedback. :)

- You can have a look at the C++ low level v1 scheduler library:
 https://github.com/apache/mesos/blob/master/include/mesos/v1/scheduler.hpp 

- An example C++ framework using the v1 API: 
https://github.com/apache/mesos/blob/master/src/examples/event_call_framework.cpp
 

- Unit tests depicting the C++ library usage:
https://github.com/apache/mesos/blob/master/src/tests/scheduler_tests.cpp 

- V1 API Docs:
http://mesos.apache.org/documentation/latest/scheduler-http-api/ 


-anand

> On Oct 23, 2015, at 3:28 PM, James Peach  wrote:
> 
> Hi all,
> 
> I was going to kick the tires by trying to write a toy scheduler against the 
> v1 C++ API. Is the v1 API ready enough that that is a reasonable thing to try 
> to do?
> 
> J



[Design Doc] Executor HTTP API

2015-10-17 Thread Anand Mazumdar
Folks,

The Scheduler HTTP API 
 was 
introduced in Mesos 0.24. Building on that, we would like to propose a design 
document for the Executor HTTP API  around 
agent-executor communication.

The document is still a work in progress and any feedback would be greatly 
appreciated.

Link to the design doc: https://goo.gl/KgdZL1 
Relevant JIRA: https://issues.apache.org/jira/browse/MESOS-2708 

HTTP API Epic: https://issues.apache.org/jira/browse/MESOS-3302 


-anand



Re: Do we still need to add InverseOffer support to Scheduler API?

2015-09-15 Thread Anand Mazumdar
Hi Qian,

Yes, the eventual plan is to only support the (C++) Scheduler Library in the 
Mesos repository going forward and deprecate the old (C++/Java/Python) 
Scheduler/Scheduler Driver. The deprecation cycle “would/should" start after 
1.0 is released. We would encourage the community to build up clients for other 
languages and link them from Mesos webpage/docs. 

Vinod sent out an email [1] to dev@ about how to manage API Client Libraries 
going forward. This was also discussed at length during the last (Sep 3) 
community sync [2].

[1] http://bit.ly/1LeLiy4 <http://bit.ly/1LeLiy4>
[2] http://bit.ly/1F0vq16 <http://bit.ly/1F0vq16>


-anand



> On Sep 15, 2015, at 8:19 AM, Qian AZ Zhang <zhang...@cn.ibm.com> wrote:
> 
> Thanks Anand for your clarification! I understand the intention now :-)
> 
> BTW, what is future plan for the old C++ Scheduler/Scheduler Driver and also 
> the Java/Python binding? Will we keep supporting them? Or they will be 
> eventually deprecated in future?
> 
> 
> Regards,
> Qian Zhang
> 
> Anand Mazumdar ---09/15/2015 11:00:00---Hi Qian, We currently don’t intend to 
> move the old C++ Scheduler/Scheduler Driver <https://github.co 
> <https://github.co/>
> 
> From: Anand Mazumdar <an...@mesosphere.io <mailto:an...@mesosphere.io>>
> To:   dev@mesos.apache.org <mailto:dev@mesos.apache.org>
> Date: 09/15/2015 11:00
> Subject:  Re: Do we still need to add InverseOffer support to Scheduler 
> API?
> 
> 
> 
> 
> Hi Qian,
> 
> We currently don’t intend to move the old C++ Scheduler/Scheduler Driver 
> <https://github.com/apache/mesos/blob/master/src/sched/sched.cpp 
> <https://github.com/apache/mesos/blob/master/src/sched/sched.cpp>>Scheduler 
> Driver <https://github.com/apache/mesos/blob/master/src/sched/sched.cpp 
> <https://github.com/apache/mesos/blob/master/src/sched/sched.cpp>> interface 
> to use the Mesos V1 API. 
> 
> If you want to use the new V1 API’s , you can use the low-level C++ Scheduler 
> Library 
> <https://github.com/apache/mesos/blob/master/src/scheduler/scheduler.cpp 
> <https://github.com/apache/mesos/blob/master/src/scheduler/scheduler.cpp>> 
> that speaks the new Call/Event lingo. Joris already pointed you to a very 
> good example of an existing test using the Inverse Offer functionality : 
> https://reviews.apache.org/r/37283 <https://reviews.apache.org/r/37283> 
> <https://reviews.apache.org/r/37283 <https://reviews.apache.org/r/37283>>
> 
> Let me know if this resolves your confusion.
> 
> -anand
> 
> 
> > On Sep 14, 2015, at 7:27 PM, Qian AZ Zhang <zhang...@cn.ibm.com 
> > <mailto:zhang...@cn.ibm.com>> wrote:
> > 
> > If we keep the current C++ scheduler API as it is, then I think framework 
> > can never receive inverse offer in its "resourceOffers()" callback, the 
> > reason is, In SchedulerProcess::initialize(), we have the following code:
> > install(
> > ::resourceOffers,
> > ::offers,
> > ::pids);
> > In the above code, only "offers" and "pids" fields of ResourceOffersMessage 
> > are passed into SchedulerProcess::resourceOffers() when it is invoked, but 
> > the "inverse_offers" field of ResourceOffersMessage is NOT passed into it.
> > 
> > 
> > Regards,
> > Qian Zhang (张乾)
> > Developer, IBM Platform Computing
> > Phone: 86-29-68797144 | Tie-Line: 87144
> > E-mail: zhang...@cn.ibm.com <mailto:zhang...@cn.ibm.com> 
> > <mailto:zhang...@cn.ibm.com <mailto:zhang...@cn.ibm.com>>
> > Chat: zhq527725
> > “An educated man should know everything about something and something about 
> > everything"
> > 
> > 
> > 
> > 陕西省西安市高新区
> > 高新六路42号中清大厦3层
> > Xian, Shaanxi Province 710075
> > China
> > 
> > Guangya Liu ---09/14/2015 23:29:45---Thanks Haosdent and Joris, I see that 
> > the host maintain patch ( https://reviews.apache.org/r/37180/d 
> > <https://reviews.apache.org/r/37180/d> 
> > <https://reviews.apache.org/r/37180/d 
> > <https://reviews.apache.org/r/37180/d>>
> > 
> > From:   Guangya Liu <gyliu...@gmail.com <mailto:gyliu...@gmail.com>>
> > To: dev@mesos.apache.org <mailto:dev@mesos.apache.org>
> > Date:   09/14/2015 23:29
> > Subject:Re: Do we still need to add InverseOffer support to Scheduler 
> > API?
> > 
> > 
> > 
> > Thanks Haosdent and Joris, I see that the host maintain patch (
> > https://reviews.apache.org/r/37180/diff/8#0 
> > <https://reviews.apach

Re: Do we still need to add InverseOffer support to Scheduler API?

2015-09-14 Thread Anand Mazumdar
Hi Qian,

We currently don’t intend to move the old C++ Scheduler/Scheduler Driver 
Scheduler 
Driver  
interface to use the Mesos V1 API. 

If you want to use the new V1 API’s , you can use the low-level C++ Scheduler 
Library 
 that 
speaks the new Call/Event lingo. Joris already pointed you to a very good 
example of an existing test using the Inverse Offer functionality : 
https://reviews.apache.org/r/37283 

Let me know if this resolves your confusion.

-anand


> On Sep 14, 2015, at 7:27 PM, Qian AZ Zhang  wrote:
> 
> If we keep the current C++ scheduler API as it is, then I think framework can 
> never receive inverse offer in its "resourceOffers()" callback, the reason 
> is, In SchedulerProcess::initialize(), we have the following code:
> install(
> ::resourceOffers,
> ::offers,
> ::pids);
> In the above code, only "offers" and "pids" fields of ResourceOffersMessage 
> are passed into SchedulerProcess::resourceOffers() when it is invoked, but 
> the "inverse_offers" field of ResourceOffersMessage is NOT passed into it.
> 
> 
> Regards,
> Qian Zhang (张乾)
> Developer, IBM Platform Computing
>   Phone: 86-29-68797144 | Tie-Line: 87144
> E-mail: zhang...@cn.ibm.com 
> Chat: zhq527725
> “An educated man should know everything about something and something about 
> everything"
> 
> 
> 
> 陕西省西安市高新区
> 高新六路42号中清大厦3层
> Xian, Shaanxi Province 710075
> China
> 
> Guangya Liu ---09/14/2015 23:29:45---Thanks Haosdent and Joris, I see that 
> the host maintain patch ( https://reviews.apache.org/r/37180/d 
> 
> 
> From: Guangya Liu 
> To:   dev@mesos.apache.org
> Date: 09/14/2015 23:29
> Subject:  Re: Do we still need to add InverseOffer support to Scheduler 
> API?
> 
> 
> 
> Thanks Haosdent and Joris, I see that the host maintain patch (
> https://reviews.apache.org/r/37180/diff/8#0 
> ) is also sending
> "ResourceOffersMessage" to framework so the framework can still use
> "ResourceOffer" to handle the inverseOffer when framework got the
> inverseOffer, right?
> 
> Thanks,
> 
> Guangya
> 
> On Mon, Sep 14, 2015 at 11:15 PM, haosdent  wrote:
> 
> > Hi @Guangya Liu. V1 API support both mesos call frameworks or frameworks
> > call mesos.
> >
> >
> > https://docs.google.com/document/d/1pnIY_HckimKNvpqhKRhbc9eSItWNFT-priXh_urR-T0/edit
> >  
> > 
> >
> > And I think Java or Python API libraries would be deprecated and more out
> > to a better place to maintain in the future(Also maybe support more
> > languages through V1 API). Continue to add them to old APIs may be not a
> > good choice.
> >
> > On Mon, Sep 14, 2015 at 11:02 PM, Guangya Liu  wrote:
> >
> > > Hi Joris,
> > >
> > > I think that those APIs are still needed as HTTP API is mainly initiated
> > by
> > > operator, the current call for HTTP API including TEARDOWN, ACCEPT,
> > > DECLINE, REVIVE, KILL, SHUTDOWN etc, but the offer related operations
> > such
> > > as offer and InverserOffers are initiatedby mesos master, the master need
> > > notify the framework for those offers via the callbacks. Comments?
> > >
> > > Thanks,
> > >
> > > Guangya
> > >
> > > On Mon, Sep 14, 2015 at 10:42 PM, Joris Van Remoortere <
> > > jo...@mesosphere.io>
> > > wrote:
> > >
> > > > Hi Qian,
> > > >
> > > > There is no current plan to add this to the old API. Those tickets were
> > > > created pre-V1 API.
> > > > Currently the goal is to encourage developers to use the V1 API to have
> > > > access to new features such as maintenance primitives.
> > > >
> > > > Joris
> > > >
> > > > On Mon, Sep 14, 2015 at 10:22 AM, Qian AZ Zhang 
> > > > wrote:
> > > >
> > > > >
> > > > >
> > > > > Hi,
> > > > >
> > > > > In the maintenance epic (MESOS-1474), I see there are 3 tasks created
> > > to
> > > > > add InverseOffer support to Scheduler API:
> > > > > MESOS-2063  Add InverseOffer to C++ Scheduler API
> > > > > MESOS-2064  Add InverseOffer to Java Scheduler API
> > > > > MESOS-2065  Add InverseOffer to Python Scheduler API
> > > > >
> > > > > I think we have already supported Schedule HTTP API, so do we still
> > > need
> > > > to
> > > > > update the C++ scheduler API (and the Java/Python binding) to support
> > > > > InverseOffer? If so, I think we may need to update all the example
> > > > > frameworks as well. Take C++ scheduler API as an example, we may need
> > > to
> > > > > add a new callback inverseResourceOffers() in the Scheduler class,
> > and
> > > > each
> > > > > 

[Breaking Change 0.24, MESOS 1988] Silently ignore launchTask/acceptOffers calls when disconnected

2015-06-22 Thread Anand Mazumdar
Hi All,

We intend to introduce a breaking change [1] in the driver to silently ignore 
launchTasks/acceptOffers(…) calls when disconnected from the master in 0.24. 
The previous behavior was to send out “TASK_LOST” messages since there was no 
way to know that these task launches were dropped. However , with the advent of 
Task Reconciliation, this feature is redundant. Other calls like 
killTask/requestResource et al already had this behavior.

If your existing framework relied on this behavior, I would encourage you to 
use the Task Reconciliation API [2] in lieu of this feature/hack. Let me know 
if you have any queries/concerns.

Links:
[1] Tracking JIRA: https://issues.apache.org/jira/browse/MESOS-1988 
https://issues.apache.org/jira/browse/MESOS-1988
[2] Task Reconciliation API : 
http://mesos.apache.org/documentation/latest/reconciliation/ 
http://mesos.apache.org/documentation/latest/reconciliation/

-anand

Re: [Breaking Change 0.24, MESOS 1988] Silently ignore launchTask/acceptOffers calls when disconnected

2015-06-22 Thread Anand Mazumdar
Vinod can add a bit more color to it. 

This is not directly linked to the HTTP API per se, and hence was initially 
marked to be fixed for 0.22 version. However , it got delayed and it was 
decided to fix this behavior as part of the HTTP API epic primarily to ensure 
that future HTTP clients don't make/rely on the same erroneous promises.

-anand


 On Jun 22, 2015, at 6:22 PM, Benjamin Mahler benjamin.mah...@gmail.com 
 wrote:
 
 +vinod
 
 Hm.. I can't tell from MESOS-1988, why is this required for the HTTP API? I
 see MESOS-1972 as a link to more context, but that is for validation. The
 disconnected case does not overlap with the master's validation logic, it
 is an artifact of the driver implementation (the scheduler can't tell when
 it's launch calls are enqueued behind a disconnected event).
 
 On Mon, Jun 22, 2015 at 5:23 PM, Anand Mazumdar an...@mesosphere.io 
 mailto:an...@mesosphere.io wrote:
 
 Hi All,
 
 We intend to introduce a breaking change [1] in the driver to silently
 ignore launchTasks/acceptOffers(…) calls when disconnected from the master
 in 0.24. The previous behavior was to send out “TASK_LOST” messages since
 there was no way to know that these task launches were dropped. However ,
 with the advent of Task Reconciliation, this feature is redundant. Other
 calls like killTask/requestResource et al already had this behavior.
 
 If your existing framework relied on this behavior, I would encourage you
 to use the Task Reconciliation API [2] in lieu of this feature/hack. Let me
 know if you have any queries/concerns.
 
 Links:
 [1] Tracking JIRA: https://issues.apache.org/jira/browse/MESOS-1988 
 https://issues.apache.org/jira/browse/MESOS-1988 
 https://issues.apache.org/jira/browse/MESOS-1988
 [2] Task Reconciliation API :
 http://mesos.apache.org/documentation/latest/reconciliation/ 
 http://mesos.apache.org/documentation/latest/reconciliation/ 
 http://mesos.apache.org/documentation/latest/reconciliation/ 
 http://mesos.apache.org/documentation/latest/reconciliation/
 
 -anand