Re: Threading model of mesos API (C++)

2015-06-10 Thread James Vanns
Thanks for the responses, guys. That link of the 'detailed description'
will be handy - I've not come across that before. I do now have another
question though! Aren't these two a contradiction;

Alex;
you launch a task, before the method returns (say you do some blocking
stuff after, like sync update zookeeper), you might get a statusUpdate()
callback.
Ben;
Methods will not be invoked concurrently, and each method must complete
before the next is called.

??

Jim


On 10 June 2015 at 02:22, Benjamin Mahler benjamin.mah...@gmail.com wrote:

 If that's really what you're seeing, it is a bug and a very surprising
 one, so please provide evidence :)

 See the detailed description here:
 http://mesos.apache.org/api/latest/c++/classmesos_1_1Scheduler.html

 The scheduler driver will serially invoke methods on your Scheduler
 implementation. Methods will not be invoked concurrently, and each method
 must complete before the next is called.

 So, we recommend that you don't block inside the callbacks. Otherwise,
 you're blocking the driver as well and your own ability to continue
 processing callbacks.

 On Tue, Jun 9, 2015 at 8:58 AM, James Vanns jvanns@gmail.com wrote:

 Hi. I'm toying with the mesos scheduler (C++) API and running into
 unexpected race conditions. I have *not* synchronised access to attributes
 of my Scheduler-derived class. Is the mesos library code threaded and
 network communication asynchronous? What it *looks like* I'm seeing is my
 statusUpdate() callback being executed before the return of
 resourceOffers(). Naturally I call driver-launchTasks() inside
 resourceOffers(). This is intermittent but generally triggered by tasks
 that report status changes very quickly; eg. a task that fails instantly.

 Can anyone point me in the right direction of any online API docs that
 explain how callbacks are invoked? Distributed over a pool of worker
 threads?

 Also are the state transitions documented? Eg.
 mesos::TASK_STAGING - mesos::TASK_STARTING - etc.

 Cheers,

 Jim

 --
 Senior Code Pig
 Industrial Light  Magic





-- 
--
Senior Code Pig
Industrial Light  Magic


Re: Threading model of mesos API (C++)

2015-06-10 Thread Alexander Gallego
Jim,

Let me prototype something small today. After reading my scheduler (in c++)
i do have comments and synchronization on some state vars, but it might
have to do with a more complex async code base I manage.

I'll get back to you.

- alex


On Wed, Jun 10, 2015 at 6:15 AM, James Vanns jvanns@gmail.com wrote:

 Thanks for the responses, guys. That link of the 'detailed description'
 will be handy - I've not come across that before. I do now have another
 question though! Aren't these two a contradiction;

 Alex;
 you launch a task, before the method returns (say you do some blocking
 stuff after, like sync update zookeeper), you might get a statusUpdate()
 callback.
 Ben;
 Methods will not be invoked concurrently, and each method must complete
 before the next is called.

 ??

 Jim


 On 10 June 2015 at 02:22, Benjamin Mahler benjamin.mah...@gmail.com
 wrote:

 If that's really what you're seeing, it is a bug and a very surprising
 one, so please provide evidence :)

 See the detailed description here:
 http://mesos.apache.org/api/latest/c++/classmesos_1_1Scheduler.html

 The scheduler driver will serially invoke methods on your Scheduler
 implementation. Methods will not be invoked concurrently, and each method
 must complete before the next is called.

 So, we recommend that you don't block inside the callbacks. Otherwise,
 you're blocking the driver as well and your own ability to continue
 processing callbacks.

 On Tue, Jun 9, 2015 at 8:58 AM, James Vanns jvanns@gmail.com wrote:

 Hi. I'm toying with the mesos scheduler (C++) API and running into
 unexpected race conditions. I have *not* synchronised access to attributes
 of my Scheduler-derived class. Is the mesos library code threaded and
 network communication asynchronous? What it *looks like* I'm seeing is my
 statusUpdate() callback being executed before the return of
 resourceOffers(). Naturally I call driver-launchTasks() inside
 resourceOffers(). This is intermittent but generally triggered by tasks
 that report status changes very quickly; eg. a task that fails instantly.

 Can anyone point me in the right direction of any online API docs that
 explain how callbacks are invoked? Distributed over a pool of worker
 threads?

 Also are the state transitions documented? Eg.
 mesos::TASK_STAGING - mesos::TASK_STARTING - etc.

 Cheers,

 Jim

 --
 Senior Code Pig
 Industrial Light  Magic





 --
 --
 Senior Code Pig
 Industrial Light  Magic



Re: Can Mesos master offer resources to multiple frameworks simultaneously?

2015-06-10 Thread Alex Rukletsov
I'll try to answer these questions.

1. Currently, the only language you can use is C++. You can workaround this
by writing a proxy in c++ that delegates the calls to, say, python scripts.
See http://mesos.apache.org/documentation/latest/allocation-module/ for
more details.

2. The default allocator is called dominant resource fairness since it
tries to distribute resources fairly between active frameworks. This means
it will offer all available resources to all frameworks, but each framework
will get only a certain share. For more information I encourage you to take
a look at the DRF paper.

3. Offered and not declined resources are considered to be used, therefore
they can't be re-offered until freed.

Hope this helps.
On 10 Jun 2015 7:53 am, Qian Zhang zhq527...@gmail.com wrote:

 Thanks Adam, this is very helpful!

 I have a few more questions:
 1. For the pluggable allocator modules, can I write my own allocator in
 any programming language (e.g., Python, Go, etc)?
 2. For the default DRF allocator, when it offer resources to a framework,
 will it offer all the available resources (resources not being used by any
 frameworks) to it? Or just part of the available resources?
 3. If there are multiple frameworks and the default DRF allocator will
 only offer resources to a single framework at a time, then that means
 framework 2 has to wait for framework 1 until framework 1 makes its
 placement decision?





Re: Threading model of mesos API (C++)

2015-06-10 Thread James Vanns
You are a star, Alex. Thank you :)

Jim


On 10 June 2015 at 15:15, Alexander Gallego agall...@concord.io wrote:

 Jim,

 Let me prototype something small today. After reading my scheduler (in
 c++) i do have comments and synchronization on some state vars, but it
 might have to do with a more complex async code base I manage.

 I'll get back to you.

 - alex


 On Wed, Jun 10, 2015 at 6:15 AM, James Vanns jvanns@gmail.com wrote:

 Thanks for the responses, guys. That link of the 'detailed description'
 will be handy - I've not come across that before. I do now have another
 question though! Aren't these two a contradiction;

 Alex;
 you launch a task, before the method returns (say you do some blocking
 stuff after, like sync update zookeeper), you might get a statusUpdate()
 callback.
 Ben;
 Methods will not be invoked concurrently, and each method must complete
 before the next is called.

 ??

 Jim


 On 10 June 2015 at 02:22, Benjamin Mahler benjamin.mah...@gmail.com
 wrote:

 If that's really what you're seeing, it is a bug and a very surprising
 one, so please provide evidence :)

 See the detailed description here:
 http://mesos.apache.org/api/latest/c++/classmesos_1_1Scheduler.html

 The scheduler driver will serially invoke methods on your Scheduler
 implementation. Methods will not be invoked concurrently, and each method
 must complete before the next is called.

 So, we recommend that you don't block inside the callbacks. Otherwise,
 you're blocking the driver as well and your own ability to continue
 processing callbacks.

 On Tue, Jun 9, 2015 at 8:58 AM, James Vanns jvanns@gmail.com
 wrote:

 Hi. I'm toying with the mesos scheduler (C++) API and running into
 unexpected race conditions. I have *not* synchronised access to attributes
 of my Scheduler-derived class. Is the mesos library code threaded and
 network communication asynchronous? What it *looks like* I'm seeing is my
 statusUpdate() callback being executed before the return of
 resourceOffers(). Naturally I call driver-launchTasks() inside
 resourceOffers(). This is intermittent but generally triggered by tasks
 that report status changes very quickly; eg. a task that fails instantly.

 Can anyone point me in the right direction of any online API docs that
 explain how callbacks are invoked? Distributed over a pool of worker
 threads?

 Also are the state transitions documented? Eg.
 mesos::TASK_STAGING - mesos::TASK_STARTING - etc.

 Cheers,

 Jim

 --
 Senior Code Pig
 Industrial Light  Magic





 --
 --
 Senior Code Pig
 Industrial Light  Magic







-- 
--
Senior Code Pig
Industrial Light  Magic


Re: Debugging framework registration from inside docker

2015-06-10 Thread Steven Schlansker
On Jun 10, 2015, at 10:10 AM, James Vanns jvanns@gmail.com wrote:

 Hi. When attempting to run my scheduler inside a docker container in 
 --net=bridge mode it never receives acknowledgement or a reply to that 
 request. However, it works fine in --net=host mode. It does not listen on any 
 port as a service so does not expose any.
 
 The scheduler receives the mesos master (leader) from zookeeper fine but 
 fails to register the framework with that master. It just loops trying to do 
 so - the master sees the registration but deactivates it immediately as 
 apparently it disconnects. It doesn't disconnect but is obviously 
 unreachable. I see the reason for this in the sendto() and the master log 
 file -- because the internal docker bridge IP is included in the POST and 
 perhaps that is how the master is trying to talk back
 to the requesting framework?? 
 
 Inside the container is this;
 tcp0  0 0.0.0.0:44431   0.0.0.0:*   LISTEN
   1/scheduler
 
 This is not my code! I'm at a loss where to go from here. Anyone got any 
 further suggestions
 to fix this?

You may need to try setting LIBPROCESS_IP and LIBPROCESS_PORT to hide the fact 
that you are on a virtual Docker interface.




Debugging framework registration from inside docker

2015-06-10 Thread James Vanns
Hi. When attempting to run my scheduler inside a docker container in
--net=bridge mode it never receives acknowledgement or a reply to that
request. However, it works fine in --net=host mode. It does not listen on
any port as a service so does not expose any.

The scheduler receives the mesos master (leader) from zookeeper fine but
fails to register the framework with that master. It just loops trying to
do so - the master sees the registration but deactivates it immediately as
apparently it disconnects. It doesn't disconnect but is obviously
unreachable. I see the reason for this in the sendto() and the master log
file -- because the internal docker bridge IP is included in the POST and
perhaps that is how the master is trying to talk back
to the requesting framework??

Inside the container is this;
tcp0  0 0.0.0.0:44431   0.0.0.0:*   LISTEN
 1/scheduler

This is not my code! I'm at a loss where to go from here. Anyone got any
further suggestions
to fix this?

Cheers,

Jim

--
Senior Code Pig
Industrial Light  Magic


Apply Now #MesosCon Conference Diversity Scholarship

2015-06-10 Thread Kiersten Gaffney
Hi Mesos friends,

We need your help promoting the #MesosCon diversity scholarship.

#MesosCon, the annual open source #ApacheMesos developers conference, is
now accepting applications for their diversity scholarship. It provides
financial assistance for women (cis and trans), genderqueer people, people
of color, and people with disabilities.

Scholarship recipients will receive a free registration ticket, can request
support for travel and hotel, and will automatically be enrolled in our
buddy system program.

To apply and learn more, click here
http://events.linuxfoundation.org/events/mesoscon/attend/scholarship.
To help promote via twitter click here
https://twitter.com/apachemesos/status/608327682569433088.

Thank you for your support,

Kiersten Gaffney

Planning Committee Member, #MesosCon
Manager of Events, Mesosphere

-- 
Kiersten Gaffney
Manager of Events
kiers...@mesosphere.io
415-559-3771


MesosCon 2015 Lightning Talk CFP now open

2015-06-10 Thread Dave Lester
Good news, everyone:

We’ve expanded the MesosCon program (http://mesoscon.org) to add
lightning talks: 5-minute presentations for speakers to introduce a
project they’re working on, or share an idea related to Mesos. Lightning
talks will take place during lunchtime of the conference, which takes
place August 20-21st, 2015 in Seattle WA. 

The form to propose a lightning talk is available here:
https://docs.google.com/forms/d/1raB-IqA4gi0elYPBHmh5lB17jQdCQWrEeMslQGeihbs/viewform

When preparing your proposal, keep in mind:

 * 5 minutes presentations will be enforced by a time-keeper. The
 5-minute presentation includes any time you may wish for QA, so use
 your time wisely.
 * Slides are allowed, but not required. We will have a laptop on stage
 with slides queued up for those that submit them in advance; using your
 own laptop and transitioning to use it will be included in your 5
 minutes so use your time wisely!
 * We encourage submissions that may have previously been shared as full
 proposals
 * Only one lightning talk may be submitted per person
 * Lightning talk speakers will be expected to purchase full tickets to
 the conference

The CFP opens Wednesday, June 10th 2015 and will close July 15th;
speakers will decided by members of the program committee and contacted
by July 22nd regarding the status of their proposal.

Good luck with your proposals! Hope to see you all at MesosCon.

Dave


Re: Can Mesos master offer resources to multiple frameworks simultaneously?

2015-06-10 Thread Qian Zhang
Thanks Alex.

For 1. I understand currently the only choice is C++. However, as Adam
mentioned, true pluggable allocator modules (MESOS-2160
https://issues.apache.org/jira/browse/MESOS-2160) are landing in Mesos
0.23, so at that time, I assume we will have more choices, right?

For 2 and 3, my understanding is Mesos allocator will partition all the
available resources into multiple subsets, and there is no overlap between
these subsets (i.e., a single resource can only be in one subset), and then
offer these subsets to multiple frameworks (e.g., offer subset1 to
framework1, offer subset2 to framework2, and so on), and it is up to each
framework's scheduler to determine if it accept the resource to launch task
or reject it. In this way, each framework's scheduler can actually make
scheduling decision independently since they will never compete for the
same resource.

If my understanding is correct, then I have one more question:
4. What if it takes very long time (e.g., mins or hours) for a framework's
scheduler to make the scheduling decision? Does that mean during this long
period, the resources offered to this framework will not be used by any
other frameworks? Is there a timeout for the framework's scheduler to make
the scheduling decision? So when the timeout is reached, the resources
offered to it will be revoked by Mesos allocator and can be offered to
another framework.