> On March 28, 2016, 5:31 p.m., Vinod Kone wrote:
> > src/examples/long_lived_framework.cpp, line 77
> > <https://reviews.apache.org/r/45067/diff/2/?file=1312598#file1312598line77>
> >
> >     s/. Reject/. Reject/ <-- extraneous space

Oops, that's a habit (and the Apache license uses 2 spaces :)


> On March 28, 2016, 5:31 p.m., Vinod Kone wrote:
> > src/examples/long_lived_framework.cpp, lines 95-96
> > <https://reviews.apache.org/r/45067/diff/2/?file=1312598#file1312598line95>
> >
> >     So if resources are not enough we do not explicitly decline the offer?
> >     
> >     How about:
> >     
> >     ```
> >     // Reject offer if the offer belongs to a different slave than the one 
> > currently running the executor(s).
> >     if (slaveID.isSome() && slaveID != offers[i].slave_id()) {
> >       // decline offer.
> >     }
> >     
> >     // Reject the offer if the resources are not enough.
> >     if (!Resources(offers[i].resources()).flatten().contains(TASK_RESOURCES 
> > + EXECUTOR_RESOURCES)) {
> >       // decline offer.
> >     }
> >     
> >     // Launch the task.
> >     
> >     ```
> >     
> >     Also, are you expecting the executor to be multi-task? If yes, the 
> > resources check above should only include task resources? 
> >     
> >     Since you do not control whether the executor is single or multi-task, 
> > I would recommend to launch each task with an unique executor id.

We actually do expect the same executor to be used for each task.  (The 
executor's ID is "default".)

There are really 5 cases here:
1) Executor running & offer from different slave.
2) Executor running & offer not big enough for a task.
3) Executor running & offer from same slave & offer big enough for task.
4) Executor not running & offer not big enough for task + executor.
5) Executor not running & offer big enough for task + executor.


> On March 28, 2016, 5:31 p.m., Vinod Kone wrote:
> > src/examples/long_lived_framework.cpp, line 173
> > <https://reviews.apache.org/r/45067/diff/2/?file=1312598#file1312598line173>
> >
> >     why is this called "residency" instead of "slaveID" ?

I wanted a variable name that somehow expressed where the `long-lived-executor` 
was "living".  But `slaveId` would work too.


> On March 28, 2016, 5:31 p.m., Vinod Kone wrote:
> > src/examples/long_lived_framework.cpp, line 196
> > <https://reviews.apache.org/r/45067/diff/2/?file=1312598#file1312598line196>
> >
> >     why does the framework assume master lives on the same machine?

Hm, I guess the location of the master doesn't really play a role here.  
Removed from comment.


> On March 28, 2016, 5:31 p.m., Vinod Kone wrote:
> > src/examples/long_lived_framework.cpp, line 151
> > <https://reviews.apache.org/r/45067/diff/2/?file=1312598#file1312598line151>
> >
> >     I thought we could compare option directly with a value instead of 
> > doing a .get()?

Yup, my mistake.


> On March 28, 2016, 5:31 p.m., Vinod Kone wrote:
> > src/examples/long_lived_framework.cpp, line 160
> > <https://reviews.apache.org/r/45067/diff/2/?file=1312598#file1312598line160>
> >
> >     seems weird that you are setting residency/slaveID to None() if one 
> > executor is lost. isn't it still possible for other executors to be running 
> > on that slave? do we want to launch new executors on a different slave in 
> > that case?
> >     
> >     I would recommend to keep track of running executors with 
> > `set<ExecutorID>`.

As mentioned above, this framework only launches on executor with ID "default".


- Joseph


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45067/#review125780
-----------------------------------------------------------


On March 29, 2016, 12:08 p.m., Joseph Wu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/45067/
> -----------------------------------------------------------
> 
> (Updated March 29, 2016, 12:08 p.m.)
> 
> 
> Review request for mesos, Greg Mann, Artem Harutyunyan, Kevin Klues, and 
> Vinod Kone.
> 
> 
> Bugs: MESOS-5062
>     https://issues.apache.org/jira/browse/MESOS-5062
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This gives the example `long-lived-framework` enough options to run outside 
> of the build environment.
> 
> This also updates:
> 
> * The style of the framework code.
> * Gives the `ExecutorInfo` some resources (needed for some cgroups isolators).
> * Restricts the framework to one agent.  Otherwise, it would grab a small 
> chunk of every machine in the cluster.
> * Adds filters for declined offers.
> 
> 
> Diffs
> -----
> 
>   src/examples/long_lived_framework.cpp 
> ef498d63bc5f0a8deb46d71edd85a76a1d38fdd0 
> 
> Diff: https://reviews.apache.org/r/45067/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> Ran this on the master node on a Mesos cluster:
> ```
> ./long-lived-framework --master=zk://localhost:2181/mesos 
> --executor_uri="https://s3.amazonaws.com/url/to/long-lived-executor"; 
> --executor_command="LD_LIBRARY_PATH=/path/to/libmesos && 
> ./long-lived-executor"
> ```
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>

Reply via email to