Meeting notes about reservations

2015-02-20 Thread Jie Yu
Hi,

BenM, BenH, MPark and I had a sync today discussing reservations
(dynamic+static) in Mesos. It was a very productive meeting and here are
some notes I took from the meeting. I think all of us are on the same page
now and are happy with the current design.

- Jie



1) we still keep the static reservation, but we don't introduce a new
reservation type for that. If role = R and reservation is none, then it's
statically reserved (cannot be released by anyone).

2) /reserve and /unreserve endpoints.

Specify the role, slaveid (or hostname, or both, and we validate),
resources when reserving resources. It only takes resources from static *
resources. The operator can reserve both OP and FW reservations.

When hit the reserve endpoint, we use the offered resources and invoke the
offer rescind mechanism.

Question: should we release dynamic reservations when a framework is
shutting down? Multiple frameworks might share the same role!

3) reservation id

We might want to have a reservation id (the id can be the framework id as
well). The goal is to distinguish who reserved this resources if two
frameworks are under the same role.

4) introduce principal for reservation

The idea is that we need to track who reserved a resource so that we can
decide who can unreserve it. The principal and ACLs can help us setup those
rules for deciding who can unreserve what.

5) protobuf

message Resources {
  ...
  message Reservation {
 optional string principal;
 optional string reservation_id;  // introduce it later maybe
  }
  ...
  optional string role [ Default = * ];
  optional Reservation reservation;
}


Re: Review Request 31000: stout: Moved MAC and IP classes to stout/mac.hpp and stout/ip.hpp

2015-02-20 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31000/#review73367
---


Thanks! This is great! Can we commit this first? (and libprocess and mesos ones)

- Jie Yu


On Feb. 21, 2015, 12:02 a.m., Evelina Dumitrescu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/31000/
 ---
 
 (Updated Feb. 21, 2015, 12:02 a.m.)
 
 
 Review request for mesos and Dominic Hamon.
 
 
 Repository: mesos
 
 
 Description
 ---
 
 see summary
 
 
 Diffs
 -
 
   3rdparty/libprocess/3rdparty/stout/include/stout/ip.hpp PRE-CREATION 
   3rdparty/libprocess/3rdparty/stout/include/stout/mac.hpp PRE-CREATION 
   3rdparty/libprocess/3rdparty/stout/include/stout/net.hpp 
 b464e517bb1e7b6b381c6cd6c0466ed788a82615 
   3rdparty/libprocess/3rdparty/stout/tests/ip_tests.cpp PRE-CREATION 
   3rdparty/libprocess/3rdparty/stout/tests/mac_tests.cpp PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/31000/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Evelina Dumitrescu
 




Re: Help us review #MesosCon 2015 proposals

2015-02-20 Thread Jie Yu
+1

It does not allow me to continue to see the second page if I don't make
choices on the first one.

- Jie

On Fri, Feb 20, 2015 at 4:54 PM, Benjamin Mahler benjamin.mah...@gmail.com
wrote:

 Great to see so many proposals!

 Is it intentional that we have to review them in small subsets? It's hard
 to tell what to consider as an Average proposal when you can only see a
 small subset at a time. Just curious on the reasoning behind that.

 On Wed, Feb 18, 2015 at 2:44 PM, Dave Lester d...@davelester.org wrote:


 A total of 63 proposals were submitted for #MesosCon[1], up
 significantly from 24 submitted for last year’s conference. Similar to
 last year, the MesosCon program committee is opening these proposals up
 for community review/feedback to better-inform our decisions about what
 should be included in the program.

 In order to make it easier to review a subset of the proposals, we’ve
 segmented them based upon three loose themes: Frameworks, Users / Ops,
 and Mesos Internals and Extensions. We encourage you to review proposals
 based upon one theme, or all three!

 *Frameworks (18 Proposals):* bit.ly/MesosCon2015Frameworks Talks on how
 frameworks can be used, developed, and integrate with Mesos.

 *Users / Ops (28 Proposals):* bit.ly/MesosCon2015UsersOps A combination
 of talks that are use cases (how company x uses Mesos), and
 operations-focused (how we deploy x, use Docker, etc).

 *Mesos Internals and Extensions (17 Proposals):*
 bit.ly/MesosCon2015InternalsExt Features of the Mesos core, or software
 integrations with the internals of Mesos. Some proposals have overlap
 with frameworks and ops, but most are focused on the foundational
 aspects of how Mesos works.

 The forms above also include an opportunity to indicate which sessions
 you didn't see proposed but would like to attend.

 Thanks in advance for your participation! The forms will close on March
 4th 2015, two weeks from today.

 Dave


 Links:

   1. http://mesoscon.org





Re: Review Request 30911: Updated the filter abstraction for Resources.

2015-02-20 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30911/#review73294
---


You may wanna do a rebase. LGTM, will give a shipit once you updated it. Thanks 
Mpark!


include/mesos/resources.hpp
https://reviews.apache.org/r/30911/#comment119561

We don't use snake_case for function names. Please
```
s/persitent_volumes/persistentVolumes
```



src/master/allocator/mesos/hierarchical.hpp
https://reviews.apache.org/r/30911/#comment119562

Can we punt on that change right now? It does not seem to simply the code 
and it's actually more hard to read.



src/master/master.hpp
https://reviews.apache.org/r/30911/#comment119563

s/isCheckpoint/needCheckpointing/ ?



src/slave/slave.cpp
https://reviews.apache.org/r/30911/#comment119564

Can we simply the logic here to be:
```
Resources resources = task.resources();
if (task.has_executor()) {
  resources += task.executor().resources();
}

CHECK(checkpointedResources.contains(resources.persistentVolumes())
   ...;
```


- Jie Yu


On Feb. 19, 2015, 2:54 a.m., Michael Park wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30911/
 ---
 
 (Updated Feb. 19, 2015, 2:54 a.m.)
 
 
 Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, and 
 Vinod Kone.
 
 
 Bugs: MESOS-2348
 https://issues.apache.org/jira/browse/MESOS-2348
 
 
 Repository: mesos
 
 
 Description
 ---
 
 See [JIRA Ticket](https://issues.apache.org/jira/browse/MESOS-2348).
 
 
 Diffs
 -
 
   include/mesos/resources.hpp c7cc46e0183ea97013dd088a717da6c0e6ed5cf0 
   src/common/resources.cpp 98371f6873482d0cdbefeb279b58ae6cc680a88f 
   src/master/allocator/mesos/hierarchical.hpp 
 2680d6231927867d5a8d75cbc42b81d6c75fc7f2 
   src/master/master.hpp 6a39df04514c756415354fae66c5835ada191c52 
   src/master/validation.cpp acc35b25c93f2d3900d79c3070b1d681416ed66b 
   src/slave/slave.cpp ec7ec1356e745bb07484ae1755c9183b038043b3 
   src/tests/hierarchical_allocator_tests.cpp 
 eeecfb64540b1915074aaffaa5d506b203bc 
   src/tests/resources_tests.cpp 3f98782fd437dba808d720bf8e9b94b8fa7e0feb 
 
 Diff: https://reviews.apache.org/r/30911/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Michael Park
 




Re: Review Request 30911: Updated the filter abstraction for Resources.

2015-02-20 Thread Jie Yu


 On Feb. 20, 2015, 6:30 p.m., Jie Yu wrote:
  include/mesos/resources.hpp, lines 148-150
  https://reviews.apache.org/r/30911/diff/3/?file=868705#file868705line148
 
  We don't use snake_case for function names. Please
  ```
  s/persitent_volumes/persistentVolumes
  ```

I am a little hesitate if we want to introduce this helper or not. In many 
cases, the code is much simpler if we do
```
foreach (const Resource resource, task.resources()) {
  if (!resource.isPersistentVolume()) {
continue;
  }
  ...
}
```

Also, looks like the following code is not too bad as well.
```
resources.filter(Resources::isPersistentVolume)
```

What do you think?


- Jie


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30911/#review73294
---


On Feb. 19, 2015, 2:54 a.m., Michael Park wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30911/
 ---
 
 (Updated Feb. 19, 2015, 2:54 a.m.)
 
 
 Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, and 
 Vinod Kone.
 
 
 Bugs: MESOS-2348
 https://issues.apache.org/jira/browse/MESOS-2348
 
 
 Repository: mesos
 
 
 Description
 ---
 
 See [JIRA Ticket](https://issues.apache.org/jira/browse/MESOS-2348).
 
 
 Diffs
 -
 
   include/mesos/resources.hpp c7cc46e0183ea97013dd088a717da6c0e6ed5cf0 
   src/common/resources.cpp 98371f6873482d0cdbefeb279b58ae6cc680a88f 
   src/master/allocator/mesos/hierarchical.hpp 
 2680d6231927867d5a8d75cbc42b81d6c75fc7f2 
   src/master/master.hpp 6a39df04514c756415354fae66c5835ada191c52 
   src/master/validation.cpp acc35b25c93f2d3900d79c3070b1d681416ed66b 
   src/slave/slave.cpp ec7ec1356e745bb07484ae1755c9183b038043b3 
   src/tests/hierarchical_allocator_tests.cpp 
 eeecfb64540b1915074aaffaa5d506b203bc 
   src/tests/resources_tests.cpp 3f98782fd437dba808d720bf8e9b94b8fa7e0feb 
 
 Diff: https://reviews.apache.org/r/30911/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Michael Park
 




Re: Review Request 30545: cgroups: added support to listen on memory pressures.

2015-02-20 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30545/#review73359
---


The code looks great! Given that the tests are quite large. Can you split this 
patch and I'll give a shipit for the code.

For tests, I need to do a pass next week. Thanks! THis is great, Chi!


src/linux/cgroups.hpp
https://reviews.apache.org/r/30545/#comment119660

Any reason this is not a static method of pressure::Listener?



src/linux/cgroups.hpp
https://reviews.apache.org/r/30545/#comment119661

Well, you don't need this friend declaration if 'create' is a static member 
funciton, right?



src/linux/cgroups.cpp
https://reviews.apache.org/r/30545/#comment119664

Look like only one place is using this constant. Sugguest killing it.



src/linux/cgroups.cpp
https://reviews.apache.org/r/30545/#comment119663

I think we indent 2 spaces for switch 'case'.



src/linux/cgroups.cpp
https://reviews.apache.org/r/30545/#comment119662

I think here should be able to do:
```
  : coutner_(0),
error(None()),
process(new event::Listener(
hierarchy,
cgroup,
memory.pressure_level,
stringify(level))) {}
```



src/linux/cgroups.cpp
https://reviews.apache.org/r/30545/#comment119668

Remove the tailing period.



src/linux/cgroups.cpp
https://reviews.apache.org/r/30545/#comment119669

We don't use std:: prefix in cpp files. Please remove them (do a sweep in 
this file).


- Jie Yu


On Feb. 20, 2015, 9:37 p.m., Chi Zhang wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30545/
 ---
 
 (Updated Feb. 20, 2015, 9:37 p.m.)
 
 
 Review request for mesos, Dominic Hamon, Ian Downes, and Jie Yu.
 
 
 Bugs: MESOS-2136
 https://issues.apache.org/jira/browse/MESOS-2136
 
 
 Repository: mesos
 
 
 Description
 ---
 
 cgroups: added support to listen on memory pressures.
 
 
 Diffs
 -
 
   src/Makefile.am d372404d84c9ac0bc49da7407ad366701b9586a6 
   src/linux/cgroups.hpp e07772fec11b9bb73a44c54326f24d7ee1e8a64b 
   src/linux/cgroups.cpp a307e27c5840270e28cced2d5aeb90c6679bff1d 
   src/tests/cgroups_tests.cpp 9d50a47c939f5f136c85fdcc555459512c6f2259 
   src/tests/memory_test_helper.hpp PRE-CREATION 
   src/tests/memory_test_helper.cpp PRE-CREATION 
   src/tests/memory_test_helper_main.cpp PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/30545/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Chi Zhang
 




Re: Review Request 29288: stout: Created IP address abstraction for different protocol families

2015-02-19 Thread Jie Yu


 On Feb. 18, 2015, 1:28 a.m., Jie Yu wrote:
  Glad to see this being worked on, Thanks for all the effort!
  
  This is a partial review. Let's address the following comments first and 
  then I'll take a second pass.
  
  One high level comment is that we should separate this giant patch. Many 
  changes are being mixed together, making it very hard for reviewers to 
  digest. For example, you renamed `ip` to `IP::fromLinkDevice`. I think that 
  can be pulled into a separate patch, right?
  
  Also, let's try not to introduce helper functions that are not currently 
  being used. You can leave a TODO there. That can also bring this patch 
  smaller. For example, you don't have to add the implementation for IPv6 
  case in `IP::fromLinkDevice` because it's not being used yet. You can 
  always leave a TODO and fill the implementation later in a separate patch. 
  That'll make the reviewer's life much better.
  
  I would suggest we first refactor `net::IP` and separate the concepts of IP 
  address and IP network (see my comments below). No need to introduce the 
  concept of family (and all IPv6 stuffs) yet in this patch. You need to 
  update all the callsites that use `net::IP` (e.g., the port mapping 
  isolator might want to take `net::IPNetwork` instead of `net::IP` in some 
  cases). Let's start with this first!
  
  And then, in the second patch, you can make `net::IP` IPv6 capable. That 
  means you need to store a union. Simply add TODO in those helper functions 
  that you think you need to fill IPv6 implementaion in. Let's try to keep 
  that patch small as well.
  
  Then, we'll see what we go from there. Does that sound good to you?
 
 Evelina Dumitrescu wrote:
 I appreciate a lot that you took a look at my patches!
 
 If we splitt the patch into IP and IPNetwork, then the netmask from the 
 IPNetwork class should be optional.
 A situation where we need the net mask as optional is fromLinkDevice.
 We will end up with the IP and IPNetwork classes representing the same 
 thing.
 
 I agree with splitting this patch and introduce the changes step by step.

Can you expand on why fromLinkDevice needs netmask to be optional? Looking at 
the original logics, we only create a IPNetwork if both address and netmask 
exist.


- Jie


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29288/#review72771
---


On Feb. 12, 2015, 8:05 p.m., Evelina Dumitrescu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/29288/
 ---
 
 (Updated Feb. 12, 2015, 8:05 p.m.)
 
 
 Review request for mesos, Benjamin Hindman, Dominic Hamon, Jie Yu, Joris Van 
 Remoortere, and Niklas Nielsen.
 
 
 Bugs: MESOS-1919
 https://issues.apache.org/jira/browse/MESOS-1919
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Created the inner class InAddrStorage encapsulated inside the IP class.
 The class uses a union with the in_addr and in6_addr fields.
 I considered that the The MasterInfo protobuffers should have both an ipv4 
 and an ipv6 field.
 I intend to use the same Classifiers, addition, removal and update of 
 container filters, but write different encode/decode functions for IPv4/ICMP 
 and IPv6/ICMPv6 because the processing of the protocol headers differ.
 
 
 Diffs
 -
 
   3rdparty/libprocess/3rdparty/stout/include/stout/net.hpp 
 a0210ea6440086246aafe632f86498abbb70719a 
   3rdparty/libprocess/3rdparty/stout/tests/net_tests.cpp 
 425132e5d7c3770be4a5a39feea5a2f22179b871 
 
 Diff: https://reviews.apache.org/r/29288/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Evelina Dumitrescu
 




Re: Review Request 30545: cgroups: added support to listen on memory pressures.

2015-02-18 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30545/#review72971
---


First pass, haven't looked at the tests yet. The main issue is that Listener is 
not thread safe. You probably need to create a libprocess.


src/linux/cgroups.hpp
https://reviews.apache.org/r/30545/#comment119089

We usually put the comments above the forward declaration:
```
// Forward declaration.
class Listener;
```



src/linux/cgroups.hpp
https://reviews.apache.org/r/30545/#comment119090

+1 on Dominic's comment.

Let's create a enum and simply expose the static method 'create' in 
pressure::Listener. You can kill the forward declaration as well.

```
enum Level
{
  LOW,
  MEDIUM,
  CRITICAL
};

TryOwnedListener pressure::Listener::create(
const string hierarchy,
const string cgroup,
Level level);
```



src/linux/cgroups.hpp
https://reviews.apache.org/r/30545/#comment119092

Remove On a single error..., and replace it with

Please see comments of 'counter' for details.



src/linux/cgroups.hpp
https://reviews.apache.org/r/30545/#comment119099

What's the concurrency story here?

Looks like this class is not thread safe? Some thread is reading 
counter/error while other thread can modify counter/error. I guess you still 
need a process here.



src/linux/cgroups.hpp
https://reviews.apache.org/r/30545/#comment119091

Returns a failure if any error occurs while monitoring the pressure 
events, and any subsequent calls to 'counter' will return the same failure. In 
that case, the user should consider create a new Listener.



src/linux/cgroups.hpp
https://reviews.apache.org/r/30545/#comment119093

See my comments above, please expose this function



src/linux/cgroups.hpp
https://reviews.apache.org/r/30545/#comment119095

s/read/listen/?

Also, please use const ref for 'future'
```
const Futureuint64_t future
```



src/linux/cgroups.hpp
https://reviews.apache.org/r/30545/#comment119098

Let's not conflate 'counter' and 'error'. Use two member fields:

```
OptionError error;
uint64_t counter_; (initialize it to 0)
```



src/linux/cgroups.cpp
https://reviews.apache.org/r/30545/#comment119094

Kill those functions.



src/linux/cgroups.cpp
https://reviews.apache.org/r/30545/#comment119096

s/read/listen



src/linux/cgroups.cpp
https://reviews.apache.org/r/30545/#comment119097

s/read/listen/


- Jie Yu


On Feb. 17, 2015, 8:12 p.m., Chi Zhang wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30545/
 ---
 
 (Updated Feb. 17, 2015, 8:12 p.m.)
 
 
 Review request for mesos, Dominic Hamon, Ian Downes, and Jie Yu.
 
 
 Bugs: MESOS-2136
 https://issues.apache.org/jira/browse/MESOS-2136
 
 
 Repository: mesos
 
 
 Description
 ---
 
 cgroups: added support to listen on memory pressures.
 
 
 Diffs
 -
 
   src/Makefile.am d372404d84c9ac0bc49da7407ad366701b9586a6 
   src/linux/cgroups.hpp bf8efee05b995a37d4e6e7679a493b2d5238aa0b 
   src/linux/cgroups.cpp b6f75b14dea7609b90627eccdd33ef891ed9d21f 
   src/tests/cgroups_tests.cpp 9d50a47c939f5f136c85fdcc555459512c6f2259 
   src/tests/memory_test_helper.hpp PRE-CREATION 
   src/tests/memory_test_helper.cpp PRE-CREATION 
   src/tests/memory_test_helper_main.cpp PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/30545/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Chi Zhang
 




Re: Review Request 31024: Changed the slave logic so that it always waits for containerizer update to finish.

2015-02-18 Thread Jie Yu


 On Feb. 18, 2015, 6:53 p.m., Vinod Kone wrote:
  src/slave/slave.hpp, line 280
  https://reviews.apache.org/r/31024/diff/1-3/?file=863795#file863795line280
 
  s/flushTasks/runTasks/ so that we don't add a new verb flush?

We already have 'runTask'. Sounds a little confusing if we have both 'runTask' 
and 'runTasks'. Because of that, I still prefer 'flushTasks'. What do you think?


- Jie


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31024/#review72975
---


On Feb. 18, 2015, 12:01 a.m., Jie Yu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/31024/
 ---
 
 (Updated Feb. 18, 2015, 12:01 a.m.)
 
 
 Review request for mesos, Ian Downes and Vinod Kone.
 
 
 Bugs: MESOS-998
 https://issues.apache.org/jira/browse/MESOS-998
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Changed the slave logic so that it always waits for containerizer update to 
 finish.
 
 The key idea is to leverage `executor-queuedTasks` when 
 containerizer-update is in progress.
 
 
 Diffs
 -
 
   include/mesos/type_utils.hpp 32dc6ed489f43ba3695507f59c4c2d94028c8df1 
   src/slave/slave.hpp 7a399f6df50c69b7e1e12d74f076fa57b6edb1b3 
   src/slave/slave.cpp ec7ec1356e745bb07484ae1755c9183b038043b3 
   src/tests/master_tests.cpp e69a7fb39e93a9a999e0991985f5d060b72aac65 
   src/tests/registrar_zookeeper_tests.cpp 
 30d3cd8b1e7d56fbb326c9feccd4c3cbae2015de 
   src/tests/scheduler_tests.cpp 080ec7236246794de6c8cc7356a06331c5e48cd5 
   src/tests/slave_tests.cpp 21709e2858d91a2746e0343786f73dec22735da6 
 
 Diff: https://reviews.apache.org/r/31024/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Jie Yu
 




Re: Review Request 31162: Backout change 30300 (Move internal protos from mesos::internal to mesos namespace)

2015-02-18 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31162/#review73025
---



include/mesos/authentication/authentication.hpp
https://reviews.apache.org/r/31162/#comment119185

We don't do using namespace in headers.


- Jie Yu


On Feb. 18, 2015, 10:27 p.m., Kapil Arya wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/31162/
 ---
 
 (Updated Feb. 18, 2015, 10:27 p.m.)
 
 
 Review request for mesos, Ben Mahler, Jie Yu, Niklas Nielsen, Till Toenshoff, 
 and Vinod Kone.
 
 
 Bugs: MESOS-2371
 https://issues.apache.org/jira/browse/MESOS-2371
 
 
 Repository: mesos
 
 
 Description
 ---
 
 This changeset puts the internal protos back into mesos::internal namespace.
 This is required for upgrade compatibility.  Without it, a newer Master would
 reject all messages from an older Slave and vice-versa.
 
 https://reviews.apache.org/r/30300/
 
 
 Diffs
 -
 
   include/mesos/authentication/authentication.hpp 
 699aa886286bc7d9c05592e71232ab1c1084871f 
   include/mesos/authentication/authentication.proto 
 38a6f781d6a0b4618a14e1681420564d78b840a8 
   include/mesos/module/module.hpp e83be2822b7c0e7935ab1c8af36e5cb9b5180f20 
   include/mesos/module/module.proto 821fc0e72ece7c497595859fc5efc1c64ea49b9b 
   include/mesos/type_utils.hpp cdf5864389a72002b538c263d70bcade2bdffa45 
   src/common/parse.hpp 547b32041f39f0ff0c38179b66a32b2239134abc 
   src/common/type_utils.cpp 12a36bbd7d7773b25dedf2d0d951c79e0b5141d6 
   src/log/consensus.hpp ee9e9081ffe2a5f18433d9eff1a2e2cf17c81863 
   src/log/leveldb.hpp 8f5df5bdfb08de02e4129494fba188f97c3b8010 
   src/log/leveldb.cpp 1d679425c6df3498be67d048bc05d02ffe03a636 
   src/log/replica.hpp b2602083564447cf75827257a2548c1b5aeb49e2 
   src/log/storage.hpp 5e81f4e9774ad239145019320cf991fcf51a9da5 
   src/master/registry.hpp d3cbe564dc8e14f3dd301e5bc06ed60d82856eee 
   src/master/registry.proto 29a309763bca9db76c443d7c039ca152a2afbc5b 
   src/messages/log.proto 12b3572e0fe0e30c5eff20c7af27eb3d7cfe8f4b 
   src/messages/messages.hpp 25769b797ec9bd1a1c348c467616aef8407c45d6 
   src/messages/messages.proto 58484ae45071a80afd2b11803dd66a88f88ad9ed 
   src/messages/state.proto 7fc4883b9fe590afbbb0261988c929c3b1f8f676 
   src/state/storage.hpp f5cd607b47d44f72d72cbd0c5adb7368410b83d8 
   src/tests/rate_limiting_tests.cpp 784ac76ee81918466de44196f5adc2b88adfee96 
   src/tests/state_tests.cpp b30812322dd8ae507e1d45912ab559940ac5ee73 
 
 Diff: https://reviews.apache.org/r/31162/diff/
 
 
 Testing
 ---
 
 make check.
 
 Also tested with master running HEAD with this patch, and slave running 
 0.21.0 with src/test-framework.
 
 The tests succeed when src/test-framework is running from the same version as 
 the slave.  However, if I run src/test-framework from the master version, it 
 doesn't succeed and errors out with unexpected TASK_LOST.  Not sure if this 
 is the desired behavior.
 
 
 Thanks,
 
 Kapil Arya
 




Re: Review Request 31162: Backout change 30300 (Move internal protos from mesos::internal to mesos namespace)

2015-02-18 Thread Jie Yu


 On Feb. 18, 2015, 10:42 p.m., Jie Yu wrote:
  include/mesos/authentication/authentication.hpp, lines 25-27
  https://reviews.apache.org/r/31162/diff/1/?file=868264#file868264line25
 
  We don't do using namespace in headers.
 
 Kapil Arya wrote:
 Yes, but this was to avoid changing numerous cpp files to include `using 
 namespace`.  Is it possible to make an exception until we can figure out a 
 better way to handle `internal`?

Can we first discuss what the right solution should be? And what's the upgrade 
story if we still want to move from mesos::internal to mesos? At least, we want 
to make sure we have a plausible way forward. Thoughts?


- Jie


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31162/#review73025
---


On Feb. 18, 2015, 10:27 p.m., Kapil Arya wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/31162/
 ---
 
 (Updated Feb. 18, 2015, 10:27 p.m.)
 
 
 Review request for mesos, Ben Mahler, Jie Yu, Niklas Nielsen, Till Toenshoff, 
 and Vinod Kone.
 
 
 Bugs: MESOS-2371
 https://issues.apache.org/jira/browse/MESOS-2371
 
 
 Repository: mesos
 
 
 Description
 ---
 
 This changeset puts the internal protos back into mesos::internal namespace.
 This is required for upgrade compatibility.  Without it, a newer Master would
 reject all messages from an older Slave and vice-versa.
 
 https://reviews.apache.org/r/30300/
 
 
 Diffs
 -
 
   include/mesos/authentication/authentication.hpp 
 699aa886286bc7d9c05592e71232ab1c1084871f 
   include/mesos/authentication/authentication.proto 
 38a6f781d6a0b4618a14e1681420564d78b840a8 
   include/mesos/module/module.hpp e83be2822b7c0e7935ab1c8af36e5cb9b5180f20 
   include/mesos/module/module.proto 821fc0e72ece7c497595859fc5efc1c64ea49b9b 
   include/mesos/type_utils.hpp cdf5864389a72002b538c263d70bcade2bdffa45 
   src/common/parse.hpp 547b32041f39f0ff0c38179b66a32b2239134abc 
   src/common/type_utils.cpp 12a36bbd7d7773b25dedf2d0d951c79e0b5141d6 
   src/log/consensus.hpp ee9e9081ffe2a5f18433d9eff1a2e2cf17c81863 
   src/log/leveldb.hpp 8f5df5bdfb08de02e4129494fba188f97c3b8010 
   src/log/leveldb.cpp 1d679425c6df3498be67d048bc05d02ffe03a636 
   src/log/replica.hpp b2602083564447cf75827257a2548c1b5aeb49e2 
   src/log/storage.hpp 5e81f4e9774ad239145019320cf991fcf51a9da5 
   src/master/registry.hpp d3cbe564dc8e14f3dd301e5bc06ed60d82856eee 
   src/master/registry.proto 29a309763bca9db76c443d7c039ca152a2afbc5b 
   src/messages/log.proto 12b3572e0fe0e30c5eff20c7af27eb3d7cfe8f4b 
   src/messages/messages.hpp 25769b797ec9bd1a1c348c467616aef8407c45d6 
   src/messages/messages.proto 58484ae45071a80afd2b11803dd66a88f88ad9ed 
   src/messages/state.proto 7fc4883b9fe590afbbb0261988c929c3b1f8f676 
   src/state/storage.hpp f5cd607b47d44f72d72cbd0c5adb7368410b83d8 
   src/tests/rate_limiting_tests.cpp 784ac76ee81918466de44196f5adc2b88adfee96 
   src/tests/state_tests.cpp b30812322dd8ae507e1d45912ab559940ac5ee73 
 
 Diff: https://reviews.apache.org/r/31162/diff/
 
 
 Testing
 ---
 
 make check.
 
 Also tested with master running HEAD with this patch, and slave running 
 0.21.0 with src/test-framework.
 
 The tests succeed when src/test-framework is running from the same version as 
 the slave.  However, if I run src/test-framework from the master version, it 
 doesn't succeed and errors out with unexpected TASK_LOST.  Not sure if this 
 is the desired behavior.
 
 
 Thanks,
 
 Kapil Arya
 




Re: Review Request 31162: Backout change 30300 (Move internal protos from mesos::internal to mesos namespace)

2015-02-18 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31162/#review73066
---


The high level comment is: you should only put the namespace hacks in protobuf 
wrapper headers so that once we move forward, we can just go to all the 
protobuf header wrappers and remove those hacks. Rest the code base should not 
have any internal:: namespace (maybe one exception which is unfortunate).


include/mesos/authentication/authentication.hpp
https://reviews.apache.org/r/31162/#comment119249

Could you expand the comments here explaining why we need to do this 
include? Maybe link the ticket as well. Here and everywhere else.



include/mesos/module/module.hpp
https://reviews.apache.org/r/31162/#comment119250

Please move this to common/type_utils.hpp



include/mesos/type_utils.hpp
https://reviews.apache.org/r/31162/#comment119248

Instead of pulling it into module.hpp, can you keep it here? Also, if you 
include module/module.hpp, do you still need to change this code?



src/common/parse.hpp
https://reviews.apache.org/r/31162/#comment119263

This is a little unfortunate, could you please add a TODO here and link the 
ticket?



src/common/type_utils.cpp
https://reviews.apache.org/r/31162/#comment119251

I am wondering why this is necessary if you already included 
messages/messages.hpp (which has using namespace mesos::internal)?



src/log/consensus.hpp
https://reviews.apache.org/r/31162/#comment119253

Let's keep all the namespace hacks in protobuf wrapper headers so that it's 
easy for us to find them later. In this case, you can put `using namespace 
mesos::internal::log` in messages/log.hpp.



src/log/leveldb.hpp
https://reviews.apache.org/r/31162/#comment119260

Ditto above.



src/log/leveldb.cpp
https://reviews.apache.org/r/31162/#comment119259

Ditto above.



src/log/replica.hpp
https://reviews.apache.org/r/31162/#comment119254

Ditto above.



src/messages/messages.hpp
https://reviews.apache.org/r/31162/#comment119256

Curious why these are not in type_utils?



src/state/storage.hpp
https://reviews.apache.org/r/31162/#comment119258

Let's do the similiar thing here (like the log above). Kill this and put 
that into messages/state.hpp.



src/tests/rate_limiting_tests.cpp
https://reviews.apache.org/r/31162/#comment119261

I would suggest just change this comment to not include the namespace so 
that we don't forget to change this once we move forward.
```
Message RegisterFrameworkMessage ...
```

Please do a sweep to find all such cases and fix them.



src/tests/state_tests.cpp
https://reviews.apache.org/r/31162/#comment119262

typedef Registry::Slaves Slaves;
typedef Registry::Slave Slave;


- Jie Yu


On Feb. 19, 2015, 12:59 a.m., Kapil Arya wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/31162/
 ---
 
 (Updated Feb. 19, 2015, 12:59 a.m.)
 
 
 Review request for mesos, Ben Mahler, Jie Yu, Niklas Nielsen, Till Toenshoff, 
 and Vinod Kone.
 
 
 Bugs: MESOS-2371
 https://issues.apache.org/jira/browse/MESOS-2371
 
 
 Repository: mesos
 
 
 Description
 ---
 
 This changeset puts the internal protos back into mesos::internal namespace.
 This is required for upgrade compatibility.  Without it, a newer Master would
 reject all messages from an older Slave and vice-versa.
 
 https://reviews.apache.org/r/30300/
 
 
 Diffs
 -
 
   include/mesos/authentication/authentication.hpp 
 699aa886286bc7d9c05592e71232ab1c1084871f 
   include/mesos/authentication/authentication.proto 
 38a6f781d6a0b4618a14e1681420564d78b840a8 
   include/mesos/module/module.hpp e83be2822b7c0e7935ab1c8af36e5cb9b5180f20 
   include/mesos/module/module.proto 821fc0e72ece7c497595859fc5efc1c64ea49b9b 
   include/mesos/type_utils.hpp cdf5864389a72002b538c263d70bcade2bdffa45 
   src/common/parse.hpp 547b32041f39f0ff0c38179b66a32b2239134abc 
   src/common/type_utils.cpp 12a36bbd7d7773b25dedf2d0d951c79e0b5141d6 
   src/log/consensus.hpp ee9e9081ffe2a5f18433d9eff1a2e2cf17c81863 
   src/log/leveldb.hpp 8f5df5bdfb08de02e4129494fba188f97c3b8010 
   src/log/leveldb.cpp 1d679425c6df3498be67d048bc05d02ffe03a636 
   src/log/replica.hpp b2602083564447cf75827257a2548c1b5aeb49e2 
   src/log/storage.hpp 5e81f4e9774ad239145019320cf991fcf51a9da5 
   src/master/registry.hpp d3cbe564dc8e14f3dd301e5bc06ed60d82856eee 
   src/master/registry.proto 29a309763bca9db76c443d7c039ca152a2afbc5b 
   src/messages/log.proto 12b3572e0fe0e30c5eff20c7af27eb3d7cfe8f4b 
   src/messages/messages.hpp 25769b797ec9bd1a1c348c467616aef8407c45d6 
   src/messages/messages.proto 58484ae45071a80afd2b11803dd66a88f88ad9ed 
   src/messages/state.proto

Re: Review Request 31008: Refactored EventListener to allow for continuous monitoring of an event.

2015-02-17 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31008/#review72866
---



src/linux/cgroups.cpp
https://reviews.apache.org/r/31008/#comment118987

This is still buggy right? You call fd.get() even if fd is an error?



src/linux/cgroups.cpp
https://reviews.apache.org/r/31008/#comment118988

Please update the comments.



src/linux/cgroups.cpp
https://reviews.apache.org/r/31008/#comment118986

Indent.


- Jie Yu


On Feb. 18, 2015, 1:14 a.m., Chi Zhang wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/31008/
 ---
 
 (Updated Feb. 18, 2015, 1:14 a.m.)
 
 
 Review request for mesos, Dominic Hamon, Ian Downes, and Jie Yu.
 
 
 Bugs: mesos-2136
 https://issues.apache.org/jira/browse/mesos-2136
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Refactored EventListener to allow for continuous monitoring of an event.
 
 
 Diffs
 -
 
   src/linux/cgroups.hpp bf8efee05b995a37d4e6e7679a493b2d5238aa0b 
   src/linux/cgroups.cpp b6f75b14dea7609b90627eccdd33ef891ed9d21f 
 
 Diff: https://reviews.apache.org/r/31008/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Chi Zhang
 




Re: Review Request 31024: Changed the slave logic so that it always waits for containerizer update to finish.

2015-02-17 Thread Jie Yu


 On Feb. 17, 2015, 7:09 p.m., Vinod Kone wrote:
  src/slave/slave.cpp, line 1695
  https://reviews.apache.org/r/31024/diff/1/?file=863796#file863796line1695
 
  Not yours, but can you add a comment here:
  
  // Send terminal update which removes the task from 'queuedTasks'.

I think there is a NOTE above already. I'll move it down to here.


 On Feb. 17, 2015, 7:09 p.m., Vinod Kone wrote:
  src/slave/slave.cpp, lines 1738-1747
  https://reviews.apache.org/r/31024/diff/1/?file=863796#file863796line1738
 
  Hmmm. I don't think we want to do this. If an executor has been running 
  for a while, and all its launched tasks have terminated, we want it to keep 
  running and wait for more tasks. With the new semantics, it's possible that 
  we kill the executor if a kill task comes for a queued task.
  
  We did this optimization for unregistered executor as a stop gap 
  because there was no shutdown executor api call.

Per our offline discussion, this is a stop gap for those single task executors 
that do not have a proper logic for self terminating if no task is received for 
an extended period of time. I added a TODO here to remove this logic because 
looks like a bug of the executor.


 On Feb. 17, 2015, 7:09 p.m., Vinod Kone wrote:
  src/slave/slave.cpp, lines 2441-2451
  https://reviews.apache.org/r/31024/diff/1/?file=863796#file863796line2441
 
  Shouldn't the monitoring be done after the containerizer update 
  completes?

I'll leave a TODO here. If 'update' fails, the container will be terminated and 
thus stopping the monitor.


- Jie


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31024/#review72747
---


On Feb. 13, 2015, 10:43 p.m., Jie Yu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/31024/
 ---
 
 (Updated Feb. 13, 2015, 10:43 p.m.)
 
 
 Review request for mesos, Ian Downes and Vinod Kone.
 
 
 Bugs: MESOS-998
 https://issues.apache.org/jira/browse/MESOS-998
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Changed the slave logic so that it always waits for containerizer update to 
 finish.
 
 The key idea is to leverage `executor-queuedTasks` when 
 containerizer-update is in progress.
 
 
 Diffs
 -
 
   include/mesos/type_utils.hpp 32dc6ed489f43ba3695507f59c4c2d94028c8df1 
   src/slave/slave.hpp 7a399f6df50c69b7e1e12d74f076fa57b6edb1b3 
   src/slave/slave.cpp f39a876cdd6b580a7a75fd053e6923761bee7635 
   src/tests/master_tests.cpp c678527942506ef23fa4f1aebf9664e4cc21b561 
   src/tests/registrar_zookeeper_tests.cpp 
 30d3cd8b1e7d56fbb326c9feccd4c3cbae2015de 
   src/tests/scheduler_tests.cpp 080ec7236246794de6c8cc7356a06331c5e48cd5 
   src/tests/slave_tests.cpp a02e335576bf68b449a6286fa5cf5093b1b7182a 
 
 Diff: https://reviews.apache.org/r/31024/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Jie Yu
 




Re: Review Request 31008: Refactored EventListener to allow for continuous monitoring of an event.

2015-02-17 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31008/#review72825
---

Ship it!


Looking good! Only a few nits.


src/linux/cgroups.cpp
https://reviews.apache.org/r/31008/#comment118917

and assumes that arguments are validated.

s/public function/public interface/



src/linux/cgroups.cpp
https://reviews.apache.org/r/31008/#comment118918

```s/eventfd_/fd```



src/linux/cgroups.cpp
https://reviews.apache.org/r/31008/#comment118920

Hum, what if `registerNotifier` fails? This will be a segfault! I would 
suggest reverting the change and making `eventfd` an `Optionint`.



src/linux/cgroups.cpp
https://reviews.apache.org/r/31008/#comment118923

Not your fault, but let's fail the promise here
```
promise.get()-fail(Event listener is terminating);
```



src/linux/cgroups.cpp
https://reviews.apache.org/r/31008/#comment118924

This comment is confusing. I would suggest just kill it.



src/linux/cgroups.cpp
https://reviews.apache.org/r/31008/#comment118927

insert one blank line above. Also, i would suggest the following style:

```
if (reading.isReady()  reading.get() == sizeof(data)) {
  promise.get()-set(data);
  
  // ...
  promise = None();
  return;
}

// Error cases.
```



src/linux/cgroups.cpp
https://reviews.apache.org/r/31008/#comment118928

You don't need std:: prefix here as this is in cpp. Please do a sweep to 
fix all such code in this file. Thanks!

Also, see my comments below.



src/linux/cgroups.cpp
https://reviews.apache.org/r/31008/#comment118929

I would suggest simply set the error directly:

```
if (reading.isDiscarded()) {
  error = Error(...);
} else if (...) {
  error = Error(...);
} else {
  error = Error(...);
}

promise.get()-fail(error.get().message);
```



src/linux/cgroups.cpp
https://reviews.apache.org/r/31008/#comment118932

Read less than expected: expect yyy bytes and actual xxx bytes



src/linux/cgroups.cpp
https://reviews.apache.org/r/31008/#comment118922

Why reordering these fields?



src/linux/cgroups.cpp
https://reviews.apache.org/r/31008/#comment118921

See my above comments, please revert this change.



src/linux/cgroups.cpp
https://reviews.apache.org/r/31008/#comment118936

Style nits. Align them probably:

```
future
  .onDiscard(lambda::bind(
  ...)
  .onAny(lambda::bind(
  ...);
```



src/linux/cgroups.cpp
https://reviews.apache.org/r/31008/#comment118934

One more blank line


- Jie Yu


On Feb. 17, 2015, 8:10 p.m., Chi Zhang wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/31008/
 ---
 
 (Updated Feb. 17, 2015, 8:10 p.m.)
 
 
 Review request for mesos, Dominic Hamon, Ian Downes, and Jie Yu.
 
 
 Bugs: mesos-2136
 https://issues.apache.org/jira/browse/mesos-2136
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Refactored EventListener to allow for continuous monitoring of an event.
 
 
 Diffs
 -
 
   src/linux/cgroups.hpp bf8efee05b995a37d4e6e7679a493b2d5238aa0b 
   src/linux/cgroups.cpp b6f75b14dea7609b90627eccdd33ef891ed9d21f 
 
 Diff: https://reviews.apache.org/r/31008/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Chi Zhang
 




Re: Review Request 30911: Updated the filter abstraction for Resources.

2015-02-17 Thread Jie Yu


 On Feb. 16, 2015, 11:43 p.m., Jie Yu wrote:
  include/mesos/resources.hpp, line 61
  https://reviews.apache.org/r/30911/diff/2/?file=863975#file863975line61
 
  We avoid using typedefs as much as possible trying to be more explicit.
 
 Alexander Rukletsov wrote:
 I think it's good place to make an exception to our rules: this typedef 
 makes client code brief and easier to understand. Look for example into 
 `Resources::flatten()`.

OK, looks like so if we do not allow 'auto'. Feel free to drop this one.


- Jie


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30911/#review72660
---


On Feb. 14, 2015, 12:11 a.m., Michael Park wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30911/
 ---
 
 (Updated Feb. 14, 2015, 12:11 a.m.)
 
 
 Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, and 
 Vinod Kone.
 
 
 Bugs: MESOS-2348
 https://issues.apache.org/jira/browse/MESOS-2348
 
 
 Repository: mesos
 
 
 Description
 ---
 
 See [JIRA Ticket](https://issues.apache.org/jira/browse/MESOS-2348).
 
 
 Diffs
 -
 
   include/mesos/resources.hpp c7cc46e0183ea97013dd088a717da6c0e6ed5cf0 
   src/common/resources.cpp 98371f6873482d0cdbefeb279b58ae6cc680a88f 
   src/master/allocator/mesos/hierarchical.hpp 
 2680d6231927867d5a8d75cbc42b81d6c75fc7f2 
   src/master/master.hpp 6a39df04514c756415354fae66c5835ada191c52 
   src/master/validation.cpp acc35b25c93f2d3900d79c3070b1d681416ed66b 
   src/slave/slave.cpp ec7ec1356e745bb07484ae1755c9183b038043b3 
   src/tests/hierarchical_allocator_tests.cpp 
 eeecfb64540b1915074aaffaa5d506b203bc 
   src/tests/resources_tests.cpp 3f98782fd437dba808d720bf8e9b94b8fa7e0feb 
 
 Diff: https://reviews.apache.org/r/30911/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Michael Park
 




Re: Review Request 31024: Changed the slave logic so that it always waits for containerizer update to finish.

2015-02-17 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31024/
---

(Updated Feb. 18, 2015, 12:01 a.m.)


Review request for mesos, Ian Downes and Vinod Kone.


Changes
---

Rebased.


Bugs: MESOS-998
https://issues.apache.org/jira/browse/MESOS-998


Repository: mesos


Description
---

Changed the slave logic so that it always waits for containerizer update to 
finish.

The key idea is to leverage `executor-queuedTasks` when containerizer-update 
is in progress.


Diffs (updated)
-

  include/mesos/type_utils.hpp 32dc6ed489f43ba3695507f59c4c2d94028c8df1 
  src/slave/slave.hpp 7a399f6df50c69b7e1e12d74f076fa57b6edb1b3 
  src/slave/slave.cpp ec7ec1356e745bb07484ae1755c9183b038043b3 
  src/tests/master_tests.cpp e69a7fb39e93a9a999e0991985f5d060b72aac65 
  src/tests/registrar_zookeeper_tests.cpp 
30d3cd8b1e7d56fbb326c9feccd4c3cbae2015de 
  src/tests/scheduler_tests.cpp 080ec7236246794de6c8cc7356a06331c5e48cd5 
  src/tests/slave_tests.cpp 21709e2858d91a2746e0343786f73dec22735da6 

Diff: https://reviews.apache.org/r/31024/diff/


Testing
---

make check


Thanks,

Jie Yu



Re: Review Request 31024: Changed the slave logic so that it always waits for containerizer update to finish.

2015-02-17 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31024/
---

(Updated Feb. 17, 2015, 11:19 p.m.)


Review request for mesos, Ian Downes and Vinod Kone.


Changes
---

Vinod's comments.


Bugs: MESOS-998
https://issues.apache.org/jira/browse/MESOS-998


Repository: mesos


Description
---

Changed the slave logic so that it always waits for containerizer update to 
finish.

The key idea is to leverage `executor-queuedTasks` when containerizer-update 
is in progress.


Diffs (updated)
-

  include/mesos/type_utils.hpp 32dc6ed489f43ba3695507f59c4c2d94028c8df1 
  src/slave/slave.hpp 7a399f6df50c69b7e1e12d74f076fa57b6edb1b3 
  src/slave/slave.cpp ec7ec1356e745bb07484ae1755c9183b038043b3 
  src/tests/master_tests.cpp e69a7fb39e93a9a999e0991985f5d060b72aac65 
  src/tests/registrar_zookeeper_tests.cpp 
30d3cd8b1e7d56fbb326c9feccd4c3cbae2015de 
  src/tests/scheduler_tests.cpp 080ec7236246794de6c8cc7356a06331c5e48cd5 
  src/tests/slave_tests.cpp 21709e2858d91a2746e0343786f73dec22735da6 

Diff: https://reviews.apache.org/r/31024/diff/


Testing
---

make check


Thanks,

Jie Yu



Re: Review Request 30906: Added a metric for launcher's container destruction failure.

2015-02-17 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30906/#review72812
---



src/slave/containerizer/linux_launcher.cpp
https://reviews.apache.org/r/30906/#comment118898

You may wanna add a note here saying that we expect the lifetime of 
launcher is longer than the time it take for `ns::pid::destroy` to finish. 
Otherwise, we may delete `this` before onAny is called, leading to a segfault 
potentially.

Can you double check that the above is always enforced in our code 
(especially in tests).


- Jie Yu


On Feb. 13, 2015, 10:37 p.m., Vinod Kone wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30906/
 ---
 
 (Updated Feb. 13, 2015, 10:37 p.m.)
 
 
 Review request for mesos, Ian Downes and Jie Yu.
 
 
 Bugs: MESOS-1690
 https://issues.apache.org/jira/browse/MESOS-1690
 
 
 Repository: mesos
 
 
 Description
 ---
 
 See summary.
 
 
 Diffs
 -
 
   src/Makefile.am d372404d84c9ac0bc49da7407ad366701b9586a6 
   src/slave/containerizer/linux_launcher.hpp 
 37304222323d3089b95df605d2077a3f88d11277 
   src/slave/containerizer/linux_launcher.cpp 
 3fdfb5709d48d079bbe904a91b773f131131b08d 
   src/slave/containerizer/mesos/containerizer.cpp 
 d5b90d12d63becfeb4c3efa9c6f5d826417f582c 
   src/tests/launcher_tests.cpp PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/30906/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Vinod Kone
 




Re: Review Request 29288: stout: Created IP address abstraction for different protocol families

2015-02-17 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29288/#review72771
---


Glad to see this being worked on, Thanks for all the effort!

This is a partial review. Let's address the following comments first and then 
I'll take a second pass.

One high level comment is that we should separate this giant patch. Many 
changes are being mixed together, making it very hard for reviewers to digest. 
For example, you renamed `ip` to `IP::fromLinkDevice`. I think that can be 
pulled into a separate patch, right?

Also, let's try not to introduce helper functions that are not currently being 
used. You can leave a TODO there. That can also bring this patch smaller. For 
example, you don't have to add the implementation for IPv6 case in 
`IP::fromLinkDevice` because it's not being used yet. You can always leave a 
TODO and fill the implementation later in a separate patch. That'll make the 
reviewer's life much better.

I would suggest we first refactor `net::IP` and separate the concepts of IP 
address and IP network (see my comments below). No need to introduce the 
concept of family (and all IPv6 stuffs) yet in this patch. You need to update 
all the callsites that use `net::IP` (e.g., the port mapping isolator might 
want to take `net::IPNetwork` instead of `net::IP` in some cases). Let's start 
with this first!

And then, in the second patch, you can make `net::IP` IPv6 capable. That means 
you need to store a union. Simply add TODO in those helper functions that you 
think you need to fill IPv6 implementaion in. Let's try to keep that patch 
small as well.

Then, we'll see what we go from there. Does that sound good to you?


3rdparty/libprocess/3rdparty/stout/include/stout/net.hpp
https://reviews.apache.org/r/29288/#comment118944

It's partially my fault, but we really should separate the concepts of IP 
address and IP network. This is pretty standard in other lauguages. For 
instance, Python has `IPAddress` and `IPNetwork`. Go has `IP` and `IPNet`.

Given that we already have `net::IP`, I would sugguest that we use that to 
represent a single IP address and introduce `net::IPNetwork` to represent an IP 
network (with netmask/prefix).

Highly suggest that you take a look at the Go and Python APIs:
```
http://golang.org/pkg/net/
https://docs.python.org/3/library/ipaddress.html
```



3rdparty/libprocess/3rdparty/stout/include/stout/net.hpp
https://reviews.apache.org/r/29288/#comment118960

I would suggest rename it to `parse`. We need to have both 
`IP::parse(10.0.0.1)` and `IPNetwork::parse(10.0.0.1/8)`.



3rdparty/libprocess/3rdparty/stout/include/stout/net.hpp
https://reviews.apache.org/r/29288/#comment118963

This need to be moved to `IPNetwork`. Suggest rename it to `TryIPNetwork 
IPNetwork::create(const IP address, const IP netmask)`.



3rdparty/libprocess/3rdparty/stout/include/stout/net.hpp
https://reviews.apache.org/r/29288/#comment118965

Ditto above. This should be:
```
TryIPNetwork IPNetwork::create(const IP address, size_t prefix);
```



3rdparty/libprocess/3rdparty/stout/include/stout/net.hpp
https://reviews.apache.org/r/29288/#comment118966

Can you remind me why this is needed?



3rdparty/libprocess/3rdparty/stout/include/stout/net.hpp
https://reviews.apache.org/r/29288/#comment118843

Looking at the comments of fields `IP::address_` and `IP::netmask_`, looks 
like we always store addresses in host order. That makes me wondering how this 
function is possible?



3rdparty/libprocess/3rdparty/stout/include/stout/net.hpp
https://reviews.apache.org/r/29288/#comment118982

Is this needed anywhere? If not, let's keep this patch a pure refactor.



3rdparty/libprocess/3rdparty/stout/include/stout/net.hpp
https://reviews.apache.org/r/29288/#comment118981

See my top level comments. Let's add the IPv6 stuffs later in a different 
patch!



3rdparty/libprocess/3rdparty/stout/include/stout/net.hpp
https://reviews.apache.org/r/29288/#comment118983

The renaming can be pulled into a separate patch. See my top level 
comments. We can also introduce the family later once we start to fill IPv6 
stuffs.



3rdparty/libprocess/3rdparty/stout/include/stout/net.hpp
https://reviews.apache.org/r/29288/#comment118980

See my top level comments. Let's implement the IPv6 case in a separate 
patch.


- Jie Yu


On Feb. 12, 2015, 8:05 p.m., Evelina Dumitrescu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/29288/
 ---
 
 (Updated Feb. 12, 2015, 8:05 p.m.)
 
 
 Review request for mesos, Benjamin Hindman, Dominic Hamon, Jie Yu, Joris Van 
 Remoortere, and Niklas Nielsen.
 
 
 Bugs: MESOS-1919
 https

Re: Review Request 31008: Refactored EventListener to allow for continuous monitoring of an event.

2015-02-17 Thread Jie Yu


 On Feb. 16, 2015, 10:48 p.m., Jie Yu wrote:
  src/linux/cgroups.cpp, lines 1204-1210
  https://reviews.apache.org/r/31008/diff/1/?file=863519#file863519line1204
 
  To be consistent and symetric, putting `registerNotifier` in 
  `initialize` is better than putting it here. What if the caller creates the 
  process but forgets to spawn it? Do you leak an opened event file?
  
  I would rather keep the existing constructor. We can hide the public 
  constructor and make those exposed public functions friend. Validations can 
  go into those public exposed functions.
 
 Chi Zhang wrote:
 If we validate in the public / friended function by calling 
 registerNotifier and closing the fd, can we just assume it'd always succeed 
 in the constructor / initialize (which are void functions)? even if we could, 
 it's still 2 pieces of the same code. 
 
 I agree with being consistent and symetric, and to address the above 
 leak, how about we put unregisterNotifier to destructor? On the construction 
 side, we use a public creation function as the only construction path that 
 provides explicit validation; the private constructor still takes the opened 
 eventfd to avoid some code duplication? Does it sound like a better balance?
 
 IMHO more commonly client code forgets about proper closing after it has 
 got the results than forgetting about getting the results -- i.e., it's 
 probably more likely to forget to 'terminate' than to 'spawn'. Terminate 
 requires an explict call; but compiler invokes destructor for us -- we could 
 reduce the possible leak more for client code this way? Just my 2 cents.

Putting `registerNotifier` in static create function and putting 
`unregiserNotifier` in destructor does not sound very symetric to me. The 
lifecycle of the eventfd is controlled by two entities: static create and 
event::Listener. I still sugguest putting `registerNotifier` and 
`unregisterNotifier` in `initialize` and `finalize`. You need to store an 
`OptionError error;` in `event::Listener`. If `error.isSome()`, simply fail 
all `listen` attempts. During initializtion of event::Listener, register the 
notifier and set `error` if any error occurs.


 On Feb. 16, 2015, 10:48 p.m., Jie Yu wrote:
  src/linux/cgroups.cpp, line 1318
  https://reviews.apache.org/r/31008/diff/1/?file=863519#file863519line1318
 
  Why do you need a OwnedPromise here? Can it simply be:
  ```
  OptionPromiseuint64 promsie;
  ```
 
 Chi Zhang wrote:
 I don't think that'd compile since promise is not copyable and not 
 assignable. More, Promise doesn't seem to have a 'reset' function to reset 
 its associated future to PENDING again (could this be a useful patch, BTW?), 
 which means a new promise has to be created to have a unset future. 
 
 How about ResultPromise* promise? We can have:
 None -- whenever there isn't a pending read and no errors have 
 encounterred before. Also the initial value. 
 Some -- Have a pending read, so we can reuse this promise. When the read 
 completes successfully, future is set and promise is (deleted and) reset to 
 none again.
 Error -- Set if a read is failed. The public listen functiion returns 
 error from this point on and won't start a new listen. Leave it to the client 
 code to clean this process up.
 
 Re: what if client discards the future returned by listen:
 How about change it to discard the current read if it hasn't completed or 
 had an error? If it has, use the above logic. Basically future returned by 
 listen only controls a single listen; client uses the process pointer from 
 the constructor path to control the process' lifetime?

IC. I think Owned has a `reset` function. Maybe use that?

Regarding the discard semantics, what if there are two entities called 
`listen`? If one of them discard the future, what happens to the other one?


- Jie


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31008/#review72643
---


On Feb. 13, 2015, 6:07 p.m., Chi Zhang wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/31008/
 ---
 
 (Updated Feb. 13, 2015, 6:07 p.m.)
 
 
 Review request for mesos, Dominic Hamon, Ian Downes, and Jie Yu.
 
 
 Bugs: mesos-2136
 https://issues.apache.org/jira/browse/mesos-2136
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Refactored EventListener to allow for continuous monitoring of an event.
 
 
 Diffs
 -
 
   src/linux/cgroups.hpp bf8efee05b995a37d4e6e7679a493b2d5238aa0b 
   src/linux/cgroups.cpp b6f75b14dea7609b90627eccdd33ef891ed9d21f 
 
 Diff: https://reviews.apache.org/r/31008/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Chi Zhang
 




Re: Review Request 30952: Adding scheduler validations to master

2015-02-16 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30952/#review72634
---


Looks nice! Do you want to remove these logics in the drivers (old and new)?


src/master/master.cpp
https://reviews.apache.org/r/30952/#comment118679

Hum, why do you want to move the 'accept' invokation to the bottom? I would 
suggest keep this function the same as before.



src/master/master.cpp
https://reviews.apache.org/r/30952/#comment118680

We usually avoid using non-const references because that makes the 
reasoning not local. See my comments below, this is not needed.



src/master/master.cpp
https://reviews.apache.org/r/30952/#comment118681

I would suggest move this logic to `_accept`. See my comments below.



src/master/master.cpp
https://reviews.apache.org/r/30952/#comment118682

You probably want to move the framework ID filling logic here right before 
the task validation. You need to make a copy of `task`:

```
// Make a copy of the original task so that we can
// fill the missing `framework_id` in ExecutorInfo
// if needed. This field was added to the API later
// and thus was made optional.
TaskInfo _task = task;
if (task.has_executor()  !task.executor().has_framework_id()) {
  _task.mutable_executor()-mutable_framework_id()-CopyFrom(framework-id);
}

// Validate the task.
const OptionError validationError = validation::task::validate(
_task,
framework,
slave,
_offeredResources);
...
```



src/master/validation.cpp
https://reviews.apache.org/r/30952/#comment118683

Do you want to add a test, or there is a test already?


- Jie Yu


On Feb. 16, 2015, 8:53 a.m., Isabel Jimenez wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30952/
 ---
 
 (Updated Feb. 16, 2015, 8:53 a.m.)
 
 
 Review request for mesos, Jie Yu, Joris Van Remoortere, and Vinod Kone.
 
 
 Bugs: MESOS-2290
 https://issues.apache.org/jira/browse/MESOS-2290
 
 
 Repository: mesos-incubating
 
 
 Description
 ---
 
 There is a scheduler validation 
 (https://github.com/apache/mesos/blob/master/src/scheduler/scheduler.cpp#L275)
  missing in master. See motivation on MESOS-2288.
 
 
 Diffs
 -
 
   src/master/master.hpp 6a39df0 
   src/master/master.cpp f10a3cf 
   src/master/validation.cpp acc35b2 
 
 Diff: https://reviews.apache.org/r/30952/diff/
 
 
 Testing
 ---
 
 make check and distcheck. Tests will be added with new HTTP API endpoints 
 tests.
 
 
 Thanks,
 
 Isabel Jimenez
 




Re: Review Request 30906: Added a metric for launcher's container destruction failure.

2015-02-16 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30906/#review72639
---



src/slave/containerizer/linux_launcher.cpp
https://reviews.apache.org/r/30906/#comment118684

Any reason not place this metrics at top level (i.e., in 
`MesosContainerizerProcess::__destroy`)? In that way, you don't need to do 
`++metrics.destroy_error` multiple times, thoughts?


- Jie Yu


On Feb. 13, 2015, 10:37 p.m., Vinod Kone wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30906/
 ---
 
 (Updated Feb. 13, 2015, 10:37 p.m.)
 
 
 Review request for mesos, Ian Downes and Jie Yu.
 
 
 Bugs: MESOS-1690
 https://issues.apache.org/jira/browse/MESOS-1690
 
 
 Repository: mesos
 
 
 Description
 ---
 
 See summary.
 
 
 Diffs
 -
 
   src/Makefile.am d372404d84c9ac0bc49da7407ad366701b9586a6 
   src/slave/containerizer/linux_launcher.hpp 
 37304222323d3089b95df605d2077a3f88d11277 
   src/slave/containerizer/linux_launcher.cpp 
 3fdfb5709d48d079bbe904a91b773f131131b08d 
   src/slave/containerizer/mesos/containerizer.cpp 
 d5b90d12d63becfeb4c3efa9c6f5d826417f582c 
   src/tests/launcher_tests.cpp PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/30906/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Vinod Kone
 




Re: Review Request 30958: PortMappingIsolator: Better ingress qdisc management on eth0 and lo.

2015-02-16 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30958/#review72640
---



src/slave/containerizer/isolators/network/port_mapping.cpp
https://reviews.apache.org/r/30958/#comment118685

Do you have a ticket explaining why this is needed?



src/slave/containerizer/isolators/network/port_mapping.cpp
https://reviews.apache.org/r/30958/#comment118686

If the ingress qdisc is not installed by mesos, will you be removing it? 
That being said, I want to understand the motivation for this in a JIRA.


- Jie Yu


On Feb. 13, 2015, 6:24 p.m., Chi Zhang wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30958/
 ---
 
 (Updated Feb. 13, 2015, 6:24 p.m.)
 
 
 Review request for mesos, Jie Yu and Cong Wang.
 
 
 Repository: mesos
 
 
 Description
 ---
 
 PortMappingIsolator: Better ingress qdisc management on eth0 and lo.
 
 Remove ingress qdiscs on eth0 and lo after the last container exits. They will
 be added back when the next container comes in. This makes sure after a clean
 shutdown, the sytem will be clean.
 
 
 Diffs
 -
 
   src/slave/containerizer/isolators/network/port_mapping.cpp 
 b860b17e59486cd6097183a4d3139fbd5c897b44 
 
 Diff: https://reviews.apache.org/r/30958/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Chi Zhang
 




Re: Review Request 31008: Refactored EventListener to allow for continuous monitoring of an event.

2015-02-16 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31008/#review72643
---


Thanks Chi! This is my first pass. Will do a second pass one the following 
issues are addressed.


src/linux/cgroups.cpp
https://reviews.apache.org/r/31008/#comment118687

We usually call a libprocess process a 'process'. So please revert this 
change.



src/linux/cgroups.cpp
https://reviews.apache.org/r/31008/#comment118703

To be consistent and symetric, putting `registerNotifier` in `initialize` 
is better than putting it here. What if the caller creates the process but 
forgets to spawn it? Do you leak an opened event file?

I would rather keep the existing constructor. We can hide the public 
constructor and make those exposed public functions friend. Validations can go 
into those public exposed functions.



src/linux/cgroups.cpp
https://reviews.apache.org/r/31008/#comment118704

OK, let's document the semantics for this public function. The semantics 
you have is slightly different than I initially thought, but I think you 
sematics is simpler and easy to implement.

Your semantics is actually waiting for the *next* event to occur and 
returns the value read from the event file. Calling multiple `read` returns the 
same future which will be satisfied when the *next* event occurs. Because of 
that, I think we probably should rename `read` to `listen`. What do you think?

Also, what is the discard semantics. What if the user discard the returned 
future, what's gonna happen?



src/linux/cgroups.cpp
https://reviews.apache.org/r/31008/#comment118709

See my comments below. No need to print path and args as they should be 
printed by the caller.



src/linux/cgroups.cpp
https://reviews.apache.org/r/31008/#comment118710

Ditto..



src/linux/cgroups.cpp
https://reviews.apache.org/r/31008/#comment118708

Thanks for expanding the error message. However, you don't need to print 
the path and args because these are caller's responsibility.



src/linux/cgroups.cpp
https://reviews.apache.org/r/31008/#comment118707

So you reset the promise once a `read` has finished, even if it has failed. 
This is problematic because what if the previous read only returns a partial 
result (i.e., reading.get() != sizeof(data)), will that affect the following 
reads?

I guess the behavior is undefined if a previous read returns a failure. So 
you probably want to return failure if there is a failure in the previous read.



src/linux/cgroups.cpp
https://reviews.apache.org/r/31008/#comment118706

Why do you need a OwnedPromise here? Can it simply be:
```
OptionPromiseuint64 promsie;
```



src/linux/cgroups.cpp
https://reviews.apache.org/r/31008/#comment118711

You should not call a method in a process directly. Please use dispatch!


- Jie Yu


On Feb. 13, 2015, 6:07 p.m., Chi Zhang wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/31008/
 ---
 
 (Updated Feb. 13, 2015, 6:07 p.m.)
 
 
 Review request for mesos, Dominic Hamon, Ian Downes, and Jie Yu.
 
 
 Bugs: mesos-2136
 https://issues.apache.org/jira/browse/mesos-2136
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Refactored EventListener to allow for continuous monitoring of an event.
 
 
 Diffs
 -
 
   src/linux/cgroups.hpp bf8efee05b995a37d4e6e7679a493b2d5238aa0b 
   src/linux/cgroups.cpp b6f75b14dea7609b90627eccdd33ef891ed9d21f 
 
 Diff: https://reviews.apache.org/r/31008/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Chi Zhang
 




Re: Review Request 29742: Added utility functions to determine types of resources.

2015-02-16 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29742/#review72659
---

Ship it!


Thanks for the cleanup!


src/master/validation.cpp
https://reviews.apache.org/r/29742/#comment118712

The indent here should be 4 spaces.


- Jie Yu


On Feb. 13, 2015, 10:59 p.m., Michael Park wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/29742/
 ---
 
 (Updated Feb. 13, 2015, 10:59 p.m.)
 
 
 Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, and 
 Vinod Kone.
 
 
 Repository: mesos
 
 
 Description
 ---
 
 # Motivation
 
 The main motivation for introducing these functions is to capture the 
 definition of various identification of resources. With these functions 
 capturing various definitions of concepts for us, we gain:
   - readability.
   - engineering benefits.
 
 ## Example
 
 For example, consider the concept of persistent volume. Currently we do `if 
 (resource.has_disk()  resource.has_persistence())` throughout the codebase 
 to test to identify this type of resource.
 
 ### Readability
 
 From a readability perspective, `if (resource.has_disk()  
 resource.has_persistence())` simply harder to read than `if 
 (Resource::persistentVolume(resource))`. A foreign reader also can't be sure 
 that the first predicate is checking for a persistent volume without digging 
 deeper into the codebase. (Maybe we actually have an additional requirement 
 for a resource to be considered a persistent volume.)
 
 ### Engineering Benefit
 
 If and when we realize that the definition needs to be updated, we shouldn't 
 have to change the predicate every `if` statement that checks for a 
 persistent volume.
 
 If you're thinking, just grep for `if (resource.has_disk()  
 resource.has_persistence())`..., what if we didn't use `resource` as the 
 variable name? what if we actually did `if (!(resource.has_disk()  
 resource.has_persistence()))`? what about `if (!resource.has_disk() || 
 !resource.has_persistence()))`? In general I believe this approach makes it 
 hard to keep the definitions consistent throughout the codebase.
 
 Instead, we should consistently use the predicates that capture the 
 definition, (e.g. `Resource::persistentVolume(resource)`) and later on if we 
 change the definition of persistent volume, we simply update the definition 
 of `persistentVolume` and we're done.
 
 ## Why not just have a Filter instead?
 
 Fundamentally a `Filter` is built on a **unary predicate**. Given a list of 
 elements, we keep elements that satisfy the predicate. We *could* embed these 
 predicates into a `Filter` and only provide those. But 1. I don't think a 
 `Filter` is necessarily the right tool for every job. 2. Unary predicates are 
 the basis of many algorithms (e.g. `all_of`, `any_of`, `none_of`, `count_if`, 
 `find_if`) and therefore deserve to exist in their own right.
 
 
 Diffs
 -
 
   include/mesos/resources.hpp c7cc46e0183ea97013dd088a717da6c0e6ed5cf0 
   src/common/resources.cpp 98371f6873482d0cdbefeb279b58ae6cc680a88f 
   src/master/master.hpp 6a39df04514c756415354fae66c5835ada191c52 
   src/master/validation.cpp acc35b25c93f2d3900d79c3070b1d681416ed66b 
   src/slave/slave.cpp ec7ec1356e745bb07484ae1755c9183b038043b3 
 
 Diff: https://reviews.apache.org/r/29742/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Michael Park
 




Re: Review Request 30911: Updated the filter abstraction for Resources.

2015-02-16 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30911/#review72660
---



include/mesos/resources.hpp
https://reviews.apache.org/r/30911/#comment118713

We avoid using typedefs as much as possible trying to be more explicit.



include/mesos/resources.hpp
https://reviews.apache.org/r/30911/#comment118715

See my comments below. Can we punt this for now and leave a TODO for 
supporing filtering for RepeatedPtrField?

This struct itself is confusing because we have `reserved` as well.

```
Resources::Reserved(role1)
```
vs
```
resources.reserved(role1)
```

I think being explicit and using lambda::bind is not a bad thing. As least 
the readers immediately know that `filter` takes a predicate function. What do 
you think?



include/mesos/resources.hpp
https://reviews.apache.org/r/30911/#comment118716

I don't think we should remove these two helper functions as they are quite 
convinent to use. (just like we still have empty() in std containers even if we 
can use size() == 0 to replace it).



include/mesos/resources.hpp
https://reviews.apache.org/r/30911/#comment118714

Can we make `filter` function a member of `Resources` instead. We usually 
prefer member functions than global functions. Also, looking at the callsites, 
this version does not read well. For example:
```
filter(Resources::Reserved(role2), slave2.resources())
```

It's not obvious what does this function do. For instance, Readers might 
think that `Reserved(role2)` is a function which returns all resources 
reserved for `role2`.

I understand that you want to make filter seamlessly work for 
RepeatedPtrField. Let's punt that for now and leave a TODO. For the above case, 
simply do the following for now (i.e., still keep `reserved` and `unreserved`).
```  
Resources(slave2.resourecs()).reserved(role2);
```


- Jie Yu


On Feb. 14, 2015, 12:11 a.m., Michael Park wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30911/
 ---
 
 (Updated Feb. 14, 2015, 12:11 a.m.)
 
 
 Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, and 
 Vinod Kone.
 
 
 Bugs: MESOS-2348
 https://issues.apache.org/jira/browse/MESOS-2348
 
 
 Repository: mesos
 
 
 Description
 ---
 
 See [JIRA Ticket](https://issues.apache.org/jira/browse/MESOS-2348).
 
 
 Diffs
 -
 
   include/mesos/resources.hpp c7cc46e0183ea97013dd088a717da6c0e6ed5cf0 
   src/common/resources.cpp 98371f6873482d0cdbefeb279b58ae6cc680a88f 
   src/master/allocator/mesos/hierarchical.hpp 
 2680d6231927867d5a8d75cbc42b81d6c75fc7f2 
   src/master/master.hpp 6a39df04514c756415354fae66c5835ada191c52 
   src/master/validation.cpp acc35b25c93f2d3900d79c3070b1d681416ed66b 
   src/slave/slave.cpp ec7ec1356e745bb07484ae1755c9183b038043b3 
   src/tests/hierarchical_allocator_tests.cpp 
 eeecfb64540b1915074aaffaa5d506b203bc 
   src/tests/resources_tests.cpp 3f98782fd437dba808d720bf8e9b94b8fa7e0feb 
 
 Diff: https://reviews.apache.org/r/30911/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Michael Park
 




Review Request 31024: Changed the slave logic so that it always waits for containerizer update to finish.

2015-02-13 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31024/
---

Review request for mesos, Ian Downes and Vinod Kone.


Bugs: MESOS-998
https://issues.apache.org/jira/browse/MESOS-998


Repository: mesos


Description
---

Changed the slave logic so that it always waits for containerizer update to 
finish.

The key idea is to leverage `executor-queuedTasks` when containerizer-update 
is in progress.


Diffs
-

  include/mesos/type_utils.hpp 32dc6ed489f43ba3695507f59c4c2d94028c8df1 
  src/slave/slave.hpp 7a399f6df50c69b7e1e12d74f076fa57b6edb1b3 
  src/slave/slave.cpp f39a876cdd6b580a7a75fd053e6923761bee7635 
  src/tests/master_tests.cpp c678527942506ef23fa4f1aebf9664e4cc21b561 
  src/tests/registrar_zookeeper_tests.cpp 
30d3cd8b1e7d56fbb326c9feccd4c3cbae2015de 
  src/tests/scheduler_tests.cpp 080ec7236246794de6c8cc7356a06331c5e48cd5 
  src/tests/slave_tests.cpp a02e335576bf68b449a6286fa5cf5093b1b7182a 

Diff: https://reviews.apache.org/r/31024/diff/


Testing
---

make check


Thanks,

Jie Yu



Re: Review Request 30906: Added a metric for launcher's container destruction failure.

2015-02-12 Thread Jie Yu


 On Feb. 12, 2015, 5:41 p.m., Ben Mahler wrote:
  src/slave/containerizer/mesos/containerizer.hpp, line 280
  https://reviews.apache.org/r/30906/diff/1/?file=861366#file861366line280
 
  Should this just be mesos_containerizer/launcher_destroy_errors? 
  Looking at our port_mapping/ metrics, and other subcomponent metrics, that 
  seems like a more consistent name?
 
 Ben Mahler wrote:
 Or, is it possible to have this error in the launcher and just call it: 
 launcher/destroy_errors or linux_launcher/destroy_errors?

Or we can modify port_mapping metrics to be 
`slave/containerizer/mesos/isolators/network/port_mapping/xxx`. Maybe that's 
too long?


- Jie


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30906/#review72190
---


On Feb. 12, 2015, 1:34 a.m., Vinod Kone wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30906/
 ---
 
 (Updated Feb. 12, 2015, 1:34 a.m.)
 
 
 Review request for mesos, Ian Downes and Jie Yu.
 
 
 Bugs: MESOS-1690
 https://issues.apache.org/jira/browse/MESOS-1690
 
 
 Repository: mesos
 
 
 Description
 ---
 
 See summary.
 
 
 Diffs
 -
 
   src/slave/containerizer/mesos/containerizer.hpp 
 074a2d82dcd882e52f8cd62ed7493295596acb26 
   src/slave/containerizer/mesos/containerizer.cpp 
 d5b90d12d63becfeb4c3efa9c6f5d826417f582c 
 
 Diff: https://reviews.apache.org/r/30906/diff/
 
 
 Testing
 ---
 
 make check
 
 No existing launcher tests to inject metrics test. Will work with @idownes to 
 figure how to write one.
 
 
 Thanks,
 
 Vinod Kone
 




Review Request 30945: Waited containerizer update to finish before sending task to the executor in Slave::_runTask.

2015-02-12 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30945/
---

Review request for mesos, Ian Downes and Vinod Kone.


Bugs: MESOS-998
https://issues.apache.org/jira/browse/MESOS-998


Repository: mesos


Description
---

Waited containerizer update to finish before sending task to the executor in 
Slave::_runTask.


Diffs
-

  src/slave/slave.hpp 7a399f6df50c69b7e1e12d74f076fa57b6edb1b3 
  src/slave/slave.cpp f39a876cdd6b580a7a75fd053e6923761bee7635 

Diff: https://reviews.apache.org/r/30945/diff/


Testing
---

make check


Thanks,

Jie Yu



Re: Review Request 30945: Waited containerizer update to finish before sending task to the executor in Slave::_runTask.

2015-02-12 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30945/#review72227
---


Just realized that there still a race here with killTask (killTask happens 
between `_runTask` and `__runTask`). We could end up with the situation where 
the executor receives an KillTaskMessage with an unknown task since 
RunTaskMessage hasn't reached the executor yet. :(

- Jie Yu


On Feb. 12, 2015, 7:30 p.m., Jie Yu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30945/
 ---
 
 (Updated Feb. 12, 2015, 7:30 p.m.)
 
 
 Review request for mesos, Ian Downes and Vinod Kone.
 
 
 Bugs: MESOS-998
 https://issues.apache.org/jira/browse/MESOS-998
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Waited containerizer update to finish before sending task to the executor in 
 Slave::_runTask.
 
 
 Diffs
 -
 
   src/slave/slave.hpp 7a399f6df50c69b7e1e12d74f076fa57b6edb1b3 
   src/slave/slave.cpp f39a876cdd6b580a7a75fd053e6923761bee7635 
 
 Diff: https://reviews.apache.org/r/30945/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Jie Yu
 




Re: Review Request 29742: Added utility functions to determine types of resources.

2015-02-12 Thread Jie Yu


 On Feb. 3, 2015, 8:19 p.m., Vinod Kone wrote:
  include/mesos/resources.hpp, line 92
  https://reviews.apache.org/r/29742/diff/4/?file=839661#file839661line92
 
  We didn't do isempty above, so how about getting rid of is as 
  prefix? I think returning a bool signals a is.
 
 Michael Park wrote:
 Hm, that's a good point. I'll fix that.
 
 Michael Park wrote:
 Fixed.

Hum, actually, I prefer the name 'isEmpty', 'isPersistentVolume', 'isReserved' 
because there are 'reserved' and 'unreserved' functions (same name) in 
Resources which return reserved/unreserved resources (different purpose).


- Jie


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29742/#review70817
---


On Feb. 11, 2015, 11:49 p.m., Michael Park wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/29742/
 ---
 
 (Updated Feb. 11, 2015, 11:49 p.m.)
 
 
 Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, and 
 Vinod Kone.
 
 
 Repository: mesos
 
 
 Description
 ---
 
 # Motivation
 
 The main motivation for introducing these functions is to capture the 
 definition of various identification of resources. With these functions 
 capturing various definitions of concepts for us, we gain:
   - readability.
   - engineering benefits.
 
 ## Example
 
 For example, consider the concept of persistent volume. Currently we do `if 
 (resource.has_disk()  resource.has_persistence())` throughout the codebase 
 to test to identify this type of resource.
 
 ### Readability
 
 From a readability perspective, `if (resource.has_disk()  
 resource.has_persistence())` simply harder to read than `if 
 (Resource::persistentVolume(resource))`. A foreign reader also can't be sure 
 that the first predicate is checking for a persistent volume without digging 
 deeper into the codebase. (Maybe we actually have an additional requirement 
 for a resource to be considered a persistent volume.)
 
 ### Engineering Benefit
 
 If and when we realize that the definition needs to be updated, we shouldn't 
 have to change the predicate every `if` statement that checks for a 
 persistent volume.
 
 If you're thinking, just grep for `if (resource.has_disk()  
 resource.has_persistence())`..., what if we didn't use `resource` as the 
 variable name? what if we actually did `if (!(resource.has_disk()  
 resource.has_persistence()))`? what about `if (!resource.has_disk() || 
 !resource.has_persistence()))`? In general I believe this approach makes it 
 hard to keep the definitions consistent throughout the codebase.
 
 Instead, we should consistently use the predicates that capture the 
 definition, (e.g. `Resource::persistentVolume(resource)`) and later on if we 
 change the definition of persistent volume, we simply update the definition 
 of `persistentVolume` and we're done.
 
 ## Why not just have a Filter instead?
 
 Fundamentally a `Filter` is built on a **unary predicate**. Given a list of 
 elements, we keep elements that satisfy the predicate. We *could* embed these 
 predicates into a `Filter` and only provide those. But 1. I don't think a 
 `Filter` is necessarily the right tool for every job. 2. Unary predicates are 
 the basis of many algorithms (e.g. `all_of`, `any_of`, `none_of`, `count_if`, 
 `find_if`) and therefore deserve to exist in its own right.
 
 
 Diffs
 -
 
   include/mesos/resources.hpp c7cc46e0183ea97013dd088a717da6c0e6ed5cf0 
   src/common/resources.cpp 98371f6873482d0cdbefeb279b58ae6cc680a88f 
   src/master/master.hpp 6a39df04514c756415354fae66c5835ada191c52 
   src/master/validation.cpp acc35b25c93f2d3900d79c3070b1d681416ed66b 
   src/slave/slave.cpp f39a876cdd6b580a7a75fd053e6923761bee7635 
 
 Diff: https://reviews.apache.org/r/29742/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Michael Park
 




Re: Review Request 30945: Waited containerizer update to finish before sending task to the executor in Slave::_runTask.

2015-02-12 Thread Jie Yu


 On Feb. 12, 2015, 7:47 p.m., Jie Yu wrote:
  Just realized that there still a race here with killTask (killTask happens 
  between `_runTask` and `__runTask`). We could end up with the situation 
  where the executor receives an KillTaskMessage with an unknown task since 
  RunTaskMessage hasn't reached the executor yet. :(
 
 Timothy Chen wrote:
 True if kill task is sent without any task status update yet, should we 
 have a new state for the executor so that we don't proxy until it's fully 
 ready?
 Or what's in your mind to fix this?

I am thinking about using the queuedTasks because it's in fact queued (and not 
yet launched).


- Jie


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30945/#review72227
---


On Feb. 12, 2015, 7:30 p.m., Jie Yu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30945/
 ---
 
 (Updated Feb. 12, 2015, 7:30 p.m.)
 
 
 Review request for mesos, Ian Downes and Vinod Kone.
 
 
 Bugs: MESOS-998
 https://issues.apache.org/jira/browse/MESOS-998
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Waited containerizer update to finish before sending task to the executor in 
 Slave::_runTask.
 
 
 Diffs
 -
 
   src/slave/slave.hpp 7a399f6df50c69b7e1e12d74f076fa57b6edb1b3 
   src/slave/slave.cpp f39a876cdd6b580a7a75fd053e6923761bee7635 
 
 Diff: https://reviews.apache.org/r/30945/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Jie Yu
 




Re: Review Request 30850: Added validation for checkpointed resources during slave recovery.

2015-02-11 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30850/
---

(Updated Feb. 11, 2015, 6:38 p.m.)


Review request for mesos, Ben Mahler, Michael Park, and Vinod Kone.


Repository: mesos


Description
---

Added validation for checkpointed resources during slave recovery.

This is a rebased version for https://reviews.apache.org/r/29913/


Diffs
-

  src/slave/slave.hpp 7a399f6df50c69b7e1e12d74f076fa57b6edb1b3 
  src/slave/slave.cpp f39a876cdd6b580a7a75fd053e6923761bee7635 
  src/tests/mesos.hpp 2b0c90d166ed33065f2a4a2909c61cc5119da511 
  src/tests/mesos.cpp ac2dc737eff48d122cfd604d4820f423e5bc0472 
  src/tests/persistent_volume_tests.cpp 
b4f2298069233b27f6b4b8d668753810f43bb58d 

Diff: https://reviews.apache.org/r/30850/diff/


Testing
---

make check


Thanks,

Jie Yu



Re: Review Request 30850: Added validation for checkpointed resources during slave recovery.

2015-02-11 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30850/
---

(Updated Feb. 11, 2015, 6:38 p.m.)


Review request for mesos, Ben Mahler, Michael Park, and Vinod Kone.


Changes
---

Review comments.


Repository: mesos


Description
---

Added validation for checkpointed resources during slave recovery.

This is a rebased version for https://reviews.apache.org/r/29913/


Diffs (updated)
-

  src/slave/slave.hpp 7a399f6df50c69b7e1e12d74f076fa57b6edb1b3 
  src/slave/slave.cpp f39a876cdd6b580a7a75fd053e6923761bee7635 
  src/tests/mesos.hpp 2b0c90d166ed33065f2a4a2909c61cc5119da511 
  src/tests/mesos.cpp ac2dc737eff48d122cfd604d4820f423e5bc0472 
  src/tests/persistent_volume_tests.cpp 
b4f2298069233b27f6b4b8d668753810f43bb58d 

Diff: https://reviews.apache.org/r/30850/diff/


Testing
---

make check


Thanks,

Jie Yu



Re: Review Request 30812: Updated the allocator in Master::addSlave using slave-totalResources.

2015-02-11 Thread Jie Yu


 On Feb. 11, 2015, 12:32 a.m., Ben Mahler wrote:
  Before you commit this, mind adding a test to ensure that the checkpointed 
  resources are offered? (in a follow up patch)

Added a test in https://reviews.apache.org/r/30900/


- Jie


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30812/#review71886
---


On Feb. 10, 2015, 1:35 a.m., Jie Yu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30812/
 ---
 
 (Updated Feb. 10, 2015, 1:35 a.m.)
 
 
 Review request for mesos and Ben Mahler.
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Updated the allocator in Master::addSlave using slave-totalResources.
 
 
 Diffs
 -
 
   src/master/master.cpp 7c3aa220e72fdc156fb9a0998dd68beb7c464256 
 
 Diff: https://reviews.apache.org/r/30812/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Jie Yu
 




Review Request 30900: Added a persistent volume test for verifying master failover.

2015-02-11 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30900/
---

Review request for mesos, Ben Mahler and Vinod Kone.


Repository: mesos


Description
---

Added a persistent volume test for verifying master failover.

Followup for https://reviews.apache.org/r/30812/


Diffs
-

  src/tests/persistent_volume_tests.cpp 
b4f2298069233b27f6b4b8d668753810f43bb58d 

Diff: https://reviews.apache.org/r/30900/diff/


Testing
---

make check


Thanks,

Jie Yu



Review Request 30861: Consistency fix for slave logging.

2015-02-10 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30861/
---

Review request for mesos, Ben Mahler and Vinod Kone.


Repository: mesos


Description
---

Consistency fix for slave logging. Removed `'` for 
framework/executor/container/task ids. Removed empty leading spaces.


Diffs
-

  src/slave/slave.cpp 8a1b12ebd0fad0c427624a8bacc47205b4311fc9 

Diff: https://reviews.apache.org/r/30861/diff/


Testing
---

make check


Thanks,

Jie Yu



Re: Review Request 30850: Added validation for checkpointed resources during slave recovery.

2015-02-10 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30850/
---

(Updated Feb. 10, 2015, 11:10 p.m.)


Review request for mesos, Ben Mahler, Michael Park, and Vinod Kone.


Repository: mesos


Description
---

Added validation for checkpointed resources during slave recovery.

This is a rebased version for https://reviews.apache.org/r/29913/


Diffs
-

  src/slave/slave.hpp 35c7c8bf6eb8efee332c13713f6087c6f899d6e3 
  src/slave/slave.cpp 7a29f86bf873141759957a6ef5502f6a3b6ab269 
  src/tests/mesos.hpp 83a369968ab2403fa341829ac5d11f7243095190 
  src/tests/mesos.cpp 21a405366f56c963611324076efe775f85b9d9f7 
  src/tests/persistent_volume_tests.cpp 
e3a8d33b6ae2b6afe5f4ab7c029b2f37a861f112 

Diff: https://reviews.apache.org/r/30850/diff/


Testing
---

make check


Thanks,

Jie Yu



Review Request 30850: Added validation for checkpointed resources during slave recovery.

2015-02-10 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30850/
---

Review request for mesos, Ben Mahler, Michael Park, and Vinod Kone.


Repository: mesos


Description
---

Added validation for checkpointed resources during slave recovery.

This is a rebased version for https://reviews.apache.org/r/29913/


Diffs
-

  src/slave/slave.hpp 35c7c8bf6eb8efee332c13713f6087c6f899d6e3 
  src/slave/slave.cpp 7a29f86bf873141759957a6ef5502f6a3b6ab269 
  src/tests/mesos.hpp 83a369968ab2403fa341829ac5d11f7243095190 
  src/tests/mesos.cpp 21a405366f56c963611324076efe775f85b9d9f7 
  src/tests/persistent_volume_tests.cpp 
e3a8d33b6ae2b6afe5f4ab7c029b2f37a861f112 

Diff: https://reviews.apache.org/r/30850/diff/


Testing
---

make check


Thanks,

Jie Yu



Review Request 30812: Updated the allocator in Master::addSlave using slave-totalResources.

2015-02-09 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30812/
---

Review request for mesos and Ben Mahler.


Repository: mesos


Description
---

Updated the allocator in Master::addSlave using slave-totalResources.


Diffs
-

  src/master/master.cpp 7c3aa220e72fdc156fb9a0998dd68beb7c464256 

Diff: https://reviews.apache.org/r/30812/diff/


Testing
---

make check


Thanks,

Jie Yu



Re: Review Request 28809: Started to maintain and checkpoint persisted resource in slave.

2015-02-09 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28809/
---

(Updated Feb. 9, 2015, 7:33 p.m.)


Review request for mesos and Ben Mahler.


Changes
---

Vinod's comments.


Bugs: MESOS-2031
https://issues.apache.org/jira/browse/MESOS-2031


Repository: mesos


Description
---

Started to maintain and checkpoint persisted resource in slave. That includes:
1) responds to update resources message
2) checkpoint resources
3) recover checkpointed resources
4) send checkpointed resources during register/reregister


Diffs (updated)
-

  src/slave/slave.hpp 9adee17cb94a72f0e1e139b3fd8978a9a1ff6237 
  src/slave/slave.cpp fff2d725fe49eee984d9151cfb2131202c47994f 
  src/slave/state.hpp 04084fc5c9130bf7d824b2ced4eb7053a995edd0 
  src/tests/persistent_volume_tests.cpp 
ffbaedddbd21d0eb7964be0bff4368384389b0a0 

Diff: https://reviews.apache.org/r/28809/diff/


Testing
---

make check


Thanks,

Jie Yu



Re: Review Request 30508: Prepare persistent volumes in slave.

2015-02-09 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30508/
---

(Updated Feb. 9, 2015, 6 p.m.)


Review request for mesos, Ben Mahler, Michael Park, and Vinod Kone.


Changes
---

Rebased.


Bugs: MESOS-2031
https://issues.apache.org/jira/browse/MESOS-2031


Repository: mesos


Description
---

This patch prepares persistent volumes in slave when receiving new 
task/executor.


Diffs (updated)
-

  src/slave/slave.hpp 9adee17cb94a72f0e1e139b3fd8978a9a1ff6237 
  src/slave/slave.cpp fff2d725fe49eee984d9151cfb2131202c47994f 

Diff: https://reviews.apache.org/r/30508/diff/


Testing
---

make check.


Thanks,

Jie Yu



Re: Review Request 28809: Started to maintain and checkpoint persisted resource in slave.

2015-02-09 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28809/
---

(Updated Feb. 9, 2015, 5:59 p.m.)


Review request for mesos and Ben Mahler.


Changes
---

Rebased.


Bugs: MESOS-2031
https://issues.apache.org/jira/browse/MESOS-2031


Repository: mesos


Description
---

Started to maintain and checkpoint persisted resource in slave. That includes:
1) responds to update resources message
2) checkpoint resources
3) recover checkpointed resources
4) send checkpointed resources during register/reregister


Diffs (updated)
-

  src/slave/slave.hpp 9adee17cb94a72f0e1e139b3fd8978a9a1ff6237 
  src/slave/slave.cpp fff2d725fe49eee984d9151cfb2131202c47994f 
  src/slave/state.hpp 04084fc5c9130bf7d824b2ced4eb7053a995edd0 
  src/tests/persistent_volume_tests.cpp 
ffbaedddbd21d0eb7964be0bff4368384389b0a0 

Diff: https://reviews.apache.org/r/28809/diff/


Testing
---

make check


Thanks,

Jie Yu



Re: Review Request 30509: Added executor working directory to Container struct in Mesos containerizer.

2015-02-09 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30509/
---

(Updated Feb. 9, 2015, 6 p.m.)


Review request for mesos, Ben Mahler, Ian Downes, Michael Park, and Vinod Kone.


Changes
---

Rebased.


Bugs: MESOS-2031
https://issues.apache.org/jira/browse/MESOS-2031


Repository: mesos


Description
---

Added executor working directory to Container struct in Mesos containerizer.

The executor directory is needed when linking persistent volumes inside the 
container.


Diffs (updated)
-

  src/slave/containerizer/mesos/containerizer.hpp 
b3aafe4efa9f3469d7a3fd39243ad66b46d6a54d 
  src/slave/containerizer/mesos/containerizer.cpp 
fa40d47aee7803833bcde6cce1e86a21d7cf27d0 

Diff: https://reviews.apache.org/r/30509/diff/


Testing
---

make check


Thanks,

Jie Yu



Re: Review Request 30510: Allowed Mesos containerizer to prepare and update volumes.

2015-02-09 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30510/
---

(Updated Feb. 9, 2015, 6 p.m.)


Review request for mesos, Ben Mahler, Ian Downes, Michael Park, and Vinod Kone.


Changes
---

Rebased.


Bugs: MESOS-2031
https://issues.apache.org/jira/browse/MESOS-2031


Repository: mesos


Description
---

Allowed Mesos containerizer to prepare and update volumes.

Mesos containerizer setup volumes for the container.


Diffs (updated)
-

  src/slave/containerizer/mesos/containerizer.hpp 
b3aafe4efa9f3469d7a3fd39243ad66b46d6a54d 
  src/slave/containerizer/mesos/containerizer.cpp 
fa40d47aee7803833bcde6cce1e86a21d7cf27d0 

Diff: https://reviews.apache.org/r/30510/diff/


Testing
---

make check


Thanks,

Jie Yu



Re: Review Request 30510: Allowed Mesos containerizer to prepare and update volumes.

2015-02-09 Thread Jie Yu


 On Feb. 2, 2015, 9:20 p.m., Ian Downes wrote:
  src/slave/containerizer/mesos/containerizer.cpp, lines 1161-1167
  https://reviews.apache.org/r/30510/diff/1/?file=843874#file843874line1161
 
  How can we ensure that such a filesystem isolator is being used? 
  Shouldn't all of this functionality be in one place!?
 
 Jie Yu wrote:
 For now, in master, we will only allow relative non-nested container path.

Revised the comments.


- Jie


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30510/#review70617
---


On Feb. 9, 2015, 6 p.m., Jie Yu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30510/
 ---
 
 (Updated Feb. 9, 2015, 6 p.m.)
 
 
 Review request for mesos, Ben Mahler, Ian Downes, Michael Park, and Vinod 
 Kone.
 
 
 Bugs: MESOS-2031
 https://issues.apache.org/jira/browse/MESOS-2031
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Allowed Mesos containerizer to prepare and update volumes.
 
 Mesos containerizer setup volumes for the container.
 
 
 Diffs
 -
 
   src/slave/containerizer/mesos/containerizer.hpp 
 b3aafe4efa9f3469d7a3fd39243ad66b46d6a54d 
   src/slave/containerizer/mesos/containerizer.cpp 
 fa40d47aee7803833bcde6cce1e86a21d7cf27d0 
 
 Diff: https://reviews.apache.org/r/30510/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Jie Yu
 




Re: Review Request 30510: Allowed Mesos containerizer to prepare and update volumes.

2015-02-09 Thread Jie Yu


 On Feb. 9, 2015, 8:24 p.m., Vinod Kone wrote:
  src/slave/containerizer/mesos/containerizer.cpp, line 1157
  https://reviews.apache.org/r/30510/diff/3/?file=858756#file858756line1157
 
  Why is this a CHECK? What is the guarantee that a container won't be 
  cleaned up?

Added a comment. This method is only supposed to be called when the container 
is alive.


 On Feb. 9, 2015, 8:24 p.m., Vinod Kone wrote:
  src/slave/containerizer/mesos/containerizer.cpp, lines 819-820
  https://reviews.apache.org/r/30510/diff/3/?file=858756#file858756line819
 
  Why pull it into the middle of 'futures' handling?

Added a comment.


 On Feb. 9, 2015, 8:24 p.m., Vinod Kone wrote:
  src/slave/containerizer/mesos/containerizer.cpp, line 1181
  https://reviews.apache.org/r/30510/diff/3/?file=858756#file858756line1181
 
  Add a comment here,

Done.


- Jie


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30510/#review71690
---


On Feb. 9, 2015, 6 p.m., Jie Yu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30510/
 ---
 
 (Updated Feb. 9, 2015, 6 p.m.)
 
 
 Review request for mesos, Ben Mahler, Ian Downes, Michael Park, and Vinod 
 Kone.
 
 
 Bugs: MESOS-2031
 https://issues.apache.org/jira/browse/MESOS-2031
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Allowed Mesos containerizer to prepare and update volumes.
 
 Mesos containerizer setup volumes for the container.
 
 
 Diffs
 -
 
   src/slave/containerizer/mesos/containerizer.hpp 
 b3aafe4efa9f3469d7a3fd39243ad66b46d6a54d 
   src/slave/containerizer/mesos/containerizer.cpp 
 fa40d47aee7803833bcde6cce1e86a21d7cf27d0 
 
 Diff: https://reviews.apache.org/r/30510/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Jie Yu
 




Re: Review Request 30508: Prepare persistent volumes in slave.

2015-02-09 Thread Jie Yu


 On Feb. 9, 2015, 8:12 p.m., Vinod Kone wrote:
  src/slave/slave.cpp, line 3765
  https://reviews.apache.org/r/30508/diff/2/?file=858752#file858752line3765
 
  also, add a check for persistence?

This is not needed because the if guard above should have already verified it.


- Jie


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30508/#review71689
---


On Feb. 9, 2015, 6 p.m., Jie Yu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30508/
 ---
 
 (Updated Feb. 9, 2015, 6 p.m.)
 
 
 Review request for mesos, Ben Mahler, Michael Park, and Vinod Kone.
 
 
 Bugs: MESOS-2031
 https://issues.apache.org/jira/browse/MESOS-2031
 
 
 Repository: mesos
 
 
 Description
 ---
 
 This patch prepares persistent volumes in slave when receiving new 
 task/executor.
 
 
 Diffs
 -
 
   src/slave/slave.hpp 9adee17cb94a72f0e1e139b3fd8978a9a1ff6237 
   src/slave/slave.cpp fff2d725fe49eee984d9151cfb2131202c47994f 
 
 Diff: https://reviews.apache.org/r/30508/diff/
 
 
 Testing
 ---
 
 make check.
 
 
 Thanks,
 
 Jie Yu
 




Re: Review Request 30508: Prepare persistent volumes in slave.

2015-02-09 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30508/
---

(Updated Feb. 9, 2015, 10:37 p.m.)


Review request for mesos, Ben Mahler, Michael Park, and Vinod Kone.


Changes
---

Review comments.

Moved the logic to checkpointResources. Added a test.


Bugs: MESOS-2031
https://issues.apache.org/jira/browse/MESOS-2031


Repository: mesos


Description
---

This patch prepares persistent volumes in slave when receiving new 
task/executor.


Diffs (updated)
-

  src/slave/slave.cpp 7a29f86bf873141759957a6ef5502f6a3b6ab269 
  src/tests/persistent_volume_tests.cpp 
e3a8d33b6ae2b6afe5f4ab7c029b2f37a861f112 

Diff: https://reviews.apache.org/r/30508/diff/


Testing
---

make check.


Thanks,

Jie Yu



Re: Review Request 30509: Added executor working directory to Container struct in Mesos containerizer.

2015-02-09 Thread Jie Yu


 On Feb. 5, 2015, 1:24 a.m., Timothy Chen wrote:
  src/slave/containerizer/mesos/containerizer.hpp, line 260
  https://reviews.apache.org/r/30509/diff/2/?file=849046#file849046line260
 
  I wonder if we should consolidate our terminalogy for the working 
  directory.
  I've been calling it sandbox in the docker containerizer, and sometimes 
  it's referred to workDir, and here just directory.
  
  Can we just call this sandbox?

I agree that we should make it consistent. However, I have a different opinion 
about whether the executor working directory should be named as 'sandbox'. The 
main reason is that we may want to add more stuff to executor's working 
directory and make 'sandbox' the actual sandbox for executors/tasks. For 
example, the following layout:
```
executor_working_dir
   |--- sandbox/   // the actual sandbox for executor/tasks
   |--- volumes/   // volumes for the executor/tasks, will be bind mount into 
container under /mnt/mesos/volumes
```

Since MesosContainerizer currently uses 'directory' as the executor working 
directory already, I would rather keep it consistent for now.


- Jie


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30509/#review71118
---


On Feb. 9, 2015, 6 p.m., Jie Yu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30509/
 ---
 
 (Updated Feb. 9, 2015, 6 p.m.)
 
 
 Review request for mesos, Ben Mahler, Ian Downes, Michael Park, and Vinod 
 Kone.
 
 
 Bugs: MESOS-2031
 https://issues.apache.org/jira/browse/MESOS-2031
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Added executor working directory to Container struct in Mesos containerizer.
 
 The executor directory is needed when linking persistent volumes inside the 
 container.
 
 
 Diffs
 -
 
   src/slave/containerizer/mesos/containerizer.hpp 
 b3aafe4efa9f3469d7a3fd39243ad66b46d6a54d 
   src/slave/containerizer/mesos/containerizer.cpp 
 fa40d47aee7803833bcde6cce1e86a21d7cf27d0 
 
 Diff: https://reviews.apache.org/r/30509/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Jie Yu
 




Re: Review Request 30510: Allowed Mesos containerizer to prepare and update volumes.

2015-02-09 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30510/
---

(Updated Feb. 10, 2015, 1:09 a.m.)


Review request for mesos, Ben Mahler, Ian Downes, Michael Park, and Vinod Kone.


Changes
---

Review comments. Added tests.


Bugs: MESOS-2031
https://issues.apache.org/jira/browse/MESOS-2031


Repository: mesos


Description
---

Allowed Mesos containerizer to prepare and update volumes.

Mesos containerizer setup volumes for the container.


Diffs (updated)
-

  src/slave/containerizer/mesos/containerizer.hpp 
b3aafe4efa9f3469d7a3fd39243ad66b46d6a54d 
  src/slave/containerizer/mesos/containerizer.cpp 
fa40d47aee7803833bcde6cce1e86a21d7cf27d0 
  src/tests/persistent_volume_tests.cpp 
e3a8d33b6ae2b6afe5f4ab7c029b2f37a861f112 

Diff: https://reviews.apache.org/r/30510/diff/


Testing
---

make check


Thanks,

Jie Yu



Re: Review Request 30694: Fixed bugs in CREATE/DESTROY operation handlers and added tests.

2015-02-08 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30694/
---

(Updated Feb. 6, 2015, 11:31 p.m.)


Review request for mesos, Ben Mahler, Michael Park, and Vinod Kone.


Changes
---

Review comments.


Bugs: MESOS-2100
https://issues.apache.org/jira/browse/MESOS-2100


Repository: mesos


Description
---

Fixed bugs in CREATE/DESTROY operation handlers and added tests.


Diffs (updated)
-

  include/mesos/resources.hpp 3b57568c10233a0c692787de6464f21af5eaadf4 
  src/Makefile.am 93537d17d3c7604a8532ee1453e405630c481ddc 
  src/master/master.hpp c3c77f840f089c5754c764e7f150a3c1971d311f 
  src/master/master.cpp 22fece79c6e511a1b61eb674d7f82216e7a25e00 
  src/tests/persistent_volume_tests.cpp PRE-CREATION 

Diff: https://reviews.apache.org/r/30694/diff/


Testing
---

make check


Thanks,

Jie Yu



Re: Review Request 30694: Fixed bugs in CREATE/DESTROY operation handlers and added tests.

2015-02-08 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30694/
---

(Updated Feb. 6, 2015, 9:43 p.m.)


Review request for mesos, Ben Mahler, Michael Park, and Vinod Kone.


Changes
---

Review comments.


Bugs: MESOS-2100
https://issues.apache.org/jira/browse/MESOS-2100


Repository: mesos


Description
---

Fixed bugs in CREATE/DESTROY operation handlers and added tests.


Diffs (updated)
-

  include/mesos/resources.hpp 3b57568c10233a0c692787de6464f21af5eaadf4 
  src/Makefile.am 93537d17d3c7604a8532ee1453e405630c481ddc 
  src/master/master.hpp c3c77f840f089c5754c764e7f150a3c1971d311f 
  src/master/master.cpp 22fece79c6e511a1b61eb674d7f82216e7a25e00 
  src/tests/persistent_volume_tests.cpp PRE-CREATION 

Diff: https://reviews.apache.org/r/30694/diff/


Testing
---

make check


Thanks,

Jie Yu



Re: Review Request 30694: Fixed bugs in CREATE/DESTROY operation handlers and added tests.

2015-02-06 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30694/
---

(Updated Feb. 6, 2015, 7:47 p.m.)


Review request for mesos, Ben Mahler, Michael Park, and Vinod Kone.


Changes
---

Review comments.


Bugs: MESOS-2100
https://issues.apache.org/jira/browse/MESOS-2100


Repository: mesos


Description
---

Fixed bugs in CREATE/DESTROY operation handlers and added tests.


Diffs (updated)
-

  include/mesos/resources.hpp 3b57568c10233a0c692787de6464f21af5eaadf4 
  src/Makefile.am 93537d17d3c7604a8532ee1453e405630c481ddc 
  src/master/master.hpp dcfd38ae2fa9e1bd0b477e9719724dba37114d30 
  src/master/master.cpp 234bbecc4205036d790b026abd59100eb188f913 
  src/tests/persistent_volume_tests.cpp PRE-CREATION 

Diff: https://reviews.apache.org/r/30694/diff/


Testing
---

make check


Thanks,

Jie Yu



Re: Review Request 30694: Fixed bugs in CREATE/DESTROY operation handlers and added tests.

2015-02-06 Thread Jie Yu


 On Feb. 6, 2015, 1:25 a.m., Ben Mahler wrote:
  src/master/master.cpp, lines 4787-4792
  https://reviews.apache.org/r/30694/diff/1/?file=851240#file851240line4787
 
  Hm.. relying on this error to determine whether the volume applies to 
  checkpointed resources seems a bit implicit. How do we know it's not due to 
  some other error? :)
  
  Have you considered something like the following approach:
  
  ```
  // There should be an easy way to get the slave's current total 
  resources
  // (checkpointed + non-checkpointed) since we likely need to use this
  // elsewhere too (e.g. http endpoints).
  Resources slaveResources; // checkpointed + non-checkopinted
  
  // Add the volumes.
  TryResources resources = slaveResources.apply(operation.create());
  
  CHECK_SOME(resources);
  
  slaveResources = resources.get();
  slave-checkpointedResources = slaveResources.filter(checkpointFilter);
  ```
  
  This approach should work for the DESTROY case as well, thoughts?

Refactored. Much better. Let me know!


- Jie


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30694/#review71316
---


On Feb. 6, 2015, 7:47 p.m., Jie Yu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30694/
 ---
 
 (Updated Feb. 6, 2015, 7:47 p.m.)
 
 
 Review request for mesos, Ben Mahler, Michael Park, and Vinod Kone.
 
 
 Bugs: MESOS-2100
 https://issues.apache.org/jira/browse/MESOS-2100
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Fixed bugs in CREATE/DESTROY operation handlers and added tests.
 
 
 Diffs
 -
 
   include/mesos/resources.hpp 3b57568c10233a0c692787de6464f21af5eaadf4 
   src/Makefile.am 93537d17d3c7604a8532ee1453e405630c481ddc 
   src/master/master.hpp dcfd38ae2fa9e1bd0b477e9719724dba37114d30 
   src/master/master.cpp 234bbecc4205036d790b026abd59100eb188f913 
   src/tests/persistent_volume_tests.cpp PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/30694/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Jie Yu
 




Re: Review Request 28809: Started to maintain and checkpoint persisted resource in slave.

2015-02-05 Thread Jie Yu


 On Feb. 5, 2015, 7:38 p.m., Ben Mahler wrote:
  src/slave/slave.cpp, lines 1335-1353
  https://reviews.apache.org/r/28809/diff/4/?file=848405#file848405line1335
 
  Is it safe to do this once on the combined resources?
  
  ```
  Resources resources = task.resources();
  
  if (task.has_executor()) {
resources += task.executor().resources();
  }
  
  foreach (const Resource resource, resources) {
...
  }
  ```

It's safe, but I want to print different ERROR messages for task and executor.


 On Feb. 5, 2015, 7:38 p.m., Ben Mahler wrote:
  src/slave/slave.cpp, lines 3478-3479
  https://reviews.apache.org/r/28809/diff/4/?file=848405#file848405line3478
 
  It's too bad the `errors` are just unsigned ints and not a collection 
  of `OptionError`s that we can print here.

Added a TODO.


- Jie


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28809/#review71279
---


On Feb. 4, 2015, 7:13 p.m., Jie Yu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/28809/
 ---
 
 (Updated Feb. 4, 2015, 7:13 p.m.)
 
 
 Review request for mesos and Ben Mahler.
 
 
 Bugs: MESOS-2031
 https://issues.apache.org/jira/browse/MESOS-2031
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Started to maintain and checkpoint persisted resource in slave. That includes:
 1) responds to update resources message
 2) checkpoint resources
 3) recover checkpointed resources
 4) send checkpointed resources during register/reregister
 
 
 Diffs
 -
 
   src/slave/slave.hpp 70bd8c1fde4ea09fa54c76aa93424a1adb0309f6 
   src/slave/slave.cpp a8b262174ab5c9a524db8318d3d1438cd75a702b 
 
 Diff: https://reviews.apache.org/r/28809/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Jie Yu
 




Re: Review Request 28809: Started to maintain and checkpoint persisted resource in slave.

2015-02-05 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28809/
---

(Updated Feb. 6, 2015, 12:05 a.m.)


Review request for mesos and Ben Mahler.


Changes
---

Addressed review comments and added tests.


Bugs: MESOS-2031
https://issues.apache.org/jira/browse/MESOS-2031


Repository: mesos


Description
---

Started to maintain and checkpoint persisted resource in slave. That includes:
1) responds to update resources message
2) checkpoint resources
3) recover checkpointed resources
4) send checkpointed resources during register/reregister


Diffs (updated)
-

  src/slave/slave.hpp 70bd8c1fde4ea09fa54c76aa93424a1adb0309f6 
  src/slave/slave.cpp a8b262174ab5c9a524db8318d3d1438cd75a702b 
  src/slave/state.hpp f92808ad9b1623cea0c35ec735c53a3d6457bdbe 
  src/tests/persistent_volume_tests.cpp PRE-CREATION 

Diff: https://reviews.apache.org/r/28809/diff/


Testing
---

make check


Thanks,

Jie Yu



Review Request 30682: Changed default disk quota check interval to be 15 seconds.

2015-02-05 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30682/
---

Review request for mesos and Vinod Kone.


Repository: mesos


Description
---

Changed default disk quota check interval to be 15 seconds.


Diffs
-

  src/slave/flags.hpp facf6d3e68e412897a1381d7bbdcceaf5d1fbb94 

Diff: https://reviews.apache.org/r/30682/diff/


Testing
---

make check


Thanks,

Jie Yu



Re: Review Request 30624: Refactor and increase test coverage for slave paths

2015-02-04 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30624/#review70989
---

Ship it!



src/tests/paths_tests.cpp
https://reviews.apache.org/r/30624/#comment116478

2 lines apart here please:)



src/tests/paths_tests.cpp
https://reviews.apache.org/r/30624/#comment116480

s/Pid/LibprocessPid/

Because we also have forked pid.


- Jie Yu


On Feb. 4, 2015, 5:43 p.m., Dominic Hamon wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30624/
 ---
 
 (Updated Feb. 4, 2015, 5:43 p.m.)
 
 
 Review request for mesos and Jie Yu.
 
 
 Bugs: MESOS-2314
 https://issues.apache.org/jira/browse/MESOS-2314
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Refactor and increase test coverage for slave paths
 
 
 Diffs
 -
 
   src/tests/paths_tests.cpp 417aa51b99d54055ad7254b4f274fb59d0794ca6 
 
 Diff: https://reviews.apache.org/r/30624/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Dominic Hamon
 




Re: Review Request 30531: Remove strings::format and unnecessary constants from paths

2015-02-04 Thread Jie Yu


 On Feb. 4, 2015, 1:32 a.m., Jie Yu wrote:
  src/slave/paths.cpp, line 453
  https://reviews.apache.org/r/30531/diff/1/?file=844643#file844643line453
 
  const string directory
 
 Dominic Hamon wrote:
 the function returns a string by value. there's no improvement here using 
 a const ref (the copy will be elided). using a plan string is much clearer i 
 think.
 
 Jie Yu wrote:
 Does it save a call to the copy constructor (from the temp object to 
 'directory')? Or the compiler will optimize it?

OK, I found this
http://en.wikipedia.org/wiki/Copy_elision

As long as the copy constructor of std::string does not have side effect, the 
copy will be elided. Look good to me. Feel free to commit.


- Jie


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30531/#review70885
---


On Feb. 4, 2015, 5:22 p.m., Dominic Hamon wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30531/
 ---
 
 (Updated Feb. 4, 2015, 5:22 p.m.)
 
 
 Review request for mesos and Jie Yu.
 
 
 Bugs: MESOS-2314
 https://issues.apache.org/jira/browse/MESOS-2314
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Remove strings::format and unnecessary constants from paths
 
 
 Diffs
 -
 
   src/slave/paths.hpp 4d6897f9eebcd6852f0223bae23a29729a42e750 
   src/slave/paths.cpp fe951abac1254748dbd5bdd81b7e50da75afad6d 
 
 Diff: https://reviews.apache.org/r/30531/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Dominic Hamon
 




Re: Review Request 30531: Remove strings::format and unnecessary constants from paths

2015-02-04 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30531/#review70984
---

Ship it!



src/slave/paths.cpp
https://reviews.apache.org/r/30531/#comment116466

please use const reference here

const string



src/slave/paths.cpp
https://reviews.apache.org/r/30531/#comment116468

Ditto.



src/slave/paths.cpp
https://reviews.apache.org/r/30531/#comment116469

Ditto.



src/slave/paths.cpp
https://reviews.apache.org/r/30531/#comment116470

Ditto.


- Jie Yu


On Feb. 4, 2015, 5:22 p.m., Dominic Hamon wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30531/
 ---
 
 (Updated Feb. 4, 2015, 5:22 p.m.)
 
 
 Review request for mesos and Jie Yu.
 
 
 Bugs: MESOS-2314
 https://issues.apache.org/jira/browse/MESOS-2314
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Remove strings::format and unnecessary constants from paths
 
 
 Diffs
 -
 
   src/slave/paths.hpp 4d6897f9eebcd6852f0223bae23a29729a42e750 
   src/slave/paths.cpp fe951abac1254748dbd5bdd81b7e50da75afad6d 
 
 Diff: https://reviews.apache.org/r/30531/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Dominic Hamon
 




Re: Review Request 30531: Remove strings::format and unnecessary constants from paths

2015-02-04 Thread Jie Yu


 On Feb. 4, 2015, 1:32 a.m., Jie Yu wrote:
  src/slave/paths.cpp, line 453
  https://reviews.apache.org/r/30531/diff/1/?file=844643#file844643line453
 
  const string directory
 
 Dominic Hamon wrote:
 the function returns a string by value. there's no improvement here using 
 a const ref (the copy will be elided). using a plan string is much clearer i 
 think.

Does it save a call to the copy constructor (from the temp object to 
'directory')? Or the compiler will optimize it?


- Jie


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30531/#review70885
---


On Feb. 4, 2015, 5:22 p.m., Dominic Hamon wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30531/
 ---
 
 (Updated Feb. 4, 2015, 5:22 p.m.)
 
 
 Review request for mesos and Jie Yu.
 
 
 Bugs: MESOS-2314
 https://issues.apache.org/jira/browse/MESOS-2314
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Remove strings::format and unnecessary constants from paths
 
 
 Diffs
 -
 
   src/slave/paths.hpp 4d6897f9eebcd6852f0223bae23a29729a42e750 
   src/slave/paths.cpp fe951abac1254748dbd5bdd81b7e50da75afad6d 
 
 Diff: https://reviews.apache.org/r/30531/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Dominic Hamon
 




Re: Review Request 30628: Add an optional container rootfs for MesosContainerizer.

2015-02-04 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30628/#review70999
---

Ship it!


- Jie Yu


On Feb. 4, 2015, 6:07 p.m., Ian Downes wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30628/
 ---
 
 (Updated Feb. 4, 2015, 6:07 p.m.)
 
 
 Review request for mesos and Jie Yu.
 
 
 Repository: mesos
 
 
 Description
 ---
 
 This is in preparation for supporting different root filesystems for 
 containers using the MesosContainerizer.
 
 
 Diffs
 -
 
   src/slave/containerizer/mesos/containerizer.hpp 
 802988c90ac872b0cefa5e28f06e6fec98e8d032 
 
 Diff: https://reviews.apache.org/r/30628/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Ian Downes
 




Re: Review Request 29918: Introduced a generic checkpoint function.

2015-02-04 Thread Jie Yu


 On Jan. 23, 2015, 6:44 p.m., Jie Yu wrote:
  src/slave/state.hpp, line 120
  https://reviews.apache.org/r/29918/diff/9/?file=829596#file829596line120
 
  One more thought:
  
  Can we introduce a third parameter for this function to let users opt 
  out the temp + rename (we can use a default value)?
  
  This function is super critical. What if /tmp has too many links? It'll 
  cause slave to flap. For majority of the checkpointing work, we don't need 
  this temp + rename because we know the old file does not exist.
  
  What do you think?
 
 Michael Park wrote:
 But if we want to guarantee that the state of the checkpointed file is 
 always valid, I don't think we can get away from writing to temp + renaming.
 For the case where the old file does not exist, we'd do something like, 
 write directly to the file if it doesn't already exist, and delete on error?
 We would end up with an invalid checkpoint file if we crash while 
 writing, or run into an error and crash before deleting the file.
 
 I'm also not sure if /tmp having too many links is the level of problem 
 we need to worry about? It seems like a very unlikely case to me, but I'm not 
 sure.
 
 Just my thoughts.

Looks like it causes this bug: https://issues.apache.org/jira/browse/MESOS-2319


- Jie


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29918/#review69435
---


On Jan. 24, 2015, 1:21 a.m., Michael Park wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/29918/
 ---
 
 (Updated Jan. 24, 2015, 1:21 a.m.)
 
 
 Review request for mesos, Adam B, Benjamin Hindman, Ben Mahler, Jie Yu, and 
 Vinod Kone.
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Introduced a generic checkpoint function.
 
 
 Diffs
 -
 
   src/slave/state.hpp 70777cf6ab681c29ca4df601fe47903e1dbdf41f 
   src/slave/state.cpp a36fa53099300ee03f051b0f5eaaafe9f1da68d1 
   src/tests/slave_recovery_tests.cpp 809822e63b05a21418cd9297c927d656d6fd871d 
 
 Diff: https://reviews.apache.org/r/29918/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Michael Park
 




Re: Review Request 30514: Rate limited the removal of slaves failing health checks.

2015-02-04 Thread Jie Yu


 On Feb. 4, 2015, 8:01 p.m., Ben Mahler wrote:
  src/master/master.cpp, lines 197-246
  https://reviews.apache.org/r/30514/diff/5/?file=847032#file847032line197
 
  It looks like we only log when we're shutting down the slave, when 
  there's no rate limiter?
  
  Have you considered having a single shutdown code path? This gets you a 
  single dispatch, and a single log statement and it looks like it's less 
  control flow?
  
  ```
void shutdown()
{
  if (shuttingDown.isNone()) {
FutureNothing future = Nothing();
  
if (limiter.isSome()) {
  LOG(INFO)  Scheduling shutdown of slave   slaveId
  due to health check timeout;
  
  future = limiter.get()-acquire();
}
  
shuttingDown = future.onAny(defer(self(), Self::_shutdown));
  
++metrics-slave_shutdowns_scheduled;
  }
}
  
void _shutdown()
{
  CHECK_SOME(shuttingDown);
  
  const FutureNothing future = shuttingDown.get();
  
  CHECK(!future.isFailed())  future.failure();
  
  if (future.isReady()) {
LOG(INFO)  Shutting down slave   slaveId
due to health check timeout;
  
dispatch(master,
 Master::shutdownSlave,
 slaveId,
 health check timed out);
  } else if (future.isDiscarded()) {
// Shutdown was canceled.
LOG(INFO)  Canceling shutdown of slave   slaveId
since a pong is received!;
  
++metrics-slave_shutdowns_canceled;
  }
  
  shuttingDown = None();
}
  ```

Having both 'future' and 'shuttingDown' in 'shutdown' method looks confusing to 
me:) You need to comment about why you need that future. Or, I think having dup 
code is not a too bad thing in this case as the logic is more clear IMO.


- Jie


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30514/#review71021
---


On Feb. 4, 2015, 1:51 a.m., Vinod Kone wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30514/
 ---
 
 (Updated Feb. 4, 2015, 1:51 a.m.)
 
 
 Review request for mesos, Ben Mahler, David Robinson, and Jie Yu.
 
 
 Bugs: MESOS-1148
 https://issues.apache.org/jira/browse/MESOS-1148
 
 
 Repository: mesos
 
 
 Description
 ---
 
 The algorithm is simple. All the slave observers share a rate limiter whose 
 rate is configured via command line. When a slave times out on health check, 
 a permit is acquired to shutdown the slave *but* the pings are continued to 
 be sent. If a pong arrives before the permit is satisifed, the shutdown is 
 cancelled.
 
 
 Diffs
 -
 
   src/master/flags.hpp 6c18a1af625311ef149b5877b08f63c2b12c040d 
   src/master/master.hpp cd37ee9d3c57bcd91f08cd402ec1e4a09d9e42ee 
   src/master/master.cpp d04b2c4041d8fe8978b877f07579a6f907903e1b 
   src/tests/partition_tests.cpp fea78016268b007590516798eb30ff423fd0ae58 
   src/tests/slave_tests.cpp e7e2af63da785644f3f7e6e23607c02be962a2c6 
 
 Diff: https://reviews.apache.org/r/30514/diff/
 
 
 Testing
 ---
 
 make check
 
 Ran the new tests 100 times
 
 
 Thanks,
 
 Vinod Kone
 




Re: Review Request 28809: Started to maintain and checkpoint persisted resource in slave.

2015-02-04 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28809/
---

(Updated Feb. 4, 2015, 7:13 p.m.)


Review request for mesos and Ben Mahler.


Changes
---

Rebased.


Bugs: MESOS-2031
https://issues.apache.org/jira/browse/MESOS-2031


Repository: mesos


Description
---

Started to maintain and checkpoint persisted resource in slave. That includes:
1) responds to update resources message
2) checkpoint resources
3) recover checkpointed resources
4) send checkpointed resources during register/reregister


Diffs (updated)
-

  src/slave/slave.hpp 70bd8c1fde4ea09fa54c76aa93424a1adb0309f6 
  src/slave/slave.cpp a8b262174ab5c9a524db8318d3d1438cd75a702b 

Diff: https://reviews.apache.org/r/28809/diff/


Testing
---

make check


Thanks,

Jie Yu



Re: Review Request 30514: Rate limited the removal of slaves failing health checks.

2015-02-04 Thread Jie Yu


 On Feb. 4, 2015, 8:01 p.m., Ben Mahler wrote:
  src/master/master.cpp, lines 197-246
  https://reviews.apache.org/r/30514/diff/5/?file=847032#file847032line197
 
  It looks like we only log when we're shutting down the slave, when 
  there's no rate limiter?
  
  Have you considered having a single shutdown code path? This gets you a 
  single dispatch, and a single log statement and it looks like it's less 
  control flow?
  
  ```
void shutdown()
{
  if (shuttingDown.isNone()) {
FutureNothing future = Nothing();
  
if (limiter.isSome()) {
  LOG(INFO)  Scheduling shutdown of slave   slaveId
  due to health check timeout;
  
  future = limiter.get()-acquire();
}
  
shuttingDown = future.onAny(defer(self(), Self::_shutdown));
  
++metrics-slave_shutdowns_scheduled;
  }
}
  
void _shutdown()
{
  CHECK_SOME(shuttingDown);
  
  const FutureNothing future = shuttingDown.get();
  
  CHECK(!future.isFailed())  future.failure();
  
  if (future.isReady()) {
LOG(INFO)  Shutting down slave   slaveId
due to health check timeout;
  
dispatch(master,
 Master::shutdownSlave,
 slaveId,
 health check timed out);
  } else if (future.isDiscarded()) {
// Shutdown was canceled.
LOG(INFO)  Canceling shutdown of slave   slaveId
since a pong is received!;
  
++metrics-slave_shutdowns_canceled;
  }
  
  shuttingDown = None();
}
  ```
 
 Jie Yu wrote:
 Having both 'future' and 'shuttingDown' in 'shutdown' method looks 
 confusing to me:) You need to comment about why you need that future. Or, I 
 think having dup code is not a too bad thing in this case as the logic is 
 more clear IMO.

Another option to avoid dup code is:

1) s/shutdown/scheduleShutdown
2) let 'shutdown' be the actual shutdown (sending message)


- Jie


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30514/#review71021
---


On Feb. 4, 2015, 1:51 a.m., Vinod Kone wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30514/
 ---
 
 (Updated Feb. 4, 2015, 1:51 a.m.)
 
 
 Review request for mesos, Ben Mahler, David Robinson, and Jie Yu.
 
 
 Bugs: MESOS-1148
 https://issues.apache.org/jira/browse/MESOS-1148
 
 
 Repository: mesos
 
 
 Description
 ---
 
 The algorithm is simple. All the slave observers share a rate limiter whose 
 rate is configured via command line. When a slave times out on health check, 
 a permit is acquired to shutdown the slave *but* the pings are continued to 
 be sent. If a pong arrives before the permit is satisifed, the shutdown is 
 cancelled.
 
 
 Diffs
 -
 
   src/master/flags.hpp 6c18a1af625311ef149b5877b08f63c2b12c040d 
   src/master/master.hpp cd37ee9d3c57bcd91f08cd402ec1e4a09d9e42ee 
   src/master/master.cpp d04b2c4041d8fe8978b877f07579a6f907903e1b 
   src/tests/partition_tests.cpp fea78016268b007590516798eb30ff423fd0ae58 
   src/tests/slave_tests.cpp e7e2af63da785644f3f7e6e23607c02be962a2c6 
 
 Diff: https://reviews.apache.org/r/30514/diff/
 
 
 Testing
 ---
 
 make check
 
 Ran the new tests 100 times
 
 
 Thanks,
 
 Vinod Kone
 




Review Request 30635: Fixed MESOS-2319 by creating the temporary file under the same base directory.

2015-02-04 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30635/
---

Review request for mesos, Ben Mahler, Michael Park, and Vinod Kone.


Bugs: MESOS-2319
https://issues.apache.org/jira/browse/MESOS-2319


Repository: mesos


Description
---

Fixed MESOS-2319 by creating the temporary file under the same base directory.


Diffs
-

  src/slave/state.hpp de631fb2c8a8d2bcbb861c438b18141ba6211024 

Diff: https://reviews.apache.org/r/30635/diff/


Testing
---

make check


Thanks,

Jie Yu



Re: Review Request 30514: Rate limited the removal of slaves failing health checks.

2015-02-04 Thread Jie Yu


 On Feb. 4, 2015, 8:01 p.m., Ben Mahler wrote:
  src/master/master.cpp, lines 197-246
  https://reviews.apache.org/r/30514/diff/5/?file=847032#file847032line197
 
  It looks like we only log when we're shutting down the slave, when 
  there's no rate limiter?
  
  Have you considered having a single shutdown code path? This gets you a 
  single dispatch, and a single log statement and it looks like it's less 
  control flow?
  
  ```
void shutdown()
{
  if (shuttingDown.isNone()) {
FutureNothing future = Nothing();
  
if (limiter.isSome()) {
  LOG(INFO)  Scheduling shutdown of slave   slaveId
  due to health check timeout;
  
  future = limiter.get()-acquire();
}
  
shuttingDown = future.onAny(defer(self(), Self::_shutdown));
  
++metrics-slave_shutdowns_scheduled;
  }
}
  
void _shutdown()
{
  CHECK_SOME(shuttingDown);
  
  const FutureNothing future = shuttingDown.get();
  
  CHECK(!future.isFailed())  future.failure();
  
  if (future.isReady()) {
LOG(INFO)  Shutting down slave   slaveId
due to health check timeout;
  
dispatch(master,
 Master::shutdownSlave,
 slaveId,
 health check timed out);
  } else if (future.isDiscarded()) {
// Shutdown was canceled.
LOG(INFO)  Canceling shutdown of slave   slaveId
since a pong is received!;
  
++metrics-slave_shutdowns_canceled;
  }
  
  shuttingDown = None();
}
  ```
 
 Jie Yu wrote:
 Having both 'future' and 'shuttingDown' in 'shutdown' method looks 
 confusing to me:) You need to comment about why you need that future. Or, I 
 think having dup code is not a too bad thing in this case as the logic is 
 more clear IMO.
 
 Jie Yu wrote:
 Another option to avoid dup code is:
 
 1) s/shutdown/scheduleShutdown
 2) let 'shutdown' be the actual shutdown (sending message)
 
 Ben Mahler wrote:
 Imagine it is s/future/limited/:
 
 ```
 FutureNothing limited = Nothing();
 
 if (limiter.isSome()) {
   limited = limiter.get()-acquire();
 }
 
 shuttingDown = limited.onAny(defer(self(), Self::_shutdown));
 ```
 
 Seems pretty clear? Can you flush out the second option you mentioned in 
 a code snippet? I didn't follow.
 
 Jie Yu wrote:
 Yeah, that reads much better.
 
 Vinod Kone wrote:
 hmm. i actually liked the explicitness of the earlier version. needed 
 less cognitive overhead. but also liked the proposal of always calling 
 _shutdown(). here's my proposal:
 
 ```
   void shutdown()
   {
 if (shuttingDown.isSome()) {
   // Shutdown is already in progress.
   return;
 }
 
 if (limiter.isSome()) {
   // If a rate limit is specified schedule the shutdown according
   // to the rate limit.
   LOG(INFO)  Scheduling shutdown of slave   slaveId
   due to health check timeout;
 
   shuttingDown = limiter.get()-acquire();
 } else {
   // If no rate limit is specified shutdown right away.
   shuttingDown = FutureNothing(Nothing()); // Ready future.
 }
 
 shuttingDown.get()
   .onAny(defer(self(), Self::_shutdown));
   }
 
 ```
 
 what do you guys think?

I like the idea, but this won't compile since shuttingDown.get() returns a 
const ref and onAny is a non-const function ;-)


- Jie


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30514/#review71021
---


On Feb. 4, 2015, 1:51 a.m., Vinod Kone wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30514/
 ---
 
 (Updated Feb. 4, 2015, 1:51 a.m.)
 
 
 Review request for mesos, Ben Mahler, David Robinson, and Jie Yu.
 
 
 Bugs: MESOS-1148
 https://issues.apache.org/jira/browse/MESOS-1148
 
 
 Repository: mesos
 
 
 Description
 ---
 
 The algorithm is simple. All the slave observers share a rate limiter whose 
 rate is configured via command line. When a slave times out on health check, 
 a permit is acquired to shutdown the slave *but* the pings are continued to 
 be sent. If a pong arrives before the permit is satisifed, the shutdown is 
 cancelled.
 
 
 Diffs
 -
 
   src/master/flags.hpp 6c18a1af625311ef149b5877b08f63c2b12c040d 
   src/master/master.hpp

Re: Review Request 30635: Fixed MESOS-2319 by creating the temporary file under the same base directory.

2015-02-04 Thread Jie Yu


 On Feb. 4, 2015, 8:31 p.m., Ben Mahler wrote:
  Is there a test of this function?
  
  Might be nice to add a NOTE or TODO about how to address the issue of 
  leaving dangling files when we failover at the wrong time?
 
 Jie Yu wrote:
 Yes, SlaveStateTest should catch the regression.
 
 Will add a NOTE, the dangline file will most likely be GCed with the 
 directory since we cannot recover the task/executor.

In fact, this is not true for resources.info. Will add a TODO to cleanup 
dangling temp files.


- Jie


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30635/#review71035
---


On Feb. 4, 2015, 7:38 p.m., Jie Yu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30635/
 ---
 
 (Updated Feb. 4, 2015, 7:38 p.m.)
 
 
 Review request for mesos, Ben Mahler, Michael Park, and Vinod Kone.
 
 
 Bugs: MESOS-2319
 https://issues.apache.org/jira/browse/MESOS-2319
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Fixed MESOS-2319 by creating the temporary file under the same base directory.
 
 
 Diffs
 -
 
   src/slave/state.hpp de631fb2c8a8d2bcbb861c438b18141ba6211024 
 
 Diff: https://reviews.apache.org/r/30635/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Jie Yu
 




Re: Review Request 30386: Added support for CREATE operation in master.

2015-02-04 Thread Jie Yu


 On Feb. 4, 2015, 8:56 p.m., Ben Mahler wrote:
  src/master/master.cpp, line 1348
  https://reviews.apache.org/r/30386/diff/2/?file=846897#file846897line1348
 
  Maybe we should add a little note there that there is no direct 
  feedback to the framework when an operation is dropped, they will find out 
  that the operation was dropped through subsequent offers.

Added a note.


 On Feb. 4, 2015, 8:56 p.m., Ben Mahler wrote:
  src/master/master.cpp, lines 4610-4613
  https://reviews.apache.org/r/30386/diff/2/?file=846897#file846897line4610
 
  Are you planning to keep this variable? Seems like every operation will 
  require checkpointing for now, hence the name of the method being 
  `updateCheckpointedResource`? :)

Killed and use CHECK in default case.


 On Feb. 4, 2015, 8:56 p.m., Ben Mahler wrote:
  src/master/master.cpp, lines 4616-4627
  https://reviews.apache.org/r/30386/diff/2/?file=846897#file846897line4616
 
  Hm.. seems like this method should enforce that the operation type is 
  supported (via CHECK or LOG(FATAL))? The caller is expected not to call 
  this with anything that might get dropped in here, right?

Yes, add LOG(FATAL) in the default case.


- Jie


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30386/#review71040
---


On Feb. 4, 2015, 1:09 a.m., Jie Yu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30386/
 ---
 
 (Updated Feb. 4, 2015, 1:09 a.m.)
 
 
 Review request for mesos, Ben Mahler, Michael Park, and Vinod Kone.
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Added support for CREATE operation in master.
 
 
 Diffs
 -
 
   src/master/master.hpp cd37ee9d3c57bcd91f08cd402ec1e4a09d9e42ee 
   src/master/master.cpp d04b2c4041d8fe8978b877f07579a6f907903e1b 
 
 Diff: https://reviews.apache.org/r/30386/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Jie Yu
 




Re: Review Request 30635: Fixed MESOS-2319 by creating the temporary file under the same base directory.

2015-02-04 Thread Jie Yu


 On Feb. 4, 2015, 8:31 p.m., Ben Mahler wrote:
  Is there a test of this function?
  
  Might be nice to add a NOTE or TODO about how to address the issue of 
  leaving dangling files when we failover at the wrong time?

Yes, SlaveStateTest should catch the regression.

Will add a NOTE, the dangline file will most likely be GCed with the directory 
since we cannot recover the task/executor.


- Jie


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30635/#review71035
---


On Feb. 4, 2015, 7:38 p.m., Jie Yu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30635/
 ---
 
 (Updated Feb. 4, 2015, 7:38 p.m.)
 
 
 Review request for mesos, Ben Mahler, Michael Park, and Vinod Kone.
 
 
 Bugs: MESOS-2319
 https://issues.apache.org/jira/browse/MESOS-2319
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Fixed MESOS-2319 by creating the temporary file under the same base directory.
 
 
 Diffs
 -
 
   src/slave/state.hpp de631fb2c8a8d2bcbb861c438b18141ba6211024 
 
 Diff: https://reviews.apache.org/r/30635/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Jie Yu
 




Re: Review Request 30514: Rate limited the removal of slaves failing health checks.

2015-02-04 Thread Jie Yu


 On Feb. 4, 2015, 8:01 p.m., Ben Mahler wrote:
  src/master/master.cpp, lines 197-246
  https://reviews.apache.org/r/30514/diff/5/?file=847032#file847032line197
 
  It looks like we only log when we're shutting down the slave, when 
  there's no rate limiter?
  
  Have you considered having a single shutdown code path? This gets you a 
  single dispatch, and a single log statement and it looks like it's less 
  control flow?
  
  ```
void shutdown()
{
  if (shuttingDown.isNone()) {
FutureNothing future = Nothing();
  
if (limiter.isSome()) {
  LOG(INFO)  Scheduling shutdown of slave   slaveId
  due to health check timeout;
  
  future = limiter.get()-acquire();
}
  
shuttingDown = future.onAny(defer(self(), Self::_shutdown));
  
++metrics-slave_shutdowns_scheduled;
  }
}
  
void _shutdown()
{
  CHECK_SOME(shuttingDown);
  
  const FutureNothing future = shuttingDown.get();
  
  CHECK(!future.isFailed())  future.failure();
  
  if (future.isReady()) {
LOG(INFO)  Shutting down slave   slaveId
due to health check timeout;
  
dispatch(master,
 Master::shutdownSlave,
 slaveId,
 health check timed out);
  } else if (future.isDiscarded()) {
// Shutdown was canceled.
LOG(INFO)  Canceling shutdown of slave   slaveId
since a pong is received!;
  
++metrics-slave_shutdowns_canceled;
  }
  
  shuttingDown = None();
}
  ```
 
 Jie Yu wrote:
 Having both 'future' and 'shuttingDown' in 'shutdown' method looks 
 confusing to me:) You need to comment about why you need that future. Or, I 
 think having dup code is not a too bad thing in this case as the logic is 
 more clear IMO.
 
 Jie Yu wrote:
 Another option to avoid dup code is:
 
 1) s/shutdown/scheduleShutdown
 2) let 'shutdown' be the actual shutdown (sending message)
 
 Ben Mahler wrote:
 Imagine it is s/future/limited/:
 
 ```
 FutureNothing limited = Nothing();
 
 if (limiter.isSome()) {
   limited = limiter.get()-acquire();
 }
 
 shuttingDown = limited.onAny(defer(self(), Self::_shutdown));
 ```
 
 Seems pretty clear? Can you flush out the second option you mentioned in 
 a code snippet? I didn't follow.

Yeah, that reads much better.


- Jie


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30514/#review71021
---


On Feb. 4, 2015, 1:51 a.m., Vinod Kone wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30514/
 ---
 
 (Updated Feb. 4, 2015, 1:51 a.m.)
 
 
 Review request for mesos, Ben Mahler, David Robinson, and Jie Yu.
 
 
 Bugs: MESOS-1148
 https://issues.apache.org/jira/browse/MESOS-1148
 
 
 Repository: mesos
 
 
 Description
 ---
 
 The algorithm is simple. All the slave observers share a rate limiter whose 
 rate is configured via command line. When a slave times out on health check, 
 a permit is acquired to shutdown the slave *but* the pings are continued to 
 be sent. If a pong arrives before the permit is satisifed, the shutdown is 
 cancelled.
 
 
 Diffs
 -
 
   src/master/flags.hpp 6c18a1af625311ef149b5877b08f63c2b12c040d 
   src/master/master.hpp cd37ee9d3c57bcd91f08cd402ec1e4a09d9e42ee 
   src/master/master.cpp d04b2c4041d8fe8978b877f07579a6f907903e1b 
   src/tests/partition_tests.cpp fea78016268b007590516798eb30ff423fd0ae58 
   src/tests/slave_tests.cpp e7e2af63da785644f3f7e6e23607c02be962a2c6 
 
 Diff: https://reviews.apache.org/r/30514/diff/
 
 
 Testing
 ---
 
 make check
 
 Ran the new tests 100 times
 
 
 Thanks,
 
 Vinod Kone
 




Re: Review Request 30510: Allowed Mesos containerizer to prepare and update volumes.

2015-02-04 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30510/
---

(Updated Feb. 4, 2015, 10:57 p.m.)


Review request for mesos, Ben Mahler, Ian Downes, Michael Park, and Vinod Kone.


Changes
---

Rebased.


Bugs: MESOS-2031
https://issues.apache.org/jira/browse/MESOS-2031


Repository: mesos


Description
---

Allowed Mesos containerizer to prepare and update volumes.

Mesos containerizer setup volumes for the container.


Diffs (updated)
-

  src/slave/containerizer/mesos/containerizer.hpp 
569119fc591ecb118ecff51b8b19d0a1b8090cb5 
  src/slave/containerizer/mesos/containerizer.cpp 
d712278428889ebdfd598537690138329d8464f0 

Diff: https://reviews.apache.org/r/30510/diff/


Testing
---

make check


Thanks,

Jie Yu



Re: Review Request 30545: cgroups: added support to listen on memory pressures.

2015-02-04 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30545/#review71104
---


Per our dicussion, could you expose and reuse the existing EventListener? Here 
are the thoughts:

1) s/EventListener/EventListenerProcess
2) expose EventListner (the wrapper for EventListenerProcess) in the header file
3) introduce a Futureuint64_t read() function for EventListener (and 
EventListenerProcess)
4) The existing 'listen' function can be implemented as follows: create an 
EventListener, call read(), register an onAny for the returned future to delete 
EventListener
5) You need to be careful if any read fails/discarded. You probably want to 
fail all the subsequent reads if that happens (i.e., having a flag inside 
EventListenerProcess and return failed future for read() if the flag is set).

- Jie Yu


On Feb. 4, 2015, 7:50 p.m., Chi Zhang wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30545/
 ---
 
 (Updated Feb. 4, 2015, 7:50 p.m.)
 
 
 Review request for mesos, Dominic Hamon, Ian Downes, and Jie Yu.
 
 
 Bugs: MESOS-2136
 https://issues.apache.org/jira/browse/MESOS-2136
 
 
 Repository: mesos
 
 
 Description
 ---
 
 cgroups: added support to listen on memory pressures.
 
 
 Diffs
 -
 
   src/linux/cgroups.hpp abf31df 
   src/linux/cgroups.cpp 0b136e1 
 
 Diff: https://reviews.apache.org/r/30545/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Chi Zhang
 




Re: Review Request 30642: Added validation for DESTROY operation.

2015-02-04 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30642/
---

(Updated Feb. 4, 2015, 11:52 p.m.)


Review request for mesos, Ben Mahler, Michael Park, and Vinod Kone.


Changes
---

Updated.


Bugs: MESOS-2100
https://issues.apache.org/jira/browse/MESOS-2100


Repository: mesos


Description
---

Added validation for DESTROY operation.


Diffs (updated)
-

  src/master/validation.hpp 81dc7eedb23ccfd61666280f6cea511cef7eb4ff 
  src/master/validation.cpp 550782ee7f13271f050a6374a33d36133da10f03 
  src/tests/master_validation_tests.cpp 
12773110a61c4478fa968a07a969bd3127d73e5e 

Diff: https://reviews.apache.org/r/30642/diff/


Testing
---

make check


Thanks,

Jie Yu



Re: Review Request 30514: Rate limited the removal of slaves failing health checks.

2015-02-04 Thread Jie Yu


 On Feb. 4, 2015, 11:01 p.m., Ben Mahler wrote:
  src/master/master.cpp, lines 210-223
  https://reviews.apache.org/r/30514/diff/6/?file=849030#file849030line210
 
  The .get() seems a bit hacky to me (as Jie said it's not really const), 
  it seems nice to set `shuttingDown` once at the end of the method to the 
  correct future.
  
  The FutureNothing(Nothing()) also looks a bit cluttered?
  
  This was the motivation for obtaining the 'limited' or 'throttle' 
  future first:
  
  ```
  FutureNothing acquire = Nothing();
  
  if (limiter.isSome()) {
acquire = limiter.get()-acquire();
  }
  
  shuttingDown = acquire.onAny(defer(self(), Self::_shutdown));
  ```
  
  Also, I'm not sure what the comments are adding over what's in the 
  code? Thoughts?

If the LOG is not a concern, another option is:
```
shuttingDown = (limiter.isSome() ? limiter.get()-acquire() : Nothing())
  .onAny(defer(self(), Self::_shutdown));
```


- Jie


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30514/#review71059
---


On Feb. 4, 2015, 10:43 p.m., Vinod Kone wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30514/
 ---
 
 (Updated Feb. 4, 2015, 10:43 p.m.)
 
 
 Review request for mesos, Ben Mahler, David Robinson, and Jie Yu.
 
 
 Bugs: MESOS-1148
 https://issues.apache.org/jira/browse/MESOS-1148
 
 
 Repository: mesos
 
 
 Description
 ---
 
 The algorithm is simple. All the slave observers share a rate limiter whose 
 rate is configured via command line. When a slave times out on health check, 
 a permit is acquired to shutdown the slave *but* the pings are continued to 
 be sent. If a pong arrives before the permit is satisifed, the shutdown is 
 cancelled.
 
 
 Diffs
 -
 
   src/master/flags.hpp 6c18a1af625311ef149b5877b08f63c2b12c040d 
   src/master/master.hpp cd37ee9d3c57bcd91f08cd402ec1e4a09d9e42ee 
   src/master/master.cpp 69b945df58c173322b95aeae9cf4ed70329c4bb3 
   src/tests/partition_tests.cpp fea78016268b007590516798eb30ff423fd0ae58 
   src/tests/slave_tests.cpp e7e2af63da785644f3f7e6e23607c02be962a2c6 
 
 Diff: https://reviews.apache.org/r/30514/diff/
 
 
 Testing
 ---
 
 make check
 
 Ran the new tests 100 times
 
 
 Thanks,
 
 Vinod Kone
 




Re: Review Request 30459: Refactored task/offer/resource valiation in master.

2015-02-03 Thread Jie Yu


 On Jan. 30, 2015, 10:32 p.m., Dominic Hamon wrote:
  src/master/master.hpp, line 518
  https://reviews.apache.org/r/30459/diff/2/?file=842079#file842079line518
 
  const Offer* ?

I'll add a TODO. Right now, I need to change a bunch of funcitons to make it 
const.


- Jie


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30459/#review70429
---


On Jan. 31, 2015, 2:09 a.m., Jie Yu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30459/
 ---
 
 (Updated Jan. 31, 2015, 2:09 a.m.)
 
 
 Review request for mesos, Ben Mahler, Dominic Hamon, and Vinod Kone.
 
 
 Bugs: MESOS-2305
 https://issues.apache.org/jira/browse/MESOS-2305
 
 
 Repository: mesos-incubating
 
 
 Description
 ---
 
 Refactored task/offer/resource valiation in master.
 
 See ticket for motivation.
 
 
 Diffs
 -
 
   src/Makefile.am 07bea1fb8f0035413f2119859e16fa4f9383f68e 
   src/master/master.hpp 337e00aa46ea127f3667e3383d631c3fb8e22f30 
   src/master/master.cpp 10056861b95ed9453c971787982db7d09f09f323 
   src/master/validation.hpp PRE-CREATION 
   src/master/validation.cpp PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/30459/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Jie Yu
 




Re: Review Request 30514: Rate limited the removal of slaves failing health checks.

2015-02-03 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30514/#review70776
---

Ship it!



src/master/master.cpp
https://reviews.apache.org/r/30514/#comment116201

indent. align the ``



src/tests/slave_tests.cpp
https://reviews.apache.org/r/30514/#comment116202

Kill process:: prefix by using namespace process?


- Jie Yu


On Feb. 3, 2015, 2:39 a.m., Vinod Kone wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30514/
 ---
 
 (Updated Feb. 3, 2015, 2:39 a.m.)
 
 
 Review request for mesos, Ben Mahler, David Robinson, and Jie Yu.
 
 
 Bugs: MESOS-1148
 https://issues.apache.org/jira/browse/MESOS-1148
 
 
 Repository: mesos
 
 
 Description
 ---
 
 The algorithm is simple. All the slave observers share a rate limiter whose 
 rate is configured via command line. When a slave times out on health check, 
 a permit is acquired to shutdown the slave *but* the pings are continued to 
 be sent. If a pong arrives before the permit is satisifed, the shutdown is 
 cancelled.
 
 
 Diffs
 -
 
   src/master/flags.hpp 6c18a1af625311ef149b5877b08f63c2b12c040d 
   src/master/master.hpp 337e00aa46ea127f3667e3383d631c3fb8e22f30 
   src/master/master.cpp 10056861b95ed9453c971787982db7d09f09f323 
   src/tests/partition_tests.cpp fea78016268b007590516798eb30ff423fd0ae58 
   src/tests/slave_tests.cpp e7e2af63da785644f3f7e6e23607c02be962a2c6 
 
 Diff: https://reviews.apache.org/r/30514/diff/
 
 
 Testing
 ---
 
 make check
 
 Ran the new tests 100 times
 
 
 Thanks,
 
 Vinod Kone
 




Re: Review Request 30514: Rate limited the removal of slaves failing health checks.

2015-02-03 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30514/#review70775
---



src/master/flags.hpp
https://reviews.apache.org/r/30514/#comment116200

Good point. I am wondering should we introduce a Rate abstraction in stout:

```
Rate rate1(3, Minites(3));
Rate rate2(10, Hours(1));
```
And the RateLimiter can take a Rate instance:
```
RateLimiter limiter(rate2);
```
And we can add parser for Rate.


- Jie Yu


On Feb. 3, 2015, 2:39 a.m., Vinod Kone wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30514/
 ---
 
 (Updated Feb. 3, 2015, 2:39 a.m.)
 
 
 Review request for mesos, Ben Mahler, David Robinson, and Jie Yu.
 
 
 Bugs: MESOS-1148
 https://issues.apache.org/jira/browse/MESOS-1148
 
 
 Repository: mesos
 
 
 Description
 ---
 
 The algorithm is simple. All the slave observers share a rate limiter whose 
 rate is configured via command line. When a slave times out on health check, 
 a permit is acquired to shutdown the slave *but* the pings are continued to 
 be sent. If a pong arrives before the permit is satisifed, the shutdown is 
 cancelled.
 
 
 Diffs
 -
 
   src/master/flags.hpp 6c18a1af625311ef149b5877b08f63c2b12c040d 
   src/master/master.hpp 337e00aa46ea127f3667e3383d631c3fb8e22f30 
   src/master/master.cpp 10056861b95ed9453c971787982db7d09f09f323 
   src/tests/partition_tests.cpp fea78016268b007590516798eb30ff423fd0ae58 
   src/tests/slave_tests.cpp e7e2af63da785644f3f7e6e23607c02be962a2c6 
 
 Diff: https://reviews.apache.org/r/30514/diff/
 
 
 Testing
 ---
 
 make check
 
 Ran the new tests 100 times
 
 
 Thanks,
 
 Vinod Kone
 




Re: Review Request 30513: Added validation for CREATE offer operation.

2015-02-03 Thread Jie Yu


 On Feb. 2, 2015, 6:43 p.m., Dominic Hamon wrote:
  src/master/validation.cpp, line 126
  https://reviews.apache.org/r/30513/diff/1/?file=843897#file843897line126
 
  s/resource/volumes/ .. the current code reads as if you're checking 
  that all resources are persistent volumes, which confused me.

Fixed.


 On Feb. 2, 2015, 6:43 p.m., Dominic Hamon wrote:
  src/master/validation.cpp, line 544
  https://reviews.apache.org/r/30513/diff/1/?file=843897#file843897line544
 
  if this is all you need, can you pass in the persisted resources 
  instead of a slave? might make future testing easier.
 
 Vinod Kone wrote:
 +1 why does this need to take a Slave* ?

Fixed.


- Jie


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30513/#review70600
---


On Feb. 3, 2015, 11:15 p.m., Jie Yu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30513/
 ---
 
 (Updated Feb. 3, 2015, 11:15 p.m.)
 
 
 Review request for mesos, Ben Mahler, Michael Park, and Vinod Kone.
 
 
 Bugs: MESOS-2100
 https://issues.apache.org/jira/browse/MESOS-2100
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Added validation for CREATE offer operation.
 
 This depends on the master validator refactor. Tests will be in a different 
 patch.
 
 
 Diffs
 -
 
   src/Makefile.am 5b1885de132d527c75fc6431cd2d377f0cb7f242 
   src/master/validation.hpp 642c37564375868fae4e398bc4dc8edf2f31c8a4 
   src/master/validation.cpp 8804ba6cf0315a22943c25ec24ef594f1cfadf83 
   src/tests/master_validation_tests.cpp PRE-CREATION 
   src/tests/resource_offers_tests.cpp 
 24a7eabd40b11f4bbbf30b8251561e73683eef9a 
 
 Diff: https://reviews.apache.org/r/30513/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Jie Yu
 




Re: Review Request 30513: Added validation for CREATE offer operation.

2015-02-03 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30513/
---

(Updated Feb. 3, 2015, 11:15 p.m.)


Review request for mesos, Ben Mahler, Michael Park, and Vinod Kone.


Changes
---

Review comments. Added/refactored the tests.


Bugs: MESOS-2100
https://issues.apache.org/jira/browse/MESOS-2100


Repository: mesos


Description
---

Added validation for CREATE offer operation.

This depends on the master validator refactor. Tests will be in a different 
patch.


Diffs (updated)
-

  src/Makefile.am 5b1885de132d527c75fc6431cd2d377f0cb7f242 
  src/master/validation.hpp 642c37564375868fae4e398bc4dc8edf2f31c8a4 
  src/master/validation.cpp 8804ba6cf0315a22943c25ec24ef594f1cfadf83 
  src/tests/master_validation_tests.cpp PRE-CREATION 
  src/tests/resource_offers_tests.cpp 24a7eabd40b11f4bbbf30b8251561e73683eef9a 

Diff: https://reviews.apache.org/r/30513/diff/


Testing
---

make check


Thanks,

Jie Yu



Re: Review Request 30513: Added validation for CREATE offer operation.

2015-02-03 Thread Jie Yu


 On Feb. 3, 2015, 8:15 p.m., Vinod Kone wrote:
  src/master/validation.cpp, line 132
  https://reviews.apache.org/r/30513/diff/1/?file=843897#file843897line132
 
  ditto.

It's clear that resource.name() is 'disk' in this case.


- Jie


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30513/#review70815
---


On Feb. 3, 2015, 11:15 p.m., Jie Yu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30513/
 ---
 
 (Updated Feb. 3, 2015, 11:15 p.m.)
 
 
 Review request for mesos, Ben Mahler, Michael Park, and Vinod Kone.
 
 
 Bugs: MESOS-2100
 https://issues.apache.org/jira/browse/MESOS-2100
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Added validation for CREATE offer operation.
 
 This depends on the master validator refactor. Tests will be in a different 
 patch.
 
 
 Diffs
 -
 
   src/Makefile.am 5b1885de132d527c75fc6431cd2d377f0cb7f242 
   src/master/validation.hpp 642c37564375868fae4e398bc4dc8edf2f31c8a4 
   src/master/validation.cpp 8804ba6cf0315a22943c25ec24ef594f1cfadf83 
   src/tests/master_validation_tests.cpp PRE-CREATION 
   src/tests/resource_offers_tests.cpp 
 24a7eabd40b11f4bbbf30b8251561e73683eef9a 
 
 Diff: https://reviews.apache.org/r/30513/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Jie Yu
 




Re: Review Request 30513: Added validation for CREATE offer operation.

2015-02-03 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30513/
---

(Updated Feb. 3, 2015, 11:19 p.m.)


Review request for mesos, Ben Mahler, Michael Park, and Vinod Kone.


Changes
---

Rebased.


Bugs: MESOS-2100
https://issues.apache.org/jira/browse/MESOS-2100


Repository: mesos


Description
---

Added validation for CREATE offer operation.

This depends on the master validator refactor. Tests will be in a different 
patch.


Diffs (updated)
-

  src/Makefile.am 5b1885de132d527c75fc6431cd2d377f0cb7f242 
  src/master/validation.hpp 642c37564375868fae4e398bc4dc8edf2f31c8a4 
  src/master/validation.cpp 8804ba6cf0315a22943c25ec24ef594f1cfadf83 
  src/tests/master_validation_tests.cpp PRE-CREATION 
  src/tests/resource_offers_tests.cpp 24a7eabd40b11f4bbbf30b8251561e73683eef9a 

Diff: https://reviews.apache.org/r/30513/diff/


Testing
---

make check


Thanks,

Jie Yu



Re: Review Request 30513: Added validation for CREATE offer operation.

2015-02-03 Thread Jie Yu


 On Feb. 3, 2015, 8:15 p.m., Vinod Kone wrote:
  Test?

Added/refactored tests.


- Jie


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30513/#review70815
---


On Feb. 3, 2015, 11:15 p.m., Jie Yu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30513/
 ---
 
 (Updated Feb. 3, 2015, 11:15 p.m.)
 
 
 Review request for mesos, Ben Mahler, Michael Park, and Vinod Kone.
 
 
 Bugs: MESOS-2100
 https://issues.apache.org/jira/browse/MESOS-2100
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Added validation for CREATE offer operation.
 
 This depends on the master validator refactor. Tests will be in a different 
 patch.
 
 
 Diffs
 -
 
   src/Makefile.am 5b1885de132d527c75fc6431cd2d377f0cb7f242 
   src/master/validation.hpp 642c37564375868fae4e398bc4dc8edf2f31c8a4 
   src/master/validation.cpp 8804ba6cf0315a22943c25ec24ef594f1cfadf83 
   src/tests/master_validation_tests.cpp PRE-CREATION 
   src/tests/resource_offers_tests.cpp 
 24a7eabd40b11f4bbbf30b8251561e73683eef9a 
 
 Diff: https://reviews.apache.org/r/30513/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Jie Yu
 




Re: Review Request 30584: Added metrics for slave shutdowns.

2015-02-03 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30584/#review70855
---



src/master/master.cpp
https://reviews.apache.org/r/30584/#comment116302

To be consistent, either use shared_ptr for Metrics, or make RateLimiter a 
raw pointer.



src/master/master.cpp
https://reviews.apache.org/r/30584/#comment116303

This doesn't seem to necessary since we have master/slave_removals already 
(which will be incremented in removeSlave(slave) below)?


- Jie Yu


On Feb. 3, 2015, 10:36 p.m., Vinod Kone wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30584/
 ---
 
 (Updated Feb. 3, 2015, 10:36 p.m.)
 
 
 Review request for mesos, Ben Mahler and Jie Yu.
 
 
 Bugs: MESOS-1148
 https://issues.apache.org/jira/browse/MESOS-1148
 
 
 Repository: mesos
 
 
 Description
 ---
 
 See summary.
 
 
 Diffs
 -
 
   src/master/master.cpp 10056861b95ed9453c971787982db7d09f09f323 
   src/master/metrics.hpp 6a43abc914dce24c60b5db57ee01d172c8258e82 
   src/master/metrics.cpp 956fe5042d7478d90d4f03059e248694ba2b95e5 
 
 Diff: https://reviews.apache.org/r/30584/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Vinod Kone
 




Re: Review Request 30584: Added metrics for slave shutdowns.

2015-02-03 Thread Jie Yu


 On Feb. 3, 2015, 11:37 p.m., Jie Yu wrote:
  src/master/master.cpp, line 3908
  https://reviews.apache.org/r/30584/diff/1/?file=846680#file846680line3908
 
  This doesn't seem to necessary since we have master/slave_removals 
  already (which will be incremented in removeSlave(slave) below)?
 
 Vinod Kone wrote:
 Yea, wanted to use another specific one for shutdowns because slaves are 
 also removed without calling shutdown, e.g. reregisterSlave(). Although 
 slave_shutdowns_scheduled - slave_shutdowns_canceled will give this number. 
 So I'm ok with killing it altogether, if that's confusing. LMK.

Yeah, I would suggest removing it since we can derive it from other metrics.


- Jie


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30584/#review70855
---


On Feb. 4, 2015, 12:24 a.m., Vinod Kone wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30584/
 ---
 
 (Updated Feb. 4, 2015, 12:24 a.m.)
 
 
 Review request for mesos, Ben Mahler and Jie Yu.
 
 
 Bugs: MESOS-1148
 https://issues.apache.org/jira/browse/MESOS-1148
 
 
 Repository: mesos
 
 
 Description
 ---
 
 See summary.
 
 
 Diffs
 -
 
   src/master/master.hpp 337e00aa46ea127f3667e3383d631c3fb8e22f30 
   src/master/master.cpp 10056861b95ed9453c971787982db7d09f09f323 
   src/master/metrics.hpp 6a43abc914dce24c60b5db57ee01d172c8258e82 
   src/master/metrics.cpp 956fe5042d7478d90d4f03059e248694ba2b95e5 
 
 Diff: https://reviews.apache.org/r/30584/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Vinod Kone
 




Re: Review Request 30386: Added support for CREATE operation in master.

2015-02-03 Thread Jie Yu


 On Jan. 29, 2015, 6:49 p.m., Ben Mahler wrote:
  src/master/master.cpp, lines 1342-1353
  https://reviews.apache.org/r/30386/diff/1/?file=839423#file839423line1342
 
  Nice :)
  
  Do you need to pass the slave here? Can you remove it for now?

Killed.


- Jie


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30386/#review70144
---


On Jan. 29, 2015, 12:12 a.m., Jie Yu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30386/
 ---
 
 (Updated Jan. 29, 2015, 12:12 a.m.)
 
 
 Review request for mesos, Ben Mahler, Michael Park, and Vinod Kone.
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Added support for CREATE operation in master.
 
 
 Diffs
 -
 
   src/master/master.hpp 1d342e56116ad63aade43484b6899ce26f25abfd 
   src/master/master.cpp 54f26900ac8c63e79a1f89562a988c9a2567d209 
 
 Diff: https://reviews.apache.org/r/30386/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Jie Yu
 




Re: Review Request 30386: Added support for CREATE operation in master.

2015-02-03 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30386/
---

(Updated Feb. 4, 2015, 1:09 a.m.)


Review request for mesos, Ben Mahler, Michael Park, and Vinod Kone.


Changes
---

Rebased. BenM's comments.


Repository: mesos


Description
---

Added support for CREATE operation in master.


Diffs (updated)
-

  src/master/master.hpp cd37ee9d3c57bcd91f08cd402ec1e4a09d9e42ee 
  src/master/master.cpp d04b2c4041d8fe8978b877f07579a6f907903e1b 

Diff: https://reviews.apache.org/r/30386/diff/


Testing
---

make check


Thanks,

Jie Yu



Re: Review Request 30513: Added validation for CREATE offer operation.

2015-02-03 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30513/
---

(Updated Feb. 4, 2015, 1:10 a.m.)


Review request for mesos, Ben Mahler, Michael Park, and Vinod Kone.


Changes
---

Killed another unneeded test.


Bugs: MESOS-2100
https://issues.apache.org/jira/browse/MESOS-2100


Repository: mesos


Description
---

Added validation for CREATE offer operation.

This depends on the master validator refactor. Tests will be in a different 
patch.


Diffs (updated)
-

  src/Makefile.am 5b1885de132d527c75fc6431cd2d377f0cb7f242 
  src/master/validation.hpp 642c37564375868fae4e398bc4dc8edf2f31c8a4 
  src/master/validation.cpp 8804ba6cf0315a22943c25ec24ef594f1cfadf83 
  src/tests/master_validation_tests.cpp PRE-CREATION 
  src/tests/resource_offers_tests.cpp 24a7eabd40b11f4bbbf30b8251561e73683eef9a 

Diff: https://reviews.apache.org/r/30513/diff/


Testing
---

make check


Thanks,

Jie Yu



Re: Review Request 30531: Remove strings::format and unnecessary constants from paths

2015-02-03 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30531/#review70885
---


Can we seperate the tests changes in this patch? It is a good practice not to 
change tests for refactors (if not necessary) so that we can capture potentiall 
regressions and have more confidence about the refactor?

The rest looks good to me! Thanks for the cleanup!


src/slave/paths.cpp
https://reviews.apache.org/r/30531/#comment116331

const string directory



src/slave/paths.cpp
https://reviews.apache.org/r/30531/#comment116332

const string last



src/slave/paths.cpp
https://reviews.apache.org/r/30531/#comment116333

ditto.



src/slave/paths.cpp
https://reviews.apache.org/r/30531/#comment116334

ditto.



src/tests/paths_tests.cpp
https://reviews.apache.org/r/30531/#comment116339

Two lines between top level functions. Here and everywhere else.



src/tests/paths_tests.cpp
https://reviews.apache.org/r/30531/#comment116335

const string



src/tests/paths_tests.cpp
https://reviews.apache.org/r/30531/#comment116337

`{` should be in next line. Here and everywhere else.



src/tests/paths_tests.cpp
https://reviews.apache.org/r/30531/#comment116340

const string



src/tests/paths_tests.cpp
https://reviews.apache.org/r/30531/#comment116346

const string



src/tests/paths_tests.cpp
https://reviews.apache.org/r/30531/#comment116342

The format here is a little jagged, how about

```
EXPECT_EQ(
path::join(
executorsRoot,
executorId.value(),
runs,
containerId.value()),
paths::getExecutorRunPath(
rootDir,
slaveId,
frameworkId,
executorId,
containerId));
```



src/tests/paths_tests.cpp
https://reviews.apache.org/r/30531/#comment116344

Ditto here on the jaggedness.



src/tests/paths_tests.cpp
https://reviews.apache.org/r/30531/#comment116345

const string


- Jie Yu


On Feb. 2, 2015, 11:40 p.m., Dominic Hamon wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30531/
 ---
 
 (Updated Feb. 2, 2015, 11:40 p.m.)
 
 
 Review request for mesos and Jie Yu.
 
 
 Bugs: MESOS-2314
 https://issues.apache.org/jira/browse/MESOS-2314
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Remove strings::format and unnecessary constants from paths
 
 
 Diffs
 -
 
   src/slave/paths.hpp 4d6897f9eebcd6852f0223bae23a29729a42e750 
   src/slave/paths.cpp fe951abac1254748dbd5bdd81b7e50da75afad6d 
   src/tests/paths_tests.cpp 417aa51b99d54055ad7254b4f274fb59d0794ca6 
 
 Diff: https://reviews.apache.org/r/30531/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Dominic Hamon
 




Re: Review Request 30509: Added executor working directory to Container struct in Mesos containerizer.

2015-02-02 Thread Jie Yu


 On Feb. 2, 2015, 7:56 p.m., Ian Downes wrote:
  Is this the executor work directory from the host's perspective or from the 
  container's perspective? They will be different if the container has a 
  different root filesystem...

It's the host executor working directory. This is the 'directory' passed to the 
Containerizer (e.g., launch)


- Jie


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30509/#review70614
---


On Feb. 2, 2015, 6:37 p.m., Jie Yu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30509/
 ---
 
 (Updated Feb. 2, 2015, 6:37 p.m.)
 
 
 Review request for mesos, Ben Mahler, Ian Downes, Michael Park, and Vinod 
 Kone.
 
 
 Bugs: MESOS-2031
 https://issues.apache.org/jira/browse/MESOS-2031
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Added executor working directory to Container struct in Mesos containerizer.
 
 The executor directory is needed when linking persistent volumes inside the 
 container.
 
 
 Diffs
 -
 
   src/slave/containerizer/mesos/containerizer.hpp 
 802988c90ac872b0cefa5e28f06e6fec98e8d032 
   src/slave/containerizer/mesos/containerizer.cpp 
 d712278428889ebdfd598537690138329d8464f0 
 
 Diff: https://reviews.apache.org/r/30509/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Jie Yu
 




Re: Review Request 30514: Rate limited the removal of slaves failing health checks.

2015-02-02 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30514/#review70603
---



src/master/flags.hpp
https://reviews.apache.org/r/30514/#comment115811

The maximum rate? Does it make sense to you to rename the flag name as well?

slave_removal_rate_limit?



src/master/master.hpp
https://reviews.apache.org/r/30514/#comment115816

The moment you pass it to the SlaveObserver, the ownership of the pointer 
is transferred and you cannot access to that anymore.

I guess here you really don't have many options but using a raw pointer 
(and let the slave manages its lifecycle). I guess we did the same thing for 
Registrar, WhitelistWatcher, Allocator, etc.



src/master/master.cpp
https://reviews.apache.org/r/30514/#comment115817

See my above comments. You probably want to use the raw pointer and add a 
NOTE about the ownership is controlled by the master.

OptionRateLimiter* _limiter



src/master/master.cpp
https://reviews.apache.org/r/30514/#comment115849

Can we simply the timeout logic and move all the limiter related stuff to 
`shutdown` function?

```
void timeout()
{
  if (pinged  ++timeouts = XXX) {
// Schedule a shutdown of the slave.
shutdown();
  }
  
  // NOTE: We keep sending PINGs even if a shutdown has
  // been scheduled so that we can cancel the shutdown
  // if the slave becomes reachable again.
  ping();
}

void shutdown()
{
  if (limiter.isNone()) {
LOG(INFO)  ...;

dispatch(master,
 Master::shutdownSlave,
 ...);  
return;
  }
  
  if (shuttingDown.isSome()) {
// A shutdown has been scheduled already.
return;
  }
  
  shuttingDown = limiter.get()-acquire()
.onAny(defer(self(), Self::_shutdown, ...);
}

void _shutdown(const FutureNothing future)
{
  if (future.isReady()) {
// Shutdown the slave.
  } else if (future.isFailed()) {
// Not expected to happen?
  } else if (future.isDiscarded()) {
// Do nothing here, maybe print a message.
  }
  
  shuttingDown = None();
}
```



src/master/master.cpp
https://reviews.apache.org/r/30514/#comment115842

Could you add a NOTE about this function:

```
NOTE: This function schedules a shutdown of the slave. The shutdown of a 
slave is rate limited (MESOS-). The shutdown can be cancelled if a pong is 
received before the actually shutdown is called.
```



src/tests/slave_tests.cpp
https://reviews.apache.org/r/30514/#comment115852

indent?



src/tests/slave_tests.cpp
https://reviews.apache.org/r/30514/#comment115851

indent?


- Jie Yu


On Feb. 2, 2015, 6:51 p.m., Vinod Kone wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30514/
 ---
 
 (Updated Feb. 2, 2015, 6:51 p.m.)
 
 
 Review request for mesos, Ben Mahler, David Robinson, and Jie Yu.
 
 
 Bugs: MESOS-1148
 https://issues.apache.org/jira/browse/MESOS-1148
 
 
 Repository: mesos
 
 
 Description
 ---
 
 The algorithm is simple. All the slave observers share a rate limiter whose 
 rate is configured via command line. When a slave times out on health check, 
 a permit is acquired to shutdown the slave *but* the pings are continued to 
 be sent. If a pong arrives before the permit is satisifed, the shutdown is 
 cancelled.
 
 
 Diffs
 -
 
   src/master/flags.hpp 6c18a1af625311ef149b5877b08f63c2b12c040d 
   src/master/master.hpp 337e00aa46ea127f3667e3383d631c3fb8e22f30 
   src/master/master.cpp 10056861b95ed9453c971787982db7d09f09f323 
   src/tests/partition_tests.cpp fea78016268b007590516798eb30ff423fd0ae58 
   src/tests/slave_tests.cpp e7e2af63da785644f3f7e6e23607c02be962a2c6 
 
 Diff: https://reviews.apache.org/r/30514/diff/
 
 
 Testing
 ---
 
 make check
 
 Ran the new tests 100 times
 
 
 Thanks,
 
 Vinod Kone
 




Re: Review Request 30512: Fixed DiskUsageCollectorTest.SymbolicLink so that it works on both ext3 and ext4.

2015-02-02 Thread Jie Yu


 On Feb. 2, 2015, 7:46 p.m., Vinod Kone wrote:
  Can you briefly mention what the fix is, in the description?

Added.


- Jie


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30512/#review70609
---


On Feb. 2, 2015, 7:48 p.m., Jie Yu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30512/
 ---
 
 (Updated Feb. 2, 2015, 7:48 p.m.)
 
 
 Review request for mesos, Ben Mahler and Vinod Kone.
 
 
 Bugs: MESOS-2241
 https://issues.apache.org/jira/browse/MESOS-2241
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Fixed DiskUsageCollectorTest.SymbolicLink so that it works on both ext3 and 
 ext4.
 
 The fix is to change the file size to be large enough so that the slight 
 discrepency between ext3 and ext4 (for meta data) shouldn't affect the 
 expections.
 
 
 Diffs
 -
 
   src/tests/disk_quota_tests.cpp 83a98447644a5fa46bffdc7f2ed73bb8411841f5 
 
 Diff: https://reviews.apache.org/r/30512/diff/
 
 
 Testing
 ---
 
 make check on both ext3 and ext4 systems.
 
 
 Thanks,
 
 Jie Yu
 




Re: Review Request 30512: Fixed DiskUsageCollectorTest.SymbolicLink so that it works on both ext3 and ext4.

2015-02-02 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30512/
---

(Updated Feb. 2, 2015, 7:48 p.m.)


Review request for mesos, Ben Mahler and Vinod Kone.


Bugs: MESOS-2241
https://issues.apache.org/jira/browse/MESOS-2241


Repository: mesos


Description (updated)
---

Fixed DiskUsageCollectorTest.SymbolicLink so that it works on both ext3 and 
ext4.

The fix is to change the file size to be large enough so that the slight 
discrepency between ext3 and ext4 (for meta data) shouldn't affect the 
expections.


Diffs
-

  src/tests/disk_quota_tests.cpp 83a98447644a5fa46bffdc7f2ed73bb8411841f5 

Diff: https://reviews.apache.org/r/30512/diff/


Testing
---

make check on both ext3 and ext4 systems.


Thanks,

Jie Yu



Re: Review Request 30509: Added executor working directory to Container struct in Mesos containerizer.

2015-02-02 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30509/
---

(Updated Feb. 2, 2015, 8:06 p.m.)


Review request for mesos, Ben Mahler, Ian Downes, Michael Park, and Vinod Kone.


Changes
---

removed circular dependency in depends on field -- @vinodkone


Bugs: MESOS-2031
https://issues.apache.org/jira/browse/MESOS-2031


Repository: mesos


Description
---

Added executor working directory to Container struct in Mesos containerizer.

The executor directory is needed when linking persistent volumes inside the 
container.


Diffs
-

  src/slave/containerizer/mesos/containerizer.hpp 
802988c90ac872b0cefa5e28f06e6fec98e8d032 
  src/slave/containerizer/mesos/containerizer.cpp 
d712278428889ebdfd598537690138329d8464f0 

Diff: https://reviews.apache.org/r/30509/diff/


Testing
---

make check


Thanks,

Jie Yu



Re: Review Request 28608: Allowed slave to create and link persistent volumes.

2015-02-02 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28608/#review70645
---


Replaced by https://reviews.apache.org/r/30510/ (do that in the containerizer).

- Jie Yu


On Dec. 2, 2014, 10:01 p.m., Jie Yu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/28608/
 ---
 
 (Updated Dec. 2, 2014, 10:01 p.m.)
 
 
 Review request for mesos and Ben Mahler.
 
 
 Bugs: MESOS-2031
 https://issues.apache.org/jira/browse/MESOS-2031
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Allowed slave to create and link persistent volumes.
 
 
 Diffs
 -
 
   src/slave/slave.hpp 70bd8c1fde4ea09fa54c76aa93424a1adb0309f6 
   src/slave/slave.cpp ed63dedbda0bf548a95de7d39002ac56a29303e5 
 
 Diff: https://reviews.apache.org/r/28608/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Jie Yu
 




Re: Review Request 30510: Allowed Mesos containerizer to prepare and update volumes.

2015-02-02 Thread Jie Yu


 On Feb. 2, 2015, 9:20 p.m., Ian Downes wrote:
  src/slave/containerizer/mesos/containerizer.hpp, lines 221-226
  https://reviews.apache.org/r/30510/diff/1/?file=843873#file843873line221
 
  In my current working code I've added an Optionstring rootfs to 
  struct Container. Would that be sufficient?

Sounds good to me! Can you pull this change out and we can commit that first?


 On Feb. 2, 2015, 9:20 p.m., Ian Downes wrote:
  src/slave/containerizer/mesos/containerizer.cpp, line 1196
  https://reviews.apache.org/r/30510/diff/1/?file=843874#file843874line1196
 
  There ordering considerations here for container paths which have 
  common ancestory beneath the container.
  
  
  this won't work:
  /bar - container/foo/bar   # fails because foo doesn't yet exist
  /foo - container/foo
  
  this works:
  /foo - container/foo
  /bar - container/foo/bar   # succeeds, symlink is created inside /foo
  
  The filesystem/shared isolator explicitly chose to not implement 
  re-ordering but it does check whether this is attempted and returns an 
  error stating such.

I'll disable all nested symlinks in master. I'll add a NOTE here.


 On Feb. 2, 2015, 9:20 p.m., Ian Downes wrote:
  src/slave/containerizer/mesos/containerizer.cpp, line 1213
  https://reviews.apache.org/r/30510/diff/1/?file=843874#file843874line1213
 
  This requires the parent of the joined target to exist?
  
  ln -s /persistent1 /container/foo/persistent will fail if 
  /container/foo doesn't exist

I'll disable nested symlinks. I'll add a NOTE here too.


 On Feb. 2, 2015, 9:20 p.m., Ian Downes wrote:
  src/slave/containerizer/mesos/containerizer.cpp, line 1215
  https://reviews.apache.org/r/30510/diff/1/?file=843874#file843874line1215
 
  I presume this always returns an absolute path?

Yes, it is (if not, we need to make sure it is).


 On Feb. 2, 2015, 9:20 p.m., Ian Downes wrote:
  src/slave/containerizer/mesos/containerizer.cpp, lines 1161-1167
  https://reviews.apache.org/r/30510/diff/1/?file=843874#file843874line1161
 
  How can we ensure that such a filesystem isolator is being used? 
  Shouldn't all of this functionality be in one place!?

For now, in master, we will only allow relative non-nested container path.


- Jie


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30510/#review70617
---


On Feb. 2, 2015, 6:36 p.m., Jie Yu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30510/
 ---
 
 (Updated Feb. 2, 2015, 6:36 p.m.)
 
 
 Review request for mesos, Ben Mahler, Ian Downes, Michael Park, and Vinod 
 Kone.
 
 
 Bugs: MESOS-2031
 https://issues.apache.org/jira/browse/MESOS-2031
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Allowed Mesos containerizer to prepare and update volumes.
 
 Mesos containerizer setup volumes for the container.
 
 
 Diffs
 -
 
   src/slave/containerizer/mesos/containerizer.hpp 
 802988c90ac872b0cefa5e28f06e6fec98e8d032 
   src/slave/containerizer/mesos/containerizer.cpp 
 d712278428889ebdfd598537690138329d8464f0 
 
 Diff: https://reviews.apache.org/r/30510/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Jie Yu
 




Review Request 30536: Renamed persisted resources to checkpointed resources.

2015-02-02 Thread Jie Yu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30536/
---

Review request for mesos, Benjamin Hindman, Ben Mahler, Michael Park, and Vinod 
Kone.


Repository: mesos


Description
---

Renamed persisted resources to checkpointed resources per our discussion.

This is to avoid confusion with persistent volumes.


Diffs
-

  src/master/allocator.hpp 318a756e0a8ca1bba6d1144c8160ff24a6b6e646 
  src/master/master.hpp 337e00aa46ea127f3667e3383d631c3fb8e22f30 
  src/master/master.cpp 10056861b95ed9453c971787982db7d09f09f323 
  src/messages/messages.proto c609f500e5e999a7587feb8acfb62aa637824c0a 

Diff: https://reviews.apache.org/r/30536/diff/


Testing
---

make check


Thanks,

Jie Yu



<    1   2   3   4   5   6   7   8   9   10   >