-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45534/#review127721
-----------------------------------------------------------



High level thought is that we should start with just the shares within the role 
sorter, since these are the easiest to understand. After Vinod and I talked, we 
realized the "quota" shares were difficult to reason about, so let's not 
include those for now (one thing at a time).

I also would suggest the following naming scheme:

```
allocator/mesos/roles/<role>/shares/dominant
```

Since we may want additional shares to be exposed:

```
allocator/mesos/roles/<role>/shares/dominant: 0.5
allocator/mesos/roles/<role>/shares/cpus: 0.5
allocator/mesos/roles/<role>/shares/mem: 0.2
allocator/mesos/roles/<role>/shares/disk: 0.1
```


docs/monitoring.md (lines 967 - 973)
<https://reviews.apache.org/r/45534/#comment191107>

    Let's not include this one for now, I chatted with vinod about this one to 
get some more thoughts and we agree this is pretty difficult to explain to the 
users and likely the most valuable thing to expose for now is the dominant 
shares for the normal sorter.



src/master/allocator/sorter/drf/metrics.hpp (line 59)
<https://reviews.apache.org/r/45534/#comment191109>

    ```
    // Dominant share of each client.
    hashmap<std::string, process::metrics::Gauge> dominantShares;
    ```



src/master/allocator/sorter/sorter.hpp (lines 46 - 52)
<https://reviews.apache.org/r/45534/#comment191108>

    Why a factory rather than just a string prefix? How about:
    
    ```
      // Provides the allocator's execution context (via a UPID)
      // and a name prefix in order to support metrics within the
      // sorter implementation.
      explicit Sorter(
          const process::UPID& allocator,
          const std::string& metricsPrefix) {}
    ```



src/tests/hierarchical_allocator_tests.cpp (lines 2748 - 2752)
<https://reviews.apache.org/r/45534/#comment191105>

    How about:
    
    ```
    // Verifies that per-role dominant share metrics are correctly reported.
    ```



src/tests/hierarchical_allocator_tests.cpp (lines 2762 - 2764)
<https://reviews.apache.org/r/45534/#comment191104>

    Would you mind wrapping to reduce jaggedness in general?
    
    ```
      // Register one agent and one framework. The framework will
      // immediately receive receive an offer and make it have the
      // maximum possible dominant share.
    ```



src/tests/hierarchical_allocator_tests.cpp (lines 2783 - 2787)
<https://reviews.apache.org/r/45534/#comment191110>

    We need to settle after these allocator calls, otherwise we may not see the 
changes reflected in the metrics. Did you run this test in repetition?



src/tests/hierarchical_allocator_tests.cpp (lines 2791 - 2792)
<https://reviews.apache.org/r/45534/#comment191106>

    How about moving this up to before the decline happens, to be consistent 
with your other comments that describe what's happening in the test, so:
    
    ```
      // Decline the offered resources and expect a zero share.
      Future<Allocation> allocation = allocations.get();
      AWAIT_READY(allocation);
      allocator->recoverResources(
          allocation->frameworkId,
          agent1.id(),
          allocation->resources.get(agent1.id()).get(),
          None());
    ```



src/tests/hierarchical_allocator_tests.cpp (line 2835)
<https://reviews.apache.org/r/45534/#comment191103>

    Can you move this down to be adjacent with the expectation on the metrics? 
It's also how we did it in the metrics test above:
    
    ```
      expected.values = {
          {"allocator/mesos/dominant_share/roles/roleA", 0.5},
      };
    
      metrics = Metrics();
      EXPECT_TRUE(metrics.contains(expected));
    ```


- Ben Mahler


On April 4, 2016, 12:32 p.m., Benjamin Bannier wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/45534/
> -----------------------------------------------------------
> 
> (Updated April 4, 2016, 12:32 p.m.)
> 
> 
> Review request for mesos, Alexander Rukletsov and Ben Mahler.
> 
> 
> Bugs: MESOS-4760
>     https://issues.apache.org/jira/browse/MESOS-4760
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Added per-role and quota share metrics to the DRFSorter.
> 
> 
> Diffs
> -----
> 
>   docs/monitoring.md 76be32842e8b437fda40c8565a34ec4f8d8dfbcc 
>   src/Makefile.am f22ae5b3bd9336a56c802e0e51d39d6cb675caf2 
>   src/master/allocator/mesos/hierarchical.hpp 
> e979fdf60da1409d1c2d08f0e9f03cef067506dd 
>   src/master/allocator/sorter/drf/metrics.hpp PRE-CREATION 
>   src/master/allocator/sorter/drf/metrics.cpp PRE-CREATION 
>   src/master/allocator/sorter/drf/sorter.hpp 
> f316bb5b8bfe93311ecac57198392e104b234b04 
>   src/master/allocator/sorter/drf/sorter.cpp 
> c14f9a45b9e2ea8d79bd8d2f092d313afa4cbac3 
>   src/master/allocator/sorter/sorter.hpp 
> e2338d5297e11a1ca4f6e5d72a4526aa4579610c 
>   src/tests/hierarchical_allocator_tests.cpp 
> 8f78a204d296f94f515f21511710a35c33e27255 
> 
> Diff: https://reviews.apache.org/r/45534/diff/
> 
> 
> Testing
> -------
> 
> make check (OS X, clang trunk, not optimized)
> 
> I did also benchmark the slowdown of the allocator due to this addition with 
> the benchmark allocator https://reviews.apache.org/r/44853. There I saw that 
> for an unoptimized build this patch adds up to 70 ms to the time needed to 
> query the metrics endpoint (this was for the case of 5000 slaves and 1000 
> frameworks), though one could expected that an optimized build might perform 
> better. The numbers I got where
> 
> `#`slaves | #frameworks | old time [us] | new time [us] | slowdown
> --------|-------------|---------------|---------------|---------
>    1000 |           1 |         38980 |         23847 |    0.6
>    1000 |          50 |         27834 |         42091 |    1.5
>    1000 |         100 |         40060 |         47571 |    1.2
>    1000 |         200 |         63132 |         75806 |    1.2
>    1000 |         500 |        145170 |        171929 |    1.2
>    1000 |        1000 |        427721 |        473822 |    1.1
>    5000 |           1 |         23249 |         21426 |    0.9
>    5000 |          50 |         41032 |         36318 |    0.9
>    5000 |         100 |         43636 |         45210 |    1.0
>    5000 |         200 |         60204 |         65570 |    1.1
>    5000 |         500 |        121509 |        196894 |    1.6
>    5000 |        1000 |        449476 |        496641 |    1.1
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>

Reply via email to