Re: Review Request 71697: Optimized tracking of cluster resource totals.

2019-10-29 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71697/#review218435
---



Bad review!

Reviews applied: [71697]

Error:
2019-10-29 16:48:01 URL:https://reviews.apache.org/r/71697/diff/raw/ 
[39429/39429] -> "71697.patch" [1]
error: patch failed: src/master/allocator/mesos/hierarchical.hpp:529
error: src/master/allocator/mesos/hierarchical.hpp: patch does not apply
error: patch failed: src/master/allocator/mesos/hierarchical.cpp:570
error: src/master/allocator/mesos/hierarchical.cpp: patch does not apply
error: src/master/allocator/sorter/drf/sorter.hpp: does not exist in index
error: src/master/allocator/sorter/drf/sorter.cpp: does not exist in index
error: src/master/allocator/sorter/random/sorter.hpp: does not exist in index
error: src/master/allocator/sorter/random/sorter.cpp: does not exist in index
error: src/master/allocator/sorter/sorter.hpp: does not exist in index

- Mesos Reviewbot


On Oct. 29, 2019, 3:30 p.m., Andrei Sekretenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71697/
> ---
> 
> (Updated Oct. 29, 2019, 3:30 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler and Meng Zhu.
> 
> 
> Bugs: MESOS-10015
> https://issues.apache.org/jira/browse/MESOS-10015
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch addresses poor performance of
> `HierarchicalAllocatorProcess::updateAllocation()` for agents with
> a huge number of non-addable resources in a many-framework case
> (see MESOS-10015).
> 
> Sorter methods for totals tracking that modify `Resources` of an agent
> in the Sorter are replaced with methods that add/remove resource
> quantities of an agent as a whole (which was actually the only use case
> of the old methods). Thus, subtracting/adding `Resources` of a whole
> agent no longer occurs when updating resources of an agent in a Sorter.
> 
> Further, this patch completely removes agent resource tracking logic
> from the random sorter (which by itself makes no use of them) by
> implementing cluster totals tracking in the allocator.
> 
> Results of `*BENCHMARK_WithReservationParam.UpdateAllocation*`
> (for the DRF sorter):
> 
> 1.8.x branch:
> Agent resources size: 200 (50 frameworks)
> Made 20 reserve and unreserve operations in 1.938801227secs
> Agent resources size: 400 (100 frameworks)
> Made 20 reserve and unreserve operations in 13.861857374secs
> Agent resources size: 800 (200 frameworks)
> Made 20 reserve and unreserve operations in 2.13412983136667mins
> 
> 1.8.x branch + this pathch:
> Agent resources size: 200 (50 frameworks)
> Made 20 reserve and unreserve operations in 214.063821ms
> Agent resources size: 400 (100 frameworks)
> Made 20 reserve and unreserve operations in 425.278671ms
> Agent resources size: 800 (200 frameworks)
> Made 20 reserve and unreserve operations in 1.136214374secs
> ...
> Agent resources size: 6400 (1600 frameworks)
> Made 20 reserve and unreserve operations in 50.094194999secs
> 
> This is a backport of https://reviews.apache.org/r/71646
> 
> 
> Diffs
> -
> 
>   src/master/allocator/mesos/hierarchical.hpp 
> 4f716820748e070569e988f8dad15670367a74b7 
>   src/master/allocator/mesos/hierarchical.cpp 
> 061b70258f4874f4f2b26a57705b9ba1543c7553 
>   src/master/allocator/sorter/drf/sorter.hpp 
> 7daf1bfd2dfe88e2d8e0af07c8af8aa823f80935 
>   src/master/allocator/sorter/drf/sorter.cpp 
> 9367469132e426f0b4b66a80ad300c157fba6bf2 
>   src/master/allocator/sorter/random/sorter.hpp 
> c8e777be256b4faf931bf1a106185d7f91b3ba6f 
>   src/master/allocator/sorter/random/sorter.cpp 
> 9899cfd570607a60dbd7980d340a8e7d9d3e6df5 
>   src/master/allocator/sorter/sorter.hpp 
> d56a1166a9e82b034564842ac071874ec2885004 
>   src/tests/sorter_tests.cpp 1e4a7893411d2107049a7bb92ee159526588c58c 
> 
> 
> Diff: https://reviews.apache.org/r/71697/diff/1/
> 
> 
> Testing
> ---
> 
> make check
> 
> `*BENCHMARK_WithReservationParam.UpdateAllocation*`:
> 
> **Before:**
> 
> Agent resources size: 200 (50 roles, 1 reservations per role, 1 port ranges)
> Made 20 reserve and unreserve operations in 1.938801227secs
> Average UNRESERVE duration: 49.161884ms
> Average RESERVE duration: 47.778177ms
> 
> Agent resources size: 400 (100 roles, 1 reservations per role, 1 port ranges)
> Made 20 reserve and unreserve operations in 13.861857374secs
> Average UNRESERVE duration: 346.822609ms
> Average RESERVE duration: 346.270259ms
> 
> Agent resources size: 800 (200 roles, 1 reservations per role, 1 port ranges)
> Made 20 reserve and unreserve operations in 2.13412983136667mins
> Average UNRESERVE duration: 3.200348465secs
> Average RESERVE duration: 3.202041028secs
> 
> Agent resources size: 1600 (400 roles, 1 reservations per role, 

Review Request 71697: Optimized tracking of cluster resource totals.

2019-10-29 Thread Andrei Sekretenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71697/
---

Review request for mesos, Benjamin Mahler and Meng Zhu.


Bugs: MESOS-10015
https://issues.apache.org/jira/browse/MESOS-10015


Repository: mesos


Description
---

This patch addresses poor performance of
`HierarchicalAllocatorProcess::updateAllocation()` for agents with
a huge number of non-addable resources in a many-framework case
(see MESOS-10015).

Sorter methods for totals tracking that modify `Resources` of an agent
in the Sorter are replaced with methods that add/remove resource
quantities of an agent as a whole (which was actually the only use case
of the old methods). Thus, subtracting/adding `Resources` of a whole
agent no longer occurs when updating resources of an agent in a Sorter.

Further, this patch completely removes agent resource tracking logic
from the random sorter (which by itself makes no use of them) by
implementing cluster totals tracking in the allocator.

Results of `*BENCHMARK_WithReservationParam.UpdateAllocation*`
(for the DRF sorter):

1.8.x branch:
Agent resources size: 200 (50 frameworks)
Made 20 reserve and unreserve operations in 1.938801227secs
Agent resources size: 400 (100 frameworks)
Made 20 reserve and unreserve operations in 13.861857374secs
Agent resources size: 800 (200 frameworks)
Made 20 reserve and unreserve operations in 2.13412983136667mins

1.8.x branch + this pathch:
Agent resources size: 200 (50 frameworks)
Made 20 reserve and unreserve operations in 214.063821ms
Agent resources size: 400 (100 frameworks)
Made 20 reserve and unreserve operations in 425.278671ms
Agent resources size: 800 (200 frameworks)
Made 20 reserve and unreserve operations in 1.136214374secs
...
Agent resources size: 6400 (1600 frameworks)
Made 20 reserve and unreserve operations in 50.094194999secs

This is a backport of https://reviews.apache.org/r/71646


Diffs
-

  src/master/allocator/mesos/hierarchical.hpp 
4f716820748e070569e988f8dad15670367a74b7 
  src/master/allocator/mesos/hierarchical.cpp 
061b70258f4874f4f2b26a57705b9ba1543c7553 
  src/master/allocator/sorter/drf/sorter.hpp 
7daf1bfd2dfe88e2d8e0af07c8af8aa823f80935 
  src/master/allocator/sorter/drf/sorter.cpp 
9367469132e426f0b4b66a80ad300c157fba6bf2 
  src/master/allocator/sorter/random/sorter.hpp 
c8e777be256b4faf931bf1a106185d7f91b3ba6f 
  src/master/allocator/sorter/random/sorter.cpp 
9899cfd570607a60dbd7980d340a8e7d9d3e6df5 
  src/master/allocator/sorter/sorter.hpp 
d56a1166a9e82b034564842ac071874ec2885004 
  src/tests/sorter_tests.cpp 1e4a7893411d2107049a7bb92ee159526588c58c 


Diff: https://reviews.apache.org/r/71697/diff/1/


Testing
---

make check

`*BENCHMARK_WithReservationParam.UpdateAllocation*`:

**Before:**

Agent resources size: 200 (50 roles, 1 reservations per role, 1 port ranges)
Made 20 reserve and unreserve operations in 1.938801227secs
Average UNRESERVE duration: 49.161884ms
Average RESERVE duration: 47.778177ms

Agent resources size: 400 (100 roles, 1 reservations per role, 1 port ranges)
Made 20 reserve and unreserve operations in 13.861857374secs
Average UNRESERVE duration: 346.822609ms
Average RESERVE duration: 346.270259ms

Agent resources size: 800 (200 roles, 1 reservations per role, 1 port ranges)
Made 20 reserve and unreserve operations in 2.13412983136667mins
Average UNRESERVE duration: 3.200348465secs
Average RESERVE duration: 3.202041028secs

Agent resources size: 1600 (400 roles, 1 reservations per role, 1 port ranges)
(killed after several minutes)

**After:**

Agent resources size: 200 (50 roles, 1 reservations per role, 1 port ranges)
Made 20 reserve and unreserve operations in 214.063821ms
Average UNRESERVE duration: 5.134867ms
Average RESERVE duration: 5.568323ms

Agent resources size: 400 (100 roles, 1 reservations per role, 1 port ranges)
Made 20 reserve and unreserve operations in 425.278671ms
Average UNRESERVE duration: 10.201193ms
Average RESERVE duration: 11.06274ms

Agent resources size: 800 (200 roles, 1 reservations per role, 1 port ranges)
Made 20 reserve and unreserve operations in 1.136214374secs
Average UNRESERVE duration: 28.336427ms
Average RESERVE duration: 28.474291ms

Agent resources size: 1600 (400 roles, 1 reservations per role, 1 port ranges)
Made 20 reserve and unreserve operations in 3.773618637secs
Average UNRESERVE duration: 93.619424ms
Average RESERVE duration: 95.061507ms

Agent resources size: 3200 (800 roles, 1 reservations per role, 1 port ranges)
Made 20 reserve and unreserve operations in 13.881966194secs
Average UNRESERVE duration: 350.46368ms
Average RESERVE duration: 343.634628ms

Agent resources size: 6400 (1600 roles, 1 reservations per role, 1 port ranges)
Made 20 reserve and unreserve operations in 50.094194999secs
Average UNRESERVE duration: 1.252057472secs
Average RESERVE duration: 1.252652277secs


Thanks,