Re: Review Request 68490: Optimized `class Resources` with copy-on-write.

2018-09-20 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68490/#review208810
---


Fix it, then Ship it!





src/v1/resources.cpp
Lines 1657-1675 (original), 1657-1677 (patched)


Why copy the shared pointers in these loops? Ditto above.


- Benjamin Mahler


On Sept. 14, 2018, 10:11 p.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68490/
> ---
> 
> (Updated Sept. 14, 2018, 10:11 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Benjamin Mahler, and Gastón 
> Kleiman.
> 
> 
> Bugs: MESOS-6765
> https://issues.apache.org/jira/browse/MESOS-6765
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch lets `class Resources` only store shared
> pointers to the underlying resource objects, so that
> read-only filter operations such as `reserved()`,
> `unreserved()` and etc. can avoid making copies of
> the whole resource objects. Instead, only shared pointers
> are copied.
> 
> In write operations, we check if there are more than one
> references to the resource object. If so, a copy is made
> for safe mutation without affecting owners.
> 
> To maintain the usage abstraction that `class Resources`
> still holds resource objects, we utilize
> `boost::indirect_iterator` iterator adapter to deference
> the shared pointers as we iterate.
> 
> 
> Diffs
> -
> 
>   include/mesos/resources.hpp 6f81b14f8bc090a144eeae8f15639c370366166d 
>   include/mesos/v1/resources.hpp 09110530da16678abf6bf6b308906dd8ccc8180a 
>   src/common/resources.cpp 3e63cdedb9261970dbeb9bb9f97eed65819f68a7 
>   src/v1/resources.cpp 3683a331e0859cd6f2ad061db6ba67112ecfcb0d 
> 
> 
> Diff: https://reviews.apache.org/r/68490/diff/5/
> 
> 
> Testing
> ---
> 
> make check
> 
> Benchmarked on a Ubuntu box using clang -O2 optimized build with an i5 
> processor, frequency fixed at 1.6GHz.
> 
> ## Resources benchmarks (Measure geo-mean of various operations):
> 
> Normalized Geomean latency
>   
>   Master  Copy-on-write
> Arithmetic1   1.09
> Filter1   0.28
> Contains  1   0.27
> Parse 1   1.01
> 
> ## Sample allocator benchmark (Measure first allocation latency):
> 
> HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/1 (1k agents, 50 
> frameworks):
> 
> Master 1
> Copy-on-write 0.72
> 
> HierarchicalAllocator_BENCHMARK_Test.ResourceLabels/1 (1k agents, 50 
> frameworks):
> 
> Master 1
> Copy-on-write 0.72
> 
> ## Sorter benchmark
> 
> ### BENCHMARK_FullSort
> 
> Master:
> 
> Using 3 agents and 500 clients
> Added 500 clients in 2.261373ms
> Added 3 agents in 322.252441ms
> Added allocations for 3 agents in 823.631826ms
> Full sort of 500 clients took 8662ns
> No-op sort of 500 clients took 809ns
> Removed allocations for 3 agents in 8.482659084secs
> Removed 3 agents in 417.988004ms
> Removed 500 clients in 17.855575ms
> 
> 
> Copy-on-write:
> 
> Using 3 agents and 500 clients (**normalized to the master above**)
> Added 500 clients in 2.464647ms (**1.09**)
> Added 3 agents in 205.071423ms (**0.64**)
> Added allocations for 3 agents in 156.340883ms (**0.19**)
> Full sort of 500 clients took 1007ns (**0.12**)
> No-op sort of 500 clients took 204ns (**0.02**)
> Removed allocations for 3 agents in 7.215175016secs (**0.85**)
> Removed 3 agents in 304.998116ms (**0.73**)
> Removed 500 clients in 3.195079ms (**0.18**)
> 
> 
> ### BENCHMARK_HierarchyFullSort
> 
> Master:
> 
> Added 1056 clients in 36.571505ms
> Added 1 agents in 114.165111ms
> Added allocations for 1 agents in 569.144686ms
> Full sort of 1056 clients took 43485ns
> No-op sort of 1056 clients took 676ns
> Removed allocations for 1 agents in 5.656517904secs
> Removed 1 agents in 153.016733ms
> Removed 1056 clients in 7.169549ms
> 
> Copy-on-write:
> 
> Added 1056 clients in 36.468822ms (**0.997**)
> Added 1 agents in 67.733365ms (**0.59**)
> Added allocations for 1 agents in 99.695723ms (**0.175**)
> Full sort of 1056 clients took 29729ns (**0.68**)
> No-op sort of 1056 clients took 704ns (**1.04**)
> Removed allocations for 1 agents in 4.824389788secs (**0.85**)
> Removed 1 agents in 101.319295ms (**0.66**)
> Removed 1056 clients in 2.480967ms (**0.35**)
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 68490: Optimized `class Resources` with copy-on-write.

2018-09-14 Thread Meng Zhu


> On Sept. 11, 2018, 7:25 p.m., Benjamin Mahler wrote:
> > src/common/resources.cpp
> > Line 1641 (original), 1655 (patched)
> > 
> >
> > Actually, I was thinking about this, isn't the casting to `Resource` 
> > broken anywhere we use it? We lose the shared count?

Good catch! Fixed. Also scanned other places and added TODO to fix in a 
separate patch.


- Meng


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68490/#review208538
---


On Sept. 14, 2018, 3:11 p.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68490/
> ---
> 
> (Updated Sept. 14, 2018, 3:11 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Benjamin Mahler, and Gastón 
> Kleiman.
> 
> 
> Bugs: MESOS-6765
> https://issues.apache.org/jira/browse/MESOS-6765
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch lets `class Resources` only store shared
> pointers to the underlying resource objects, so that
> read-only filter operations such as `reserved()`,
> `unreserved()` and etc. can avoid making copies of
> the whole resource objects. Instead, only shared pointers
> are copied.
> 
> In write operations, we check if there are more than one
> references to the resource object. If so, a copy is made
> for safe mutation without affecting owners.
> 
> To maintain the usage abstraction that `class Resources`
> still holds resource objects, we utilize
> `boost::indirect_iterator` iterator adapter to deference
> the shared pointers as we iterate.
> 
> 
> Diffs
> -
> 
>   include/mesos/resources.hpp 6f81b14f8bc090a144eeae8f15639c370366166d 
>   include/mesos/v1/resources.hpp 09110530da16678abf6bf6b308906dd8ccc8180a 
>   src/common/resources.cpp 3e63cdedb9261970dbeb9bb9f97eed65819f68a7 
>   src/v1/resources.cpp 3683a331e0859cd6f2ad061db6ba67112ecfcb0d 
> 
> 
> Diff: https://reviews.apache.org/r/68490/diff/5/
> 
> 
> Testing
> ---
> 
> make check
> 
> Benchmarked on a Ubuntu box using clang -O2 optimized build with an i5 
> processor, frequency fixed at 1.6GHz.
> 
> ## Resources benchmarks (Measure geo-mean of various operations):
> 
> Normalized Geomean latency
>   
>   Master  Copy-on-write
> Arithmetic1   1.09
> Filter1   0.28
> Contains  1   0.27
> Parse 1   1.01
> 
> ## Sample allocator benchmark (Measure first allocation latency):
> 
> HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/1 (1k agents, 50 
> frameworks):
> 
> Master 1
> Copy-on-write 0.72
> 
> HierarchicalAllocator_BENCHMARK_Test.ResourceLabels/1 (1k agents, 50 
> frameworks):
> 
> Master 1
> Copy-on-write 0.72
> 
> ## Sorter benchmark
> 
> ### BENCHMARK_FullSort
> 
> Master:
> 
> Using 3 agents and 500 clients
> Added 500 clients in 2.261373ms
> Added 3 agents in 322.252441ms
> Added allocations for 3 agents in 823.631826ms
> Full sort of 500 clients took 8662ns
> No-op sort of 500 clients took 809ns
> Removed allocations for 3 agents in 8.482659084secs
> Removed 3 agents in 417.988004ms
> Removed 500 clients in 17.855575ms
> 
> 
> Copy-on-write:
> 
> Using 3 agents and 500 clients (**normalized to the master above**)
> Added 500 clients in 2.464647ms (**1.09**)
> Added 3 agents in 205.071423ms (**0.64**)
> Added allocations for 3 agents in 156.340883ms (**0.19**)
> Full sort of 500 clients took 1007ns (**0.12**)
> No-op sort of 500 clients took 204ns (**0.02**)
> Removed allocations for 3 agents in 7.215175016secs (**0.85**)
> Removed 3 agents in 304.998116ms (**0.73**)
> Removed 500 clients in 3.195079ms (**0.18**)
> 
> 
> ### BENCHMARK_HierarchyFullSort
> 
> Master:
> 
> Added 1056 clients in 36.571505ms
> Added 1 agents in 114.165111ms
> Added allocations for 1 agents in 569.144686ms
> Full sort of 1056 clients took 43485ns
> No-op sort of 1056 clients took 676ns
> Removed allocations for 1 agents in 5.656517904secs
> Removed 1 agents in 153.016733ms
> Removed 1056 clients in 7.169549ms
> 
> Copy-on-write:
> 
> Added 1056 clients in 36.468822ms (**0.997**)
> Added 1 agents in 67.733365ms (**0.59**)
> Added allocations for 1 agents in 99.695723ms (**0.175**)
> Full sort of 1056 clients took 29729ns (**0.68**)
> No-op sort of 1056 clients took 704ns (**1.04**)
> Removed allocations for 1 agents in 4.824389788secs (**0.85**)
> Removed 1 agents in 101.319295ms (**0.66**)
> Removed 1056 clients in 2.480967ms (**0.35**)
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 68490: Optimized `class Resources` with copy-on-write.

2018-09-14 Thread Meng Zhu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68490/
---

(Updated Sept. 14, 2018, 3:11 p.m.)


Review request for mesos, Benjamin Bannier, Benjamin Mahler, and Gastón Kleiman.


Bugs: MESOS-6765
https://issues.apache.org/jira/browse/MESOS-6765


Repository: mesos


Description
---

This patch lets `class Resources` only store shared
pointers to the underlying resource objects, so that
read-only filter operations such as `reserved()`,
`unreserved()` and etc. can avoid making copies of
the whole resource objects. Instead, only shared pointers
are copied.

In write operations, we check if there are more than one
references to the resource object. If so, a copy is made
for safe mutation without affecting owners.

To maintain the usage abstraction that `class Resources`
still holds resource objects, we utilize
`boost::indirect_iterator` iterator adapter to deference
the shared pointers as we iterate.


Diffs (updated)
-

  include/mesos/resources.hpp 6f81b14f8bc090a144eeae8f15639c370366166d 
  include/mesos/v1/resources.hpp 09110530da16678abf6bf6b308906dd8ccc8180a 
  src/common/resources.cpp 3e63cdedb9261970dbeb9bb9f97eed65819f68a7 
  src/v1/resources.cpp 3683a331e0859cd6f2ad061db6ba67112ecfcb0d 


Diff: https://reviews.apache.org/r/68490/diff/5/

Changes: https://reviews.apache.org/r/68490/diff/4-5/


Testing
---

make check

Benchmarked on a Ubuntu box using clang -O2 optimized build with an i5 
processor, frequency fixed at 1.6GHz.

## Resources benchmarks (Measure geo-mean of various operations):

Normalized Geomean latency

Master  Copy-on-write
Arithmetic  1   1.09
Filter  1   0.28
Contains1   0.27
Parse   1   1.01

## Sample allocator benchmark (Measure first allocation latency):

HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/1 (1k agents, 50 frameworks):

Master 1
Copy-on-write 0.72

HierarchicalAllocator_BENCHMARK_Test.ResourceLabels/1 (1k agents, 50 
frameworks):

Master 1
Copy-on-write 0.72

## Sorter benchmark

### BENCHMARK_FullSort

Master:

Using 3 agents and 500 clients
Added 500 clients in 2.261373ms
Added 3 agents in 322.252441ms
Added allocations for 3 agents in 823.631826ms
Full sort of 500 clients took 8662ns
No-op sort of 500 clients took 809ns
Removed allocations for 3 agents in 8.482659084secs
Removed 3 agents in 417.988004ms
Removed 500 clients in 17.855575ms


Copy-on-write:

Using 3 agents and 500 clients (**normalized to the master above**)
Added 500 clients in 2.464647ms (**1.09**)
Added 3 agents in 205.071423ms (**0.64**)
Added allocations for 3 agents in 156.340883ms (**0.19**)
Full sort of 500 clients took 1007ns (**0.12**)
No-op sort of 500 clients took 204ns (**0.02**)
Removed allocations for 3 agents in 7.215175016secs (**0.85**)
Removed 3 agents in 304.998116ms (**0.73**)
Removed 500 clients in 3.195079ms (**0.18**)


### BENCHMARK_HierarchyFullSort

Master:

Added 1056 clients in 36.571505ms
Added 1 agents in 114.165111ms
Added allocations for 1 agents in 569.144686ms
Full sort of 1056 clients took 43485ns
No-op sort of 1056 clients took 676ns
Removed allocations for 1 agents in 5.656517904secs
Removed 1 agents in 153.016733ms
Removed 1056 clients in 7.169549ms

Copy-on-write:

Added 1056 clients in 36.468822ms (**0.997**)
Added 1 agents in 67.733365ms (**0.59**)
Added allocations for 1 agents in 99.695723ms (**0.175**)
Full sort of 1056 clients took 29729ns (**0.68**)
No-op sort of 1056 clients took 704ns (**1.04**)
Removed allocations for 1 agents in 4.824389788secs (**0.85**)
Removed 1 agents in 101.319295ms (**0.66**)
Removed 1056 clients in 2.480967ms (**0.35**)


Thanks,

Meng Zhu



Re: Review Request 68490: Optimized `class Resources` with copy-on-write.

2018-09-11 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68490/#review208538
---



Almost there! I'm wondering about the casting to `Resource` in loops, seems 
broken since we lose the shared count? :O


src/common/resources.cpp
Line 1641 (original), 1655 (patched)


Actually, I was thinking about this, isn't the casting to `Resource` broken 
anywhere we use it? We lose the shared count?



src/common/resources.cpp
Lines 1671-1674 (patched)


The std::move in add doesn't do anything right now since we don't have an 
ravlue overload for shared_ptr?

Wouldn't this be simpler?

```
if (isReserved(resource_->resource)) {
  Resource_ r_ = *resource_;
  r_->resource.clear_reservations();
  result.add(std::move(r_));
} else {
```



src/common/resources.cpp
Line 1969 (original), 1992 (patched)


Ditto here.



src/v1/resources.cpp
Lines 1687-1690 (patched)


Ditto here



src/v1/resources.cpp
Line 1976 (original), 1997 (patched)


We don't have an rvalue overload for -=, can we just leave this as is for 
now by taking out the std::move?


- Benjamin Mahler


On Sept. 7, 2018, 3:29 a.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68490/
> ---
> 
> (Updated Sept. 7, 2018, 3:29 a.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Benjamin Mahler, and Gastón 
> Kleiman.
> 
> 
> Bugs: MESOS-6765
> https://issues.apache.org/jira/browse/MESOS-6765
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch lets `class Resources` only store shared
> pointers to the underlying resource objects, so that
> read-only filter operations such as `reserved()`,
> `unreserved()` and etc. can avoid making copies of
> the whole resource objects. Instead, only shared pointers
> are copied.
> 
> In write operations, we check if there are more than one
> references to the resource object. If so, a copy is made
> for safe mutation without affecting owners.
> 
> To maintain the usage abstraction that `class Resources`
> still holds resource objects, we utilize
> `boost::indirect_iterator` iterator adapter to deference
> the shared pointers as we iterate.
> 
> 
> Diffs
> -
> 
>   include/mesos/resources.hpp 6f81b14f8bc090a144eeae8f15639c370366166d 
>   include/mesos/v1/resources.hpp 09110530da16678abf6bf6b308906dd8ccc8180a 
>   src/common/resources.cpp 3e63cdedb9261970dbeb9bb9f97eed65819f68a7 
>   src/v1/resources.cpp 3683a331e0859cd6f2ad061db6ba67112ecfcb0d 
> 
> 
> Diff: https://reviews.apache.org/r/68490/diff/4/
> 
> 
> Testing
> ---
> 
> make check
> 
> Benchmarked on a Ubuntu box using clang -O2 optimized build with an i5 
> processor, frequency fixed at 1.6GHz.
> 
> ## Resources benchmarks (Measure geo-mean of various operations):
> 
> Normalized Geomean latency
>   
>   Master  Copy-on-write
> Arithmetic1   1.09
> Filter1   0.28
> Contains  1   0.27
> Parse 1   1.01
> 
> ## Sample allocator benchmark (Measure first allocation latency):
> 
> HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/1 (1k agents, 50 
> frameworks):
> 
> Master 1
> Copy-on-write 0.72
> 
> HierarchicalAllocator_BENCHMARK_Test.ResourceLabels/1 (1k agents, 50 
> frameworks):
> 
> Master 1
> Copy-on-write 0.72
> 
> ## Sorter benchmark
> 
> ### BENCHMARK_FullSort
> 
> Master:
> 
> Using 3 agents and 500 clients
> Added 500 clients in 2.261373ms
> Added 3 agents in 322.252441ms
> Added allocations for 3 agents in 823.631826ms
> Full sort of 500 clients took 8662ns
> No-op sort of 500 clients took 809ns
> Removed allocations for 3 agents in 8.482659084secs
> Removed 3 agents in 417.988004ms
> Removed 500 clients in 17.855575ms
> 
> 
> Copy-on-write:
> 
> Using 3 agents and 500 clients (**normalized to the master above**)
> Added 500 clients in 2.464647ms (**1.09**)
> Added 3 agents in 205.071423ms (**0.64**)
> Added allocations for 3 agents in 156.340883ms (**0.19**)
> Full sort of 500 clients took 1007ns (**0.12**)
> No-op sort of 500 clients took 204ns (**0.02**)
> Removed allocations for 3 agents in 7.215175016secs (**0.85**)
> Removed 3 agents in 304.998116ms (**0.73**)
> Removed 500 clients in 3.195079ms (**0.18**)
> 
> 
> ### BENCHMARK_HierarchyFullSort
> 
> Master:
> 
> Added 1056 clients in 36.571505ms
> Added 1 agents in 114.165111ms
> Added allocations 

Re: Review Request 68490: Optimized `class Resources` with copy-on-write.

2018-09-06 Thread Meng Zhu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68490/
---

(Updated Sept. 6, 2018, 8:29 p.m.)


Review request for mesos, Benjamin Bannier, Benjamin Mahler, and Gastón Kleiman.


Changes
---

Posted sorter benchmark results.


Bugs: MESOS-6765
https://issues.apache.org/jira/browse/MESOS-6765


Repository: mesos


Description
---

This patch lets `class Resources` only store shared
pointers to the underlying resource objects, so that
read-only filter operations such as `reserved()`,
`unreserved()` and etc. can avoid making copies of
the whole resource objects. Instead, only shared pointers
are copied.

In write operations, we check if there are more than one
references to the resource object. If so, a copy is made
for safe mutation without affecting owners.

To maintain the usage abstraction that `class Resources`
still holds resource objects, we utilize
`boost::indirect_iterator` iterator adapter to deference
the shared pointers as we iterate.


Diffs
-

  include/mesos/resources.hpp 6f81b14f8bc090a144eeae8f15639c370366166d 
  include/mesos/v1/resources.hpp 09110530da16678abf6bf6b308906dd8ccc8180a 
  src/common/resources.cpp 3e63cdedb9261970dbeb9bb9f97eed65819f68a7 
  src/v1/resources.cpp 3683a331e0859cd6f2ad061db6ba67112ecfcb0d 


Diff: https://reviews.apache.org/r/68490/diff/4/


Testing (updated)
---

make check

Benchmarked on a Ubuntu box using clang -O2 optimized build with an i5 
processor, frequency fixed at 1.6GHz.

## Resources benchmarks (Measure geo-mean of various operations):

Normalized Geomean latency

Master  Copy-on-write
Arithmetic  1   1.09
Filter  1   0.28
Contains1   0.27
Parse   1   1.01

## Sample allocator benchmark (Measure first allocation latency):

HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/1 (1k agents, 50 frameworks):

Master 1
Copy-on-write 0.72

HierarchicalAllocator_BENCHMARK_Test.ResourceLabels/1 (1k agents, 50 
frameworks):

Master 1
Copy-on-write 0.72

## Sorter benchmark

### BENCHMARK_FullSort

Master:

Using 3 agents and 500 clients
Added 500 clients in 2.261373ms
Added 3 agents in 322.252441ms
Added allocations for 3 agents in 823.631826ms
Full sort of 500 clients took 8662ns
No-op sort of 500 clients took 809ns
Removed allocations for 3 agents in 8.482659084secs
Removed 3 agents in 417.988004ms
Removed 500 clients in 17.855575ms


Copy-on-write:

Using 3 agents and 500 clients (**normalized to the master above**)
Added 500 clients in 2.464647ms (**1.09**)
Added 3 agents in 205.071423ms (**0.64**)
Added allocations for 3 agents in 156.340883ms (**0.19**)
Full sort of 500 clients took 1007ns (**0.12**)
No-op sort of 500 clients took 204ns (**0.02**)
Removed allocations for 3 agents in 7.215175016secs (**0.85**)
Removed 3 agents in 304.998116ms (**0.73**)
Removed 500 clients in 3.195079ms (**0.18**)


### BENCHMARK_HierarchyFullSort

Master:

Added 1056 clients in 36.571505ms
Added 1 agents in 114.165111ms
Added allocations for 1 agents in 569.144686ms
Full sort of 1056 clients took 43485ns
No-op sort of 1056 clients took 676ns
Removed allocations for 1 agents in 5.656517904secs
Removed 1 agents in 153.016733ms
Removed 1056 clients in 7.169549ms

Copy-on-write:

Added 1056 clients in 36.468822ms (**0.997**)
Added 1 agents in 67.733365ms (**0.59**)
Added allocations for 1 agents in 99.695723ms (**0.175**)
Full sort of 1056 clients took 29729ns (**0.68**)
No-op sort of 1056 clients took 704ns (**1.04**)
Removed allocations for 1 agents in 4.824389788secs (**0.85**)
Removed 1 agents in 101.319295ms (**0.66**)
Removed 1056 clients in 2.480967ms (**0.35**)


Thanks,

Meng Zhu



Re: Review Request 68490: Optimized `class Resources` with copy-on-write.

2018-09-06 Thread Meng Zhu


> On Sept. 6, 2018, 3:27 a.m., Benjamin Bannier wrote:
> > include/mesos/resources.hpp
> > Line 588 (original), 590 (patched)
> > 
> >
> > `s/Resource/Resource_/`?

`Resource_` is not exposed outside, it is implicitly converted to `Resource`. 
Will comment on this.


> On Sept. 6, 2018, 3:27 a.m., Benjamin Bannier wrote:
> > src/v1/resources.cpp
> > Lines 2117-2118 (original), 2154-2178 (patched)
> > 
> >
> > This function looks redundant and identical to `Resources::add(const 
> > Resource&)` to me. It also introduces another place where we leak 
> > implementation details into the interface.
> > 
> > Suggest to remove and instead let callers perform the deref. We could 
> > always add something similar back once we want to support `move`'ing a 
> > `that`.
> 
> Benjamin Mahler wrote:
> I think Meng forgot to publish a reply here, but you can see the reply on 
> my similar comment above. If we don't have this and the caller has to de-ref, 
> then we have to copy the resource object, whereas this interface allows us to 
> copy the shared_ptr.
> 
> Also note that this is not part of the public interface, it's private.

The difference lies in:

```
 // Cannot be combined with any existing Resource object.
  if (!found) {
resources.push_back(that);
  }
```

I will copy my earlier reply to BenM:

Consider adding two non-mergeble Resources(e.g. 1 cpu + 1 mem), if we only have 
Resources::add(const Resource_& that), we will always need to 
make_shared(Resource_). If we directly pass shared_ptr, we can then just 
push_back. For any non-mergable Resource_, we will be able to save one 
make_shared.


One rule of thumb I figured is that, we should avoid make_shared when possible 
(because it makes a copy). This means if we already have a shared_ptr we should 
just use it and pass it around. This also explains why I use shared_ptr for 
foreach loops.


- Meng


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68490/#review208402
---


On Sept. 6, 2018, 2:56 p.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68490/
> ---
> 
> (Updated Sept. 6, 2018, 2:56 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Benjamin Mahler, and Gastón 
> Kleiman.
> 
> 
> Bugs: MESOS-6765
> https://issues.apache.org/jira/browse/MESOS-6765
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch lets `class Resources` only store shared
> pointers to the underlying resource objects, so that
> read-only filter operations such as `reserved()`,
> `unreserved()` and etc. can avoid making copies of
> the whole resource objects. Instead, only shared pointers
> are copied.
> 
> In write operations, we check if there are more than one
> references to the resource object. If so, a copy is made
> for safe mutation without affecting owners.
> 
> To maintain the usage abstraction that `class Resources`
> still holds resource objects, we utilize
> `boost::indirect_iterator` iterator adapter to deference
> the shared pointers as we iterate.
> 
> 
> Diffs
> -
> 
>   include/mesos/resources.hpp 6f81b14f8bc090a144eeae8f15639c370366166d 
>   include/mesos/v1/resources.hpp 09110530da16678abf6bf6b308906dd8ccc8180a 
>   src/common/resources.cpp 3e63cdedb9261970dbeb9bb9f97eed65819f68a7 
>   src/v1/resources.cpp 3683a331e0859cd6f2ad061db6ba67112ecfcb0d 
> 
> 
> Diff: https://reviews.apache.org/r/68490/diff/4/
> 
> 
> Testing
> ---
> 
> make check
> 
> Benchmarked on a Ubuntu box using clang -O2 optimized build with an i5 
> processor, frequency fixed at 1.6GHz.
> 
> Resources benchmarks (Measure geo-mean of various operations):
> 
> Normalized Geomean latency
>   
>   Master  Copy-on-write
> Arithmetic1   1.09
> Filter1   0.28
> Contains  1   0.27
> Parse 1   1.01
> 
> Sample allocator benchmark (Measure first allocation latency):
> 
> HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/1 (1k agents, 50 
> frameworks):
> 
> Master 1
> Copy-on-write 0.72
> 
> HierarchicalAllocator_BENCHMARK_Test.ResourceLabels/1 (1k agents, 50 
> frameworks):
> 
> Master 1
> Copy-on-write 0.72
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 68490: Optimized `class Resources` with copy-on-write.

2018-09-06 Thread Benjamin Mahler


> On Sept. 6, 2018, 10:27 a.m., Benjamin Bannier wrote:
> > src/v1/resources.cpp
> > Lines 2117-2118 (original), 2154-2178 (patched)
> > 
> >
> > This function looks redundant and identical to `Resources::add(const 
> > Resource&)` to me. It also introduces another place where we leak 
> > implementation details into the interface.
> > 
> > Suggest to remove and instead let callers perform the deref. We could 
> > always add something similar back once we want to support `move`'ing a 
> > `that`.

I think Meng forgot to publish a reply here, but you can see the reply on my 
similar comment above. If we don't have this and the caller has to de-ref, then 
we have to copy the resource object, whereas this interface allows us to copy 
the shared_ptr.

Also note that this is not part of the public interface, it's private.


- Benjamin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68490/#review208402
---


On Sept. 6, 2018, 9:56 p.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68490/
> ---
> 
> (Updated Sept. 6, 2018, 9:56 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Benjamin Mahler, and Gastón 
> Kleiman.
> 
> 
> Bugs: MESOS-6765
> https://issues.apache.org/jira/browse/MESOS-6765
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch lets `class Resources` only store shared
> pointers to the underlying resource objects, so that
> read-only filter operations such as `reserved()`,
> `unreserved()` and etc. can avoid making copies of
> the whole resource objects. Instead, only shared pointers
> are copied.
> 
> In write operations, we check if there are more than one
> references to the resource object. If so, a copy is made
> for safe mutation without affecting owners.
> 
> To maintain the usage abstraction that `class Resources`
> still holds resource objects, we utilize
> `boost::indirect_iterator` iterator adapter to deference
> the shared pointers as we iterate.
> 
> 
> Diffs
> -
> 
>   include/mesos/resources.hpp 6f81b14f8bc090a144eeae8f15639c370366166d 
>   include/mesos/v1/resources.hpp 09110530da16678abf6bf6b308906dd8ccc8180a 
>   src/common/resources.cpp 3e63cdedb9261970dbeb9bb9f97eed65819f68a7 
>   src/v1/resources.cpp 3683a331e0859cd6f2ad061db6ba67112ecfcb0d 
> 
> 
> Diff: https://reviews.apache.org/r/68490/diff/4/
> 
> 
> Testing
> ---
> 
> make check
> 
> Benchmarked on a Ubuntu box using clang -O2 optimized build with an i5 
> processor, frequency fixed at 1.6GHz.
> 
> Resources benchmarks (Measure geo-mean of various operations):
> 
> Normalized Geomean latency
>   
>   Master  Copy-on-write
> Arithmetic1   1.09
> Filter1   0.28
> Contains  1   0.27
> Parse 1   1.01
> 
> Sample allocator benchmark (Measure first allocation latency):
> 
> HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/1 (1k agents, 50 
> frameworks):
> 
> Master 1
> Copy-on-write 0.72
> 
> HierarchicalAllocator_BENCHMARK_Test.ResourceLabels/1 (1k agents, 50 
> frameworks):
> 
> Master 1
> Copy-on-write 0.72
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 68490: Optimized `class Resources` with copy-on-write.

2018-09-06 Thread Meng Zhu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68490/
---

(Updated Sept. 6, 2018, 2:56 p.m.)


Review request for mesos, Benjamin Bannier, Benjamin Mahler, and Gastón Kleiman.


Changes
---

Addressed Benjamin's comment.


Bugs: MESOS-6765
https://issues.apache.org/jira/browse/MESOS-6765


Repository: mesos


Description
---

This patch lets `class Resources` only store shared
pointers to the underlying resource objects, so that
read-only filter operations such as `reserved()`,
`unreserved()` and etc. can avoid making copies of
the whole resource objects. Instead, only shared pointers
are copied.

In write operations, we check if there are more than one
references to the resource object. If so, a copy is made
for safe mutation without affecting owners.

To maintain the usage abstraction that `class Resources`
still holds resource objects, we utilize
`boost::indirect_iterator` iterator adapter to deference
the shared pointers as we iterate.


Diffs (updated)
-

  include/mesos/resources.hpp 6f81b14f8bc090a144eeae8f15639c370366166d 
  include/mesos/v1/resources.hpp 09110530da16678abf6bf6b308906dd8ccc8180a 
  src/common/resources.cpp 3e63cdedb9261970dbeb9bb9f97eed65819f68a7 
  src/v1/resources.cpp 3683a331e0859cd6f2ad061db6ba67112ecfcb0d 


Diff: https://reviews.apache.org/r/68490/diff/4/

Changes: https://reviews.apache.org/r/68490/diff/3-4/


Testing
---

make check

Benchmarked on a Ubuntu box using clang -O2 optimized build with an i5 
processor, frequency fixed at 1.6GHz.

Resources benchmarks (Measure geo-mean of various operations):

Normalized Geomean latency

Master  Copy-on-write
Arithmetic  1   1.09
Filter  1   0.28
Contains1   0.27
Parse   1   1.01

Sample allocator benchmark (Measure first allocation latency):

HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/1 (1k agents, 50 frameworks):

Master 1
Copy-on-write 0.72

HierarchicalAllocator_BENCHMARK_Test.ResourceLabels/1 (1k agents, 50 
frameworks):

Master 1
Copy-on-write 0.72


Thanks,

Meng Zhu



Re: Review Request 68490: Optimized `class Resources` with copy-on-write.

2018-09-06 Thread Benjamin Bannier

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68490/#review208402
---




include/mesos/resources.hpp
Line 588 (original), 590 (patched)


`s/Resource/Resource_/`?



include/mesos/resources.hpp
Lines 596-597 (patched)


I'd leave this above above the iterator types as documentation.



include/mesos/resources.hpp
Lines 691-694 (patched)


nit: While this is likely true, copying a shared pointer can be suprisingly 
non-inexpensive due to copying of its atomic ref count (likely protected with 
memory barries, and copies triggering cache invalidations across threads 
holding ref counts).

Your benchmarks strongly suggest that we still benefit from this patch.



src/v1/resources.cpp
Lines 2117-2118 (original), 2154-2178 (patched)


This function looks redundant and identical to `Resources::add(const 
Resource&)` to me. It also introduces another place where we leak 
implementation details into the interface.

Suggest to remove and instead let callers perform the deref. We could 
always add something similar back once we want to support `move`'ing a `that`.


- Benjamin Bannier


On Sept. 5, 2018, 11:47 p.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68490/
> ---
> 
> (Updated Sept. 5, 2018, 11:47 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Benjamin Mahler, and Gastón 
> Kleiman.
> 
> 
> Bugs: MESOS-6765
> https://issues.apache.org/jira/browse/MESOS-6765
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch lets `class Resources` only store shared
> pointers to the underlying resource objects, so that
> read-only filter operations such as `reserved()`,
> `unreserved()` and etc. can avoid making copies of
> the whole resource objects. Instead, only shared pointers
> are copied.
> 
> In write operations, we check if there are more than one
> references to the resource object. If so, a copy is made
> for safe mutation without affecting owners.
> 
> To maintain the usage abstraction that `class Resources`
> still holds resource objects, we utilize
> `boost::indirect_iterator` iterator adapter to deference
> the shared pointers as we iterate.
> 
> 
> Diffs
> -
> 
>   include/mesos/resources.hpp 6f81b14f8bc090a144eeae8f15639c370366166d 
>   include/mesos/v1/resources.hpp 09110530da16678abf6bf6b308906dd8ccc8180a 
>   src/common/resources.cpp 3e63cdedb9261970dbeb9bb9f97eed65819f68a7 
>   src/v1/resources.cpp 3683a331e0859cd6f2ad061db6ba67112ecfcb0d 
> 
> 
> Diff: https://reviews.apache.org/r/68490/diff/3/
> 
> 
> Testing
> ---
> 
> make check
> 
> Benchmarked on a Ubuntu box using clang -O2 optimized build with an i5 
> processor, frequency fixed at 1.6GHz.
> 
> Resources benchmarks (Measure geo-mean of various operations):
> 
> Normalized Geomean latency
>   
>   Master  Copy-on-write
> Arithmetic1   1.09
> Filter1   0.28
> Contains  1   0.27
> Parse 1   1.01
> 
> Sample allocator benchmark (Measure first allocation latency):
> 
> HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/1 (1k agents, 50 
> frameworks):
> 
> Master 1
> Copy-on-write 0.72
> 
> HierarchicalAllocator_BENCHMARK_Test.ResourceLabels/1 (1k agents, 50 
> frameworks):
> 
> Master 1
> Copy-on-write 0.72
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>



Re: Review Request 68490: Optimized `class Resources` with copy-on-write.

2018-09-05 Thread Meng Zhu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68490/
---

(Updated Sept. 5, 2018, 2:47 p.m.)


Review request for mesos, Benjamin Bannier, Benjamin Mahler, and Gastón Kleiman.


Changes
---

Added benchmark results.


Bugs: MESOS-6765
https://issues.apache.org/jira/browse/MESOS-6765


Repository: mesos


Description
---

This patch lets `class Resources` only store shared
pointers to the underlying resource objects, so that
read-only filter operations such as `reserved()`,
`unreserved()` and etc. can avoid making copies of
the whole resource objects. Instead, only shared pointers
are copied.

In write operations, we check if there are more than one
references to the resource object. If so, a copy is made
for safe mutation without affecting owners.

To maintain the usage abstraction that `class Resources`
still holds resource objects, we utilize
`boost::indirect_iterator` iterator adapter to deference
the shared pointers as we iterate.


Diffs
-

  include/mesos/resources.hpp 6f81b14f8bc090a144eeae8f15639c370366166d 
  include/mesos/v1/resources.hpp 09110530da16678abf6bf6b308906dd8ccc8180a 
  src/common/resources.cpp 3e63cdedb9261970dbeb9bb9f97eed65819f68a7 
  src/v1/resources.cpp 3683a331e0859cd6f2ad061db6ba67112ecfcb0d 


Diff: https://reviews.apache.org/r/68490/diff/3/


Testing (updated)
---

make check

Benchmarked on a Ubuntu box using clang -O2 optimized build with an i5 
processor, frequency fixed at 1.6GHz.

Resources benchmarks (Measure geo-mean of various operations):

Normalized Geomean latency

Master  Copy-on-write
Arithmetic  1   1.09
Filter  1   0.28
Contains1   0.27
Parse   1   1.01

Sample allocator benchmark (Measure first allocation latency):

HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/1 (1k agents, 50 frameworks):

Master 1
Copy-on-write 0.72

HierarchicalAllocator_BENCHMARK_Test.ResourceLabels/1 (1k agents, 50 
frameworks):

Master 1
Copy-on-write 0.72


Thanks,

Meng Zhu



Re: Review Request 68490: Optimized `class Resources` with copy-on-write.

2018-09-05 Thread Meng Zhu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68490/
---

(Updated Sept. 5, 2018, 2:30 p.m.)


Review request for mesos, Benjamin Bannier, Benjamin Mahler, and Gastón Kleiman.


Changes
---

Made foreach loops inside `class Resources` iterate over 
`shared_ptr` consistently.


Bugs: MESOS-6765
https://issues.apache.org/jira/browse/MESOS-6765


Repository: mesos


Description
---

This patch lets `class Resources` only store shared
pointers to the underlying resource objects, so that
read-only filter operations such as `reserved()`,
`unreserved()` and etc. can avoid making copies of
the whole resource objects. Instead, only shared pointers
are copied.

In write operations, we check if there are more than one
references to the resource object. If so, a copy is made
for safe mutation without affecting owners.

To maintain the usage abstraction that `class Resources`
still holds resource objects, we utilize
`boost::indirect_iterator` iterator adapter to deference
the shared pointers as we iterate.


Diffs (updated)
-

  include/mesos/resources.hpp 6f81b14f8bc090a144eeae8f15639c370366166d 
  include/mesos/v1/resources.hpp 09110530da16678abf6bf6b308906dd8ccc8180a 
  src/common/resources.cpp 3e63cdedb9261970dbeb9bb9f97eed65819f68a7 
  src/v1/resources.cpp 3683a331e0859cd6f2ad061db6ba67112ecfcb0d 


Diff: https://reviews.apache.org/r/68490/diff/3/

Changes: https://reviews.apache.org/r/68490/diff/2-3/


Testing
---

make check

Did a quick test on Mac with an optimized build, running benchmark 
`HierarchicalAllocator_BENCHMARK_Test.ResourceLabels`, here are the results of 
comparing "before" and "after". Note, DVFS is not fixed. And we only did a 
partial run to verify the validity of the patch, full evaluation coming soon.

**Overall, 33% performance improvement for the 1st steup (1000 agents and 1 
frameworks) and 32% improvement for the first 50 iterarions of the 2nd setup 
(1000 agents and 50 frameworks).**

Before:

[ RUN  ] 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ResourceLabels/0

Using 1000 agents and 1 frameworks
Added 1 frameworks in 1.220022ms
Added 1000 agents in 465.045382ms
round 0 allocate() took 54.803773ms to make 0 offers after filtering 1000 offers
round 1 allocate() took 66.690456ms to make 0 offers after filtering 1000 offers

Using 1000 agents and 50 frameworks
Added 50 frameworks in 1.40774ms
Added 1000 agents in 460.562488ms
round 0 allocate() took 155.500548ms to make 1000 offers after filtering 1000 
offers
round 1 allocate() took 155.194986ms to make 1000 offers after filtering 2000 
offers
round 2 allocate() took 160.706712ms to make 1000 offers after filtering 3000 
offers
round 3 allocate() took 156.701883ms to make 1000 offers after filtering 4000 
offers
round 4 allocate() took 164.246987ms to make 1000 offers after filtering 5000 
offers
round 5 allocate() took 160.791809ms to make 1000 offers after filtering 6000 
offers
round 6 allocate() took 166.064652ms to make 1000 offers after filtering 7000 
offers
round 7 allocate() took 162.177213ms to make 1000 offers after filtering 8000 
offers
round 8 allocate() took 166.620163ms to make 1000 offers after filtering 9000 
offers
round 9 allocate() took 167.970045ms to make 1000 offers after filtering 1 
offers
round 10 allocate() took 166.189463ms to make 1000 offers after filtering 11000 
offers
round 11 allocate() took 170.863462ms to make 1000 offers after filtering 12000 
offers
round 12 allocate() took 172.260473ms to make 1000 offers after filtering 13000 
offers
round 13 allocate() took 170.553911ms to make 1000 offers after filtering 14000 
offers
round 14 allocate() took 235.764593ms to make 1000 offers after filtering 15000 
offers
round 15 allocate() took 247.250433ms to make 1000 offers after filtering 16000 
offers
round 16 allocate() took 276.932824ms to make 1000 offers after filtering 17000 
offers
round 17 allocate() took 248.888469ms to make 1000 offers after filtering 18000 
offers
round 18 allocate() took 193.890556ms to make 1000 offers after filtering 19000 
offers
round 19 allocate() took 195.105346ms to make 1000 offers after filtering 2 
offers
round 20 allocate() took 194.447428ms to make 1000 offers after filtering 21000 
offers
round 21 allocate() took 205.486287ms to make 1000 offers after filtering 22000 
offers
round 22 allocate() took 199.241922ms to make 1000 offers after filtering 23000 
offers
round 23 allocate() took 200.885488ms to make 1000 offers after filtering 24000 
offers
round 24 allocate() took 216.132361ms to make 1000 offers after filtering 25000 
offers
round 25 allocate() took 210.638273ms to make 1000 offers after filtering 26000 
offers
round 26 allocate() took 232.397778ms to make 1000 offers after filtering 27000 
offers
round 27 allocate() took 239.633708ms to make 1000 

Re: Review Request 68490: Optimized `class Resources` with copy-on-write.

2018-09-05 Thread Meng Zhu


> On Sept. 4, 2018, 2:36 a.m., Benjamin Bannier wrote:
> > Have you benchmarked against a dedicated copy-on-write abstraction? I 
> > understand that the current implementation which only copies when needed 
> > might be more performant, but it also appears to carry a huge mental 
> > overhead to me as users of `Resources` need to always pay a lot of 
> > attention to how used objects could be shared. Using a slightly safer 
> > copy-on-write abstraction might introduce some overhead, but could IMO be 
> > much easier to use and reduced the likelihood of subtle bugs. It would also 
> > help encapsulate low-level issues like `use_count`, see e.g., 
> > https://wg21.cmeerw.net/lwg/issue2776, 
> > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0521r0.html.
> > 
> > I imagine this could be useful elsewhere as well in the future.
> > 
> > Note that implementations like the one presented in 
> > https://en.wikibooks.org/wiki/More_C%2B%2B_Idioms/Copy-on-write which allow 
> > getting references to wrapped value are prone to subtle issues in user 
> > code. Not directly giving users a reference requires a little more 
> > determination to introduce bugs.
> > 
> > Dummy:
> > 
> > // UNTESTED AND BUGGY. FOR DEMONSTRATION PURPOSES ONLY ;)
> > template  struct cow {
> >   // We can only construct from a value we need to wrap. All other 
> > constructors
> >   // should be deleted as needed.
> >   //
> >   // Verify with e.g., SFINAE that
> >   //
> >   // static_assert(is_convertible_v, T>);
> >   template  explicit cow(U);
> > 
> >   // Allow assignments from a `cow` we own. All other assignment 
> > operators
> >   // should be deleted as needed. These operators need to internally 
> > synchronize
> >   // concurrent access to the wrapped value.
> >   cow =(cow);
> > 
> >   // Non-mutable access to wrapped value.
> >   const T >() const;
> >   const T& operator*() const;
> > 
> >   // Mutating access to the wrapped value. `F` is callable as 
> > `void(T&)` and
> >   // applied to the wrapped value immediately. Synchronizes internally.
> >   template  cow (F) { return *this; }
> > };
> > 
> > void example() {
> > cow i(47);
> > assert(*i == 47);
> > 
> > i.mut([](int ) { x = 11; });
> > assert(*i == 11);
> > }

Thanks a lot for the suggestion, Benjamin! There is certainly room for 
improvement. As Ben mentioned above, the boilerplate code regarding use_count 
is brittle. Hide setters and only expose mutation in a controlled fashion is 
the way to go.

But I need some time to think through and evaluate the options. For now,  only 
code inside the realm of resource.cpp (`class Resources` specifically) needs to 
carry that mental burden. Hopefully, this part of the code is only touched 
sparingly. Added a todo and I will make sure to follow up on this.


- Meng


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68490/#review208295
---


On Sept. 5, 2018, 12:17 p.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68490/
> ---
> 
> (Updated Sept. 5, 2018, 12:17 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler and Gastón Kleiman.
> 
> 
> Bugs: MESOS-6765
> https://issues.apache.org/jira/browse/MESOS-6765
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch lets `class Resources` only store shared
> pointers to the underlying resource objects, so that
> read-only filter operations such as `reserved()`,
> `unreserved()` and etc. can avoid making copies of
> the whole resource objects. Instead, only shared pointers
> are copied.
> 
> In write operations, we check if there are more than one
> references to the resource object. If so, a copy is made
> for safe mutation without affecting owners.
> 
> To maintain the usage abstraction that `class Resources`
> still holds resource objects, we utilize
> `boost::indirect_iterator` iterator adapter to deference
> the shared pointers as we iterate.
> 
> 
> Diffs
> -
> 
>   include/mesos/resources.hpp 6f81b14f8bc090a144eeae8f15639c370366166d 
>   include/mesos/v1/resources.hpp 09110530da16678abf6bf6b308906dd8ccc8180a 
>   src/common/resources.cpp 3e63cdedb9261970dbeb9bb9f97eed65819f68a7 
>   src/v1/resources.cpp 3683a331e0859cd6f2ad061db6ba67112ecfcb0d 
> 
> 
> Diff: https://reviews.apache.org/r/68490/diff/2/
> 
> 
> Testing
> ---
> 
> make check
> 
> Did a quick test on Mac with an optimized build, running benchmark 
> `HierarchicalAllocator_BENCHMARK_Test.ResourceLabels`, here are the results 
> of comparing "before" and "after". Note, DVFS is not fixed. And 

Re: Review Request 68490: Optimized `class Resources` with copy-on-write.

2018-09-05 Thread Meng Zhu


> On Sept. 3, 2018, 9:59 p.m., Benjamin Mahler wrote:
> > Thanks! Can you also include some commentary on the rest of the allocation 
> > benchmarks as well as the overhead within the Resources micro-benchmarks?

Posted the updated patches. Will summarize and post more comprehensive results 
soon.


> On Sept. 3, 2018, 9:59 p.m., Benjamin Mahler wrote:
> > include/mesos/resources.hpp
> > Lines 678-679 (patched)
> > 
> >
> > What's this referring to? Reading this I see it already has an rvalue 
> > reference?

Referring to `void add(std::shared_ptr&& that);`. Moved the comment 
down to clarifiy it.


> On Sept. 3, 2018, 9:59 p.m., Benjamin Mahler wrote:
> > include/mesos/resources.hpp
> > Lines 682 (patched)
> > 
> >
> > Why do we need this? It seems like the callers should just be 
> > de-referencing their shared_ptr prior to calling?

Consider adding two non-mergeble `Resources`(e.g. 1 cpu + 1 mem), if we only 
have `Resources::add(const Resource_& that)`, we will always need to 
`make_shared(Resource_)`. If we directly pass `shared_ptr`, we can then just 
`push_back`. For any non-mergable `Resource_`, we will be able to save one 
`make_shared`.

One rule of thumb I figured is that, we should avoid `make_shared` when 
possible (because it makes a copy). This means if we already have a 
`shared_ptr` we should just use it and pass it around. This also explains why I 
use `shared_ptr` for foreach loops.


> On Sept. 3, 2018, 9:59 p.m., Benjamin Mahler wrote:
> > src/common/resources.cpp
> > Line 1471 (original), 1473 (patched)
> > 
> >
> > I'm a little puzzled about these `const shared_ptr<...>` foreach loops. 
> > Why don't the const ones do `const Resource_&` loops?
> > 
> > ```
> > foreach (const Resource_& resource_, that) {
> > ```
> > 
> > Something I'm missing?

See my comment above regarding `add(const std::shared_ptr& that)`.

Actual benefits aside, I also want to make it consistent that inside the `class 
Resources`, we are speaking `shared_ptr` unless we want to make an implicit 
copy as in `push/popReservation`.


> On Sept. 3, 2018, 9:59 p.m., Benjamin Mahler wrote:
> > src/common/resources.cpp
> > Line 1503 (original), 1505 (patched)
> > 
> >
> > It seems ok to slip this in here, but generally please send unrelated 
> > cleanups as separate patches. :)

Got it.


> On Sept. 3, 2018, 9:59 p.m., Benjamin Mahler wrote:
> > src/common/resources.cpp
> > Line 1514 (original), 1516-1518 (patched)
> > 
> >
> > I'm not sure how obvious these will be to readers, maybe a comment on 
> > all of these?
> > 
> > ```
> > // Copy-on-write (if more than 1 reference).
> > if (resource_.use_count() > 1) {
> >   resource_ = make_shared(*resource_);
> > }
> > ```
> > 
> > Maybe something like (since we can't use `mutable`):
> > 
> > ```
> > make_mutable(resource_);
> > ```
> > 
> > Maybe also make it composable?
> > 
> > ```
> > 
> > make_mutable(resource_)->resource.mutable_allocation_info()->set_role(role);
> > ```
> > 
> > Still, seems rather error prone and easy to forget to check for copies 
> > prior to mutating? Wonder if there's a way to make it less brittle.

Added the comment. Added a TODO regarding introducing a more controlled 
mutation interface.


> On Sept. 3, 2018, 9:59 p.m., Benjamin Mahler wrote:
> > src/common/resources.cpp
> > Lines 1667-1670 (patched)
> > 
> >
> > Seems odd, why don't we use `Resource` directly here?
> > 
> > ```
> >   Resource r = *resource_;
> >   r.clear_reservations();
> > 
> >   result.add(std::move(r));
> > ```
> > 
> > (I left a comment above about removing the add that takes a shared_ptr, 
> > since it doesn't seem needed?)

See my comments regarding `add` shared_ptr above.


> On Sept. 3, 2018, 9:59 p.m., Benjamin Mahler wrote:
> > src/common/resources.cpp
> > Line 1676 (original), 1692 (patched)
> > 
> >
> > Feel free to pull some of these unrelated improvements out in front of 
> > this, it would help make the diff easier for reviewers.

Got it.


> On Sept. 3, 2018, 9:59 p.m., Benjamin Mahler wrote:
> > src/common/resources.cpp
> > Line 1954 (original), 1972-1979 (patched)
> > 
> >
> > Why is this needed? Isn't the point of this change that toUnreserved 
> > avoids copying 

Re: Review Request 68490: Optimized `class Resources` with copy-on-write.

2018-09-05 Thread Meng Zhu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68490/
---

(Updated Sept. 5, 2018, 12:17 p.m.)


Review request for mesos, Benjamin Mahler and Gastón Kleiman.


Bugs: MESOS-6765
https://issues.apache.org/jira/browse/MESOS-6765


Repository: mesos


Description
---

This patch lets `class Resources` only store shared
pointers to the underlying resource objects, so that
read-only filter operations such as `reserved()`,
`unreserved()` and etc. can avoid making copies of
the whole resource objects. Instead, only shared pointers
are copied.

In write operations, we check if there are more than one
references to the resource object. If so, a copy is made
for safe mutation without affecting owners.

To maintain the usage abstraction that `class Resources`
still holds resource objects, we utilize
`boost::indirect_iterator` iterator adapter to deference
the shared pointers as we iterate.


Diffs (updated)
-

  include/mesos/resources.hpp 6f81b14f8bc090a144eeae8f15639c370366166d 
  include/mesos/v1/resources.hpp 09110530da16678abf6bf6b308906dd8ccc8180a 
  src/common/resources.cpp 3e63cdedb9261970dbeb9bb9f97eed65819f68a7 
  src/v1/resources.cpp 3683a331e0859cd6f2ad061db6ba67112ecfcb0d 


Diff: https://reviews.apache.org/r/68490/diff/2/

Changes: https://reviews.apache.org/r/68490/diff/1-2/


Testing
---

make check

Did a quick test on Mac with an optimized build, running benchmark 
`HierarchicalAllocator_BENCHMARK_Test.ResourceLabels`, here are the results of 
comparing "before" and "after". Note, DVFS is not fixed. And we only did a 
partial run to verify the validity of the patch, full evaluation coming soon.

**Overall, 33% performance improvement for the 1st steup (1000 agents and 1 
frameworks) and 32% improvement for the first 50 iterarions of the 2nd setup 
(1000 agents and 50 frameworks).**

Before:

[ RUN  ] 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ResourceLabels/0

Using 1000 agents and 1 frameworks
Added 1 frameworks in 1.220022ms
Added 1000 agents in 465.045382ms
round 0 allocate() took 54.803773ms to make 0 offers after filtering 1000 offers
round 1 allocate() took 66.690456ms to make 0 offers after filtering 1000 offers

Using 1000 agents and 50 frameworks
Added 50 frameworks in 1.40774ms
Added 1000 agents in 460.562488ms
round 0 allocate() took 155.500548ms to make 1000 offers after filtering 1000 
offers
round 1 allocate() took 155.194986ms to make 1000 offers after filtering 2000 
offers
round 2 allocate() took 160.706712ms to make 1000 offers after filtering 3000 
offers
round 3 allocate() took 156.701883ms to make 1000 offers after filtering 4000 
offers
round 4 allocate() took 164.246987ms to make 1000 offers after filtering 5000 
offers
round 5 allocate() took 160.791809ms to make 1000 offers after filtering 6000 
offers
round 6 allocate() took 166.064652ms to make 1000 offers after filtering 7000 
offers
round 7 allocate() took 162.177213ms to make 1000 offers after filtering 8000 
offers
round 8 allocate() took 166.620163ms to make 1000 offers after filtering 9000 
offers
round 9 allocate() took 167.970045ms to make 1000 offers after filtering 1 
offers
round 10 allocate() took 166.189463ms to make 1000 offers after filtering 11000 
offers
round 11 allocate() took 170.863462ms to make 1000 offers after filtering 12000 
offers
round 12 allocate() took 172.260473ms to make 1000 offers after filtering 13000 
offers
round 13 allocate() took 170.553911ms to make 1000 offers after filtering 14000 
offers
round 14 allocate() took 235.764593ms to make 1000 offers after filtering 15000 
offers
round 15 allocate() took 247.250433ms to make 1000 offers after filtering 16000 
offers
round 16 allocate() took 276.932824ms to make 1000 offers after filtering 17000 
offers
round 17 allocate() took 248.888469ms to make 1000 offers after filtering 18000 
offers
round 18 allocate() took 193.890556ms to make 1000 offers after filtering 19000 
offers
round 19 allocate() took 195.105346ms to make 1000 offers after filtering 2 
offers
round 20 allocate() took 194.447428ms to make 1000 offers after filtering 21000 
offers
round 21 allocate() took 205.486287ms to make 1000 offers after filtering 22000 
offers
round 22 allocate() took 199.241922ms to make 1000 offers after filtering 23000 
offers
round 23 allocate() took 200.885488ms to make 1000 offers after filtering 24000 
offers
round 24 allocate() took 216.132361ms to make 1000 offers after filtering 25000 
offers
round 25 allocate() took 210.638273ms to make 1000 offers after filtering 26000 
offers
round 26 allocate() took 232.397778ms to make 1000 offers after filtering 27000 
offers
round 27 allocate() took 239.633708ms to make 1000 offers after filtering 28000 
offers
round 28 allocate() took 225.829677ms to make 1000 offers after filtering 29000 
offers

Re: Review Request 68490: Optimized `class Resources` with copy-on-write.

2018-09-04 Thread Benjamin Bannier

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68490/#review208295
---



Have you benchmarked against a dedicated copy-on-write abstraction? I 
understand that the current implementation which only copies when needed might 
be more performant, but it also appears to carry a huge mental overhead to me 
as users of `Resources` need to always pay a lot of attention to how used 
objects could be shared. Using a slightly safer copy-on-write abstraction might 
introduce some overhead, but could IMO be much easier to use and reduced the 
likelihood of subtle bugs. It would also help encapsulate low-level issues like 
`use_count`, see e.g., https://wg21.cmeerw.net/lwg/issue2776, 
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0521r0.html.

I imagine this could be useful elsewhere as well in the future.

Note that implementations like the one presented in 
https://en.wikibooks.org/wiki/More_C%2B%2B_Idioms/Copy-on-write which allow 
getting references to wrapped value are prone to subtle issues in user code. 
Not directly giving users a reference requires a little more determination to 
introduce bugs.

Dummy:

// UNTESTED AND BUGGY. FOR DEMONSTRATION PURPOSES ONLY ;)
template  struct cow {
  // We can only construct from a value we need to wrap. All other 
constructors
  // should be deleted as needed.
  //
  // Verify with e.g., SFINAE that
  //
  // static_assert(is_convertible_v, T>);
  template  explicit cow(U);

  // Allow assignments from a `cow` we own. All other assignment operators
  // should be deleted as needed. These operators need to internally 
synchronize
  // concurrent access to the wrapped value.
  cow =(cow);

  // Non-mutable access to wrapped value.
  const T >() const;
  const T& operator*() const;

  // Mutating access to the wrapped value. `F` is callable as `void(T&)` and
  // applied to the wrapped value immediately. Synchronizes internally.
  template  cow (F) { return *this; }
};

void example() {
cow i(47);
assert(*i == 47);

i.mut([](int ) { x = 11; });
assert(*i == 11);
}

- Benjamin Bannier


On Aug. 24, 2018, 12:13 a.m., Meng Zhu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68490/
> ---
> 
> (Updated Aug. 24, 2018, 12:13 a.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-6765
> https://issues.apache.org/jira/browse/MESOS-6765
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This patch lets `class Resources` only store shared
> pointers to the underlying resource objects, so that
> read-only filter operations such as `reserved()`,
> `unreserved()` and etc. can avoid making copies of
> the whole resource objects. Instead, only shared pointers
> are copied.
> 
> In write operations, we check if there are more than one
> references to the resource object. If so, a copy is made
> for safe mutation without affecting owners.
> 
> To maintain the usage abstraction that `class Resources`
> still holds resource objects, we utilize
> `boost::indirect_iterator` iterator adapter to deference
> the shared pointers as we iterate.
> 
> 
> Diffs
> -
> 
>   include/mesos/resources.hpp 6f81b14f8bc090a144eeae8f15639c370366166d 
>   include/mesos/v1/resources.hpp 09110530da16678abf6bf6b308906dd8ccc8180a 
>   src/common/resources.cpp 0110c0ee3e810ad1c29dfa5507b13ebd5d0222a2 
>   src/v1/resources.cpp 228a7327ffe7934d37b56ee67b8be9ae1e119ca8 
> 
> 
> Diff: https://reviews.apache.org/r/68490/diff/1/
> 
> 
> Testing
> ---
> 
> make check
> 
> Did a quick test on Mac with an optimized build, running benchmark 
> `HierarchicalAllocator_BENCHMARK_Test.ResourceLabels`, here are the results 
> of comparing "before" and "after". Note, DVFS is not fixed. And we only did a 
> partial run to verify the validity of the patch, full evaluation coming soon.
> 
> **Overall, 33% performance improvement for the 1st steup (1000 agents and 1 
> frameworks) and 32% improvement for the first 50 iterarions of the 2nd setup 
> (1000 agents and 50 frameworks).**
> 
> Before:
> 
> [ RUN  ] 
> SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ResourceLabels/0
> 
> Using 1000 agents and 1 frameworks
> Added 1 frameworks in 1.220022ms
> Added 1000 agents in 465.045382ms
> round 0 allocate() took 54.803773ms to make 0 offers after filtering 1000 
> offers
> round 1 allocate() took 66.690456ms to make 0 offers after filtering 1000 
> offers
> 
> Using 1000 agents and 50 frameworks
> Added 50 frameworks in 1.40774ms
> Added 1000 agents in 460.562488ms
> round 0 allocate() took 155.500548ms 

Re: Review Request 68490: Optimized `class Resources` with copy-on-write.

2018-09-03 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68490/#review208286
---



Thanks! Can you also include some commentary on the rest of the allocation 
benchmarks as well as the overhead within the Resources micro-benchmarks?


include/mesos/resources.hpp
Lines 590-591 (patched)


```
  // We use `boost::indirect_iterator` to expose `Resource_` iteration,
  // while actually storing `shared_ptr`.
```



include/mesos/resources.hpp
Lines 592 (patched)


Remove the newline?



include/mesos/resources.hpp
Lines 588-590 (original), 597-599 (patched)


This comment now looks a little weird, how about:

```
  // NOTE: Non-const `begin()` and `end()` intentionally return const
  // iterators to prevent mutable access to the `Resource` objects.
```



include/mesos/resources.hpp
Lines 678-679 (patched)


What's this referring to? Reading this I see it already has an rvalue 
reference?



include/mesos/resources.hpp
Lines 682 (patched)


Why do we need this? It seems like the callers should just be 
de-referencing their shared_ptr prior to calling?



include/mesos/resources.hpp
Line 676 (original), 691-693 (patched)


A comment would be helpful for the reader:

```
// Resources are stored using copy-on-write:
//
//   (1) Copies are done by copying the `shared_ptr`. This
//   makes read-only filtering (e.g. `unreserved()`)
//   inexpensive as we do not have to perform copies
//   of the resource objects.
//
//   (2) When a write occurs:
//  (a) If there's a single reference to the resource
//  object, we mutate directly.
//  (b) If there's more than a single reference to the
//  resource object, we copy first, then mutate the
//  copy.
```



src/common/resources.cpp
Line 1471 (original), 1473 (patched)


I'm a little puzzled about these `const shared_ptr<...>` foreach loops. Why 
don't the const ones do `const Resource_&` loops?

```
foreach (const Resource_& resource_, that) {
```

Something I'm missing?



src/common/resources.cpp
Line 1503 (original), 1505 (patched)


It seems ok to slip this in here, but generally please send unrelated 
cleanups as separate patches. :)



src/common/resources.cpp
Line 1514 (original), 1516-1518 (patched)


I'm not sure how obvious these will be to readers, maybe a comment on all 
of these?

```
// Copy-on-write (if more than 1 reference).
if (resource_.use_count() > 1) {
  resource_ = make_shared(*resource_);
}
```

Maybe something like (since we can't use `mutable`):

```
make_mutable(resource_);
```

Maybe also make it composable?

```
make_mutable(resource_)->resource.mutable_allocation_info()->set_role(role);
```

Still, seems rather error prone and easy to forget to check for copies 
prior to mutating? Wonder if there's a way to make it less brittle.



src/common/resources.cpp
Lines 1543-1544 (patched)


Interesting suggestion!



src/common/resources.cpp
Lines 1667-1670 (patched)


Seems odd, why don't we use `Resource` directly here?

```
  Resource r = *resource_;
  r.clear_reservations();

  result.add(std::move(r));
```

(I left a comment above about removing the add that takes a shared_ptr, 
since it doesn't seem needed?)



src/common/resources.cpp
Line 1676 (original), 1692 (patched)


Feel free to pull some of these unrelated improvements out in front of 
this, it would help make the diff easier for reviewers.



src/common/resources.cpp
Lines 1924-1925 (original), 1940-1941 (patched)


Ditto here and elsewhere for the question above about const shared pointer 
loops vs `const Resource_&` loops



src/common/resources.cpp
Line 1954 (original), 1972-1979 (patched)


Why is this needed? Isn't the point of this change that toUnreserved avoids 
copying now?



src/common/resources.cpp
Lines 1958-1963 (original), 1983-1993 (patched)

Review Request 68490: Optimized `class Resources` with copy-on-write.

2018-08-23 Thread Meng Zhu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68490/
---

Review request for mesos and Benjamin Mahler.


Bugs: MESOS-6765
https://issues.apache.org/jira/browse/MESOS-6765


Repository: mesos


Description
---

This patch lets `class Resources` only store shared
pointers to the underlying resource objects, so that
read-only filter operations such as `reserved()`,
`unreserved()` and etc. can avoid making copies of
the whole resource objects. Instead, only shared pointers
are copied.

In write operations, we check if there are more than one
references to the resource object. If so, a copy is made
for safe mutation without affecting owners.

To maintain the usage abstraction that `class Resources`
still holds resource objects, we utilize
`boost::indirect_iterator` iterator adapter to deference
the shared pointers as we iterate.


Diffs
-

  include/mesos/resources.hpp 6f81b14f8bc090a144eeae8f15639c370366166d 
  include/mesos/v1/resources.hpp 09110530da16678abf6bf6b308906dd8ccc8180a 
  src/common/resources.cpp 0110c0ee3e810ad1c29dfa5507b13ebd5d0222a2 
  src/v1/resources.cpp 228a7327ffe7934d37b56ee67b8be9ae1e119ca8 


Diff: https://reviews.apache.org/r/68490/diff/1/


Testing
---

make check

Did a quick test on Mac with an optimized build, running benchmark 
`HierarchicalAllocator_BENCHMARK_Test.ResourceLabels`, here are the results of 
comparing "before" and "after". Note, DVFS is not fixed. And we only did a 
partial run to verify the validity of the patch, full evaluation coming soon.

**Overall, 33% performance improvement for the 1st steup (1000 agents and 1 
frameworks) and 32% improvement for the first 50 iterarions of the 2nd setup 
(1000 agents and 50 frameworks).**

Before:

[ RUN  ] 
SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.ResourceLabels/0

Using 1000 agents and 1 frameworks
Added 1 frameworks in 1.220022ms
Added 1000 agents in 465.045382ms
round 0 allocate() took 54.803773ms to make 0 offers after filtering 1000 offers
round 1 allocate() took 66.690456ms to make 0 offers after filtering 1000 offers

Using 1000 agents and 50 frameworks
Added 50 frameworks in 1.40774ms
Added 1000 agents in 460.562488ms
round 0 allocate() took 155.500548ms to make 1000 offers after filtering 1000 
offers
round 1 allocate() took 155.194986ms to make 1000 offers after filtering 2000 
offers
round 2 allocate() took 160.706712ms to make 1000 offers after filtering 3000 
offers
round 3 allocate() took 156.701883ms to make 1000 offers after filtering 4000 
offers
round 4 allocate() took 164.246987ms to make 1000 offers after filtering 5000 
offers
round 5 allocate() took 160.791809ms to make 1000 offers after filtering 6000 
offers
round 6 allocate() took 166.064652ms to make 1000 offers after filtering 7000 
offers
round 7 allocate() took 162.177213ms to make 1000 offers after filtering 8000 
offers
round 8 allocate() took 166.620163ms to make 1000 offers after filtering 9000 
offers
round 9 allocate() took 167.970045ms to make 1000 offers after filtering 1 
offers
round 10 allocate() took 166.189463ms to make 1000 offers after filtering 11000 
offers
round 11 allocate() took 170.863462ms to make 1000 offers after filtering 12000 
offers
round 12 allocate() took 172.260473ms to make 1000 offers after filtering 13000 
offers
round 13 allocate() took 170.553911ms to make 1000 offers after filtering 14000 
offers
round 14 allocate() took 235.764593ms to make 1000 offers after filtering 15000 
offers
round 15 allocate() took 247.250433ms to make 1000 offers after filtering 16000 
offers
round 16 allocate() took 276.932824ms to make 1000 offers after filtering 17000 
offers
round 17 allocate() took 248.888469ms to make 1000 offers after filtering 18000 
offers
round 18 allocate() took 193.890556ms to make 1000 offers after filtering 19000 
offers
round 19 allocate() took 195.105346ms to make 1000 offers after filtering 2 
offers
round 20 allocate() took 194.447428ms to make 1000 offers after filtering 21000 
offers
round 21 allocate() took 205.486287ms to make 1000 offers after filtering 22000 
offers
round 22 allocate() took 199.241922ms to make 1000 offers after filtering 23000 
offers
round 23 allocate() took 200.885488ms to make 1000 offers after filtering 24000 
offers
round 24 allocate() took 216.132361ms to make 1000 offers after filtering 25000 
offers
round 25 allocate() took 210.638273ms to make 1000 offers after filtering 26000 
offers
round 26 allocate() took 232.397778ms to make 1000 offers after filtering 27000 
offers
round 27 allocate() took 239.633708ms to make 1000 offers after filtering 28000 
offers
round 28 allocate() took 225.829677ms to make 1000 offers after filtering 29000 
offers
round 29 allocate() took 245.143272ms to make 1000 offers after filtering 3 
offers
round 30 allocate() took