Re: Review Request 18157: Compute task host attribute aggregates once when scheduling tasks.

2014-02-19 Thread Bill Farner

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18157/
---

(Updated Feb. 19, 2014, 9:27 p.m.)


Review request for Aurora, Suman Karumuri and Maxim Khutornenko.


Changes
---

Reopened to attach bug.


Bugs: AURORA-117
https://issues.apache.org/jira/browse/AURORA-117


Repository: aurora


Description
---

Addresses TODO from CachedJobState:
  33  * TODO(wfarner): Take this caching one step further and don't store tasks 
here, but instead store
  34  * pre-computed aggregates of host attributes.  For example, translate 
each IScheduledTask into
  35  * the HostAttributes of the machine it resides on, and count the number 
of times each attribute is
  36  * seen.  This could be represented in a Map>, where
  37  * the outer key is the attribute name, and the inner key is attribute 
values.  So, for a job with
  38  * two tasks on the same rack but different hosts, you could have this 
aggregate:
  39  * 
  40  * { "host": {"hostA": 1, "hostB": 1},
  41  *   "rack": {"rack1": 2}
  42  * }
  43  * 

The bulk of this change is with renaming/rewriting CachedJobState (now called 
AttributeAggregate) and consuming from AttributeFilter.  The rest is ripple in 
wiring.

I expect this will make scheduling tasks in large jobs less expensive, since we 
would previously iterate through all tasks multiple times for each offer.


Diffs
-

  src/main/java/org/apache/aurora/scheduler/async/AsyncModule.java 
e7a5a8377f4c8a42012e7ecb118c597a2804da40 
  src/main/java/org/apache/aurora/scheduler/async/Preemptor.java 
7893e55f83c3285b37a4e20e88a7bd31728253f3 
  src/main/java/org/apache/aurora/scheduler/async/TaskScheduler.java 
61385c6bbcddc6fd03498190708c9e49892d0859 
  
src/main/java/org/apache/aurora/scheduler/events/NotifyingSchedulingFilter.java 
d2be7ace46ca59612368ef70239877a4a1ee72fb 
  src/main/java/org/apache/aurora/scheduler/filter/AttributeAggregate.java 
PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/filter/AttributeFilter.java 
3e74135b1dcff2c42bd10b558e3699e70fd7c6be 
  src/main/java/org/apache/aurora/scheduler/filter/CachedJobState.java 
337a5e7c045b3338056d17541d6aec0b2d84c9cb 
  src/main/java/org/apache/aurora/scheduler/filter/ConstraintFilter.java 
c2b9f047b199aa24849c3a9eb1d8176f13fade07 
  src/main/java/org/apache/aurora/scheduler/filter/SchedulingFilter.java 
ddb34cfc2c476ad6266e2d04a626c391f33d2592 
  src/main/java/org/apache/aurora/scheduler/filter/SchedulingFilterImpl.java 
93355c3e9a38bb581d9080a492ec786509158d96 
  src/main/java/org/apache/aurora/scheduler/state/TaskAssigner.java 
3154ef04e211e0078144e14091e0ceb1470231b1 
  src/test/java/org/apache/aurora/scheduler/async/PreemptorImplTest.java 
dc371e0b0ef66c254c6db5b274ddf81ccfbd9ff3 
  src/test/java/org/apache/aurora/scheduler/async/TaskSchedulerImplTest.java 
94647832c01cbf0b2d8d44f7e40fb1b5cc68fd91 
  src/test/java/org/apache/aurora/scheduler/async/TaskSchedulerTest.java 
4112d6133ad26344696237c180f7cc045da4f4f2 
  
src/test/java/org/apache/aurora/scheduler/events/NotifyingSchedulingFilterTest.java
 4ac059cf0eab2370938dc388dbfd86ffb086a248 
  src/test/java/org/apache/aurora/scheduler/filter/AttributeAggregateTest.java 
PRE-CREATION 
  
src/test/java/org/apache/aurora/scheduler/filter/SchedulingFilterImplTest.java 
56de3aae4e08138a66a587e2a43d64a2310079f1 
  src/test/java/org/apache/aurora/scheduler/state/TaskAssignerImplTest.java 
f0fdd571f273be8dad1f9410b0e70e3a11e8133b 

Diff: https://reviews.apache.org/r/18157/diff/


Testing
---

./gradlew build


Thanks,

Bill Farner



Re: Review Request 18157: Compute task host attribute aggregates once when scheduling tasks.

2014-02-19 Thread Suman Karumuri

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18157/#review34897
---

Ship it!


Ship It!

- Suman Karumuri


On Feb. 18, 2014, 7:26 p.m., Bill Farner wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18157/
> ---
> 
> (Updated Feb. 18, 2014, 7:26 p.m.)
> 
> 
> Review request for Aurora, Suman Karumuri and Maxim Khutornenko.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Addresses TODO from CachedJobState:
>   33  * TODO(wfarner): Take this caching one step further and don't store 
> tasks here, but instead store
>   34  * pre-computed aggregates of host attributes.  For example, translate 
> each IScheduledTask into
>   35  * the HostAttributes of the machine it resides on, and count the number 
> of times each attribute is
>   36  * seen.  This could be represented in a Map Multiset>, where
>   37  * the outer key is the attribute name, and the inner key is attribute 
> values.  So, for a job with
>   38  * two tasks on the same rack but different hosts, you could have this 
> aggregate:
>   39  * 
>   40  * { "host": {"hostA": 1, "hostB": 1},
>   41  *   "rack": {"rack1": 2}
>   42  * }
>   43  * 
> 
> The bulk of this change is with renaming/rewriting CachedJobState (now called 
> AttributeAggregate) and consuming from AttributeFilter.  The rest is ripple 
> in wiring.
> 
> I expect this will make scheduling tasks in large jobs less expensive, since 
> we would previously iterate through all tasks multiple times for each offer.
> 
> 
> Diffs
> -
> 
>   src/main/java/org/apache/aurora/scheduler/async/AsyncModule.java 
> e7a5a8377f4c8a42012e7ecb118c597a2804da40 
>   src/main/java/org/apache/aurora/scheduler/async/Preemptor.java 
> 7893e55f83c3285b37a4e20e88a7bd31728253f3 
>   src/main/java/org/apache/aurora/scheduler/async/TaskScheduler.java 
> 61385c6bbcddc6fd03498190708c9e49892d0859 
>   
> src/main/java/org/apache/aurora/scheduler/events/NotifyingSchedulingFilter.java
>  d2be7ace46ca59612368ef70239877a4a1ee72fb 
>   src/main/java/org/apache/aurora/scheduler/filter/AttributeAggregate.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/filter/AttributeFilter.java 
> 3e74135b1dcff2c42bd10b558e3699e70fd7c6be 
>   src/main/java/org/apache/aurora/scheduler/filter/CachedJobState.java 
> 337a5e7c045b3338056d17541d6aec0b2d84c9cb 
>   src/main/java/org/apache/aurora/scheduler/filter/ConstraintFilter.java 
> c2b9f047b199aa24849c3a9eb1d8176f13fade07 
>   src/main/java/org/apache/aurora/scheduler/filter/SchedulingFilter.java 
> ddb34cfc2c476ad6266e2d04a626c391f33d2592 
>   src/main/java/org/apache/aurora/scheduler/filter/SchedulingFilterImpl.java 
> 93355c3e9a38bb581d9080a492ec786509158d96 
>   src/main/java/org/apache/aurora/scheduler/state/TaskAssigner.java 
> 3154ef04e211e0078144e14091e0ceb1470231b1 
>   src/test/java/org/apache/aurora/scheduler/async/PreemptorImplTest.java 
> dc371e0b0ef66c254c6db5b274ddf81ccfbd9ff3 
>   src/test/java/org/apache/aurora/scheduler/async/TaskSchedulerImplTest.java 
> 94647832c01cbf0b2d8d44f7e40fb1b5cc68fd91 
>   src/test/java/org/apache/aurora/scheduler/async/TaskSchedulerTest.java 
> 4112d6133ad26344696237c180f7cc045da4f4f2 
>   
> src/test/java/org/apache/aurora/scheduler/events/NotifyingSchedulingFilterTest.java
>  4ac059cf0eab2370938dc388dbfd86ffb086a248 
>   
> src/test/java/org/apache/aurora/scheduler/filter/AttributeAggregateTest.java 
> PRE-CREATION 
>   
> src/test/java/org/apache/aurora/scheduler/filter/SchedulingFilterImplTest.java
>  56de3aae4e08138a66a587e2a43d64a2310079f1 
>   src/test/java/org/apache/aurora/scheduler/state/TaskAssignerImplTest.java 
> f0fdd571f273be8dad1f9410b0e70e3a11e8133b 
> 
> Diff: https://reviews.apache.org/r/18157/diff/
> 
> 
> Testing
> ---
> 
> ./gradlew build
> 
> 
> Thanks,
> 
> Bill Farner
> 
>



Re: Review Request 18157: Compute task host attribute aggregates once when scheduling tasks.

2014-02-18 Thread Bill Farner

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18157/
---

(Updated Feb. 18, 2014, 7:26 p.m.)


Review request for Aurora, Suman Karumuri and Maxim Khutornenko.


Repository: aurora


Description
---

Addresses TODO from CachedJobState:
  33  * TODO(wfarner): Take this caching one step further and don't store tasks 
here, but instead store
  34  * pre-computed aggregates of host attributes.  For example, translate 
each IScheduledTask into
  35  * the HostAttributes of the machine it resides on, and count the number 
of times each attribute is
  36  * seen.  This could be represented in a Map>, where
  37  * the outer key is the attribute name, and the inner key is attribute 
values.  So, for a job with
  38  * two tasks on the same rack but different hosts, you could have this 
aggregate:
  39  * 
  40  * { "host": {"hostA": 1, "hostB": 1},
  41  *   "rack": {"rack1": 2}
  42  * }
  43  * 

The bulk of this change is with renaming/rewriting CachedJobState (now called 
AttributeAggregate) and consuming from AttributeFilter.  The rest is ripple in 
wiring.

I expect this will make scheduling tasks in large jobs less expensive, since we 
would previously iterate through all tasks multiple times for each offer.


Diffs (updated)
-

  src/main/java/org/apache/aurora/scheduler/async/AsyncModule.java 
e7a5a8377f4c8a42012e7ecb118c597a2804da40 
  src/main/java/org/apache/aurora/scheduler/async/Preemptor.java 
7893e55f83c3285b37a4e20e88a7bd31728253f3 
  src/main/java/org/apache/aurora/scheduler/async/TaskScheduler.java 
61385c6bbcddc6fd03498190708c9e49892d0859 
  
src/main/java/org/apache/aurora/scheduler/events/NotifyingSchedulingFilter.java 
d2be7ace46ca59612368ef70239877a4a1ee72fb 
  src/main/java/org/apache/aurora/scheduler/filter/AttributeAggregate.java 
PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/filter/AttributeFilter.java 
3e74135b1dcff2c42bd10b558e3699e70fd7c6be 
  src/main/java/org/apache/aurora/scheduler/filter/CachedJobState.java 
337a5e7c045b3338056d17541d6aec0b2d84c9cb 
  src/main/java/org/apache/aurora/scheduler/filter/ConstraintFilter.java 
c2b9f047b199aa24849c3a9eb1d8176f13fade07 
  src/main/java/org/apache/aurora/scheduler/filter/SchedulingFilter.java 
ddb34cfc2c476ad6266e2d04a626c391f33d2592 
  src/main/java/org/apache/aurora/scheduler/filter/SchedulingFilterImpl.java 
93355c3e9a38bb581d9080a492ec786509158d96 
  src/main/java/org/apache/aurora/scheduler/state/TaskAssigner.java 
3154ef04e211e0078144e14091e0ceb1470231b1 
  src/test/java/org/apache/aurora/scheduler/async/PreemptorImplTest.java 
dc371e0b0ef66c254c6db5b274ddf81ccfbd9ff3 
  src/test/java/org/apache/aurora/scheduler/async/TaskSchedulerImplTest.java 
94647832c01cbf0b2d8d44f7e40fb1b5cc68fd91 
  src/test/java/org/apache/aurora/scheduler/async/TaskSchedulerTest.java 
4112d6133ad26344696237c180f7cc045da4f4f2 
  
src/test/java/org/apache/aurora/scheduler/events/NotifyingSchedulingFilterTest.java
 4ac059cf0eab2370938dc388dbfd86ffb086a248 
  src/test/java/org/apache/aurora/scheduler/filter/AttributeAggregateTest.java 
PRE-CREATION 
  
src/test/java/org/apache/aurora/scheduler/filter/SchedulingFilterImplTest.java 
56de3aae4e08138a66a587e2a43d64a2310079f1 
  src/test/java/org/apache/aurora/scheduler/state/TaskAssignerImplTest.java 
f0fdd571f273be8dad1f9410b0e70e3a11e8133b 

Diff: https://reviews.apache.org/r/18157/diff/


Testing
---

./gradlew build


Thanks,

Bill Farner



Re: Review Request 18157: Compute task host attribute aggregates once when scheduling tasks.

2014-02-18 Thread Bill Farner


> On Feb. 15, 2014, 1:11 a.m., Maxim Khutornenko wrote:
> > src/main/java/org/apache/aurora/scheduler/filter/AttributeAggregate.java, 
> > line 38
> > 
> >
> > Double space.

(insert snarky comment about how double space after periods is superior)

Removed, here and below.


> On Feb. 15, 2014, 1:11 a.m., Maxim Khutornenko wrote:
> > src/main/java/org/apache/aurora/scheduler/filter/AttributeAggregate.java, 
> > line 48
> > 
> >
> > Having a comment deciphering key/value here would be helpful.

Good call, done.


> On Feb. 15, 2014, 1:11 a.m., Maxim Khutornenko wrote:
> > src/main/java/org/apache/aurora/scheduler/filter/AttributeAggregate.java, 
> > line 51
> > 
> >
> > Typo.

Fixed.


> On Feb. 15, 2014, 1:11 a.m., Maxim Khutornenko wrote:
> > src/main/java/org/apache/aurora/scheduler/filter/AttributeAggregate.java, 
> > line 66
> > 
> >
> > Would it make sense to move getHostAttributes() outside of the Supplier 
> > initialization? I find it hard to follow the logic as it is.

Sure, done.


- Bill


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18157/#review34561
---


On Feb. 15, 2014, 12:21 a.m., Bill Farner wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18157/
> ---
> 
> (Updated Feb. 15, 2014, 12:21 a.m.)
> 
> 
> Review request for Aurora, Suman Karumuri and Maxim Khutornenko.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Addresses TODO from CachedJobState:
>   33  * TODO(wfarner): Take this caching one step further and don't store 
> tasks here, but instead store
>   34  * pre-computed aggregates of host attributes.  For example, translate 
> each IScheduledTask into
>   35  * the HostAttributes of the machine it resides on, and count the number 
> of times each attribute is
>   36  * seen.  This could be represented in a Map Multiset>, where
>   37  * the outer key is the attribute name, and the inner key is attribute 
> values.  So, for a job with
>   38  * two tasks on the same rack but different hosts, you could have this 
> aggregate:
>   39  * 
>   40  * { "host": {"hostA": 1, "hostB": 1},
>   41  *   "rack": {"rack1": 2}
>   42  * }
>   43  * 
> 
> The bulk of this change is with renaming/rewriting CachedJobState (now called 
> AttributeAggregate) and consuming from AttributeFilter.  The rest is ripple 
> in wiring.
> 
> I expect this will make scheduling tasks in large jobs less expensive, since 
> we would previously iterate through all tasks multiple times for each offer.
> 
> 
> Diffs
> -
> 
>   src/main/java/org/apache/aurora/scheduler/async/AsyncModule.java 
> e7a5a8377f4c8a42012e7ecb118c597a2804da40 
>   src/main/java/org/apache/aurora/scheduler/async/Preemptor.java 
> 7893e55f83c3285b37a4e20e88a7bd31728253f3 
>   src/main/java/org/apache/aurora/scheduler/async/TaskScheduler.java 
> 61385c6bbcddc6fd03498190708c9e49892d0859 
>   
> src/main/java/org/apache/aurora/scheduler/events/NotifyingSchedulingFilter.java
>  d2be7ace46ca59612368ef70239877a4a1ee72fb 
>   src/main/java/org/apache/aurora/scheduler/filter/AttributeAggregate.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/filter/AttributeFilter.java 
> 3e74135b1dcff2c42bd10b558e3699e70fd7c6be 
>   src/main/java/org/apache/aurora/scheduler/filter/CachedJobState.java 
> 337a5e7c045b3338056d17541d6aec0b2d84c9cb 
>   src/main/java/org/apache/aurora/scheduler/filter/ConstraintFilter.java 
> c2b9f047b199aa24849c3a9eb1d8176f13fade07 
>   src/main/java/org/apache/aurora/scheduler/filter/SchedulingFilter.java 
> ddb34cfc2c476ad6266e2d04a626c391f33d2592 
>   src/main/java/org/apache/aurora/scheduler/filter/SchedulingFilterImpl.java 
> 93355c3e9a38bb581d9080a492ec786509158d96 
>   src/main/java/org/apache/aurora/scheduler/state/TaskAssigner.java 
> 3154ef04e211e0078144e14091e0ceb1470231b1 
>   src/test/java/org/apache/aurora/scheduler/async/PreemptorImplTest.java 
> dc371e0b0ef66c254c6db5b274ddf81ccfbd9ff3 
>   src/test/java/org/apache/aurora/scheduler/async/TaskSchedulerImplTest.java 
> 94647832c01cbf0b2d8d44f7e40fb1b5cc68fd91 
>   src/test/java/org/apache/aurora/scheduler/async/TaskSchedulerTest.java 
> 4112d6133ad26344696237c180f7cc045da4f4f2 
>   
> src/test/java/org/apache/aurora/scheduler/events/NotifyingSchedulingFilterTest.java
>  4ac059cf0eab2370938dc388dbfd86ffb086a248 
>   
> src/test/java/org/apache/aurora/scheduler/filter/AttributeAggregateTest.java 
> PRE-CREATION 

Re: Review Request 18157: Compute task host attribute aggregates once when scheduling tasks.

2014-02-14 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18157/#review34561
---

Ship it!



src/main/java/org/apache/aurora/scheduler/filter/AttributeAggregate.java


Double space.



src/main/java/org/apache/aurora/scheduler/filter/AttributeAggregate.java


Having a comment deciphering key/value here would be helpful.



src/main/java/org/apache/aurora/scheduler/filter/AttributeAggregate.java


Typo.



src/main/java/org/apache/aurora/scheduler/filter/AttributeAggregate.java


Would it make sense to move getHostAttributes() outside of the Supplier 
initialization? I find it hard to follow the logic as it is.


- Maxim Khutornenko


On Feb. 15, 2014, 12:21 a.m., Bill Farner wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18157/
> ---
> 
> (Updated Feb. 15, 2014, 12:21 a.m.)
> 
> 
> Review request for Aurora, Suman Karumuri and Maxim Khutornenko.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Addresses TODO from CachedJobState:
>   33  * TODO(wfarner): Take this caching one step further and don't store 
> tasks here, but instead store
>   34  * pre-computed aggregates of host attributes.  For example, translate 
> each IScheduledTask into
>   35  * the HostAttributes of the machine it resides on, and count the number 
> of times each attribute is
>   36  * seen.  This could be represented in a Map Multiset>, where
>   37  * the outer key is the attribute name, and the inner key is attribute 
> values.  So, for a job with
>   38  * two tasks on the same rack but different hosts, you could have this 
> aggregate:
>   39  * 
>   40  * { "host": {"hostA": 1, "hostB": 1},
>   41  *   "rack": {"rack1": 2}
>   42  * }
>   43  * 
> 
> The bulk of this change is with renaming/rewriting CachedJobState (now called 
> AttributeAggregate) and consuming from AttributeFilter.  The rest is ripple 
> in wiring.
> 
> I expect this will make scheduling tasks in large jobs less expensive, since 
> we would previously iterate through all tasks multiple times for each offer.
> 
> 
> Diffs
> -
> 
>   src/main/java/org/apache/aurora/scheduler/async/AsyncModule.java 
> e7a5a8377f4c8a42012e7ecb118c597a2804da40 
>   src/main/java/org/apache/aurora/scheduler/async/Preemptor.java 
> 7893e55f83c3285b37a4e20e88a7bd31728253f3 
>   src/main/java/org/apache/aurora/scheduler/async/TaskScheduler.java 
> 61385c6bbcddc6fd03498190708c9e49892d0859 
>   
> src/main/java/org/apache/aurora/scheduler/events/NotifyingSchedulingFilter.java
>  d2be7ace46ca59612368ef70239877a4a1ee72fb 
>   src/main/java/org/apache/aurora/scheduler/filter/AttributeAggregate.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/filter/AttributeFilter.java 
> 3e74135b1dcff2c42bd10b558e3699e70fd7c6be 
>   src/main/java/org/apache/aurora/scheduler/filter/CachedJobState.java 
> 337a5e7c045b3338056d17541d6aec0b2d84c9cb 
>   src/main/java/org/apache/aurora/scheduler/filter/ConstraintFilter.java 
> c2b9f047b199aa24849c3a9eb1d8176f13fade07 
>   src/main/java/org/apache/aurora/scheduler/filter/SchedulingFilter.java 
> ddb34cfc2c476ad6266e2d04a626c391f33d2592 
>   src/main/java/org/apache/aurora/scheduler/filter/SchedulingFilterImpl.java 
> 93355c3e9a38bb581d9080a492ec786509158d96 
>   src/main/java/org/apache/aurora/scheduler/state/TaskAssigner.java 
> 3154ef04e211e0078144e14091e0ceb1470231b1 
>   src/test/java/org/apache/aurora/scheduler/async/PreemptorImplTest.java 
> dc371e0b0ef66c254c6db5b274ddf81ccfbd9ff3 
>   src/test/java/org/apache/aurora/scheduler/async/TaskSchedulerImplTest.java 
> 94647832c01cbf0b2d8d44f7e40fb1b5cc68fd91 
>   src/test/java/org/apache/aurora/scheduler/async/TaskSchedulerTest.java 
> 4112d6133ad26344696237c180f7cc045da4f4f2 
>   
> src/test/java/org/apache/aurora/scheduler/events/NotifyingSchedulingFilterTest.java
>  4ac059cf0eab2370938dc388dbfd86ffb086a248 
>   
> src/test/java/org/apache/aurora/scheduler/filter/AttributeAggregateTest.java 
> PRE-CREATION 
>   
> src/test/java/org/apache/aurora/scheduler/filter/SchedulingFilterImplTest.java
>  56de3aae4e08138a66a587e2a43d64a2310079f1 
>   src/test/java/org/apache/aurora/scheduler/state/TaskAssignerImplTest.java 
> f0fdd571f273be8dad1f9410b0e70e3a11e8133b 
> 
> Diff: https://reviews.apache.org/r/18157/diff/
> 
> 
> Testing
> ---
> 
> ./gradlew build
> 
> 
> Thanks,
> 
> Bill Farner
> 
>