-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62956/#review187939
-----------------------------------------------------------




src/main/java/org/apache/aurora/scheduler/offers/OfferManager.java
Lines 67-68 (patched)
<https://reviews.apache.org/r/62956/#comment265006>

    As far as I know this will filter this agent entirely for 30 days. This 
comes pretty close to leaking agents. 
https://github.com/apache/mesos/blob/2fe2bb26a425da9aaf1d7cf34019dd347d0cf9a4/src/master/allocator/mesos/hierarchical.cpp#L1207-L1209
    
    This implies the timeout would need to be significantly smaller (e.g ~3 
minutes) and configurable for operators. At that point, I am no longer sure the 
optimization would help at Twitter-scale clusters.



src/main/java/org/apache/aurora/scheduler/offers/OfferManager.java
Lines 220-224 (patched)
<https://reviews.apache.org/r/62956/#comment265005>

    This won't work for us.
    
    We are using both non-revocable and revocable (CPU & RAM) resources. it is 
crucial for us that we can still use revocable resources on an agent even if 
the non-revocable resources are maxed out. The same applies vice versa. 
    
    This pseudo code should solve it:
    ```
    bool lacksUsefulResources(offer):
        no_revocable = revocable_mem <= mem_threshold || revocable_cpu <= 
cpu_threshold
        no_non_revocabe = mem <= mem_threshold || cpu <= cpu_threshold
        
        return no_revocable and no_non_revocable
    ```
    
    Would that still work for you? 
    
    (As a minor improvement of the heuristic we could use the minimal executor 
resources as thresholds rather than 0)


- Stephan Erb


On Oct. 13, 2017, 1:18 a.m., Bill Farner wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62956/
> -----------------------------------------------------------
> 
> (Updated Oct. 13, 2017, 1:18 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin and Jordan Ly.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> There's no reason for us to evaluate offers with no CPUs or memory, so reject 
> them early in the offer lifecycle.
> 
> This is an incremental performance optimization, but it may net significant 
> improvements based on observations in some very large clusters.
> 
> 
> Diffs
> -----
> 
>   src/main/java/org/apache/aurora/scheduler/http/Utilization.java 
> 3c77e2983ce00f897f3d5ed106b779cd7f7f0940 
>   src/main/java/org/apache/aurora/scheduler/offers/OfferManager.java 
> e8334310a2a46a0ccb09ee6e4122c515892d3996 
>   
> src/main/java/org/apache/aurora/scheduler/preemptor/PreemptionVictimFilter.java
>  1b1239753f40d7d46d91724def6c25037eb79f1c 
>   src/main/java/org/apache/aurora/scheduler/resources/ResourceBag.java 
> d5db81b88a0369d0b26c8fbf70efab3886ad7695 
>   src/main/java/org/apache/aurora/scheduler/stats/TaskStatCalculator.java 
> b98aaaf48ae60afef19a368ee96abc897300f8fa 
>   src/test/java/org/apache/aurora/scheduler/offers/OfferManagerImplTest.java 
> 2cfdc090ff75a63111ae146c9fe7b3542e7ac83f 
>   src/test/java/org/apache/aurora/scheduler/offers/Offers.java 
> 129b4437315c6ad4ea47ca75d4ae6e28cadd7911 
>   src/test/java/org/apache/aurora/scheduler/resources/ResourceTestUtil.java 
> 765a527acb96997989c920be8b69dfa1113dc302 
> 
> 
> Diff: https://reviews.apache.org/r/62956/diff/2/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Bill Farner
> 
>

Reply via email to