-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20398/#review40705
-----------------------------------------------------------



src/main/java/org/apache/aurora/scheduler/async/RescheduleCalculator.java
<https://reviews.apache.org/r/20398/#comment73815>

    Pull this part up to the previous line:
    
      ITaskEvent activeEvent = Iterables.find(
          arg1,
          arg2);



src/main/java/org/apache/aurora/scheduler/sla/MetricCalculator.java
<https://reviews.apache.org/r/20398/#comment73816>

    This is a lazy iterable - you probably want to call toList() at the end to 
avoid on-the-fly filtering each time you walk over it.



src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java
<https://reviews.apache.org/r/20398/#comment73821>

    Log message would be more useful with the task ID.



src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java
<https://reviews.apache.org/r/20398/#comment73834>

    Have you considered making MedianAlgorithm non-abstract, and have it accept 
a ScheduleStatus in the constructor?  That would eliminate the need for these 
two classes.



src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java
<https://reviews.apache.org/r/20398/#comment73835>

    Can you elaborate or clarify "relative to request time"?



src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java
<https://reviews.apache.org/r/20398/#comment73836>

    The decay comment threw me a bit - this might be improved by using 
consistent terminology "time frame".



src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java
<https://reviews.apache.org/r/20398/#comment73839>

    Can you add a comment explaining the rationale here?  I would have expected 
that this would only look at tasks currently in RUNNING.



src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java
<https://reviews.apache.org/r/20398/#comment73838>

    s/numElements/percentileIndex/?



src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java
<https://reviews.apache.org/r/20398/#comment73837>

    It would be good to extract a function to calculate the Nth percentile of a 
List of numbers.  As it stands, this is duplicated and also inconsistent with 
the median calculation (the other one averages when there's a non-even number 
of elements).
    
    You might even go so far to extract a PercentileAlgorithm, whose 
constructor accepts something like Predicate<IScheduledTask> and 
Function<IScheduledTask, Long>.  I'll leave that decision up to you, it might 
not make sense.



src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java
<https://reviews.apache.org/r/20398/#comment73845>

    The name and doc don't line up with the values returned.  You're returning 
a percentage [0, 100] (higher is better), but everything else suggests it's 
accumulated downtime (implying higher is worse).



src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java
<https://reviews.apache.org/r/20398/#comment73844>

    please doc



src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java
<https://reviews.apache.org/r/20398/#comment73840>

    Given that some functions deal with collections of events that pertain to 
different tasks, it would be helpful to document places like this where the 
events must be for a single task.



src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java
<https://reviews.apache.org/r/20398/#comment73829>

    Use ImmutableList.Builder, to make sure that an immutable list is returned.



src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java
<https://reviews.apache.org/r/20398/#comment73843>

    Please add comments in here, i suspect a newcomer would be pretty unclear 
on what's going on vs what's intended.



src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java
<https://reviews.apache.org/r/20398/#comment73825>

    Can you more thoroughly document the behavior here?



src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java
<https://reviews.apache.org/r/20398/#comment73826>

    In your comment, please be sure to explain why the events from multiple 
tasks are merged and sorted.  Also, please rename TASKS_TO_TASK_EVENTS to 
indicate the sorting behavior.



src/main/java/org/apache/aurora/scheduler/sla/SlaGroup.java
<https://reviews.apache.org/r/20398/#comment73846>

    Please share instance size definitions with SlotSizeCounter to prevent 
drift.



src/main/java/org/apache/aurora/scheduler/sla/SlaGroup.java
<https://reviews.apache.org/r/20398/#comment73847>

    These three classes seem to have a pretty big overlap.  Can you solve this 
with a single implementation that accepts:
    
      Map<String, Range<Double>>
      Function<IScheduledTask, Double>


- Bill Farner


On April 17, 2014, 11:39 p.m., Maxim Khutornenko wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/20398/
> -----------------------------------------------------------
> 
> (Updated April 17, 2014, 11:39 p.m.)
> 
> 
> Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman.
> 
> 
> Bugs: AURORA-293
>     https://issues.apache.org/jira/browse/AURORA-293
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> High level overview:
> - MetricCalculator runs periodically (every minute), pulls all task history, 
> packages it into SlaInstance list and updates stats;
> - Stat calculation is handled by a pair of: SlaAlgorithm and a set of 
> applicable SlaGroups (logical groupings by job, cluster, resource and etc.);
> - Stat name is generated by combining group and algorithm name parts.
> 
> 
> Diffs
> -----
> 
>   src/main/java/org/apache/aurora/scheduler/app/AppModule.java 
> eeafc784e915137cacd5f64df1252ccbaf6c0f6c 
>   src/main/java/org/apache/aurora/scheduler/async/RescheduleCalculator.java 
> bc77bf2e6fbc1ff4159d049b0c28afd6832499ef 
>   src/main/java/org/apache/aurora/scheduler/base/Tasks.java 
> fae2d237d18f945d4dd73ec56cd42981359bea46 
>   src/main/java/org/apache/aurora/scheduler/sla/MetricCalculator.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/sla/SlaGroup.java PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/sla/SlaModule.java PRE-CREATION 
>   src/test/java/org/apache/aurora/scheduler/sla/MetricCalculatorTest.java 
> PRE-CREATION 
>   src/test/java/org/apache/aurora/scheduler/sla/SlaAlgorithmTest.java 
> PRE-CREATION 
>   src/test/java/org/apache/aurora/scheduler/sla/SlaTestUtil.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/20398/diff/
> 
> 
> Testing
> -------
> 
> gradle build
> gradle run
> 
> Sample from local scheduler:
> 
> sla_cluster_mtta 3271.493
> sla_cluster_mttr 3273.497
> sla_cluster_platform_uptime_percent 100.0
> sla_cpu_small_mtta 3271.493
> sla_cpu_small_mttr 3273.497
> sla_disk_small_mtta 3271.493
> sla_disk_small_mttr 3273.497
> sla_mesos_test_serviceJob0_job_uptime_75_00_sec 0
> sla_mesos_test_serviceJob0_job_uptime_90_00_sec 0
> sla_mesos_test_serviceJob0_job_uptime_95_00_sec 0
> sla_mesos_test_serviceJob0_job_uptime_99_00_sec 0
> sla_mesos_test_serviceJob0_mtta 0.0
> sla_mesos_test_serviceJob0_mttr 0.0
> sla_mesos_test_serviceJob0_platform_uptime_percent 100.0
> sla_mesos_test_serviceJob10_job_uptime_75_00_sec 0
> sla_mesos_test_serviceJob10_job_uptime_90_00_sec 0
> sla_mesos_test_serviceJob10_job_uptime_95_00_sec 0
> sla_mesos_test_serviceJob10_job_uptime_99_00_sec 0
> sla_mesos_test_serviceJob10_mtta 0.0
> sla_mesos_test_serviceJob10_mttr 0.0
> sla_mesos_test_serviceJob10_platform_uptime_percent 100.0
> sla_mesos_test_serviceJob12_job_uptime_75_00_sec 0
> sla_mesos_test_serviceJob12_job_uptime_90_00_sec 0
> sla_mesos_test_serviceJob12_job_uptime_95_00_sec 0
> sla_mesos_test_serviceJob12_job_uptime_99_00_sec 0
> sla_mesos_test_serviceJob12_mtta 0.0
> sla_mesos_test_serviceJob12_mttr 0.0
> sla_mesos_test_serviceJob12_platform_uptime_percent 100.0
> sla_mesos_test_serviceJob14_job_uptime_75_00_sec 0
> sla_mesos_test_serviceJob14_job_uptime_90_00_sec 0
> sla_mesos_test_serviceJob14_job_uptime_95_00_sec 0
> sla_mesos_test_serviceJob14_job_uptime_99_00_sec 0
> sla_mesos_test_serviceJob14_mtta 0.0
> sla_mesos_test_serviceJob14_mttr 0.0
> sla_mesos_test_serviceJob14_platform_uptime_percent 100.0
> sla_mesos_test_serviceJob16_job_uptime_75_00_sec 0
> sla_mesos_test_serviceJob16_job_uptime_90_00_sec 0
> sla_mesos_test_serviceJob16_job_uptime_95_00_sec 0
> sla_mesos_test_serviceJob16_job_uptime_99_00_sec 0
> sla_mesos_test_serviceJob16_mtta 0.0
> sla_mesos_test_serviceJob16_mttr 0.0
> sla_mesos_test_serviceJob16_platform_uptime_percent 100.0
> sla_mesos_test_serviceJob18_job_uptime_75_00_sec 0
> sla_mesos_test_serviceJob18_job_uptime_90_00_sec 0
> sla_mesos_test_serviceJob18_job_uptime_95_00_sec 0
> sla_mesos_test_serviceJob18_job_uptime_99_00_sec 0
> sla_mesos_test_serviceJob18_mtta 0.0
> sla_mesos_test_serviceJob18_mttr 0.0
> sla_mesos_test_serviceJob18_platform_uptime_percent 100.0
> sla_mesos_test_serviceJob2_job_uptime_75_00_sec 0
> sla_mesos_test_serviceJob2_job_uptime_90_00_sec 0
> sla_mesos_test_serviceJob2_job_uptime_95_00_sec 0
> sla_mesos_test_serviceJob2_job_uptime_99_00_sec 0
> sla_mesos_test_serviceJob2_mtta 0.0
> sla_mesos_test_serviceJob2_mttr 0.0
> sla_mesos_test_serviceJob2_platform_uptime_percent 100.0
> sla_mesos_test_serviceJob4_job_uptime_75_00_sec 0
> sla_mesos_test_serviceJob4_job_uptime_90_00_sec 0
> sla_mesos_test_serviceJob4_job_uptime_95_00_sec 0
> sla_mesos_test_serviceJob4_job_uptime_99_00_sec 0
> sla_mesos_test_serviceJob4_mtta 0.0
> sla_mesos_test_serviceJob4_mttr 0.0
> sla_mesos_test_serviceJob4_platform_uptime_percent 100.0
> sla_mesos_test_serviceJob6_job_uptime_75_00_sec 7125
> sla_mesos_test_serviceJob6_job_uptime_90_00_sec 7125
> sla_mesos_test_serviceJob6_job_uptime_95_00_sec 7125
> sla_mesos_test_serviceJob6_job_uptime_99_00_sec 7125
> sla_mesos_test_serviceJob6_mtta 3271.493
> sla_mesos_test_serviceJob6_mttr 3273.497
> sla_mesos_test_serviceJob6_platform_uptime_percent 100.0
> sla_mesos_test_serviceJob8_job_uptime_75_00_sec 0
> sla_mesos_test_serviceJob8_job_uptime_90_00_sec 0
> sla_mesos_test_serviceJob8_job_uptime_95_00_sec 0
> sla_mesos_test_serviceJob8_job_uptime_99_00_sec 0
> sla_mesos_test_serviceJob8_mtta 0.0
> sla_mesos_test_serviceJob8_mttr 0.0
> sla_mesos_test_serviceJob8_platform_uptime_percent 100.0
> sla_ram_small_mtta 3271.493
> sla_ram_small_mttr 3273.497
> 
> 
> File Attachments
> ----------------
> 
> Coverage report
>   
> https://reviews.apache.org/media/uploaded/files/2014/04/16/ffe00c63-bb3a-4b90-95f8-f23878b0fdab__SLA_coverage_report.png
> 
> 
> Thanks,
> 
> Maxim Khutornenko
> 
>

Reply via email to