[GitHub] spark pull request: [SPARK-13131] [SQL] Use best time in benchmark

davies Tue, 02 Feb 2016 11:48:05 -0800

Github user davies commented on the pull request:

    https://github.com/apache/spark/pull/11018#issuecomment-178783672
  
    @mengxr Thanks for the details, that make sense. 
    
    Ran a few tests, here is the distribution of W (removed the outliners)
    
    
![image](https://cloud.githubusercontent.com/assets/40902/12761087/77cc3f32-c99e-11e5-8c37-f3f84c98e1d0.png)
    
    Because the number we care most is `a + W / b + W`, especially when b is 
small, the result become more sensitive on W.
    
    Ran a few tests on this particular case, the relative rates of first 
benchmark (range/filter/ are listed here (this is the number we care about 
most):
    
    ![image 
1](https://cloud.githubusercontent.com/assets/40902/12762067/61084486-c9a2-11e5-95c3-2f1d3b295164.png)
    
    It seems that best time or medium time are much better than mean time, the 
variance of best time (0.21) is little better than medium time (0.33), the 
variance of mean time is 2.4.
    
    I think we should go with best time or medium time. cc @rxin @nongli




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-13131] [SQL] Use best time in benchmark

Reply via email to