[GitHub] [beam] robertwb commented on issue #6375: [BEAM-4858] Clean up division in batch size estimator.

GitHub Thu, 27 Sep 2018 08:01:47 -0700

You're right, the a and b were switched in computing the error term when I 
copied this to the PR. This meant that significantly more points were 
considered outliers (but enough retained to typically give a reasonable 
regression). Unfortunately this fix means that it's still pretty sensitive to 
multiple outliers...


I'm trying a simpler approach: just assume the top quantile is outliers. We 
have enough data to make this pretty robust. Running experiments now. 

(As for computing h, I used sagemath.)

[ Full content available at: https://github.com/apache/beam/pull/6375 ]
This message was relayed via gitbox.apache.org for [email protected]

[GitHub] [beam] robertwb commented on issue #6375: [BEAM-4858] Clean up division in batch size estimator.

Reply via email to