#26035: Streamline sample quantile types used in the various modules --------------------------------+------------------------------ Reporter: karsten | Owner: karsten Type: enhancement | Status: needs_review Priority: High | Milestone: Component: Metrics/Statistics | Version: Severity: Normal | Resolution: Keywords: | Actual Points: Parent ID: | Points: Reviewer: | Sponsor: Sponsor13 --------------------------------+------------------------------ Changes (by karsten):
* status: assigned => needs_review Comment: Replying to [comment:14 iwakeh]: > Taking you up on your offer from comment:13, so I can concentrate on reviews and tickets of CollecTor. Alright, happy to implement this change. Please review [https://gitweb.torproject.org/karsten/metrics- web.git/log/?h=task-26035 my task-26035 branch] with three commits: - [https://gitweb.torproject.org/karsten/metrics- web.git/commit/?h=task-26035&id=4f92894a1ee5315b9e4a17b38f3cdb229612f0f1 4f92894] changes how we're computing median and inter-quartile range in the censorship detector code, which is still written in Python. I tested the change by running on our user number estimates. I found that it changes 159 of 2447 days in our data (6.5%) and leaves the remaining days entirely unchanged. This also makes sense: with a slightly different median and inter-quartile range we either include a value or exclude it as outlier. I'd say we cannot conclude that one of the implementations is correct and the other is not. The new implementation will simply be more consistent throughout our code base. - [https://gitweb.torproject.org/karsten/metrics- web.git/commit/?h=task-26035&id=2685c78f13cbf9402d5ba0b4380df03f246e86e5 2685c78] makes the same change to our advertised bandwidth statistics. Obviously, this changes results a bit, because we're now interpolating between actually reported advertised bandwidths rather than returning a value that was actually reported by one of the relays. Still, for the sake of consistency throughout our code base, we should switch. - [https://gitweb.torproject.org/karsten/metrics- web.git/commit/?h=task-26035&id=f9c24cab1006bf5999c662e9d06767c59c71a3e6 f9c24ca] makes the third change in this series, this time to the connbidirect module. The change is quite significant in years 2011 and 2012 where we had just a handful of relays reporting these statistics. Then it does make a difference whether we're interpolating or not. Same argument in favor of doing it now. I'm currently re-processing the descriptor archive for updated advbwdist statistics (second commit above). Re-doing the clients and connbidirect statistics using the updated code is much simpler. I hope to be ready to deploy the change in the next few days. Ideally, we'd be done with the review process by then, too. Thanks in advance! -- Ticket URL: <https://trac.torproject.org/projects/tor/ticket/26035#comment:15> Tor Bug Tracker & Wiki <https://trac.torproject.org/> The Tor Project: anonymity online
_______________________________________________ tor-bugs mailing list tor-bugs@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs