Re: [Wikitech-l] [RFC] performance standards for new mediawiki features
On Thu, Mar 21, 2013 at 10:55 PM, Yuri Astrakhan yastrak...@wikimedia.orgwrote: API is fairly complex to meassure and performance target. If a bot requests 5000 pages in one call, together with all links categories, it might take a very long time (seconds if not tens of seconds). Comparing that to another api request that gets an HTML section of a page, which takes a fraction of a second (especially when comming from cache) is not very useful. This is true, and I think we'd want to look at a metric like 99th percentile latency. There's room for corner cases taking much longer, but they really have to be corner cases. Standards also have to be flexible, with different acceptable ranges for different uses. Yet if 30% of requests for an api method to fetch pages took tens of seconds, we'd likely have to disable it entirely until its use or the number of pages per request could be limited. On Fri, Mar 22, 2013 at 1:32 AM, Peter Gehres li...@pgehres.com wrote: From where would you propose measuring these data points? Obviously network latency will have a great impact on some of the metrics and a consistent location would help to define the pass/fail of each test. I do think another benchmark Ops features would be a set of latency-to-datacenter values, but I know that is a much harder taks. Thanks for putting this together. On Thu, Mar 21, 2013 at 6:40 PM, Asher Feldman afeld...@wikimedia.org wrote: I'd like to push for a codified set of minimum performance standards that new mediawiki features must meet before they can be deployed to larger wikimedia sites such as English Wikipedia, or be considered complete. These would look like (numbers pulled out of a hat, not actual suggestions): - p999 (long tail) full page request latency of 2000ms - p99 page request latency of 800ms - p90 page request latency of 150ms - p99 banner request latency of 80ms - p90 banner request latency of 40ms - p99 db query latency of 250ms - p90 db query latency of 50ms - 1000 write requests/sec (if applicable; writes operations must be free from concurrency issues) - guidelines about degrading gracefully - specific limits on total resource consumption across the stack per request - etc.. Right now, varying amounts of effort are made to highlight potential performance bottlenecks in code review, and engineers are encouraged to profile and optimize their own code. But beyond is the site still up for everyone / are users complaining on the village pump / am I ranting in irc, we've offered no guidelines as to what sort of request latency is reasonable or acceptable. If a new feature (like aftv5, or flow) turns out not to meet perf standards after deployment, that would be a high priority bug and the feature may be disabled depending on the impact, or if not addressed in a reasonable time frame. Obviously standards like this can't be applied to certain existing parts of mediawiki, but systems other than the parser or preprocessor that don't meet new standards should at least be prioritized for improvement. Thoughts? Asher ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [RFC] performance standards for new mediawiki features
Asher, I don't know the actual perf statistics just yet. With the API this has to be a balance - I would want more slower calls than tons of very fast calls - as that consumes much more bandwidth and resources (consider getting all items one item at a time - very quick, but very inefficient). On Tue, Mar 26, 2013 at 2:57 PM, Asher Feldman afeld...@wikimedia.orgwrote: On Thu, Mar 21, 2013 at 10:55 PM, Yuri Astrakhan yastrak...@wikimedia.orgwrote: API is fairly complex to meassure and performance target. If a bot requests 5000 pages in one call, together with all links categories, it might take a very long time (seconds if not tens of seconds). Comparing that to another api request that gets an HTML section of a page, which takes a fraction of a second (especially when comming from cache) is not very useful. This is true, and I think we'd want to look at a metric like 99th percentile latency. There's room for corner cases taking much longer, but they really have to be corner cases. Standards also have to be flexible, with different acceptable ranges for different uses. Yet if 30% of requests for an api method to fetch pages took tens of seconds, we'd likely have to disable it entirely until its use or the number of pages per request could be limited. On Fri, Mar 22, 2013 at 1:32 AM, Peter Gehres li...@pgehres.com wrote: From where would you propose measuring these data points? Obviously network latency will have a great impact on some of the metrics and a consistent location would help to define the pass/fail of each test. I do think another benchmark Ops features would be a set of latency-to-datacenter values, but I know that is a much harder taks. Thanks for putting this together. On Thu, Mar 21, 2013 at 6:40 PM, Asher Feldman afeld...@wikimedia.org wrote: I'd like to push for a codified set of minimum performance standards that new mediawiki features must meet before they can be deployed to larger wikimedia sites such as English Wikipedia, or be considered complete. These would look like (numbers pulled out of a hat, not actual suggestions): - p999 (long tail) full page request latency of 2000ms - p99 page request latency of 800ms - p90 page request latency of 150ms - p99 banner request latency of 80ms - p90 banner request latency of 40ms - p99 db query latency of 250ms - p90 db query latency of 50ms - 1000 write requests/sec (if applicable; writes operations must be free from concurrency issues) - guidelines about degrading gracefully - specific limits on total resource consumption across the stack per request - etc.. Right now, varying amounts of effort are made to highlight potential performance bottlenecks in code review, and engineers are encouraged to profile and optimize their own code. But beyond is the site still up for everyone / are users complaining on the village pump / am I ranting in irc, we've offered no guidelines as to what sort of request latency is reasonable or acceptable. If a new feature (like aftv5, or flow) turns out not to meet perf standards after deployment, that would be a high priority bug and the feature may be disabled depending on the impact, or if not addressed in a reasonable time frame. Obviously standards like this can't be applied to certain existing parts of mediawiki, but systems other than the parser or preprocessor that don't meet new standards should at least be prioritized for improvement. Thoughts? Asher ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [RFC] performance standards for new mediawiki features
There are all good points, and we certainly do need better tooling for individual developers. There are a lot of things a developer can do on just a laptop in terms of profiling code, that if done consistently, could go a long way, even without it looking anything like production. Things like understanding if algorithms or queries are O(n) or O(2^n), etc. and thinking about the potential size of the relevant production data set might be more useful at that stage than raw numbers. When it comes to gathering numbers in such an environment, it would be helpful if either the mediawiki profiler could gain an easy visualization interface appropriate for such environments, or if we standardized around something like xdebug. The beta cluster has some potential as a performance test bed if only it could gain a guarantee that the compute nodes it runs on aren't oversubscribed or that the beta virts were otherwise consistently resourced. By running a set of performance benchmarks against beta and production, we may be able to gain insight on how new features are likely to perform. Beyond due diligence while architecting and implementing a feature, I'm actually a proponent of testing in production, albeit in limited ways. Not as with test.wikipedia.org which ran on the production cluster, but by deploying a feature to 5% of enwiki users, or 10% of pages, or 20% of editors. Once something is deployed like that, we do indeed have tooling available to gather hard performance metrics of the sort I proposed, though they can always be improved upon. It became apparent that ArticleFeedbackV5 had severe scaling issues after being enabled on 10% of the articles on enwiki. For that example, I think it could have been caught in an architecture review or in local testing by the developers that issuing 17 database write statements per submission of an anonymous text box that would go at the bottom of every wikipedia article was a bad idea. But it's really great that it was incrementally deployed and we could halt its progress before the resulting issues got too serious. That rollout methodology should be considered a great success. If it can become the norm, perhaps it won't be difficult to get to the point where we can have actionable performance standards for new features, via a process that actually encourages getting features in production instead of being a complicated roadblock. On Fri, Mar 22, 2013 at 1:20 PM, Arthur Richards aricha...@wikimedia.orgwrote: Right now, I think many of us profile locally or in VMs, which can be useful for relative metrics or quickly identifying bottlenecks, but doesn't really get us the kind of information you're talking about from any sort of real-world setting, or in any way that would be consistent from engineer to engineer, or even necessarily from day to day. From network topology to article counts/sizes/etc and everything in between, there's a lot we can't really replicate or accurately profile against. Are there plans to put together and support infrastructure for this? It seems to me that this proposal is contingent upon a consistent environment accessible by engineers for performance testing. On Thu, Mar 21, 2013 at 10:55 PM, Yuri Astrakhan yastrak...@wikimedia.orgwrote: API is fairly complex to meassure and performance target. If a bot requests 5000 pages in one call, together with all links categories, it might take a very long time (seconds if not tens of seconds). Comparing that to another api request that gets an HTML section of a page, which takes a fraction of a second (especially when comming from cache) is not very useful. On Fri, Mar 22, 2013 at 1:32 AM, Peter Gehres li...@pgehres.com wrote: From where would you propose measuring these data points? Obviously network latency will have a great impact on some of the metrics and a consistent location would help to define the pass/fail of each test. I do think another benchmark Ops features would be a set of latency-to-datacenter values, but I know that is a much harder taks. Thanks for putting this together. On Thu, Mar 21, 2013 at 6:40 PM, Asher Feldman afeld...@wikimedia.org wrote: I'd like to push for a codified set of minimum performance standards that new mediawiki features must meet before they can be deployed to larger wikimedia sites such as English Wikipedia, or be considered complete. These would look like (numbers pulled out of a hat, not actual suggestions): - p999 (long tail) full page request latency of 2000ms - p99 page request latency of 800ms - p90 page request latency of 150ms - p99 banner request latency of 80ms - p90 banner request latency of 40ms - p99 db query latency of 250ms - p90 db query latency of 50ms - 1000 write requests/sec (if applicable; writes operations must be free from concurrency issues) - guidelines about degrading gracefully
Re: [Wikitech-l] [RFC] performance standards for new mediawiki features
On Tue, Mar 26, 2013 at 3:58 PM, Asher Feldman afeld...@wikimedia.orgwrote: There are all good points, and we certainly do need better tooling for individual developers. There are a lot of things a developer can do on just a laptop in terms of profiling code, that if done consistently, could go a long way, even without it looking anything like production. Things like understanding if algorithms or queries are O(n) or O(2^n), etc. and thinking about the potential size of the relevant production data set might be more useful at that stage than raw numbers. When it comes to gathering numbers in such an environment, it would be helpful if either the mediawiki profiler could gain an easy visualization interface appropriate for such environments, or if we standardized around something like xdebug. The beta cluster has some potential as a performance test bed if only it could gain a guarantee that the compute nodes it runs on aren't oversubscribed or that the beta virts were otherwise consistently resourced. By running a set of performance benchmarks against beta and production, we may be able to gain insight on how new features are likely to perform. This is possible in newer versions of OpenStack, using scheduler hinting. That said, the instances would still be sharing a host with each other, which can cause some inconsistencies. We'd likely not want to use beta itself, but something that has limited access for performance testing purposes only, as we wouldn't want other unrelated testing load to skew results. Additionally, we'd want to make sure to avoid things like /data/project or /home (both of which beta is using), even once we've moved to a more stable shared storage solution, as it could very heavily skew results based on load from other projects. We need to upgrade to the Folsom release or greater for a few other features anyway, and enabling scheduler hinting is pretty simple. I'd say let's consider adding something like this to the Labs infrastructure roadmap once the upgrade happens and we've tested out the hinting feature. - Ryan ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [RFC] performance standards for new mediawiki features
On Tue, Mar 26, 2013 at 8:15 PM, Ryan Lane rlan...@gmail.com wrote: On Tue, Mar 26, 2013 at 3:58 PM, Asher Feldman afeld...@wikimedia.orgwrote: There are all good points, and we certainly do need better tooling for individual developers. There are a lot of things a developer can do on just a laptop in terms of profiling code, that if done consistently, could go a long way, even without it looking anything like production. Things like understanding if algorithms or queries are O(n) or O(2^n), etc. and thinking about the potential size of the relevant production data set might be more useful at that stage than raw numbers. When it comes to gathering numbers in such an environment, it would be helpful if either the mediawiki profiler could gain an easy visualization interface appropriate for such environments, or if we standardized around something like xdebug. The beta cluster has some potential as a performance test bed if only it could gain a guarantee that the compute nodes it runs on aren't oversubscribed or that the beta virts were otherwise consistently resourced. By running a set of performance benchmarks against beta and production, we may be able to gain insight on how new features are likely to perform. This is possible in newer versions of OpenStack, using scheduler hinting. That said, the instances would still be sharing a host with each other, which can cause some inconsistencies. We'd likely not want to use beta itself, but something that has limited access for performance testing purposes only, as we wouldn't want other unrelated testing load to skew results. Additionally, we'd want to make sure to avoid things like /data/project or /home (both of which beta is using), even once we've moved to a more stable shared storage solution, as it could very heavily skew results based on load from other projects. We need to upgrade to the Folsom release or greater for a few other features anyway, and enabling scheduler hinting is pretty simple. I'd say let's consider adding something like this to the Labs infrastructure roadmap once the upgrade happens and we've tested out the hinting feature. I am concerned in this discussion with insufficient testbed load generation and avoidance of confounding variables in the performance analysis... -- -george william herbert george.herb...@gmail.com ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [RFC] performance standards for new mediawiki features
Right now, I think many of us profile locally or in VMs, which can be useful for relative metrics or quickly identifying bottlenecks, but doesn't really get us the kind of information you're talking about from any sort of real-world setting, or in any way that would be consistent from engineer to engineer, or even necessarily from day to day. From network topology to article counts/sizes/etc and everything in between, there's a lot we can't really replicate or accurately profile against. Are there plans to put together and support infrastructure for this? It seems to me that this proposal is contingent upon a consistent environment accessible by engineers for performance testing. On Thu, Mar 21, 2013 at 10:55 PM, Yuri Astrakhan yastrak...@wikimedia.orgwrote: API is fairly complex to meassure and performance target. If a bot requests 5000 pages in one call, together with all links categories, it might take a very long time (seconds if not tens of seconds). Comparing that to another api request that gets an HTML section of a page, which takes a fraction of a second (especially when comming from cache) is not very useful. On Fri, Mar 22, 2013 at 1:32 AM, Peter Gehres li...@pgehres.com wrote: From where would you propose measuring these data points? Obviously network latency will have a great impact on some of the metrics and a consistent location would help to define the pass/fail of each test. I do think another benchmark Ops features would be a set of latency-to-datacenter values, but I know that is a much harder taks. Thanks for putting this together. On Thu, Mar 21, 2013 at 6:40 PM, Asher Feldman afeld...@wikimedia.org wrote: I'd like to push for a codified set of minimum performance standards that new mediawiki features must meet before they can be deployed to larger wikimedia sites such as English Wikipedia, or be considered complete. These would look like (numbers pulled out of a hat, not actual suggestions): - p999 (long tail) full page request latency of 2000ms - p99 page request latency of 800ms - p90 page request latency of 150ms - p99 banner request latency of 80ms - p90 banner request latency of 40ms - p99 db query latency of 250ms - p90 db query latency of 50ms - 1000 write requests/sec (if applicable; writes operations must be free from concurrency issues) - guidelines about degrading gracefully - specific limits on total resource consumption across the stack per request - etc.. Right now, varying amounts of effort are made to highlight potential performance bottlenecks in code review, and engineers are encouraged to profile and optimize their own code. But beyond is the site still up for everyone / are users complaining on the village pump / am I ranting in irc, we've offered no guidelines as to what sort of request latency is reasonable or acceptable. If a new feature (like aftv5, or flow) turns out not to meet perf standards after deployment, that would be a high priority bug and the feature may be disabled depending on the impact, or if not addressed in a reasonable time frame. Obviously standards like this can't be applied to certain existing parts of mediawiki, but systems other than the parser or preprocessor that don't meet new standards should at least be prioritized for improvement. Thoughts? Asher ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l -- Arthur Richards Software Engineer, Mobile [[User:Awjrichards]] IRC: awjr +1-415-839-6885 x6687 ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [RFC] performance standards for new mediawiki features
Asher Feldman wrote: I'd like to push for a codified set of minimum performance standards that new mediawiki features must meet before they can be deployed to larger wikimedia sites such as English Wikipedia, or be considered complete. These would look like (numbers pulled out of a hat, not actual suggestions): [...] Thoughts? Hi. Once you have numbers from a non-hat source, please draft an RFC at https://www.mediawiki.org/wiki/RFC. :-) MZMcBride ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [RFC] performance standards for new mediawiki features
On Thu, Mar 21, 2013 at 6:40 PM, Asher Feldman afeld...@wikimedia.orgwrote: Right now, varying amounts of effort are made to highlight potential performance bottlenecks in code review, and engineers are encouraged to profile and optimize their own code. But beyond is the site still up for everyone / are users complaining on the village pump / am I ranting in irc, we've offered no guidelines as to what sort of request latency is reasonable or acceptable. If a new feature (like aftv5, or flow) turns out not to meet perf standards after deployment, that would be a high priority bug and the feature may be disabled depending on the impact, or if not addressed in a reasonable time frame. Obviously standards like this can't be applied to certain existing parts of mediawiki, but systems other than the parser or preprocessor that don't meet new standards should at least be prioritized for improvement. Thoughts? As a features product manager, I am totally behind this. I don't take adding another potential blocker lightly, but performance is a feature, and not a minor one. For me the hurdle to taking this more seriously, beyond just is this thing unusably/annoyingly slow when testing it?, has always been a way to reliably measure performance, set goals, and a set of guidelines. Like MZ suggests, I think the place to discuss that is in an RFC on mediawiki.org, but in general I want to say that I support creating a reasonable set of guidelines based data. Steven ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [RFC] performance standards for new mediawiki features
been a way to reliably measure performance... The measure part is important. As it stands I have no way of measuring code in action (sure i can set up profiling locally, and actually have but its not the same [otoh i barely ever look at the local profiling i did set up...). People throw around words like graphite, but unless im mistaken us non staff folks do not have access to whatever that may be. -bawolff ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [RFC] performance standards for new mediawiki features
People throw around words like graphite, but unless im mistaken us non staff folks do not have access to whatever that may be. Graphite refers to the cluster performance logger available at: http://graphite.wikimedia.org/ Anyone with a labs account can view it -- which as a commiter you do (it's the same as your Gerrit login.) otoh i barely ever look at the local profiling i did set up... This problem still exists with graphite; you have to look at it for it to do any good :) ~Matt Walker Wikimedia Foundation Fundraising Technology Team On Fri, Mar 22, 2013 at 2:17 PM, Brian Wolff bawo...@gmail.com wrote: been a way to reliably measure performance... The measure part is important. As it stands I have no way of measuring code in action (sure i can set up profiling locally, and actually have but its not the same [otoh i barely ever look at the local profiling i did set up...). People throw around words like graphite, but unless im mistaken us non staff folks do not have access to whatever that may be. -bawolff ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [RFC] performance standards for new mediawiki features
On 2013-03-22 6:46 PM, Matthew Walker mwal...@wikimedia.org wrote: People throw around words like graphite, but unless im mistaken us non staff folks do not have access to whatever that may be. Graphite refers to the cluster performance logger available at: http://graphite.wikimedia.org/ Anyone with a labs account can view it -- which as a commiter you do (it's the same as your Gerrit login.) I've tried. My lab login doesnt work. More generally, since labs account are free to make, what is the point of password protecting it? otoh i barely ever look at the local profiling i did set up... This problem still exists with graphite; you have to look at it for it to do any good :) That's lame ;) -bawolff ~Matt Walker Wikimedia Foundation Fundraising Technology Team On Fri, Mar 22, 2013 at 2:17 PM, Brian Wolff bawo...@gmail.com wrote: been a way to reliably measure performance... The measure part is important. As it stands I have no way of measuring code in action (sure i can set up profiling locally, and actually have but its not the same [otoh i barely ever look at the local profiling i did set up...). People throw around words like graphite, but unless im mistaken us non staff folks do not have access to whatever that may be. -bawolff ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
[Wikitech-l] [RFC] performance standards for new mediawiki features
I'd like to push for a codified set of minimum performance standards that new mediawiki features must meet before they can be deployed to larger wikimedia sites such as English Wikipedia, or be considered complete. These would look like (numbers pulled out of a hat, not actual suggestions): - p999 (long tail) full page request latency of 2000ms - p99 page request latency of 800ms - p90 page request latency of 150ms - p99 banner request latency of 80ms - p90 banner request latency of 40ms - p99 db query latency of 250ms - p90 db query latency of 50ms - 1000 write requests/sec (if applicable; writes operations must be free from concurrency issues) - guidelines about degrading gracefully - specific limits on total resource consumption across the stack per request - etc.. Right now, varying amounts of effort are made to highlight potential performance bottlenecks in code review, and engineers are encouraged to profile and optimize their own code. But beyond is the site still up for everyone / are users complaining on the village pump / am I ranting in irc, we've offered no guidelines as to what sort of request latency is reasonable or acceptable. If a new feature (like aftv5, or flow) turns out not to meet perf standards after deployment, that would be a high priority bug and the feature may be disabled depending on the impact, or if not addressed in a reasonable time frame. Obviously standards like this can't be applied to certain existing parts of mediawiki, but systems other than the parser or preprocessor that don't meet new standards should at least be prioritized for improvement. Thoughts? Asher ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [RFC] performance standards for new mediawiki features
Asher, Do we know what our numbers are now? That's probably a pretty good baseline to start with as a discussion. p99 banner request latency of 80ms Fundraising banners? From start of page load; or is this specifically how fast our API requests run? On the topic of APIs; we should set similar perf goals for requests to the API / jobs. This gets very subjective though because now we're talking about CPU time, memory usage, HDD usage, cache key space usage -- are these in your scope; or are we simply starting the discussion with response times? Further down the road -- consistency is going to be important (my box will profile differently than someone else's) so it seems like this is a good candidate for 'yet another' continuous integration test. I can easily see us being able to get an initial feel for response times in the CI environment. Or maybe we should just continuously hammer the alpha/beta servers... On deployment though -- currently the only way I know of to see if something is performing is to look directly at graphite -- can icinga/something alert us -- presumably via email? Ideally we would be able to set up new metrics as we go (obviously start with global page loads; but maybe I want to keep an eye on banner render time). I would love to get an email about something I've deployed under-performing. ~Matt Walker Wikimedia Foundation Fundraising Technology Team On Thu, Mar 21, 2013 at 6:40 PM, Asher Feldman afeld...@wikimedia.orgwrote: I'd like to push for a codified set of minimum performance standards that new mediawiki features must meet before they can be deployed to larger wikimedia sites such as English Wikipedia, or be considered complete. These would look like (numbers pulled out of a hat, not actual suggestions): - p999 (long tail) full page request latency of 2000ms - p99 page request latency of 800ms - p90 page request latency of 150ms - p99 banner request latency of 80ms - p90 banner request latency of 40ms - p99 db query latency of 250ms - p90 db query latency of 50ms - 1000 write requests/sec (if applicable; writes operations must be free from concurrency issues) - guidelines about degrading gracefully - specific limits on total resource consumption across the stack per request - etc.. Right now, varying amounts of effort are made to highlight potential performance bottlenecks in code review, and engineers are encouraged to profile and optimize their own code. But beyond is the site still up for everyone / are users complaining on the village pump / am I ranting in irc, we've offered no guidelines as to what sort of request latency is reasonable or acceptable. If a new feature (like aftv5, or flow) turns out not to meet perf standards after deployment, that would be a high priority bug and the feature may be disabled depending on the impact, or if not addressed in a reasonable time frame. Obviously standards like this can't be applied to certain existing parts of mediawiki, but systems other than the parser or preprocessor that don't meet new standards should at least be prioritized for improvement. Thoughts? Asher ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [RFC] performance standards for new mediawiki features
From where would you propose measuring these data points? Obviously network latency will have a great impact on some of the metrics and a consistent location would help to define the pass/fail of each test. I do think another benchmark Ops features would be a set of latency-to-datacenter values, but I know that is a much harder taks. Thanks for putting this together. On Thu, Mar 21, 2013 at 6:40 PM, Asher Feldman afeld...@wikimedia.orgwrote: I'd like to push for a codified set of minimum performance standards that new mediawiki features must meet before they can be deployed to larger wikimedia sites such as English Wikipedia, or be considered complete. These would look like (numbers pulled out of a hat, not actual suggestions): - p999 (long tail) full page request latency of 2000ms - p99 page request latency of 800ms - p90 page request latency of 150ms - p99 banner request latency of 80ms - p90 banner request latency of 40ms - p99 db query latency of 250ms - p90 db query latency of 50ms - 1000 write requests/sec (if applicable; writes operations must be free from concurrency issues) - guidelines about degrading gracefully - specific limits on total resource consumption across the stack per request - etc.. Right now, varying amounts of effort are made to highlight potential performance bottlenecks in code review, and engineers are encouraged to profile and optimize their own code. But beyond is the site still up for everyone / are users complaining on the village pump / am I ranting in irc, we've offered no guidelines as to what sort of request latency is reasonable or acceptable. If a new feature (like aftv5, or flow) turns out not to meet perf standards after deployment, that would be a high priority bug and the feature may be disabled depending on the impact, or if not addressed in a reasonable time frame. Obviously standards like this can't be applied to certain existing parts of mediawiki, but systems other than the parser or preprocessor that don't meet new standards should at least be prioritized for improvement. Thoughts? Asher ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] [RFC] performance standards for new mediawiki features
API is fairly complex to meassure and performance target. If a bot requests 5000 pages in one call, together with all links categories, it might take a very long time (seconds if not tens of seconds). Comparing that to another api request that gets an HTML section of a page, which takes a fraction of a second (especially when comming from cache) is not very useful. On Fri, Mar 22, 2013 at 1:32 AM, Peter Gehres li...@pgehres.com wrote: From where would you propose measuring these data points? Obviously network latency will have a great impact on some of the metrics and a consistent location would help to define the pass/fail of each test. I do think another benchmark Ops features would be a set of latency-to-datacenter values, but I know that is a much harder taks. Thanks for putting this together. On Thu, Mar 21, 2013 at 6:40 PM, Asher Feldman afeld...@wikimedia.org wrote: I'd like to push for a codified set of minimum performance standards that new mediawiki features must meet before they can be deployed to larger wikimedia sites such as English Wikipedia, or be considered complete. These would look like (numbers pulled out of a hat, not actual suggestions): - p999 (long tail) full page request latency of 2000ms - p99 page request latency of 800ms - p90 page request latency of 150ms - p99 banner request latency of 80ms - p90 banner request latency of 40ms - p99 db query latency of 250ms - p90 db query latency of 50ms - 1000 write requests/sec (if applicable; writes operations must be free from concurrency issues) - guidelines about degrading gracefully - specific limits on total resource consumption across the stack per request - etc.. Right now, varying amounts of effort are made to highlight potential performance bottlenecks in code review, and engineers are encouraged to profile and optimize their own code. But beyond is the site still up for everyone / are users complaining on the village pump / am I ranting in irc, we've offered no guidelines as to what sort of request latency is reasonable or acceptable. If a new feature (like aftv5, or flow) turns out not to meet perf standards after deployment, that would be a high priority bug and the feature may be disabled depending on the impact, or if not addressed in a reasonable time frame. Obviously standards like this can't be applied to certain existing parts of mediawiki, but systems other than the parser or preprocessor that don't meet new standards should at least be prioritized for improvement. Thoughts? Asher ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l