On Tue, Dec 1, 2015 at 11:49 AM, Fabio Zadrozny <fabi...@gmail.com> wrote: > > On Tue, Dec 1, 2015 at 6:36 AM, Maciej Fijalkowski <fij...@gmail.com> wrote: >> >> Hi >> >> Thanks for doing the work! I'm on of the pypy devs and I'm very >> interested in seeing this getting somewhere. I must say I struggle to >> read the graph - is red good or is red bad for example? >> >> I'm keen to help you getting anything you want to run it repeatedly. >> >> PS. The intel stuff runs one benchmark in a very questionable manner, >> so let's maybe not rely on it too much. > > > Hi Maciej, > > Great, it'd be awesome having data on multiple Python VMs (my latest target > is really having a way to compare across multiple VMs/versions easily and > help each implementation keep a focus on performance). Ideally, a single, > dedicated machine could be used just to run the benchmarks from multiple VMs > (one less variable to take into account for comparisons later on, as I'm not > sure it'd be reliable to normalize benchmark data from different machines -- > it seems Zach was the one to contact from that, but if there's such a > machine already being used to run PyPy, maybe it could be extended to run > other VMs too?). > > As for the graph, it should be easy to customize (and I'm open to > suggestions). In the case, as it is, red is slower and blue is faster (so, > for instance in > https://www.speedtin.com/reports/1_CPython27x_Performance_Over_Time, the > fastest CPython version overall was 2.7.3 -- and 2.7.1 was the baseline). > I've updated the comments to make it clearer (and changed the second graph > to compare the latest against the fastest version (2.7.rc11 vs 2.7.3) for > the individual benchmarks. > > Best Regards, > > Fabio
There is definitely a machine available. I suggest you ask python-infra list for access. It definitely can be used to run more than just pypy stuff. As for normalizing across multiple machines - don't even bother. Different architectures make A LOT of difference, especially with cache sizes and whatnot, that seems to have different impact on different loads. As for graph - I like the split on the benchmarks and a better description (higher is better) would be good. I have a lot of ideas about visualizations, pop in on IRC, I'm happy to discuss :-) Cheers, fijal _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com