Author: Maciej Fijalkowski <fij...@gmail.com> Branch: extradoc Changeset: r5543:51066aac7fba Date: 2015-07-20 11:44 +0200 http://bitbucket.org/pypy/extradoc/changeset/51066aac7fba/
Log: work on the talk diff --git a/talk/ep2015/performance/talk.rst b/talk/ep2015/performance/talk.rst --- a/talk/ep2015/performance/talk.rst +++ b/talk/ep2015/performance/talk.rst @@ -9,9 +9,9 @@ - PyPy core devs -- ``pdb++``, ``fancycompleter``, ... +- ``vmprof``, ``cffi``, ``pdb++``, ``fancycompleter``, ... -- Consultant +- Consultants - http://baroquesoftware.com/ @@ -19,7 +19,7 @@ About you ------------- -- Target audience +- You are proficient in Python - Your Python program is slow @@ -37,6 +37,8 @@ - 80% of the time will be spent in 20% of the program + - 20% of 1 mln is 200 000 + * Two golden rules: 1. Identify the slow spots @@ -49,11 +51,112 @@ * Two parts - 1. PyPy as a tool to make Python faster + 1. How to identify the slow spots - 2. How to identify the slow spots + 2. How to address the problems +Part 1 +------- + +* profiling + +* tools + + +What is performance? +-------------------- + +* you need something quantifiable by numbers + +* usually, time spent doing task X + +* sometimes number of requests, latency, etc. + +* some statistical properties about that metric (average, minimum, maximum) + +Do you have a performance problem? +---------------------------------- + +* define what you're trying to measure + +* measure it (production, benchmarks, etc.) + +* see if Python is the cause here (if it's not, we can't help you, + but I'm sure someone can) + +* make sure you can change and test stuff quickly (e.g. benchmarks are better + than changing stuff in production) + +* same as for debugging + +We have a python problem +------------------------ + +* tools, timers etc. + +* systems are too complicated to **guess** which will be faster + +* find your bottlenecks + +* 20/80 (but 20% of million lines is 200 000 lines, remember that) + +Profilers landscape +------------------- + +* cProfile, runSnakeRun (high overhead) - event based profiler + +* plop, vmprof - statistical profiler + +* cProfile & vmprof work on pypy + +vmprof +------ + +* inspired by ``gperftools`` + +* statistical profiler run by an interrupt (~300Hz on modern linux) + +* sampling the C stack + +* CPython, PyPy, possibly more virtual machines + +why not just use gperftools? +---------------------------- + +* C stack does not contain python-level frames + +* 90% ``PyEval_EvalFrame`` and other internals + +* we want python-level functions + +* picture is even more confusing in the presence of the JIT + +using vmprof +------------ + +* demo + +* http://vmprof.readthedocs.org + +using vmprof in production +-------------------------- + +* low overhead (5-10%), possibly lower in the future + +* possibility of realtime monitoring (coming) + +vmprof future +------------- + +* profiler as a service + +* realtime advanced visualization + +Part 2 +------ + +Make it fast Tools ------ @@ -74,6 +177,8 @@ * WARNING: we wrote it, we are biased :) + * gives you most wins for free (*) + What is PyPy @@ -107,6 +212,7 @@ :scale: 47% + The JIT -------- @@ -127,15 +233,12 @@ .. image:: jit-overview3.pdf :scale: 50% - JIT overview ------------- - Tracing JIT - * detect and compile "hot" loops - - * (although not only loops) + * detect and compile "hot" code - **Specialization** @@ -224,91 +327,3 @@ * inefficient code -Part 2 -------- - -* Measure performance - -* Identify problems - - -What is performance? --------------------- - -* it's a metric - -* usually, time spent doing task X - -* sometimes number of requests, latency, etc. - -* some statistical properties about that metric (average, minimum, maximum) - -Do you have a performance problem? ----------------------------------- - -* define the metric - -* measure it (production, benchmarks, etc.) - -* see if Python is the cause here (if it's not, we can't help you, - but I'm sure someone help) - -* make sure you can change and test stuff quickly (e.g. benchmarks are better - than changing stuff in production) - -We have a python problem ------------------------- - -* tools, timers etc. - -* systems are too complicated to **guess** which will be faster - -* find your bottlenecks - -* 20/80 (but 20% of million lines is 200 000 lines, remember that) - -Profilers landscape -------------------- - -* cProfile, runSnakeRun (high overhead) - exact profiler - -* plop, vmprof - statistical profiler - -* cProfile & vmprof work on pypy - -vmprof ------- - -XXXxxx - -using vmprof ------------- - -yyyyyyy - -interpreting the results ------------------------- - -xxxx - -using vmprof in production --------------------------- - -demo ----- - -let's optimize some code ------------------------- - -let's optimize some more complex code -------------------------------------- - -Extras: what's cool what's not cool on cpython and pypy - -CPython vs PyPy ---------------- - -* very different performance characteristics - -* XXX list them - _______________________________________________ pypy-commit mailing list pypy-commit@python.org https://mail.python.org/mailman/listinfo/pypy-commit