https://github.com/python/cpython/commit/ca2e1a1e1c1f6b6489f6e354984f89bb1f72a7fd commit: ca2e1a1e1c1f6b6489f6e354984f89bb1f72a7fd branch: 3.13 author: Miss Islington (bot) <31488909+miss-isling...@users.noreply.github.com> committer: pablogsal <pablog...@gmail.com> date: 2025-04-18T13:48:23Z summary:
[3.13] gh-130861: Add clarification to the perf docs on optimization levels (GH-131098) (#132687) files: M Doc/howto/perf_profiling.rst diff --git a/Doc/howto/perf_profiling.rst b/Doc/howto/perf_profiling.rst index 4cec8f62aeba2d..b579d776576365 100644 --- a/Doc/howto/perf_profiling.rst +++ b/Doc/howto/perf_profiling.rst @@ -254,13 +254,28 @@ files in the current directory which are ELF images for all the JIT trampolines that were created by Python. .. warning:: - Notice that when using ``--call-graph dwarf`` the ``perf`` tool will take + When using ``--call-graph dwarf``, the ``perf`` tool will take snapshots of the stack of the process being profiled and save the - information in the ``perf.data`` file. By default the size of the stack dump - is 8192 bytes but the user can change the size by passing the size after - comma like ``--call-graph dwarf,4096``. The size of the stack dump is - important because if the size is too small ``perf`` will not be able to - unwind the stack and the output will be incomplete. On the other hand, if - the size is too big, then ``perf`` won't be able to sample the process as - frequently as it would like as the overhead will be higher. + information in the ``perf.data`` file. By default, the size of the stack dump + is 8192 bytes, but you can change the size by passing it after + a comma like ``--call-graph dwarf,16384``. + The size of the stack dump is important because if the size is too small + ``perf`` will not be able to unwind the stack and the output will be + incomplete. On the other hand, if the size is too big, then ``perf`` won't + be able to sample the process as frequently as it would like as the overhead + will be higher. + + The stack size is particularly important when profiling Python code compiled + with low optimization levels (like ``-O0``), as these builds tend to have + larger stack frames. If you are compiling Python with ``-O0`` and not seeing + Python functions in your profiling output, try increasing the stack dump + size to 65528 bytes (the maximum):: + + $ perf record -F 9999 -g -k 1 --call-graph dwarf,65528 -o perf.data python -Xperf_jit my_script.py + + Different compilation flags can significantly impact stack sizes: + + - Builds with ``-O0`` typically have much larger stack frames than those with ``-O1`` or higher + - Adding optimizations (``-O1``, ``-O2``, etc.) typically reduces stack size + - Frame pointers (``-fno-omit-frame-pointer``) generally provide more reliable stack unwinding _______________________________________________ Python-checkins mailing list -- python-checkins@python.org To unsubscribe send an email to python-checkins-le...@python.org https://mail.python.org/mailman3/lists/python-checkins.python.org/ Member address: arch...@mail-archive.com