https://github.com/python/cpython/commit/d134bd272f90bfc2dfb55b126d4552c996251fc1
commit: d134bd272f90bfc2dfb55b126d4552c996251fc1
branch: main
author: Pablo Galindo Salgado <[email protected]>
committer: pablogsal <[email protected]>
date: 2025-04-18T14:42:20+01:00
summary:
gh-130861: Add clarification to the perf docs on optimization levels (#131098)
files:
M Doc/howto/perf_profiling.rst
diff --git a/Doc/howto/perf_profiling.rst b/Doc/howto/perf_profiling.rst
index 4cec8f62aeba2d..b579d776576365 100644
--- a/Doc/howto/perf_profiling.rst
+++ b/Doc/howto/perf_profiling.rst
@@ -254,13 +254,28 @@ files in the current directory which are ELF images for
all the JIT trampolines
that were created by Python.
.. warning::
- Notice that when using ``--call-graph dwarf`` the ``perf`` tool will take
+ When using ``--call-graph dwarf``, the ``perf`` tool will take
snapshots of the stack of the process being profiled and save the
- information in the ``perf.data`` file. By default the size of the stack
dump
- is 8192 bytes but the user can change the size by passing the size after
- comma like ``--call-graph dwarf,4096``. The size of the stack dump is
- important because if the size is too small ``perf`` will not be able to
- unwind the stack and the output will be incomplete. On the other hand, if
- the size is too big, then ``perf`` won't be able to sample the process as
- frequently as it would like as the overhead will be higher.
+ information in the ``perf.data`` file. By default, the size of the stack
dump
+ is 8192 bytes, but you can change the size by passing it after
+ a comma like ``--call-graph dwarf,16384``.
+ The size of the stack dump is important because if the size is too small
+ ``perf`` will not be able to unwind the stack and the output will be
+ incomplete. On the other hand, if the size is too big, then ``perf`` won't
+ be able to sample the process as frequently as it would like as the
overhead
+ will be higher.
+
+ The stack size is particularly important when profiling Python code
compiled
+ with low optimization levels (like ``-O0``), as these builds tend to have
+ larger stack frames. If you are compiling Python with ``-O0`` and not
seeing
+ Python functions in your profiling output, try increasing the stack dump
+ size to 65528 bytes (the maximum)::
+
+ $ perf record -F 9999 -g -k 1 --call-graph dwarf,65528 -o perf.data
python -Xperf_jit my_script.py
+
+ Different compilation flags can significantly impact stack sizes:
+
+ - Builds with ``-O0`` typically have much larger stack frames than those
with ``-O1`` or higher
+ - Adding optimizations (``-O1``, ``-O2``, etc.) typically reduces stack
size
+ - Frame pointers (``-fno-omit-frame-pointer``) generally provide more
reliable stack unwinding
_______________________________________________
Python-checkins mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-checkins.python.org/
Member address: [email protected]