Re: [Python-Dev] Encoding of PyFrameObject members

2015-02-08 Thread Francis Giraldeau
2015-02-08 4:01 GMT-05:00 Maciej Fijalkowski fij...@gmail.com:

 I'm working on vmprof (github.com/vmprof/vmprof-python) which works
 for both cpython and pypy (pypy has special support, cpython is
 patched on-the fly)


This looks interesting. I'm working on a profiler that is similar, but not
based on timer. Instead, the signal is generated when an hardware
performance counter overflows. It required a special linux kernel module,
and the tracepoint is recorded using LTTng-UST.

https://github.com/giraldeau/perfuser
https://github.com/giraldeau/perfuser-modules
https://github.com/giraldeau/python-profile-ust

This is of course very experimental, requires a special setup, an I don't
even know if it's going to produce good results. I'll report the results in
the coming weeks.

Cheers,

Francis Giraldeau
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Encoding of PyFrameObject members

2015-02-06 Thread Francis Giraldeau
2015-02-06 6:04 GMT-05:00 Armin Rigo ar...@tunes.org:

 Hi,

 On 6 February 2015 at 08:24, Maciej Fijalkowski fij...@gmail.com wrote:
  I don't think it's safe to assume f_code is properly filled by the
  time you might read it, depending a bit where you find the frame
  object. Are you sure it's not full of garbage?


 Yes, before discussing how to do the utf8 decoding, we should realize
 that it is really unsafe code starting from the line before.  From a
 signal handler you're only supposed to read data that was written to
 volatile fields.  So even PyEval_GetFrame(), which is done by
 reading the thread state's frame field, is not safe: this is not a
 volatile.  This means that the compiler is free to do crazy things
 like *first* write into this field and *then* initialize the actual
 content of the frame.  The uninitialized content may be garbage, not
 just NULLs.


Thanks for these comments. Of course accessing frames withing a signal
handler is racy. I confirm that code encoded in non-ascii is not accessible
from the uft8 buffer pointer. However, a call to PyUnicode_AsUTF8() encodes
the data and caches it in the unicode object. Later access returns the byte
buffer without memory allocation and re-encoding.

I think it is possible to solve both safety problems by registering a
handler with PyPyEval_SetProfile(). On function entry, the handler will
call PyUnicode_AsUTF8() on the required frame members to make sure the utf8
encoded string is available. Then, we increment the refcount of the frame
and assign it to a thread local pointer. On function return, the refcount
is decremented. These operations occurs in the normal context and they are
not racy. The signal handler will use the thread local frame pointer
instead of calling PyEval_GetFrame(). Does that sounds good?

Thanks again for your feedback!

Francis
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Encoding of PyFrameObject members

2015-02-05 Thread Francis Giraldeau
I need to access frame members from within a signal handler for tracing
purpose. My first attempt to access co_filename was like this (omitting
 error checking):

PyFrameObject *frame = PyEval_GetFrame();
PyObject *ob = PyUnicode_AsUTF8String(frame-f_code-co_filename)
char *str = PyBytes_AsString(ob)

However, the function PyUnicode_AsUTF8String() calls PyObject_Malloc(),
which is not reentrant. If the signal handler nest over PyObject_Malloc(),
it causes a segfault, and it could also deadlock.

Instead, I access members directly:
char *str = PyUnicode_DATA(frame-f_code-co_filename);
size_t len = PyUnicode_GET_DATA_SIZE(frame-f_code-co_filename);

Is it safe to assume that unicode objects co_filename and co_name are
always UTF-8 data for loaded code? I looked at the PyTokenizer_FromString()
and it seems to convert everything to UTF-8 upfront, and I would like to
make sure this assumption is valid.

Thanks!

Francis
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] LTTng-UST support for CPython

2014-12-01 Thread Francis Giraldeau
Here is a working prototype for CPython to record all function call/return
using LTTng-UST, a fast tracer.

https://github.com/giraldeau/python-profile-ust

However, there are few issues and questions:

- I was not able to get PyTrace_EXCEPTION using raise or other error
conditions. How can we trigger this event in Python code
(PyTrace_C_EXCEPTION works)?

- How could be the best way to get the full name of an object (such as
package, module, class and function). Maybe it's too Java-ish, and it is
better to record file/lineno instead?

- On the C-API side: I did a horrible and silly function show_type() to run
every Py*_Check() to determine the type of a PyObject *. What would be the
sane way to do that?

Your comments are very valuable. Thanks!

Francis
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Support for Linux perf

2014-11-23 Thread Francis Giraldeau
2014-11-22 7:44 GMT-05:00 Julian Taylor jtaylor.deb...@googlemail.com:

 On 17.11.2014 23:09, Francis Giraldeau wrote:
  Hi,
  ...
  The PEP-418 is about performance counters, but there is no mention o
  Anyway, I think we must change CPython to support tools such as perf.
  Any thoughts?
 

 there are some patches available adding systemtap and dtrace probes,
 which should at least help getting function level profiles:

 http://bugs.python.org/issue21590


Thanks for these links, the patches looks interesting.

As Jonas mentioned, Perf should be able to unwind a Python stack. It does
at the interpreter level, and the frame info is scattered in virtual
memory. It needs to be access offline.

I think it could be possible to use the function entry and exit hooks in
the interpreter to save important frame info, such as function name, file
and line number, in a memory map known to perf. Then, we can tell Perf to
record this compact zone of data in the sample as extra field for offline
use. Then, at the analysis time, each ELF interpreter frame could be
matched with the corresponding Python frame info. I think the perf handler
can't sleep, and accesses on each function entry/exit will also ensure the
page is present in main memory when the sample is recorded.

Thanks again for your inputs, I'll post any further developments.

Cheers,

Francis
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Static checker for common Python programming errors

2014-11-17 Thread Francis Giraldeau
If I may, there are prior work on JavaScript that may be worth
investigating. Formal verification of dynamically typed software is a
challenging endeavour, but it is very valuable to avoid errors at runtime,
providing benefits from strongly type language without the rigidity.

http://cs.au.dk/~amoeller/papers/tajs/

Good luck!

Francis

2014-11-17 9:49 GMT-05:00 Stefan Bucur stefan.bu...@gmail.com:

 I'm developing a Python static analysis tool that flags common programming
 errors in Python programs. The tool is meant to complement other tools like
 Pylint (which perform checks at lexical and syntactic level) by going
 deeper with the code analysis and keeping track of the possible control
 flow paths in the program (path-sensitive analysis).

 For instance, a path-sensitive analysis detects that the following snippet
 of code would raise an AttributeError exception:

 if object is None: # If the True branch is taken, we know the object is
 None
   object.doSomething() # ... so this statement would always fail

 I'm writing first to the Python developers themselves to ask, in their
 experience, what common pitfalls in the language  its standard library
 such a static checker should look for. For instance, here [1] is a list of
 static checks for the C++ language, as part of the Clang static analyzer
 project.

 My preliminary list of Python checks is quite rudimentary, but maybe could
 serve as a discussion starter:

 * Proper Unicode handling (for 2.x)
   - encode() is not called on str object
   - decode() is not called on unicode object
 * Check for integer division by zero
 * Check for None object dereferences

 Thanks a lot,
 Stefan Bucur

 [1] http://clang-analyzer.llvm.org/available_checks.html


 ___
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 https://mail.python.org/mailman/options/python-dev/francis.giraldeau%40gmail.com


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Support for Linux perf

2014-11-17 Thread Francis Giraldeau
Hi,

The PEP-418 is about performance counters, but there is no mention of
performance management unit (PMU) counters, such as cache misses and
instruction counts.

The Linux perf tool aims at recording these samples at the system level. I
ran linux perf on CPython for profiling. The resulting callstack is inside
libpython.so, mostly recursive calls to PyEval_EvalFrameEx(), because the
tool works at the ELF level. Here is an example with a dummy program
(linux-tools on Ubuntu 14.04):

$ perf record python crunch.py
$ perf report --stdio
# Overhead  Command   Shared ObjectSymbol
#   ...  ..  
#
32.37%   python  python2.7   [.] PyEval_EvalFrameEx
13.70%   python  libm-2.19.so[.] __sin_avx
 5.25%   python  python2.7   [.] binary_op1.5010
 4.82%   python  python2.7   [.] PyObject_GetAttr

While this may be insightful for the interpreter developers, it it not so
for the average Python developer. The report should display Python code
instead. It seems obvious, still I haven't found the feature for that.

When a performance counter reaches a given value, a sample is recorded. The
most basic sample only records a timestamps, thread ID and the program
counter (%rip). In addition, all executable memory maps of libraries are
recorded. For the callstack, frame pointers are traversed, but most of the
time, they are optimized on x86, so there is a fall back to unwind, which
requires saving register values and a chunk of the stack. The memory space
of the process is reconstructed offline.

CPython seems to allocates code and frames on mmap() pages. If the data is
outside about 1k from the top of stack, it is not available offline in the
trace. We need some way to reconstitute this memory space of the
interpreter to resolve the symbols, probably by  dumping the data on disk.

In Java, there is a small HotSpot agent that spits out the symbols of JIT
code:

https://github.com/jrudolph/perf-map-agent

The problem is that CPython does not JIT code, and executed code is the ELF
library itself. The executed frames are parameters of functions of the
interpreter. I don't think the same approach can be used (maybe this can be
applied to PyPy?).

I looked at how Python frames are handled in GDB
(file cpython/Tools/gdb/libpython.py). A python frame is detected in
Frame(gdbframe).is_evalframeex() by a C call to PyEval_EvalFrameEx().
However, the traceback accesses PyFrameObject on the heap (at least for
f-f_back = 0xa57460), which is possible in GDB when the program is paused
and the whole memory space is available, but is not recorded for offline
use in perf. Here is an example of callstack from GDB:

#0  PyEval_EvalFrameEx (f=Frame 0x77f1b060, for file crunch.py, line 7,
in bar (num=466829),
throwflag=0) at ../Python/ceval.c:1039
#1  0x00527877 in fast_function (func=function at remote
0x76ec45a0,
pp_stack=0x7fffd280, n=1, na=1, nk=0) at ../Python/ceval.c:4106
#2  0x00527582 in call_function (pp_stack=0x7fffd280, oparg=1)
at ../Python/ceval.c:4041


We could add a kernel module that knows how to make samples of CPython,
but it means python structures becomes sort of ABI, and kernel devs won't
allow a python interpreter in kernel mode ;-).

What we really want is f_code data and related objects:

(gdb) print (void *)(f-f_code)
$8 = (void *) 0x77e370f0

Maybe we could save these pages every time some code is loaded from the
interpreter? (the memory range is about 1.7MB, but )

Anyway, I think we must change CPython to support tools such as perf. Any
thoughts?

Cheers,

Francis
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com