Re: [Python-Dev] Encoding of PyFrameObject members
2015-02-08 4:01 GMT-05:00 Maciej Fijalkowski fij...@gmail.com: I'm working on vmprof (github.com/vmprof/vmprof-python) which works for both cpython and pypy (pypy has special support, cpython is patched on-the fly) This looks interesting. I'm working on a profiler that is similar, but not based on timer. Instead, the signal is generated when an hardware performance counter overflows. It required a special linux kernel module, and the tracepoint is recorded using LTTng-UST. https://github.com/giraldeau/perfuser https://github.com/giraldeau/perfuser-modules https://github.com/giraldeau/python-profile-ust This is of course very experimental, requires a special setup, an I don't even know if it's going to produce good results. I'll report the results in the coming weeks. Cheers, Francis Giraldeau ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Encoding of PyFrameObject members
2015-02-06 6:04 GMT-05:00 Armin Rigo ar...@tunes.org: Hi, On 6 February 2015 at 08:24, Maciej Fijalkowski fij...@gmail.com wrote: I don't think it's safe to assume f_code is properly filled by the time you might read it, depending a bit where you find the frame object. Are you sure it's not full of garbage? Yes, before discussing how to do the utf8 decoding, we should realize that it is really unsafe code starting from the line before. From a signal handler you're only supposed to read data that was written to volatile fields. So even PyEval_GetFrame(), which is done by reading the thread state's frame field, is not safe: this is not a volatile. This means that the compiler is free to do crazy things like *first* write into this field and *then* initialize the actual content of the frame. The uninitialized content may be garbage, not just NULLs. Thanks for these comments. Of course accessing frames withing a signal handler is racy. I confirm that code encoded in non-ascii is not accessible from the uft8 buffer pointer. However, a call to PyUnicode_AsUTF8() encodes the data and caches it in the unicode object. Later access returns the byte buffer without memory allocation and re-encoding. I think it is possible to solve both safety problems by registering a handler with PyPyEval_SetProfile(). On function entry, the handler will call PyUnicode_AsUTF8() on the required frame members to make sure the utf8 encoded string is available. Then, we increment the refcount of the frame and assign it to a thread local pointer. On function return, the refcount is decremented. These operations occurs in the normal context and they are not racy. The signal handler will use the thread local frame pointer instead of calling PyEval_GetFrame(). Does that sounds good? Thanks again for your feedback! Francis ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Encoding of PyFrameObject members
I need to access frame members from within a signal handler for tracing purpose. My first attempt to access co_filename was like this (omitting error checking): PyFrameObject *frame = PyEval_GetFrame(); PyObject *ob = PyUnicode_AsUTF8String(frame-f_code-co_filename) char *str = PyBytes_AsString(ob) However, the function PyUnicode_AsUTF8String() calls PyObject_Malloc(), which is not reentrant. If the signal handler nest over PyObject_Malloc(), it causes a segfault, and it could also deadlock. Instead, I access members directly: char *str = PyUnicode_DATA(frame-f_code-co_filename); size_t len = PyUnicode_GET_DATA_SIZE(frame-f_code-co_filename); Is it safe to assume that unicode objects co_filename and co_name are always UTF-8 data for loaded code? I looked at the PyTokenizer_FromString() and it seems to convert everything to UTF-8 upfront, and I would like to make sure this assumption is valid. Thanks! Francis ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] LTTng-UST support for CPython
Here is a working prototype for CPython to record all function call/return using LTTng-UST, a fast tracer. https://github.com/giraldeau/python-profile-ust However, there are few issues and questions: - I was not able to get PyTrace_EXCEPTION using raise or other error conditions. How can we trigger this event in Python code (PyTrace_C_EXCEPTION works)? - How could be the best way to get the full name of an object (such as package, module, class and function). Maybe it's too Java-ish, and it is better to record file/lineno instead? - On the C-API side: I did a horrible and silly function show_type() to run every Py*_Check() to determine the type of a PyObject *. What would be the sane way to do that? Your comments are very valuable. Thanks! Francis ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Support for Linux perf
2014-11-22 7:44 GMT-05:00 Julian Taylor jtaylor.deb...@googlemail.com: On 17.11.2014 23:09, Francis Giraldeau wrote: Hi, ... The PEP-418 is about performance counters, but there is no mention o Anyway, I think we must change CPython to support tools such as perf. Any thoughts? there are some patches available adding systemtap and dtrace probes, which should at least help getting function level profiles: http://bugs.python.org/issue21590 Thanks for these links, the patches looks interesting. As Jonas mentioned, Perf should be able to unwind a Python stack. It does at the interpreter level, and the frame info is scattered in virtual memory. It needs to be access offline. I think it could be possible to use the function entry and exit hooks in the interpreter to save important frame info, such as function name, file and line number, in a memory map known to perf. Then, we can tell Perf to record this compact zone of data in the sample as extra field for offline use. Then, at the analysis time, each ELF interpreter frame could be matched with the corresponding Python frame info. I think the perf handler can't sleep, and accesses on each function entry/exit will also ensure the page is present in main memory when the sample is recorded. Thanks again for your inputs, I'll post any further developments. Cheers, Francis ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Static checker for common Python programming errors
If I may, there are prior work on JavaScript that may be worth investigating. Formal verification of dynamically typed software is a challenging endeavour, but it is very valuable to avoid errors at runtime, providing benefits from strongly type language without the rigidity. http://cs.au.dk/~amoeller/papers/tajs/ Good luck! Francis 2014-11-17 9:49 GMT-05:00 Stefan Bucur stefan.bu...@gmail.com: I'm developing a Python static analysis tool that flags common programming errors in Python programs. The tool is meant to complement other tools like Pylint (which perform checks at lexical and syntactic level) by going deeper with the code analysis and keeping track of the possible control flow paths in the program (path-sensitive analysis). For instance, a path-sensitive analysis detects that the following snippet of code would raise an AttributeError exception: if object is None: # If the True branch is taken, we know the object is None object.doSomething() # ... so this statement would always fail I'm writing first to the Python developers themselves to ask, in their experience, what common pitfalls in the language its standard library such a static checker should look for. For instance, here [1] is a list of static checks for the C++ language, as part of the Clang static analyzer project. My preliminary list of Python checks is quite rudimentary, but maybe could serve as a discussion starter: * Proper Unicode handling (for 2.x) - encode() is not called on str object - decode() is not called on unicode object * Check for integer division by zero * Check for None object dereferences Thanks a lot, Stefan Bucur [1] http://clang-analyzer.llvm.org/available_checks.html ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/francis.giraldeau%40gmail.com ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Support for Linux perf
Hi, The PEP-418 is about performance counters, but there is no mention of performance management unit (PMU) counters, such as cache misses and instruction counts. The Linux perf tool aims at recording these samples at the system level. I ran linux perf on CPython for profiling. The resulting callstack is inside libpython.so, mostly recursive calls to PyEval_EvalFrameEx(), because the tool works at the ELF level. Here is an example with a dummy program (linux-tools on Ubuntu 14.04): $ perf record python crunch.py $ perf report --stdio # Overhead Command Shared ObjectSymbol # ... .. # 32.37% python python2.7 [.] PyEval_EvalFrameEx 13.70% python libm-2.19.so[.] __sin_avx 5.25% python python2.7 [.] binary_op1.5010 4.82% python python2.7 [.] PyObject_GetAttr While this may be insightful for the interpreter developers, it it not so for the average Python developer. The report should display Python code instead. It seems obvious, still I haven't found the feature for that. When a performance counter reaches a given value, a sample is recorded. The most basic sample only records a timestamps, thread ID and the program counter (%rip). In addition, all executable memory maps of libraries are recorded. For the callstack, frame pointers are traversed, but most of the time, they are optimized on x86, so there is a fall back to unwind, which requires saving register values and a chunk of the stack. The memory space of the process is reconstructed offline. CPython seems to allocates code and frames on mmap() pages. If the data is outside about 1k from the top of stack, it is not available offline in the trace. We need some way to reconstitute this memory space of the interpreter to resolve the symbols, probably by dumping the data on disk. In Java, there is a small HotSpot agent that spits out the symbols of JIT code: https://github.com/jrudolph/perf-map-agent The problem is that CPython does not JIT code, and executed code is the ELF library itself. The executed frames are parameters of functions of the interpreter. I don't think the same approach can be used (maybe this can be applied to PyPy?). I looked at how Python frames are handled in GDB (file cpython/Tools/gdb/libpython.py). A python frame is detected in Frame(gdbframe).is_evalframeex() by a C call to PyEval_EvalFrameEx(). However, the traceback accesses PyFrameObject on the heap (at least for f-f_back = 0xa57460), which is possible in GDB when the program is paused and the whole memory space is available, but is not recorded for offline use in perf. Here is an example of callstack from GDB: #0 PyEval_EvalFrameEx (f=Frame 0x77f1b060, for file crunch.py, line 7, in bar (num=466829), throwflag=0) at ../Python/ceval.c:1039 #1 0x00527877 in fast_function (func=function at remote 0x76ec45a0, pp_stack=0x7fffd280, n=1, na=1, nk=0) at ../Python/ceval.c:4106 #2 0x00527582 in call_function (pp_stack=0x7fffd280, oparg=1) at ../Python/ceval.c:4041 We could add a kernel module that knows how to make samples of CPython, but it means python structures becomes sort of ABI, and kernel devs won't allow a python interpreter in kernel mode ;-). What we really want is f_code data and related objects: (gdb) print (void *)(f-f_code) $8 = (void *) 0x77e370f0 Maybe we could save these pages every time some code is loaded from the interpreter? (the memory range is about 1.7MB, but ) Anyway, I think we must change CPython to support tools such as perf. Any thoughts? Cheers, Francis ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com