-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 05/14/2011 05:39 AM, Armin Rigo wrote: > I think the general problem is that you are trying to approach > debugging PyPy like you approach debugging a C-written program (say, > CPython).
Note that I am not trying to debug pypy itself but an extension written in C compiled with cpyext. > It is a > bit like, say, wanting to debug various memory issues in a Java > program by running the Java VM in gdb. It is like trying to debug JNI issues, not Java program issues. To debug JNI you do use a regular C based debugger like gdb. Some JVM also have a checking mode for JNI calls: http://publib.boulder.ibm.com/infocenter/javasdk/v5r0/index.jsp?topic=/com.ibm.java.doc.diagnostics.50/html/jni_debug.html > One thing > we could add to cpyext is a reference count checker: a special mode in > which cpyext doesn't immediately free the CPython objects whose > refcount drops to zero, but instead checks for a while if the object > is still used, and cleanly complain if it is. This can be added to > the RPython source code of cpyext; it's not something that should be > debugged in gdb, for the reasons above. I'm all for a checking mode. Note that you'll still have to provide some way of interacting the with the debugger - it would be rather pointless to emit a message saying "pyobject misuse at address 0x12345678" and then exit as it would need to be established who allocated it and who reused it after death. > If nevertheless you want to use C-level debugging tools, then it's > going to be painful: *Any* debugging tools would be fine, but given C extensions are full of C code you may as well make using C tools on that C code possible. >> - - Some gdb macros to make debugging easier. For example CPython comes with >> a .gdbinit with nice macros like 'pyo' so you can see what a PyObject * >> represents > > This is impossible in general, because the C structures corresponding > to objects changes depending on e.g. whether you included some special > compilation options or not. It would be possible, though, to come > with some macros that work in the common case. This might require > additional hacks to fix the C name of some helper functions, so that > they can be called by the macros. In other words it's something that > someone may come up with at some point (there was some work in that > direction), but it's work. The exact same issues apply to CPython. CPython exports a helper function named _PyObject_Dump(PyObject *) which then calls the internal Python machinery (str etc). 'pyo' is the only macro I use and is very necessary because you have PyObject* all over the place and need some idea of what they are. >> - - How to disable all memory optimisations and make it work with valgrind > > That's not possible. The minimark GC is not a "memory optimisation"; > it's a completely different approach to garbage collection than, say, > CPython's reference counting. It is actually possible. All valgrind needs to know is which areas of memory are in use and which aren't. By far the easiest way is to devolve into C library calls of malloc and free. That is why I mentioned the ref counting GC as it would be malloc and free calls at the end of the day. Since pypy doesn't have a functioning "dumb" memory mode the only alternative is to add calls to valgrind. It has a header you can use that generates no side effect instruction sequences when the program is run normally and tells valgrind things when run under valgrind. See http://valgrind.org/docs/manual/manual-core-adv.html For example the memory allocation code would need to call VALGRIND_MALLOCLIKE_BLOCK for each allocated chunk of memory and the GC would need to call VALGRIND_FREELIKE_BLOCK on each freed chunk. >> - - How to get deterministic behaviour - ie exactly the same thing happens >> each time whether you run gdb or not > > I don't know the details, but there are several factors that make it > impossible. The most important one is, again, the fact that the > minimark GC triggers collections at times that look random. I don't > see how to fix that. I'd be very happy having the GC be triggered at every possible point and do a complete collection. That is the best for a checking mode since it means the time between when something is no longer used and when it is collected will be as short as possible. You can probably trigger it before and after every cpyext wrapped call. > What we are > missing is a set of tools that let you locate and debug issues with > CPython C extension modules, but not necessarily based on gdb. Indeed. At the moment my code works perfectly under CPython, has 99.6% test coverage, is valgrind clean under CPython and only broke the refcount rules for one internal debug method used for fault injection that is only used in special builds. The latter was fairly easily diagnosed and fixed. Under pypy I'm having all sorts of problems including a pypy crash, nonsensical behaviour and just using gdb changing program behaviour. Until pypy is perfect I won't be the only one affected by this! Roger -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iEYEARECAAYFAk3OkHgACgkQmOOfHg372QQlBgCghjQw1kwzd9KLc1XXNlISPnkk ZY0An0SkMpm3vS7OfknkGtYRJrTvXCgf =y4Oe -----END PGP SIGNATURE----- _______________________________________________ pypy-dev mailing list pypy-dev@python.org http://mail.python.org/mailman/listinfo/pypy-dev