Re: [Valgrind-users] Having some problem with valgrind (on Android Emulator 2.2 API)
On Thu, 2012-02-02 at 17:07 +0500, Mohammad Ali wrote: After configuring valgrind. I took the Inst directory (as mentioned in Readme.android) and copied it to two emulators, one running on MacOSx and the other running on Ubuntu 11.10. I created the emulator using 2.2 API (platform-8). On both the platforms while I run valgrind from the ./adb shell, kernel stops it. + Stopped (signal) and at second time when I run it. It says Illegal Instruction I am also using an Android emulator to do some (limited) Valgrind validations. Note however that the emulated android is a 2.3.3 API level 10. Assuming we are speaking about the same emulator (delivered as part of the Android SDK/NDK) : it emulates an armv5, while valgrind needs something like an armv7. = I have to apply some patches which allows to compile and run Valgrind on the emulator with armv5. The patches are mostly found in https://bugs.kde.org/show_bug.cgi?id=276897 (if you are interested, I have a combined patch which I am applying on a 3.8.0 SVN) Philippe -- Keep Your Developer Skills Current with LearnDevNow! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-d2d ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] can't start any application on OS X 10.7.3
On Thu, 2012-02-09 at 17:47 +0400, Alexander Potapenko wrote: under Valgrind on your machine? If it returns 0, it means that the code you're running is incorrectly assuming AES support on the CPU (this is still a reason to fix AESKEYGENASSIST) Otherwise cpuid is broken under Valgrind. Testing on an Xeon X5670 (which supports AES instructions), we see that a native run of Alexander's code tells AES is supported, but the synthetic cpu emulated by Valgrind indicates AES is not supported (which is the case). See below. So, it looks like the application you are trying to run does not verify at runtime if AES is supported or not (e.g. if this is checked at installation time and different executable is installed depending on this install check, then no luck (until Valgrind supports the AES instructions). FYI: I am busy working on implementing the AES instructions. Not very advanced yet, but I guess it should arrive in the coming weeks. Philippe ./cpu_aes 200 philippe@gcc20:~/valgrind/aes_trials$ ~/valgrind/trunk_untouched/vg-in-place ./cpu_aes ==14424== Memcheck, a memory error detector ==14424== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al. ==14424== Using Valgrind-3.8.0.SVN and LibVEX; rerun with -h for copyright info ==14424== Command: ./cpu_aes ==14424== 0 ==14424== ==14424== HEAP SUMMARY: ==14424== in use at exit: 0 bytes in 0 blocks ==14424== total heap usage: 0 allocs, 0 frees, 0 bytes allocated ==14424== ==14424== All heap blocks were freed -- no leaks are possible ==14424== ==14424== For counts of detected and suppressed errors, rerun with: -v ==14424== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 4 from 4) -- Virtualization Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] identifying error numbers for --vgdb-error
On Fri, 2012-02-10 at 16:01 +, Rob wrote: Thanks for the patch. I have manually applied it to 3.7.0 (not svn) and it is a big improvement. The number seems to be offset by 1 from what I would expect though, eg. --vgdb-error=5 stops after detecting 6 errors. Thanks for the feedback. I found the reason for the off by one you have seen. I will dig deeper in the difference between nr of errors found, and nr of errors shown. A better patch will follow. One reason for printing the error number in the output would be to avoid having to manually count them if there are many. Personally I think it would be nicer to always have the errors numbered to help navigating large amounts of output . I understand this : it is effectively not very easy for the user to count the error nrs. However, printing an error nr is changing the behaviour for all tools reporting errors. = changing this implies some advice/feedback from others Julian, Bart : an opinion ? Philippe -- Virtualization Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] can't start any application on OS X 10.7.3
On Thu, 2012-02-09 at 13:40 +, Tom Hughes wrote: On 09/02/12 13:00, Eliot Moss wrote: 0x66 0x0F 0x3A 0xDF appears to be AESKEYGENASSIST. Someone else will have to address that (if at all). There's a bug for that already: https://bugs.kde.org/show_bug.cgi?id=290655 I just attached to the bug a patch on 3.8.0 SVN implementing the 6 AES instructions. Patch includes a test but it would be nice if testing could be done using a real application using AES (e.g. firefox) and report if this is working properly. Thanks Philippe -- Try before you buy = See our experts in action! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-dev2 ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] identifying error numbers for --vgdb-error
On Tue, 2012-02-14 at 13:54 +, Rob wrote: One thing that might be relevant is that errors already have a 32 bit value that identifies them uniquely. struct _Error :: unique You can see them in the XML output, eg ./vg-in-place --xml=yes --xml-fd=1 memcheck/tests/errs1 I would prefer to use them, rather than add yet another kind of error-counter mechanism. But the problem is now to show the user There is in fact already a mechanism which counts the number of printed errors, used a.o. to tell the user More than %d errors detected. Subsequent errors\n will still be recorded, but in less detail than before.\n The idea behind the current --vgdb-error was to use this counting mechanism. The manual for --vgdb-error=number says: Tools that report errors will wait for number errors to be reported before freezing the program and waiting for you to connect with GDB. It follows that a value of zero will cause the gdbserver to be started before your program is executed. This is typically used to insert GDB breakpoints before execution, and also works with tools that do not report errors, such as Massif. I have a preference for this number errors to be reported, but if the above is still not convincing, fine for me to use the unique concept instead. what --vgdb-error value is required for each error. The simple thing to do is to print a line Use --vgdb-error=unique to make the GDB server stop at this error Problem is I don't really want to add printing of such lines by default. Is it possible that we can make printing of them conditional on some other command line option that must be present in order to use the gdbserver? Difficult to find a mandatory command line option as --vgdb=yes is the default. So, a new option would be needed (e.g. --print-unique-error-nr=no|yes print error nr for each error or maybe --print-vgdb-error=no|yes print error nr to use for --vgdb-error Yes, allowing the unique/ numbers in place of a count would be good. It wouldn't necessarily have to print the full usage method above as this could be documented in the manual. How about appending the numbers at the end of each error line, either by default or with an option? ==14600== Syscall param write(buf) points to uninitialised byte(s) (uid=0x2f9) To make the link with --vgdb-error, maybe ==14600== Syscall param write(buf) points to uninitialised byte(s) (--vgdb-error=761) (I would in any case not use hexadecimal for this, so as to match the way integers options are read) Numbering the errors (either with n_errs_shown or with unique) will for sure help. However, with multi-threaded applications, the order and numbering of errors might not be easily reproduced from one run to another. At work, a user is doing a nasty trick to survive in such a case: he writes a suppression file for all errors preceeding the one he is interested in. Not very easy but can be made better by having a new command line option: --vgdb-error-list=filename invoke gdbserver for each error described in filename The big advantage of this schema is that it is not sensitive to scheduling/numbering/... With this, gdbserver would be invoked either when the error nr is = --vgdb-error or when it matches an error described in --vgdb-error-list. The --vgdb-error-list=filename will use the same format as a suppression file. Philippe -- Keep Your Developer Skills Current with LearnDevNow! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-d2d ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] identifying error numbers for --vgdb-error
On Tue, 2012-02-14 at 23:52 +0100, Julian Seward wrote: Hmm, this doesn't sound like it's going to be simple to fix in a clean way. For the moment, can we do the incremental fix of taking Philippe's patch (with the off-by-one fixed) ? That's a very simple patch and uncontroversial patch. (Maybe should also backport it for 3.7.1 ?) I will prepare a patch (and effectively, this looks a good candidate for 3.7.1 backporting). Assuming the idea --vgdb-error-list=filename invoke gdbserver for each error described in filename is deemed a good idea, it is for sure not for 3.7.x. Philippe -- Virtualization Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Valgrind getting stuck or hung
On Fri, 2012-02-17 at 16:01 -0800, Richa Mehta wrote: Hi, I am a valgrind user, and while running valgrind on my program, it got stuck on one of my library files. Following is the output of the running valgrind: --2990-- Considering /usr/lib/mylib.so.debug .. --2990-- .. CRC is valid The above statements are the last 2 statements of the whole output. Valgrind gets stuck here everytime and doesn't move forward. Please help with the issue. With the above info, very difficult to see what is happening. Which version of Valgrind are you using ? On which OS ? If you do not use a recent valgrind (3.7.0), it is always better to try with the last version. If it still blocks, try to run it with -v -v -v -d -d -d to have some more trace and/or put some more debug flags (see valgrind --help-debug) Philippe -- Virtualization Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] about array out of bounds
On Thu, 2012-02-23 at 17:59 +0800, 张昭 wrote: static int b[60]; int main ( int argc, char *argv[] ) { int a[50]; int ret; /*a[50] = 1; a[51] = 1;*/ b[59] = 1; b[60] = 2; return EXIT_SUCCESS; } and I find this result; why? I always have a array out of bounds but valgrind doesn't found that? Help See in the user manual the description of the heuristic used by sgcheck. http://www.valgrind.org/docs/manual/sg-manual.html#sg-manual.overview section 11.3. Basically, if the *same* instruction accesses first inside an array, and then outside an array, then sgcheck will detect the array out of bounds. In the above case, each instruction is executed only once, and so the heuristic cannot find it. Philippe -- Virtualization Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Behavior diffenert with valgrind without valgrind
On Thu, 2012-02-23 at 21:33 +0800, jee wrote: like this: in gdb: 137 while(fgets(buff, PATH_MAX, p_file)){ (gdb) //when i press n, .. there's no n,it's dead -_-! What happens if you press Control-c ? Control-c should indicate to the Valgrind gdbserver to give back the control to gdb, and let you do backtrace, info threads, etc ... to see where it is hanging. Philippe -- Virtualization Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Behavior diffenert with valgrind without valgrind
On Fri, 2012-02-24 at 09:46 +0800, jee wrote: the same,can not pause. 2012/2/24 Philippe Waroquiers philippe.waroqui...@skynet.be On Thu, 2012-02-23 at 21:33 +0800, jee wrote: like this: in gdb: 137 while(fgets(buff, PATH_MAX, p_file)){ (gdb) //when i press n, .. there's no n,it's dead -_-! What happens if you press Control-c ? Control-c should indicate to the Valgrind gdbserver to give back the control to gdb, and let you do backtrace, info threads, etc ... to see where it is hanging. When it is deadlock, strace should indicate what is being done. An alternative : you might try to understand the difference between a native run and a run under valgrind using a gdb on the native run and a gdb connected to Valgrind gdbserver (using --vgdb=full) and then using si gdb command from a certain point onwards. Philippe -- Virtualization Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] How to use valgrind to debug memory leak when the jvm is runing ?
On Fri, 2012-03-30 at 10:52 +0800, 田东云 wrote: -bash-3.2$ vgdb -d -d -d help 1329272585.803212 searching pid in directory /tmp/ format /tmp/vgdb-pipe-from-vgdb-to- 1329272585.803691 check_trial 0 ... 1329272585.811478 trying /tmp/vgdb-pipe-from-vgdb-to-3380-by-weblogic-on-tdy218 1329272585.811872 trying /tmp/vgdb-pipe-from-vgdb-to-3380-by-weblogic-on-tdy218 ... vgdb error: no FIFO found and no pid given Normally, one would expect to find two lines like the above 2 but with 11535 replacing 3380. Can you redo the experiment with valgrind -v -v -v -d -d -d then do ls -l /tmp/*x* (replace x with the process id as found in the valgrind output) (this ls command should show files such as the above) and then re-do the vgdb -d -d -d help If the ls command does not show anything, then the valgrind debug log should indicate a trace related to deleting these files. Thanks -- This SF email is sponsosred by: Try Windows Azure free for 90 days Click Here http://p.sf.net/sfu/sfd2d-msazure ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] vgpreload_core-x86-linux.so from LD_PRELOAD cannot be preloaded
On Tue, 2012-04-03 at 18:48 +0200, Peter Toft wrote: I do not have any $LD_PRELOAD set. Should I? Not AFAIK. J What is the Valgrind error-message then telling me? This looks like bug https://bugs.kde.org/show_bug.cgi?id=286270 which is solved in 3.8.0 SVN. If this is the same bug, then there is no consequence. Philippe -- Better than sec? Nothing is better than sec when it comes to monitoring Big Data applications. Try Boundary one-second resolution app monitoring today. Free. http://p.sf.net/sfu/Boundary-dev2dev ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Using custom memory allocators with valgrind?
On Tue, 2012-04-03 at 22:06 +0200, ольга крыжановская wrote: How do I use custom memory allocators with valgrind? We'd like to use the memory allocators from ATT libast but also like to use valgrind for (automated) error checking. Is there any howto document how to do this? Olga http://www.valgrind.org/docs/manual/mc-manual.html#mc-manual.mempools describes what Valgrind provides for real custom allocators. Now, if libast provides a set of malloc/free/... compatible functions, then you should rather modify Valgrind to intercept these replacement. See bug https://bugs.kde.org/show_bug.cgi?id=219156 for a patch implementing replacements for tcmalloc library. Philippe -- Better than sec? Nothing is better than sec when it comes to monitoring Big Data applications. Try Boundary one-second resolution app monitoring today. Free. http://p.sf.net/sfu/Boundary-dev2dev ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Using custom memory allocators with valgrind?
On Tue, 2012-04-03 at 22:40 +0200, ольга крыжановская wrote: Philippe, libast provides both malloc replacement and a complex allocation system based on io streams, e.g.sort of stdio with disciplines on steroids where even string buffers, or lists and trees of (nested) string buffers, can be a io stream or memory buffer. Is it really wise to modify valgrind just to intercept the custom malloc/calloc/memalign/free implementation? It sounds like an over kill and might be problematic because we would have to wait until each Linux vendor has updated to the new valgrind version. If valgrind is not informed that libast has malloc/free/... replacements, then many/most valgrind tools are less functional (e.g. memcheck will not detect leaks). So, if your application is using the malloc/free/... of libast rather than the standard malloc/free, I think it is a good idea to have the replacements done. Now, I do not understand why you would have to wait until each Linux vendor has updated to the new valgrind version. Just do the changes, and have your valgrind compiled in a corner :). (in addition, I am not sure a libast replacement patch for Valgrind would be committed). For what concerns the complex allocation system: if this system is doing a direct usage of libast malloc/free/..., then the replacements above will be good enough. If this complex allocation system is malloc-ing or mmap-ing a big block, and then does its own small blocks from these big blocks, then you have to add Valgrind mempool client requests in the libast code to have Valgrind understanding this complex allocation system. Philippe -- Better than sec? Nothing is better than sec when it comes to monitoring Big Data applications. Try Boundary one-second resolution app monitoring today. Free. http://p.sf.net/sfu/Boundary-dev2dev ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] [vgdb - gdbserver] FIFO name mismatch...
On Sun, 2012-04-15 at 00:26 +0100, Plug Gulp wrote: I tried starting apache webserver and gdb from root login. Did not help, because the valgrind is still launched from under apache user. So I set the LOGNAME and HOST to ??? as suggested above and started the webserver and gdb from root login. I also changed the permissions of the FIFOs using a+rw. Here is the output of the commands: Here is another solution to try : * set the needed env var (LOGNAME, HOST) * make a copy of the vgdb executable * change owner of this vgdb file to be the user apache * set the user and group id bit (must be root or apache for that, I guess) chmod u+s vgdb chmod g+s vgdb The above worked for me to then connect a root gdb+this vgdb to a Valgrind running under my user (I replaced apache by my own user) 3 notes: 1. once you have a apache vgdb, I believe root user should not be needed anymore. You should be able to use your own user, as the launched vgdb will belong to apache. 2. with the above hack, it seems some ptrace syscall done by vgdb are still not authorised. So, you might have to disable the ptrace things in vgdb using --max-invoke-ms=0 (see description and consequences in Valgrind manual). 3. IMPORTANT: of course, anybody else having access to this apache vgdb can then connect to any Valgrind gdbserver belonging to apache. In other words, other users on the same computer have full debuggability of all your apache Valgrind gdbserver as long as this apache vgdb copy exists. Hope this helps ... Hope this (really) helps (now) ... Philippe -- For Developers, A Lot Can Happen In A Second. Boundary is the first to Know...and Tell You. Monitor Your Applications in Ultra-Fine Resolution. Try it FREE! http://p.sf.net/sfu/Boundary-d2dvs2 ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] system crash
On Sun, 2012-04-15 at 19:17 +0200, Folkert van Heusden wrote: Hi, I'm trying to debug this (opengl) application I'm writing. Now something odd happens: when I run it in gdb, it occasionally sigsegvs which is not ok but expected, but when ran under valgrind the whole system crashes. I think it starts swapping like hell, but far more than the usual out of memory situation because not even the mouse cursor moves. So what I would like to know: is it possible to let valgrind limit the amount of memory the 'guest application' uses? Could not find this in the man page. Which Valgrind version are you using ? 3.7.0 contains a fix related to memory usage (bug 250101). 3.8.0 SVN has also some improvements related to memory usage. If you still have a problem, ulimit -d will limit the total memory used by Valgrind and the guest application. Philippe -- For Developers, A Lot Can Happen In A Second. Boundary is the first to Know...and Tell You. Monitor Your Applications in Ultra-Fine Resolution. Try it FREE! http://p.sf.net/sfu/Boundary-d2dvs2 ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] system crash
On Tue, 2012-04-17 at 20:16 +0200, Folkert van Heusden wrote: From what I read on wikipedia, Valgrind runs things in a virtual machine and from my experience (wrote an MSX (z80) emulator once, no twice) you can emulate everything, maybe a tad slow. Valgrind provides a simulated cpu, but not a simulated OS and simulated mmu etc etc. In other words, Valgrind runs a unix application process on top of a virtual cpu, Valgrind does not provide a virtual machine like kvm or Xen or ... Philippe -- Better than sec? Nothing is better than sec when it comes to monitoring Big Data applications. Try Boundary one-second resolution app monitoring today. Free. http://p.sf.net/sfu/Boundary-dev2dev ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] system crash
On Tue, 2012-04-17 at 21:02 +0200, Folkert van Heusden wrote: Valgrind provides a simulated cpu, but not a simulated OS and simulated mmu etc etc. In other words, Valgrind runs a unix application process on top of a virtual cpu, Valgrind does not provide a virtual machine like kvm or Xen or ... hmmm ok. it seems it can't handle corruptions that nicely: Not too sure I understand. The below msgs from Valgrind are indicating (probable/possible) bugs. Apart of reporting the error, the behaviour is (usually) not influenced too much (compared to a native execution). malloc-fill might cause bigger differences, in case non initialised memory is used. Or do you mean the stack trace is not that good/clear ? Maybe the gdb+Valgrind gdbserver will give better stack traces ? Philippe ==21521==at 0x6A39957: ioctl (syscall-template.S:82) ==21521==by 0x40A8B44: ukiCreateContext (in /usr/lib/x86_64-linux-gnu/libatiuki.so.1.0) ==21521==by 0xF808A35: ??? (in /usr/lib/x86_64-linux-gnu/dri/fglrx_dri.so) ==21521==by 0x1016A3AF: ??? ==21521==by 0x10169467: ??? ==21521==by 0x1016956F: ??? ==21521==by 0xF862E0F: ??? (in /usr/lib/x86_64-linux-gnu/dri/fglrx_dri.so) ==21521==by 0xF: ??? ==21521== Address 0x7feff2528 is on thread 1's stack ==21521== Uninitialised value was created by a stack allocation ==21521==at 0xDEE4168: ??? (in /usr/lib/x86_64-linux-gnu/dri/fglrx_dri.so) -- Better than sec? Nothing is better than sec when it comes to monitoring Big Data applications. Try Boundary one-second resolution app monitoring today. Free. http://p.sf.net/sfu/Boundary-dev2dev ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] system crash
On Tue, 2012-04-17 at 21:55 +0200, Folkert van Heusden wrote: The problem I see is that the stacktraces seem to be incorrect. gdb unwinder might work better = try with the Valgrind gdbserver (give --vgdb-error=0 arg to Valgrind, and follow instructions to attach gdb, and then 'continue' your process till the error is encountered). For example: ==21521== Invalid write of size 1 ==21521==at 0x402A788: memcpy (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==21521==by 0xF867F95: ??? (in /usr/lib/x86_64-linux-gnu/dri/fglrx_dri.so) ==21521==by 0x1DFF: ??? ==21521==by 0x1EFF: ??? ==21521==by 0x1: ??? ==21521==by 0x205: ??? ==21521==by 0x2: ??? ==21521==by 0x2: ??? ==21521== Address 0x7f19655e45ab is not stack'd, malloc'd or (recently) free'd This happened on a system with an ati card with the fglrx driver. On my laptop with intel video chipset it does not. Hmmm. -- Better than sec? Nothing is better than sec when it comes to monitoring Big Data applications. Try Boundary one-second resolution app monitoring today. Free. http://p.sf.net/sfu/Boundary-dev2dev ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] can't start any application on OS X 10.7.3
On Tue, 2012-04-24 at 14:06 -0400, Matt Broadstone wrote: Not sure how to post to this thread having just signed up for the list, but hopefully this routes correctly.. Hi, I wanted to confirm that the aes changes in trunk do indeed solve that unrecognized instruction issue, however, I am still experiencing immediate termination whenever I use valgrind with the following output: ==66368== valgrind: Unrecognised instruction at address 0x3a36b8c. ==66368==at 0x3A36B8C: __abort (in /usr/lib/system/libsystem_c.dylib) ==66368==by 0x3A36AAA: abort (in /usr/lib/system/libsystem_c.dylib) ==66368==by 0x3D79431: _SCSessionUniverseByUIDAcquireAndLock (in The above does not match the symptoms of an aes instruction not recognised (see e.g. bug 290655). From the above, I am guessing that _SCSessionUniverseByUIDAcquireAndLock encounters a problem, and calls abort. Abort might be implemented via an illegal instruction. You might verify that by just doing a small executable calling abort and see if that gives the same behaviour. Otherwise, disassemble the instructions at 0x3a36b8c. Philippe -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] can't start any application on OS X 10.7.3
On Thu, 2012-04-26 at 09:29 -0400, Matt Broadstone wrote: As for doing a db-attach, that seems to have failed as well - I never make it to a gdb session. Here is the full output of a db-attach valgrind run on TextEdit.app: ==76980== Attach to debugger ? --- [Return/N/n/Y/y/C/c] Y valgrind: m_debugger.c:238 (ptrace_setregs): Assertion 'Unimplemented functionality' failed. The above assert indicates that --db-attach is not implemented on darwin. You could however try the Valgrind gdbserver, which is supposed to work (at least, I manually tested it on Darwin something like one year ago on a 3.7.0 SVN). You could try to investigate why abort is called by using 2 GDBs to debug: * a native run * a run under Valgrind and see at which point/instruction their executions are diverging. (e.g. put breakpoint in _SCSessionUniverseByUIDAcquireAndLock and then use stepi or similar.). Philippe -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] can't start any application on OS X 10.7.3
On Thu, 2012-04-26 at 14:17 -0400, Matt Broadstone wrote: and then: (gdb) target remote | /usr/local/bin/vgdb | /usr/local/bin/vgdb: Undefined error: 0 You must have a version of gdb recent enough (I believe = 6.5) otherwise GDB does not understand the | target. Two alternatives: * compile + install a recent GDB (there is a kind of magic security signing which is needed). * alternatively: valgrind --vgdb-error=0 prog # and then in another shell, run: vgdb --port=1234 # in a third shell: gdb prog (gdb) target remote :1234 (NB: with this technique, there is no security: anybody which have access to your system can connect to the vgdb port nr). Philippe -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] core dump improvements - fix order of threads
On Fri, 2012-04-27 at 11:49 +0200, Matthias Schwarzott wrote: Hi there! Comparing the output from gdb attached to valgrind gdbserver and the core file valgrind creates, the thread order is inverted. As I have more minor issues with gdb and valgrind core files, I do not known if this is always the case. I do not think that there is a consistent order (inverted or not) between the list of gdbserver threads reported to gdb and the list of threads in the VG thread array. The valgrind gdbserver maintains a linked list of threads derived from new threads appearing in the array or old threads that disappeared. I believe (not checked) that if you have: create thread a create thread b create thread c delete thread b create thread d that the VG array will contain a d b while the gdbserver linked list will contain d b a. If the above is correct, then the changes below will not guarantee the order is the same. Also, not too sure what gdb does with the list of threads it receives from the gdbserver (maybe gdb sorts them ?). Just to understand, why do you need to make the link between the V core thread list and the V gdbserver thread list ? Is it because you obtain a core dump, that you then try to understand with V gdbserver in another run ? For exactly this problem I have two possible solutions: A. Change the loop over all threads to be reversed: - for(i = 1; i VG_N_THREADS; i++) { + for(i = VG_N_THREADS - 1; i = 1 ; i--) { B. Change the function add_note (or related notes processing code), to output the notes in the order add_note is called, and not backward. I wonder which approach is better, but I tend to approach B, as then the code creates the notes in the order they appear in the final core file. Regards Matthias -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] core dump improvements - fix order of threads
Maybe there is some other piece missing (see my posting before this topic). My gdb is not sure about the threads: (gdb) info threads Cannot find new threads: generic error But: (gdb) thread apply 1-2 bt Thread 1 (LWP 6): #0 0x003e93a329b5 in raise () from /lib64/libc.so.6 #1 0x003e93a33d5a in abort () from /lib64/libc.so.6 #2 0x0040082b in main () Thread 2 (LWP 7): #0 0x003e93ab56dd in nanosleep () from /lib64/libc.so.6 #1 0x003e93ab5589 in sleep () from /lib64/libc.so.6 #2 0x0040079a in th () #3 0x003e94607006 in start_thread () from /lib64/libpthread.so.0 #4 0x003e93ae780d in clone () from /lib64/libc.so.6 But I have no idea how to getinto this problem. Is it necessary to debug or trace gdb+bfd for that? It is not clear to me what is the above. I suppose that info threads does not work on a core dump, and so reports it cannot find new threads. Philippe -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Theoretical question snapshotting
On Fri, 2012-05-04 at 17:32 +, Oliver Schneider wrote: Hi folks, I've got a question about Valgrind and its Memcheck tool. Is it possible to take a snapshot of a program under Valgrind, kinda similar to the way a fork() clones the process space, and then continue again from that snapshot with Valgrind? Could fork() perhaps be the answer? This is not possible with Valgrind. Having a fully general snapshot solution looks close to impossible e.g. you have to re-create the exact system state: opened files and seek position tcp/ip connections pwd ... The closest to what you describe here that I know of is the unexec feature of emacs: emacs is first compiled, it has no lisp loaded. As part of the build, it then loads a whole bunch of lisp files and then unexec itself (i.e. creates a dumped executable) After that, the dumped file is the one which is installed, with loaded lisp files being part of the initialised data. So, I guess you better work in that direction (or have a data structure that you can e.g. dump to a file to just mmap at startup). Philippe -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] valgrind prints out a lot of error messages pointing to the standard library
On Thu, 2012-05-03 at 23:37 -0400, Zheng Da wrote: Is this normal? Is it because my program is written in C++? How do I suppress these errors very effectively? or these errors are actually caused by some bugs of my program? C++ is supported by Valgrind. Valgrind reports some errors in glibc which are normally suppressed using a suppression file. ==32701== Conditional jump or move depends on uninitialised value(s) ==32701==at 0x4FB3D9: fillin_rpath (in /home/zhengda/Dropbox/research/read-test/rand-read) ==32701==by 0x4FDBCB: _dl_init_paths (in /home/zhengda/Dropbox/research/read-test/rand-read) ==32701==by 0x4CCC58: _dl_non_dynamic_init (in /home/zhengda/Dropbox/research/read-test/rand-read) ==32701==by 0x4CD762: __libc_init_first (in /home/zhengda/Dropbox/research/read-test/rand-read) ==32701==by 0x47F795: (below main) (in /home/zhengda/Dropbox/research/read-test/rand-read) ==32701== The above error for example looks to somewhat match a suppression in glibc-2.3.supp It is however not clear what is the cause of all these errors not being suppressed. Note that usually, having more info such as Valgrind version, OS and distribution version, cpu etc might only help to guess what it is :). If you have an old version of Valgrind, you could try to upgrade to a newer one. Philippe -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Valgrind+MPICH2: wrong libmpi.so used?
On Sat, 2012-05-05 at 14:25 +0200, Martin Kalany wrote: Hello, I'm trying to use valgrind do debug an mpich2 program. Unfortunately, I get the following error: libmpi.so.0: cannot open shared object file: No such file or directory I found out that libmpich.so.1.0 should be linked to instead (see libmpiwrap.c). Valgrind documation states that The MPI functions to be wrapped are assumed to be in an ELF shared object with soname matching libmpi.so*. This is known to be correct at least for Open MPI and Quadrics MPI, and can easily be changed if required. How do I change that? Is the 'cannot open' error only there when running under Valgrind ? The Z encoding used in libmpiwrap.c is a pattern which matches one or the other library: #define I_WRAP_FNNAME_U(_name) \ I_WRAP_SONAME_FNNAME_ZU(libmpiZaZdsoZa,_name) i.e. it is libmpi*.so*. So, I guess your problem is not the Valgrind wrapping. Maybe a problem related to the dynamic loader ? Philippe -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Valgrind+MPICH2: wrong libmpi.so used?
On Sat, 2012-05-05 at 18:14 +0200, Martin Kalany wrote: Is the 'cannot open' error only there when running under Valgrind ? Yes. When I use mpirun, it's fine. What I think is strange that valgrind apperantly tries to load libmpi.so, although it should load libmpich.so.1.0 Maybe a problem related to the dynamic loader ? I'm rather new to MPI so I'm not sure about this. Valgrind is not supposed to change which shared lib are used: the dynamic loader is executed by Valgrind and should behave the same (and so load the same shared libs as a native run). I know nothing abound MPI and so might not understand what you are doing. But I believe mpirun is a shell script which might be needed to setup some required env variables. So, to be sure, mpirun should be used also when using Valgrind e.g. valgrind --trace-children=yes mpirun (or is this what you are doing already ?) Philippe -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] valgrind prints out a lot of error messages pointing to the standard library
On Sat, 2012-05-05 at 14:45 -0400, Zheng Da wrote: The corresponding code is shown below. I don't understand which variable isn't initialized? If you upgrade to Valgrind 3.7.0, you can use gdb to debug your program under Valgrind. With this, you have GDB monitor commands to ask if an address is initialised (or not). See user manual, sections 3.2. Debugging your program using Valgrind gdbserver and GDB and 4.6. Memcheck Monitor Commands This might make it easier to understand where the problem is coming from. Philippe -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Valgrind+MPICH2: wrong libmpi.so used?
On Sun, 2012-05-06 at 00:24 +0200, Martin Kalany wrote: Valgrind documation states that The MPI functions to be wrapped are assumed to be in an ELF shared object with soname matching libmpi.so*. This is knownto be correct at least for Open MPI and Quadrics MPI, and can easily be changed if required. Note that the documentation is slightly out of date, as the code contains the pattern libmpi*.so* (so as to match a.o. libmpich.so.1.0). Philippe -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Valgrind+MPICH2: wrong libmpi.so used?
On Mon, 2012-05-07 at 21:15 +0200, Martin Kalany wrote: Nevertheless, valgrind doesn't print anything similar to valgrind MPI wrappers 31901: Active for pid 31901 valgrind MPI wrappers 31901: Try MPIWRAP_DEBUG=help for possible options as stated in the documentation. How do I know whether or not the mpi wrappers now work? If you add the option --trace-redir=yes to your Valgrind args, Valgrind will trace all the actions related to redirection/wrapping: * it will trace the creation of the redir specifications (e.g; when loading the libmpiwrap which is part of Valgrind) * it will trace the resulting active redirections or wrappings. For what concerns the original problem: I understand it is because Valgrind was configured with a different mpi that the one you are using and that created a mixup in the libs. Is that the explanation ? Philippe -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] about massif
On Mon, 2012-05-21 at 10:01 +0800, 王?? wrote: Hello everyone. Massif is a heap and stack profiler, I have a question about massif when I use it. It is that I can’t get the output log(massif.out.pid) until the program tested is over. If the program tested is just like endless while sentence, how can I get the massif log. Thank you. If you have Valgrind 3.7.0, you can do on-demand massif snapshot using: vgdb snapshot or vgdb detailed_snapshot For more info, see user manual e.g. http://www.valgrind.org/docs/manual/ms-manual.html#ms-manual.monitor-commands Alternatively, you can do snapshot on demand from a gdb connected to the Valgrind gdbserver. Philippe -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] libpthread: recursive write lock granted on mutex/wrlock which does not support recursion
On Mon, 2012-06-04 at 14:14 +0200, Christoph Bartoschek wrote: Am 04.06.2012 14:00, schrieb Tom Hughes: On 04/06/12 12:27, Christoph Bartoschek wrote: how should one interpret the following report: Thread #11: Bug in libpthread: recursive write lock granted on mutex/wrlock which does not support recursion ==00:13:17:12.428 20623==at 0x4C2D18D: pthread_spin_lock (in /usr/lib64/valgrind/vgpreload_helgrind-amd64-linux.so) Is there a bug in libpthread that does something strange? Or is there a bug in my program that tries to lock a lock twice? It's saying that the program is trying to lock a mutex which is (a) already locked and (b) not marked as a recursive mutex. So yes, something is trying to lock the same mutex twice. Whether it is your program at fault is hard to say for sure without seeing the rest of the stack trace. The pthread_mutex_lock manual says: If the mutex type is PTHREAD_MUTEX_DEFAULT, attempting to recursively lock the mutex results in undefined behavior. So, one of the possible outcome of the undefined behaviour is to give the lock. Valgrind then reports this as a bug in the pthread library (even if this is not formally a bug, according to the manual). But of course, the above is only ok if your program effectively tries to lock recursively a non recursive mutex. however I wonder that the message does not mention where the lock was aquired for the first time. You might try --tool=drd --trace-mutex=yes --trace-rwlock=yes to have more details about what is happening. Philippe -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] compiling with MPI support on OS X Lion
On Mon, 2012-06-18 at 17:05 -0400, Brian Helenbrook wrote: I don't know why it tries to switch to the i386 architecture. I also have no idea (and no MacOS system to play with). Maybe ./configure --enable-only64bit will help/bypass the problem ? Philippe -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Valgrind support for MIPS
On Mon, 2012-06-18 at 15:34 -0700, Ajay Kalambur wrote: Hi Is the support for MIPS checked into trunk already https://bugs.kde.org/show_bug.cgi?id=270777 As per this bug it seems to suggest the patches are in latest trunk. Yes, the trunk contains the MIPS patches (see commit revision numbers in comment 154). So, if you are interested, just checkout the latest trunk. Philippe -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] --num-callers limit of 50 is not enough
On Wed, 2012-06-20 at 16:09 -0400, Bob Rossi wrote: Hi, I'm using valgrind with a C++ program that embeds python. The stack trace is pretty deep, greater than 50 at almost every point valgrind finds a memory leak. Unfortunately, I can't see the stack frame that started this chain because 50 is not enough. Is there a way to increase this limit? It should be enough (this is theory, untested) to change the below in coregrind/pub_core_execontext.h and recompile. /* The maximum number of calls we're prepared to save in an ExeContext. */ #define VG_DEEPEST_BACKTRACE 50 -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Using Valgrind on Android to detect deadlocks
On Mon, 2012-06-25 at 11:03 +0200, Julian Seward wrote: Is there any eay how to 'iterate' over all the current threads (let us say that we know their thread id's - we do) and print their stack traces? That would help us a lot. One hack that might be worth a try is this. Your SIGTERM is sent by the kernel first to Valgrind, which then sends it onwards to the app. It's easier if you deal with it on the Valgrind side, before it gets forwarded to the app. In coregrind/m_signals.c there is async_signalhandler(). In there, add a test for sigNo==15 (sigterm) and if so make a call to VG_(show_sched_status), which is in m_libcassert.c. This shows the stacks for all threads, which is what you want. If you use Valgrind version = 3.7.0, you can also print the list of threads and their stack traces from a shell command line, using: vgdb v.info scheduler You need to activate the Valgrind gdbserver for that, using options --vgdb=yes (and possible give --vgdb-prefix= to point at a file system supporting FIFOs). Philippe -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Using Valgrind on Android to detect deadlocks
You need to activate the Valgrind gdbserver for that, using options --vgdb=yes (and possible give --vgdb-prefix= to point at a file system supporting FIFOs). Note that you should also be able to obtain the stack trace of all threads using the standard gdbserver part of the android system, and inside gdb, do: thread apply all bt full to see the stack traces of all threads, and all local variables; (might be a lot, or full might give a problem, then just do without full). There is no need to use specifically Valgrind and its gdbserver to obtain these stacktraces. Philippe -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Using Valgrind on Android to detect deadlocks
On Fri, 2012-06-29 at 17:46 +0400, Alexander Potapenko wrote: This may be not that easy to guess which locks are taken when the deadlock has already occurred. However a Valgrind-like tool is really an overkill for deadlock detection: a small library that interposes pthread_mutex_* (or other locking primitives) and keeps track of the locking order is fairly enough. helgrind has such a logic to detect an inconsistent lock ordering. However, just tracking the locking order to detect (potential) deadlock is not covering properly some cases. A.o., the simple lock ordering verification can give false positive when the application is using a guard lock. try lock is also not properly taken into account (e.g. if there are a mixture of try lock and lock operations on a set of locks). Properly covering these cases (i.e. not giving false positive with the guard lock, and not giving false negative for the try lock) seems not straightforward to me. Some algorithms have been developped to cover the guard lock case (search e.g. for goodlock algorithm). I have however not found anything which properly covers the try lock cases. Philippe -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Installation on OS X 10.7
On Sat, 2012-06-30 at 18:00 +0100, Josh Reese wrote: ... ld: symbol(s) not found for architecture i386 collect2: ld returned 1 exit status make[2]: *** [libmpiwrap-x86-darwin.so] Error 1 make[1]: *** [all-recursive] Error 1 make: *** [all] Error 2 If anyone has any experience with this I would be most appreciative. See http://comments.gmane.org/gmane.comp.debugging.valgrind/12289 If I understood well: mpi wrappers are only supported in 64 bits. So, either configure without mpi wrappers (if you do not need them), or alternatively configure only for 64 bits (--enable-only64bit) Philippe -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Compiling valgrind with non-default gcc
On Sat, 2012-07-07 at 03:19 +0300, Zvi Vered wrote: If I run gcc --version (on my gcc) I get: .i686-nptl-linux-gnu-gcc (crosstool-NG-1.5.2) 4.3.2 Copyright (C) 2008 Free Software Foundation, Inc. If I run gcc --version on the default gcc installed on my Centos 5.3 I get: gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-52) Copyright (C) 2006 Free Software Foundation, Inc. With the default gcc, valgrind is built OK. I found a similar question in google but the fix in ./configure did not help. Can you help ? modify configure.in around line 130 : case ${is_clang}-${gcc_version} in to add a pattern matching what your gcc does. (or, somewhat more kludgy, edit configure, and replace an existing pattern by a *). Tested none of the above suggestions ... Philippe -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] configure reporting incorrect primary build arch
On Thu, 2012-07-12 at 14:42 -0700, Jacob Goldstein wrote: Primary build arch: amd64 Secondary build arch: x86 I'm running on a MacBook Pro with an Intel i7 CPU (running OS X 10.7.4), so I'm not sure why the primary build arch is amd64. amd64 indicates it is the intel architecture in 64 bits. If I am not wrong, it is amd that defined the extension of the x86 to 64 bits, and so that is one (the?) reason to call it amd64. See also http://www.valgrind.org/info/platforms.html So, no problem. Philippe -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Vlgrind vgdb doesn't stop after memory error
On Sat, 2012-07-14 at 00:44 +, Anup wrote: Hi, I am trying to debug memory corruption in a program by attaching it to valgrind gdbserver and GDB (valgrind version 3.7.0). Command: valgrind --tool=memcheck --vgdb=full --vgdb-error=0 prog With --vgdb-error=0, Valgrind should stop before starting prog, to let a GDB attach and e.g. put breaks. It should also stop on all errors. Which OS are you using ? Otherwise, can you add options -v -d and show the trace between vgdb me ... and Continuing Philippe I am expecting that execution will stop when valgrind detects memory error which will allow me to debug it using GDB. However, it continues to execute after reporting the error. As it is seen in the following snapshot, action on error is continue. I want execution to stop at this instance giving control to GDB. ==31272== Use of uninitialised value of size 8 ==31272==at 0x783AA6C: __mpn_lshift (lshift.S:57) ==31272==by 0x7851E22: __printf_fp (printf_fp.c:668) ==31272==by 0x784C19F: vfprintf (vfprintf.c:1616) ==31272==by 0x7855659: printf (printf.c:35) ==31272==by 0x4EF2901: gpgpu_sim::cycle() (gpu-sim.cc:1447) ==31272==by 0x4F77D7D: gpgpu_sim_thread_concurrent(void*) (gpgpusim_entrypoint.cc:135) ==31272==by 0x7B8E9C9: start_thread (pthread_create.c:300) ==31272==by 0x78EBCDC: clone (clone.S:112) ==31272== ==31272== (action on error) vgdb me ... ==31272== Continuing ... Is this the expected behavior? How can I get valgrind to stop at such points? I remember using the above command earlier where valgrind stopped on detecting memory error, giving control back to GDB. But I am not able to get it working again. Is such support available in valgrind at this point? Any help is appreciated. Thanks, Anup -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Vlgrind vgdb doesn't stop after memory error
On Wed, 2012-07-18 at 01:22 +, Anup wrote: --4562:1:gdbsrv getpkt (C14); [no ack] --4562:1:gdbsrv set_desired_inferior use_general 0 found 0x402514BE0 tid 1 lwpid 4562 --4562:1:gdbsrv resume_info thread 4562 leave_stopped 0 step 0 sig 17 stepping 0 --4562:1:gdbsrv stop pc is 0x7B0969D --4562:1:gdbsrv stop_pc 0x7AAC2CF changed to be resume_pc 0x7B0969D: ??? (syscall-template.S:82) ==4562== Continuing ... This looks similar to bug 297078, which is solved in 3.8.0 SVN. Can you try with the latest SVN version ? For how to get it and configure+compile it, see http://www.valgrind.org/downloads/repository.html If it does not work with 3.8.0 SVN, then the best is to enter a bug, redo the -v -d trace and attach the complete trace (from the beginning) to the bug. Thanks Philippe -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Using System Call in a Valgrind Tool
On Thu, 2012-08-02 at 18:21 -0400, Wonjoon Song wrote: ... valgrind: the 'impossible' happened: Killed by fatal signal ==24479==at 0x38059A25: vgPlain_do_syscall (m_syscall.c:72) ==24479==by 0x3808E5E1: handle_syscall (scheduler.c:1057) ... I tried to search for examples using system call in tools but it seems they(memcheck, massif, lackey) don't use it. Difficult to see what is going wrong without looking at (some of) the code. For sure, tools are e.g. calling VG_(getpid) without crashing :). For what concerns mmap, you should not use mmap syscall directly. Instead, you should use the interface of the Valgrind address space manager (i.e. pub_tool_aspacemgr.h). My question is, is it allowed to use system call in a valgrind tool? Is it recommended not to use system call? If it is allowed to use system call in a tool, what should I do to make this thing work? You can use syscalls, but you must have the interface defined for it. See e.g. m_libcproc.c and similar to see how the VG(xx) are providing the equivalent of the xx syscall. Philippe -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Errors for empty Xcode Cocoa app
On Tue, 2012-08-07 at 17:12 -0700, John Reiser wrote: On 08/07/2012 04:43 PM, Jacob Goldstein wrote: --22925:0:schedule VG_(sema_down): read returned -4 That unexpected behavior of sema_down should be investigated. The code explicitely retry in case read on the sema returns -4 (i.e. -VKI_EINTR). Philippe -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Invalid read raised with stpncpy
On Sun, 2012-08-12 at 18:10 +0200, Francis Giraldeau wrote: Of course you are right! Thought, I just tested without updating dst and the problem is raised anyway. The error is not caused in the printf(), but in the stpncpy(). String functions might be optimised very specially by the compiler and then might not be redirected by Valgrind. There has been some changes in the area of sse instructions and some replacement of string functions in 3.8.0. = would be good to try with the 3.8.0. If problem is still there, you might try to compile telling gcc to not use string builtins or similar. You can also (with or without the previous builtin trial) examine the output of Valgrind (-v) to see if the stpncpy function is properly redirected to the valgrind one. From what I can see, there is no replacement for stpncpy. So, the optimised code is then not replaced. And as it is optimised a lot, it can gives false positive. (this is one of the reasons for which some string functions are replaced. See memcheck/mc_replace_strmem.c) Philippe -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Invalid read raised with stpncpy
On Sun, 2012-08-12 at 19:15 +0200, Francis Giraldeau wrote: I confirm this too. I did a trivial implementation of stpcpy/stpncpy with strcpy and it do not raise the issue. Maybe they can be a basis for an additional REDIR? The best is to file a bug on bugzilla, with the test program to reproduce the false positive and additional details such as gcc version, glibc version, distro version, compilation options used. Thanks Philippe -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] shmat parameters are modified by valgrind
On Tue, 2012-08-14 at 10:20 +, Adishesh wrote: Hi, I have used below program for testing. Below are the steps I have followed. Compile: gcc -g3 test_shm.c -o test_shm Create shared memory: ./test_shm –c Get shared memory without valgrind: ./test_shm –g ( this works fine) Get with valgrind: /usr/bin/valgrind --tool=memcheck --leak-check=full --track-origins=yes --log-file=/tmp/val_log /home/rtp99/test_shm –g With valgrind shmat command fails. Can you try again after applying the attached patch ? Thanks Philippe Index: coregrind/m_syswrap/syswrap-generic.c === --- coregrind/m_syswrap/syswrap-generic.c (revision 12872) +++ coregrind/m_syswrap/syswrap-generic.c (working copy) @@ -1700,7 +1700,7 @@ /* -- */ static -UInt get_shm_size ( Int shmid ) +SizeT get_shm_size ( Int shmid ) { #ifdef __NR_shmctl # ifdef VKI_IPC_64 @@ -1725,7 +1725,7 @@ if (sr_isError(__res)) return 0; - return buf.shm_segsz; + return (SizeT) buf.shm_segsz; } UWord @@ -1733,7 +1733,7 @@ UWord arg0, UWord arg1, UWord arg2 ) { /* void *shmat(int shmid, const void *shmaddr, int shmflg); */ - UInt segmentSize = get_shm_size ( arg0 ); + SizeT segmentSize = get_shm_size ( arg0 ); UWord tmp; Bool ok; if (arg1 == 0) { @@ -1768,7 +1768,7 @@ UWord res, UWord arg0, UWord arg1, UWord arg2 ) { - UInt segmentSize = VG_PGROUNDUP(get_shm_size(arg0)); + SizeT segmentSize = VG_PGROUNDUP(get_shm_size(arg0)); if ( segmentSize 0 ) { UInt prot = VKI_PROT_READ|VKI_PROT_WRITE; Bool d; -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] shmat parameters are modified by valgrind
On Thu, 2012-08-16 at 12:51 +0530, Adishesh M wrote: Hi Philippe, After applying patch shmat is working fine. Does this patch will be included in the next valgrind release? Patch has been committed (revision 12874), so will be in next release. Philippe -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Large application SIGSEGV when run in valgrind
On Tue, 2012-10-02 at 12:48 -0700, John Reiser wrote: In any case, please run cat /proc/PID/maps (where PID is the numerical process ID) and show us what the mappings look like for addresses 0xFE00 and above, when the program hits the memcheck error (or shortly before.) The easiest to obtain this info when the segv is triggered is to start with --vgdb-error=1, then attach with gdb/vgdb when the error is reported. In gdb, you can then use: monitor v.info memory aspacemgr # this will show the set of mappings as seen by Valgrind shell cat /proc/pidmaps # this will show the mappings as seen by the kernel Philippe -- Don't let slow site performance ruin your business. Deploy New Relic APM Deploy New Relic app performance management and know exactly what is happening inside your Ruby, Python, PHP, Java, and .NET app Try New Relic at no cost today and get our sweet Data Nerd shirt too! http://p.sf.net/sfu/newrelic-dev2dev ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Addresses marked as ??? in Valgrind stack trace
On Thu, 2012-10-04 at 13:19 -0500, Kerrick Staley wrote: I can't track down the error since the stack trace doesn't indicate which shared object and function it occurs in. According to http://valgrind.org/docs/manual/faq.html#faq.unhelpful, if a shared object is unloaded before the program terminates, ??? entries will appear in the stack trace, so I'm guessing that Mono is dynamically unloading the shared object after the segfault. I'm unsure as to whether this hunch even makes sense, though. Is there anything I can do on either the Valgrind or the Mono side to get more information from the stack trace? To my knowledge, ??? can only appear for stacktraces which are produced after the object is unloaded (e.g. for stacktraces for leaks). Your stacktraces are for errors which are reported directly. I suppose that a shared object is not unloaded while it is being executed (i.e. is on the call stack) as this would not behave properly I guess. I know close to 0 about C# but IIUC, C# is typically run in by in a JITted environment. If that is effectively the case, have you given the argument --smc-check=all (or =all-non-file) ? One of these two is for sure mandatory in a JITted env (on x86/amd64 at least). Otherwise, you might always try using gdb/vgdb to connect to the process under Valgrind when the error is raised : gdb might maybe help to see what is going on. Philippe -- Don't let slow site performance ruin your business. Deploy New Relic APM Deploy New Relic app performance management and know exactly what is happening inside your Ruby, Python, PHP, Java, and .NET app Try New Relic at no cost today and get our sweet Data Nerd shirt too! http://p.sf.net/sfu/newrelic-dev2dev ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Valgrind + Wine
On Thu, 2012-10-04 at 11:34 -0700, Dan Kegel wrote: On Thu, Oct 4, 2012 at 11:31 AM, Don Rosengrant drosengr...@westbrooktech.com wrote: I’ve read where Windows programs can be run under Valgrind with some effort with Wine. What is involved in doing this? Otherwise, if you are courageous and/or can help develop in a Windows environment, you could take a look at http://sourceforge.net/projects/valgrind4win/ Philippe -- Don't let slow site performance ruin your business. Deploy New Relic APM Deploy New Relic app performance management and know exactly what is happening inside your Ruby, Python, PHP, Java, and .NET app Try New Relic at no cost today and get our sweet Data Nerd shirt too! http://p.sf.net/sfu/newrelic-dev2dev ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] [Mono-list] Addresses marked as ??? in Valgrind stack trace
On Tue, 2012-10-09 at 11:40 -0500, Kerrick Staley wrote: Otherwise, you might always try using gdb/vgdb to connect to the process under Valgrind when the error is raised : gdb might maybe help to see what is going on. You mean I should use --db-attach=yes (as Greg suggested)? Since Valgrind 3.7.0, Valgrind contains an embedded gdbserver, to which you connect from gdb using vgdb as a relay application. The advantages of gdb/vgdb compared to --db-attach is that you get all the usual gdb commands (e.g. breakpoints, continue, jump, inferior function calls, ...) + interactive calls of Valgrind functionality (e.g. search for memleaks e.g. when a breakpoint is reached). vgdb also allows to look at a multi-threaded application, allows inferior function calls, etc. You can also start to debug your application under Valgrind from the beginning (so, before an error has been reported). To use it, give argument --vgdb-error=0 to Valgrind, and follow the instructions to connect your gdb using vgdb. Now, not clear this will help you a lot :(. Maybe there is a function in the mono environment to map a JITted program counter to a source line ? Then you could call it from gdb/vgdb to translate these ??? addresses to the source lines that was JITed to these instructions. Philippe -- Don't let slow site performance ruin your business. Deploy New Relic APM Deploy New Relic app performance management and know exactly what is happening inside your Ruby, Python, PHP, Java, and .NET app Try New Relic at no cost today and get our sweet Data Nerd shirt too! http://p.sf.net/sfu/newrelic-dev2dev ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] helgrind markup for lock-ordering
On Tue, 2012-11-06 at 13:43 +0100, David Faure wrote: On Monday 05 November 2012 23:19:42 Philippe Waroquiers wrote: On Mon, 2012-11-05 at 18:59 +0100, David Faure wrote: The testcase http://www.davidfaure.fr/2012/qmutex_trylock.cpp (from https://bugs.kde.org/show_bug.cgi?id=243232) shows that an optimization inside Qt leads to a helgrind warning about wrong lock ordering, making the use of that feature impossible for detecting actual problems elsewhere (i.e. I have to use --track-lockorders=no all the time). Technically if we ignore the distinction between lock and tryLock, helgrind is right, we did lock the mutexes in reverse order. But because it's a tryLock, it couldn't possibly deadlock. Should helgrind simply ignore all pthread_mutex_trylock calls, for the lockorder detection, even if they succeed? I think so, actually (by definition they couldn't deadlock, which is what track-lockorders is all about). A deadlock can appear if there is a mixture of trylock and lock on the same lock. So, trylock cannot just be ignored. E.g. Thread 1:trylock mutex1 lock mutex2 Thread 2:lock mutex2 lock mutex1 might deadlock. True. This means that only trylock in second place should be ignored. More on this below. More generally, I guess that you mean trylock in last place should be ignored (rather than the special case of 2nd place). This might be difficult to implement as each time a lock is taken, helgrind checks for order violation. I suspect a later lock operation might then transform a trylock in last place to a trylock which is now not anymore in last place. But of course, when the trylock operation has just been done, this trylock is last place and so if we would ignore it, then this would be similar to always ignore the trylock, which is not ok. Currently, helgrind maintains a graph of lock order. I suspect we might need different graph node types and/or edge types to cope with trylock. For sure, more investigations needed looking in depth at the current algorithm. Even without mixture, isn't the below slightly bizarre/dangerous ? Thread 1: trylock mutex1 trylock mutex2 Thread 2: trylock mutex2 trylock mutex1 No deadlock can ever happen here. Yes, no deadlock can occur. However, this is really a really doubful construction. The question is: should helgrind report a lock order warning for such constructs ? If the 2nd trylock fails, what is the plan B ? If the program then accesses shared data, a race condition will happen and will be detected by helgrind anyway. So ignoring the ordering of these trylocks is ok, I would think. Of course helgrind must record we got the lock, for the race condition detection feature, but it shouldn't warn about the wrong order of the locking, since it can't possibly deadlock. The idea of helgrind is that it detects lock order problems and/or race condition problems *even* if no deadlock happens and/or if no race condition really happened. Maybe it is very unlikely that the trylock fails. Still would be nice to discover the bug. And if the trylock does not fail, then the race condition will then not be detected by helgrind. Not that the above would be good programming practice, of course, but helgrind can't say anything about it if all the locks were acquired. It will warn in another run, where some trylock fails, and a race ensues. It seems that a task must unlock all locks and restart from scratch in the above case. Yes this is exactly that QOrderedMutexLocker does. Thread 1: lock mutex1 lock mutex2 Thread 2: lock mutex2 trylock mutex1 if that failed, unlock mutex2 lock mutex1 lock mutex2 QOrderedMutexLocker really retries locking in an other order than the operations done by Thread 2 ? Or is that a typo ? If the trylock fails (because thread1 was first), then it unlocks and restarts from scratch. I can't see a deadlock risk with that, so ideally helgrind shouldn't warn. I guess we might need an option such as: --trylock-logic={normal-lock|local-retry|full-retry} normal-lock = current behaviour local-retry means the task would re-trylock full-retry means that the plan B is to unlock all locks and retry everything. I don't see how this can be a global option. Some piece of code (like QOrderedMutexLocker) might have the full retry logic above, but other pieces of code might do something different - e.g. something wrong. It doesn't make sense to me to tell helgrind this is what all the code paths are going to do about tryLock, that's impossible to predict in a complex program. For sure, generally, an application can do a big variety of behaviours. I suspect however that an application might (this is sane at least) try
Re: [Valgrind-users] Thread names in valgrind and vgdb
On Wed, 2012-11-07 at 16:54 +0100, Matthias Schwarzott wrote: Printing thread names is not a bad idea (not too sure that a lot of applications are using pthread_setname or prctl but never mind). In any case, I see several subtilities to look at. Using plain gdb, info threads also lists these user-defined names. But valgrind and gdb+vgdb only show the thread-ids. I tried with plain gdb 7.5 on fedora 12 (kernel 2.6.32.26-175.fc12.i686.PAE). The prctl get/set name syscalls are working, but I could not persuade (gdb) info threads or ps command to show these names. With which versions have you succeeded to have info threads showing the thread names ? The only place in valgrind that already handles thread names is a helgrind client request. Which is unimplemented in helgrind, but implemented by drd. There will be a need to decide which one(s) to show if both prctl and the DRD client request were used to give a name to a thread. I think it would be a good idea if valgrind could also show the thread name. My first attempt is to add more code to POST(sys_prctl) and in case of VKI_PR_SET_NAME is called, store the set name into a new field of ThreadState. Then in every user visible place where a threadid is printed, also print the name. How could this look like? I think that in some cases, an error can reference a terminated thread (e.g. for helgrind or drd errors). For such a case, you cannot recuperate the thread name in the ThreadState as this ThreadState could have been emptied or re-used by another thread. I think that helgrind maintains a unique thread data structure (even for terminated thread). So, you will have to copy the thread name to this unique helgrind data structure. (in other words, it looks like printing the thread name cannot be done centrally in coregrind error mgr, at least e.g. for helgrind). Maybe/probably drd does a similar thing. The other option would be to call prctl(PR_GET_NAME) at every place that needs a thread name. This would not work when a thread prints an error which references another thread. Otherwise, if you can retrieve somewhere the name of a non terminated thread (i.e. the ThreadState solution), then it is straightforward to have Valgrind gdbsrv giving back this info to GDB. See handling of packet qThreadExtraInfo, in file server.c. Philippe -- LogMeIn Central: Instant, anywhere, Remote PC access and management. Stay in control, update software, and manage PCs from one command center Diagnose problems and improve visibility into emerging IT issues Automate, monitor and manage. Do more in less time with Central http://p.sf.net/sfu/logmein12331_d2d ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] helgrind markup for lock-ordering
On Wed, 2012-11-07 at 10:51 +0100, David Faure wrote: The idea of helgrind is that it detects lock order problems and/or race condition problems *even* if no deadlock happens and/or if no race condition really happened. Maybe it is very unlikely that the trylock fails. Still would be nice to discover the bug. And if the trylock does not fail, then the race condition will then not be detected by helgrind. The simpler construct that can lead to this problem is * trylock mutex1 * access shared data discover the bug is related to the doubful construct, not to a race condition (as said above, if the trylock does not fail, no way to detect the race condition that would happen if the application does not properly handle the trylock failure). Note that currently, laog is producing messages which should be considered as lock order warning, not as for sure there is a deadlock order problem. The trylock is one case of warning, not an error which could/should be improved. But there are others e.g. laog does not understand the concept of guard locks which is (IIUC): each thread can acquire a single lock in a set of locks. If a thread wants to acquire more than one lock (in any order then), it first has to acquire the guard lock, and then can lock in any order any nr of locks in the lock set. With this guard lock, not possible to have a deadlock, but for sure this is not understood by the current helgrind laog algorithm. Properly handling guard locks implies a more sophisticated algorithm than the current laog (search for good lock algorithm). This is too monolithic thinking :) An application that uses a given framework (e.g. Qt) could very well be doing things differently than the framework does internally. As a developer debugging a large application written by other people, using a framework written by other people, how can I guarantee helgrind that all trylock uses follow a single design pattern? Of course, this cannot be guaranteed. But the idea is not that the command line option(s) are covering the full range of possible design patterns. The idea is that they should cover 'out of the box' a reasonable range of such patterns. This is true e.g. for memcheck (out of the box, it supports malloc compatible libs, but otherwise you need annotations). Helgrind should also support some 'out of the box' locking logics. The difficult question is what should be covered by the command line options 'out of the box' (the nirvana being an algorithm that would work for everything without annotations). At the time of the trylock, it is the last one - no warning at that precise moment. This sounds like a simple enough change in the current algorithm? Basically adding one if() ... if only I knew where ;) To avoid doing a lock order warning when doing the trylock is easy I believe: in hg_main.c:3697, put a condition 'if (!is_a_try_lock) before: other = laog__do_dfs_from_to(lk, thr-locksetA); if (other) { (where is_a_trylock has to be given by the caller). I think it is almost mechanical work to add arguments to *POST event handlers and corresponding requests to transfer the is a try lock from the helgrind interception to the line 3697). But I suspect that the insertion of a trylock in the graph might later on cause a 'wrong' cycle to be detected. E.g. (L = lock, T = trylock, L and T followed by lock nr) threadA L1 T2 threadB L2 L3 threadC L3 L1 cannot deadlock (I think :) if threadA releases lock 1 when T2 fails. But when L3 L1 will be inserted, a cycle will be found via T2 (if the graph has not remembered this is a trylock). So, I am still (somewhat intuitively) thinking that we need to have nodes and/or edges marked with this is a trylock and have the graph exploration taking these marking into account to not generate such false warnings. Far to be a mathematical proof, I know :). Philippe -- LogMeIn Central: Instant, anywhere, Remote PC access and management. Stay in control, update software, and manage PCs from one command center Diagnose problems and improve visibility into emerging IT issues Automate, monitor and manage. Do more in less time with Central http://p.sf.net/sfu/logmein12331_d2d ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] helgrind markup for lock-ordering
On Thu, 2012-11-08 at 00:18 +0100, David Faure wrote: On Wednesday 07 November 2012 23:00:51 Philippe Waroquiers wrote: discover the bug is related to the doubful construct, not to a race condition If there's no race condition and no deadlock, I'm not sure what bug you want to detect :-) The doubtful construct. Such kind of construct is in my opinion better removed, and better detect that via a tool. Note that currently, laog is producing messages which should be considered as lock order warning, not as for sure there is a deadlock order problem. Yes, but why does it warn about lock order? Because it could cause deadlocks. Not necessarily. Doubtful construct like the above could cause e.g. livelocks if the inverted (failing) trylocks are not properly handled. So, even if a wrong order does not necessarily cause a deadlock, it can be a good idea to detect these doubtful constructions and remove them. I agree, this is about potential deadlocks, not actual deadlocks. But trylock 1 + trylock 2 vs trylock2 + trylock1 (the case we're talking about in this part of the mail) is not even a potential deadlock. It can't ever deadlock. So there's nothing to warn about. A compiler produces warnings and errors. warnings are useful to detect potential problems. Similarly, a wrong lock order (even if not deadlocking) might be something you do not want. And so, have e.g. an option to continue to warn for such a case (i.e. continue to have the current helgrind behaviour) would be useful for many users. The trylock is one case of warning, not an error which could/should be improved. But there are others e.g. laog does not understand the concept of guard locks which is (IIUC): each thread can acquire a single lock in a set of locks. If a thread wants to acquire more than one lock (in any order then), it first has to acquire the guard lock, and then can lock in any order any nr of locks in the lock set. With this guard lock, not possible to have a deadlock, but for sure this is not understood by the current helgrind laog algorithm. Right. That one definitely needs annotations in the source code, I would think. There's no way for the tool to detect that these mutexes all go together. Search on the web for good lock algorithm, which solves the problem of guard lock detection in deadlock lock graph analysis. I gave it a try, but I'm hitting a problem with exactly that, passing isTryLock to that code. isTryLock is set in HG_PTHREAD_MUTEX_LOCK_PRE and similar, while the above code is called from HG_PTHREAD_MUTEX_LOCK_POST and similar. If I understand correctly, adding an argument to the _POST variant would break source compatibility for the existing userland macros? I do not think it breaks compatibility for two reasons: 1. I believe these are system client requests, executed by helgrind interception code, not by client code. 2. it is possible to add an argument, because the requests all have a fixed nr of arguments, with unused arguments passed as 0. As long as really using such an unused argument keeps the same semantic for the 0 value, it is ok to add such arguments. (e.g. I have in 3.7.0 added a new argument to the memcheck leak search client request to do delta leak search, without breaking the compatibility: 0 means full leak search 1 means delta leak search I'll finish the patch then, but only if you agree with the approach, otherwise this would be dead code, i.e. a wasted effort. At this point you don't seem fully convinced :) Because I believe the below 3 threads 3 locks is the counter-example which shows that simply adding the trylock in the graph does not work. Note that convincing me is neither a sufficient nor necessary condition to have a complex helgrind patch accepted :). But I suspect that the insertion of a trylock in the graph might later on cause a 'wrong' cycle to be detected. E.g. (L = lock, T = trylock, L and T followed by lock nr) threadA L1 T2 threadB L2 L3 threadC L3 L1 cannot deadlock (I think :) if threadA releases lock 1 when T2 fails. Well, if T2 fails then we have no cycle, and if it succeeds we have a real potential deadlock. The question is whether we want to remember the T2 attempt (and warn later) even when T2 fails. I would say, if it failed, it's like it didn't happen. If T2 succeeds, there is of course no deadlock : in this case, threadA can do the required work, and then release the locks. So, with T2, the above cannot deadlock (if we assume the heuristic of threadA is to release L1 when T2 fails). If the heuristic of threadA is rather to just retry T2, then we have a deadlock. So, in summary: if T2 succeeds, for sure no deadlock (at least for this run :). if T2 fails, then depending on threadA heuristic, we might have a deadlock (probably better to call it a livelock, as threadA will continue to burn CPU
Re: [Valgrind-users] Thread names in valgrind and vgdb
On Fri, 2012-11-09 at 07:52 -0700, Tom Tromey wrote: Matthias Using plain gdb, info threads also lists these user-defined names. Matthias But valgrind and gdb+vgdb only show the thread-ids. Currently I don't think there is a way for vgdb to report this back to gdb. The thread name could be reported via packet qThreadExtraInfo, shown in 'info threads' (this is how Valgrind gdbsrv shows e.g. the Valgrind thread state and id). This extra info is just a string, and so GDB will not be able to understand it contains a thread name. Philippe -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_nov ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Fw:valgrind on cortrex-a9 problem
On Thu, 2012-11-15 at 07:49 -0800, John Reiser wrote: ==15447== error 22 Invalid argument ==15447== error VG_(am_shared_mmap_file_float_valgrind) /tmp/vgdb-pipe-shared-mem-vgdb-15447-by-root-on-??? Run with valgrind --trace-syscalls=yes ./maintest (or use strace) to find the system call which gives the error, and perhaps a hint about what is wrong. What is the page size on this system? 4KiB ? 16KiB ? A possible cause might be the /tmp file system not allowing files mapped in shared memory. If you do not need vgdb (i.e. not using vgdb directly and/or the valgrind gdbserver and/or callgrind_control and similar), you can probably bypass the problem by giving --vgdb=no. Would be worth in any case to investigate what is wrong with the very relevant suggestions given by John. Philippe -- Monitor your physical, virtual and cloud infrastructure from a single web console. Get in-depth insight into apps, servers, databases, vmware, SAP, cloud infrastructure, etc. Download 30-day Free Trial. Pricing starts from $795 for 25 servers or applications! http://p.sf.net/sfu/zoho_dev2dev_nov ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Fw:valgrind on cortrex-a9 problem
On Fri, 2012-11-16 at 09:09 +0800, lchquan wrote: By using --vgdb=no, it works . Good that it bypasses the problem. Still, there is a latent bug in the mmap area which would be nice to understand. I am amazed that the --trace-syscalls=yes did not give any output. (at least on my setup, it gives a lot of output before it reaches the gdbsrv code). Also, strace -f valgrind ./maintest should also be able to tell why the mmap fails. Philippe -- Monitor your physical, virtual and cloud infrastructure from a single web console. Get in-depth insight into apps, servers, databases, vmware, SAP, cloud infrastructure, etc. Download 30-day Free Trial. Pricing starts from $795 for 25 servers or applications! http://p.sf.net/sfu/zoho_dev2dev_nov ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] massif only produces one snapshot with 0 memory use
On Mon, 2012-11-26 at 12:46 -0800, Wiser, Tyson wrote: Does anyone have any idea what I am doing wrong? I am new to valgrind so I'm sure it is something simple that I have missed. Not too sure it is very simple. Normally, massif should work with default args. Maybe there is a problem with malloc replacement ? Can you try (with 3.8.1) to launch valgrind --tool=massif --stats=yes --trace-malloc=yes ... your app ... and then the same using ls as app. and see if there are some relevant differences (like no indication that malloc looks replaced in your app). Philippe -- Keep yourself connected to Go Parallel: DESIGN Expert tips on starting your parallel project right. http://goparallel.sourceforge.net ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] massif only produces one snapshot with 0 memory use
On Tue, 2012-11-27 at 23:35 +0100, Philippe Waroquiers wrote: On Mon, 2012-11-26 at 12:46 -0800, Wiser, Tyson wrote: Does anyone have any idea what I am doing wrong? I am new to valgrind so I'm sure it is something simple that I have missed. I just saw your follow-up telling you have a statically linked library. From Valgrind 3.8.1 onwards, Valgrind can properly work with statically linked malloc libraries thanks to the option --soname-synonyms=somalloc=NONE This option can also be used to support alternative malloc libraries such as tcmalloc. See user manual for details. I will update Valgrind FAQ with the above information. Philippe -- Keep yourself connected to Go Parallel: INSIGHTS What's next for parallel hardware, programming and related areas? Interviews and blogs by thought leaders keep you ahead of the curve. http://goparallel.sourceforge.net ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] massif only produces one snapshot with 0 memory use
On Wed, 2012-11-28 at 11:06 -0800, Wiser, Tyson wrote: I tried it with 3.8.1 that I built locally and got the same result (i.e. no profile). The command I used was: valgrind --tool=massif --soname-synonyms=somalloc=NONE ./MyProg The above is supposed to properly replace a static malloc. I tried this with the Valgrind regression test, and it works. When adding a -v option, succesful replacements are giving lines such as: --13571-- REDIR: 0x80483c4 (malloc) redirected to 0x4005fc5 (malloc) --13571-- REDIR: 0x80483e3 (free) redirected to 0x40059ec (free) Can you try with -v and/or with --trace-redir=yes ? That might give some lights about the problem ? Philippe -- Keep yourself connected to Go Parallel: INSIGHTS What's next for parallel hardware, programming and related areas? Interviews and blogs by thought leaders keep you ahead of the curve. http://goparallel.sourceforge.net ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
[Valgrind-users] RFC: more flexible way to show or count as error or suppress leak kinds
Currently, Valgrind does not provide a fully flexible way to indicate which leak kinds to show, which leak kinds to consider as an error, and which leak kinds to suppress. This is a.o. described in bugs 284540 and 307465. For example, the current options (--show-reachable=yes|no --show-possibly-lost=yes|no) do not allow to indicate that reachable blocks should be considered as an error. There is also no way to indicate that possibly lost blocks are not an error (whatever the value of --show-possibly-lost). Leak suppression entries are also currently catching all leak kinds. For example, if you have possibly lost blocks which you want to suppress, the suppression entry will also suppress definitely lost blocks allocated at the same stack trace, thereby hiding/suppressing real leaks. The patch attached to bug 307465 implements a flexible way to specify on the command line which leak kinds to show and which leak kinds to consider as an error. It also provides a way to have a leak suppression entry matching only a specific set of leak kinds. Here are the new command lines args: --show-leak-kinds=kind1,kind2,.. which leak kinds to show? [definite,possible] --errors-for-leak-kinds=kind1,kind2,.. which leak kinds are errors? [definite,possible] where kind is one of definite indirect possible reachable all none (note: old arguments are kept for backward compatibility). With the patch, a suppression entry now also has an optional line indicating which leak kind(s) are matched by this suppression. For example: { insert_a_suppression_name_here Memcheck:Leak match-leak-kinds: possible fun:malloc fun:mk fun:f fun:main } (where the optional match-leak-kinds: line can specify leak kinds similarly to the command line options). When using --gen-suppressions=yes, the match-leak-kinds: line will be produced to match the reported leak kind. This is not committed (yet), any feedback about the approach is welcome. Philippe -- Keep yourself connected to Go Parallel: INSIGHTS What's next for parallel hardware, programming and related areas? Interviews and blogs by thought leaders keep you ahead of the curve. http://goparallel.sourceforge.net ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] RFC: more flexible way to show or count as error or suppress leak kinds
On Thu, 2012-11-29 at 08:44 +0100, David Faure wrote: Here are the new command lines args: --show-leak-kinds=kind1,kind2,.. which leak kinds to show? [definite,possible] --errors-for-leak-kinds=kind1,kind2,.. which leak kinds are errors? [definite,possible] where kind is one of definite indirect possible reachable all none This sounds good, but I'm missing one piece of information: what will the default values be? The default values are indicated in [] in the --help above. These default values are backward compatible with the current default values. It would be good for this to have sane defaults, so that most users don't actually need to specify these options. Would this mean show for possible and error for definite? It is expected that keeping the same default behaviour as today is the sane default. Philippe -- Keep yourself connected to Go Parallel: VERIFY Test and improve your parallel project with help from experts and peers. http://goparallel.sourceforge.net ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] RFC: more flexible way to show or count as error or suppress leak kinds
On Thu, 2012-11-29 at 06:25 -0800, John Reiser wrote: This is good as far as it goes. The presentation in the output from valgrind --help will matter, and so will the explanation given in the user manual. Just finding and understanding the new options is a significant barrier to usability. Try to write things so that applying grep gives good hints about where to read further. (This may include rewriting _other_ pieces in order to reduce false positive usage of key words.) The patch contains the --help and the updates to the manual. Feedback welcome ... More generally, why isn't this controllable by a loadable Python module, complete with defaults (including a complete default error handling module) and introspection? There should be ways to find all existing suppressions, how many times each one has been applied so far, the current traceback, the type of the current error, etc. If coregrind doesn't want to deal with Python, then have gdb do it. Integrate a Python interpreter inside Valgrind seems quite a lot of work. It is not clear to me if the possible usage(s) would justify it. Using the python interpreter in GDB (via the Valgrind gdbsrv) is ok as long as it accesses the guest process data. I do not know a way to persuade GDB that the process also has Valgrind tool data (and code). Philippe -- Keep yourself connected to Go Parallel: VERIFY Test and improve your parallel project with help from experts and peers. http://goparallel.sourceforge.net ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] massif only produces one snapshot with 0 memory use
On Thu, 2012-11-29 at 07:30 -0800, Wiser, Tyson wrote: Can you try with -v and/or with --trace-redir=yes ? That might give some lights about the problem ? I used both options and it produced the following output. Thanks for taking the time to look at this. There is an unexpected (or rather missing) behaviour in the trace you have provided. You should find lines such as: --3143-- Reading syms from /home/philippe/valgrind/valgrind-3.8.1/install/lib/valgrind/vgpreload_core-amd64-linux.so ... --3143-- Reading syms from /home/philippe/valgrind/valgrind-3.8.1/install/lib/valgrind/vgpreload_massif-amd64-linux.so Without these, there is no chance to have replacements being done. Maybe an installation problem ? Can you try: valgrind --tool=massif --soname-synonyms=somalloc=NONE --trace-redir=yes \ -v -v -v -d -d -d ./MyProg 21 | grep -i preload This should give something like: --31179:2:initimgpreload_string: --31179:2:initimg /home/philippe/valgrind/valgrind-3.8.1/install/lib/valgrind/vgpreload_core-amd64-linux.so:/home/philippe/valgrind/valgrind-3.8.1/install/lib/valgrind/vgpreload_massif-amd64-linux.so --31179-- Reading syms from /home/philippe/valgrind/valgrind-3.8.1/install/lib/valgrind/vgpreload_core-amd64-linux.so --31179--TOPSPECS of soname NONE filename /home/philippe/valgrind/valgrind-3.8.1/install/lib/valgrind/vgpreload_core-amd64-linux.so --31179-- Reading syms from /home/philippe/valgrind/valgrind-3.8.1/install/lib/valgrind/vgpreload_massif-amd64-linux.so You should then check that the files referenced by initimg exist and have correct permissions (typically -rwxr-xr-x). If initimg is correct and files are existing and have correct permission, then mystery is increasing. You could try the same with memcheck and see if the preload for memcheck is working (and malloc replacement is properly done). You could also verify if the regression test for the static malloc replacement works by doing: cd memcheck/tests/ make static_malloc algrind --tool=massif --soname-synonyms=somalloc=NONE --trace-redir=yes \ -v -v -v -d -d -d ./static_malloc 21 | grep -i preload Philippe -- Keep yourself connected to Go Parallel: VERIFY Test and improve your parallel project with help from experts and peers. http://goparallel.sourceforge.net ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] mmap fail when running valgrind massif
On Thu, 2012-11-29 at 17:31 +0100, Pedro Larroy wrote: Without valgrind everything works fine, it tries to map a file of 20GB or so, might this be the reason? Yes, it might be the reason. Try to reduce the size to e.g. 1GB and see if it works. -v -v -v -d -d -d args will also activate tracing for the VAlgrind address space manager. This trace could explain why 20Gb are not mappable. Also, you should try with the last version (3.8.1). Philippe -- Keep yourself connected to Go Parallel: VERIFY Test and improve your parallel project with help from experts and peers. http://goparallel.sourceforge.net ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] massif only produces one snapshot with 0 memory use
On Fri, 2012-11-30 at 06:56 -0800, Wiser, Tyson wrote: I'm not sure how to interpret these results. Does all this look OK? Yes, it looks similar to what I see here. To see that the whole replacement thing is working, the command valgrind --tool=memcheck --soname-synonyms=somalloc=NONE ./static_malloc should produce lines telling that the heap was used e.g. ==14445== total heap usage: 2 allocs, 1 frees, 133 bytes allocated ... ==14445==definitely lost: 10 bytes in 1 blocks If the same lines appears in your case, it means the memcheck replacement works on static_malloc. If the replacement is not done (e.g. by using --soname-synonyms=somalloc=FOO), then you rather have ==29278== total heap usage: 0 allocs, 0 frees, 0 bytes allocated With massif correct replacement, it should produce a massif.out.x file telling some memory was allocated. If that works with the static_malloc program but not with your program, then better file a bug in bugzilla, attaching the full output of -v -v -v -d -d -d --trace-redir=yes. You could also verify that your program is doing a call to malloc (or similar function), find the address of this function A correct redirection for malloc will look like --24545-- REDIR: 0x4004e4 (malloc) redirected to 0x4c25c59 (malloc) (with the malloc original address being found by: nm static_malloc | grep malloc 004004e4 T malloc Philippe -- Keep yourself connected to Go Parallel: TUNE You got it built. Now make it sing. Tune shows you how. http://goparallel.sourceforge.net ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Debugging vmware process
On Wed, 2012-12-05 at 11:57 -0800, John Reiser wrote: On 12/05/2012 09:18 AM, Simon Bonello wrote: I am trying to track a memory leak in a vmware service and I tried to use valgrind to track the leak. Unfortunately it is stopping for the following instruction. vex x86-IR:unhandled instruction bytes:0xF 0xB 0xFF 0x85 I am using a busybox 1.20.0.||Help would be highly appreciated. Use plain text for posting, not HTML. Which version of valgrind? What was the shell command that you invoked in order to get the error? 0x0f 0x0b is 'ud2', the blessed-by-Intel opcode for undefined instruction. It means that the compiler's code generator thought that it was impossible to get to that point. The cause of the real error happened some time ago. So look at the traceback, or sometimes even farther back than that. (Is the traceback from glibc, uClibc, or the app itself?) You might also look at thread http://comments.gmane.org/gmane.comp.debugging.valgrind.devel/17911 which speaks about specific vmware stuff and how it was finally supported (without patch in Valgrind, instead using client requests). Philippe -- LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial Remotely access PCs and mobile devices and provide instant support Improve your efficiency, and focus on delivering more value-add services Discover what IT Professionals Know. Rescue delivers http://p.sf.net/sfu/logmein_12329d2d ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Can we know more about condition variable being destroyed?
On Mon, 2012-12-17 at 21:03 +0900, ISHIKAWA,chiaki wrote: During running mozilla thunderbird mail client under helgrind, I got the following message: ==13832== Thread #1: pthread_cond_destroy: destruction of condition variable being waited upon (See mozilla bugzilla entry https://bugzilla.mozilla.org/show_bug.cgi?id=819445 ~nsHTTPListener() destroys condition variable on which other threads are blocked.) I wonder if we can learn WHICH THREADs, maybe the thread ids, were waiting on the said condition variable when this message is printed. Assuming you have a recent version of Valgrind, you can activate the embedded gdbserver and then use GDB to examine the state of all the other threads when the above error is reported. You will then see which threads are waiting on this cond var. It would be at least great to learn which thread (maybe tid or its starting address or whatever) is waiting on the condition variable which is being destroyed, and also, it would be insanely great, if we can learn WHICH condition variable exactly is talked about (maybe its address?) is known. I think it is definitely worthwhile if we can print the address of the condition variable being destroyed even if symbolic information is not available because in the same log I often see something about destruction of unknown cond var also. I wonder if we can correlate the addresses printed by these warning messages to see if one thread is prematurely destroying a cond variable which other threads really assume to continue to exist, etc. Destruction of unknown cond var is probably/maybe bug https://bugs.kde.org/show_bug.cgi?id=307082 By looking at helgrind/hg_main.c staring at line 2153 (I am quoting the function map_cond_to_CVInfo_delete ( ThreadId tid, void* cond ) below, I think printing the value of 'cond' as address in hexadecimal format would be enough to print the address of condition variable (I am not familiar how to print the symbolic information.) OR, since I have in a corner a patch for helgrind which print symbolic information for the lock addresses. Patch not finished yet. Would be worth filing a wish bug in bugzilla telling that helgrind could use --read-var-info=yes to show more info about cond var addresses, lock addresses, etc. Philippe -- LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial Remotely access PCs and mobile devices and provide instant support Improve your efficiency, and focus on delivering more value-add services Discover what IT Professionals Know. Rescue delivers http://p.sf.net/sfu/logmein_12329d2d ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Can we know more about condition variable being destroyed?
On Tue, 2012-12-18 at 21:00 +0900, ISHIKAWA,chiaki wrote: (2012/12/18 8:07), Philippe Waroquiers wrote: Destruction of unknown cond var is probably/maybe bug https://bugs.kde.org/show_bug.cgi?id=307082 I have produced a patch to take care of the issue. But before that, I have a question. Q1: Why does valgrind not complain if I compile link Marc's code (in the bug entry which was given as a reminder that unknown cond var may be a bug or false positive.) in the following manner, cc -o /tmp/a.out marc.c No idea. Maybe a problem of redirection caused by static linking ? Q2: I have produced a work-in-progress patch to take care this issue. I wonder if the developers in the know can take a look and improve it. The patch is posted to the bug entry https://bugs.kde.org/show_bug.cgi?id=307082 I took a quick look at the patch, approach looks ok to me. No time to look more in depth at this now however :(. Not sure, though if it works with the initialized data as in pthread_mutex_t mut = PTHREAD_MUTEX_INITIALIZER; pthread_cond_t cond = PTHREAD_COND_INITIALIZER; I assume there is no function call for the above, so no way to have Valgrind knowing that it is ok to destroy cond. [Note: No it can't be obviously. This is because the mapping of cond var to CVInfo structure can not be done explicitly using the timing of pthread_cond_init(). I tested it to confirm this observation by a slight modification of marc's code.] Q3: I have in a corner a patch for helgrind which print symbolic information for the lock addresses. Patch not finished yet. Would be worth filing a wish bug in bugzilla telling that helgrind could use --read-var-info=yes to show more info about cond var addresses, lock addresses, etc. Philippe You mean helgrind can't use the information obtained by --read-var-info=yes (!?). That is tough, indeed. helgrind uses --read-var-info=yes to report details about address involved in race condition. It does not use it to describe locks, cond var, etc... For this, might be good to file a wish bug. Philippe -- LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial Remotely access PCs and mobile devices and provide instant support Improve your efficiency, and focus on delivering more value-add services Discover what IT Professionals Know. Rescue delivers http://p.sf.net/sfu/logmein_12329d2d ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Can we know more about condition variable being destroyed?
helgrind can't really know which task is being removed from the waiting list and so decrmenting nWaiters is all it does (I think). I think it does a lot more (otherwise helgrind could not follow at all what would happen with cond variables). See e.g. pthread_cond_wait_WRK Also, does anyone have a clever idea about how to debug this situation? As mentionned previously, if you use vgdb, it should be trivial to find which thread is doing what. E.g. do in a GDB attached to the Valgrind embedded gdbserver: thread apply all bt The stack traces will allow to determine which thread is waiting on a cond var. You can then examine which cond var is being waited upon. Philippe -- LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial Remotely access PCs and mobile devices and provide instant support Improve your efficiency, and focus on delivering more value-add services Discover what IT Professionals Know. Rescue delivers http://p.sf.net/sfu/logmein_12329d2d ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Execution of a dirty helper: atomic?
On Sun, 2012-12-30 at 22:45 +0100, Emilio Coppa wrote: Thank both of you for your answers. Each CPU core may switch logical threads only at a superblock boundary, but mutual exclusion between threads on different CPU cores is not guaranteed. For some purposes this will look like interleaving. I will be very interested on how memcheck will approach this interleaving :) Currently, the really multi-threaded valgrind (see https://bugs.kde.org/show_bug.cgi?id=301830) is blocked. There are many global data structures in Valgrind which are not thread safe, but I think most of them can easily be made thread safe (typically by a mutex). However, the main memcheck data structure (which maintains the V-bit) is accessed so often that it is not acceptable (perf wise) to use a mutex. It is even not ok to use an atomic instruction : first tests have shown that having one atomic instruction on this path makes a multi-threaded Valgrind slower than a serialised Valgrind. So, in summary, there is no solution for a multi-threaded memcheck. Ideas welcome :). (a prototype of the none tool worked reasonably well in multi-threaded, but that is quite useless). Philippe -- Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft MVPs and experts. SALE $99.99 this month only -- learn more at: http://p.sf.net/sfu/learnmore_122412 ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Valgrind 3.8.0 can not open cache simulation output file(callgrind)
On Tue, 2013-01-15 at 15:15 +0100, Josef Weidendorfer wrote: Am 15.01.2013 07:36, schrieb Steph: root@bt:~# ==1975== ==1975== Error: can not open cache simulation output file `/root/callgrind.out.1975' Are you allowed to create a file in /root ? Anyway, you should not run valgrind as root. The file is written into the directory where you start callgrind. Using root is effectively not ideal. Alternatively, --callgrind-out-file can be used to have the file given another directory/name. Philippe -- Master SQL Server Development, Administration, T-SQL, SSAS, SSIS, SSRS and more. Get SQL Server skills now (including 2012) with LearnDevNow - 200+ hours of step-by-step video tutorials by Microsoft MVPs and experts. SALE $99.99 this month only - learn more at: http://p.sf.net/sfu/learnmore_122512 ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Helgrind and stack unavailable
On Thu, 2013-01-17 at 19:00 +, Phil Longstaff wrote: What is the cause of “stack unavailable”? This error message doesn’t give me enough information to go on, since it doesn’t tell me anything about what the first incorrectly locked mutex is or about the stacks establishing the correct locking order. I suspect this can happen when the (invalid) locking chain is complex (e.g. sequences more complex than lockA,lockB and lockB,lockA). Not digged more in depth. There is a small helgrind regression test which causes also this output (helgrind/tests/tc14_laog_dinphils.vgtest). Would be nice to investigate further ... Philippe -- Master HTML5, CSS3, ASP.NET, MVC, AJAX, Knockout.js, Web API and much more. Get web development skills now with LearnDevNow - 350+ hours of step-by-step video tutorials by Microsoft MVPs and experts. SALE $99.99 this month only -- learn more at: http://p.sf.net/sfu/learnmore_122812 ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Fwd: Running valgrind on WebSphere Server process
On Sun, 2013-01-20 at 01:07 -0700, vijay singh wrote: I have a suspected native memory leak in a Java Web Application deployed on WebSphere App Server. We are trying to use valgrind to debug the same. Can anyone help me with the commands to be used to start the server process under valgrind. What have you tried ? What problems have you encountered ? You must very probably activate the option related to self modifying code. Philippe -- Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft MVPs and experts. ON SALE this month only -- learn more at: http://p.sf.net/sfu/learnmore_123012 ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] time-stamp of memory allocation
On Mon, 2013-02-04 at 16:26 +, Rehrmann, Robin wrote: printing the result. Since memory leaks can only be detected at the end of a program, these are printed out at the end of the program, so If you want to find which test specifically leaks some memory (i.e. loses the last pointer to a piece of memory), you can launch a leak search between each test. This can be done either from your program (by calling a client request) or from an external program (e.g. a shell, or the python test driver), using vgdb. The leak report can be incremental (i.e. showing the delta compared with the previous leak search). I am fine with the relative time-stamp; I do not need an absolute one! Otherwise, as explained in another mail, MC_Chunk is one data structure to modify. The time stamp however will be per allocated block, while the leak check results are regrouping the leaked blocks in loss records. A single error is produced for each loss record. So, there will be (potentially) several leaked blocks mapped to one single loss record (and to one reported error). At my work, we use the incremental leak report between each test. Philippe -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_jan ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] [PATCH] Improve errors for use-after-free on memory pools
On Thu, 2013-02-14 at 07:21 +0100, Matthias Schwarzott wrote: I will create a bug ticket to track this. No time for the moment to look at your patch, but it is a good idea to enter a bug in bugzilla with the patch and the before/after diffs for the test. Philippe -- The Go Parallel Website, sponsored by Intel - in partnership with Geeknet, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials, tech docs, whitepapers, evaluation guides, and opinion stories. Check out the most recent posts - join the conversation now. http://goparallel.sourceforge.net/ ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Assertion 'cfsi.len 5000000' failed
On Wed, 2013-02-20 at 07:09 -0800, Greg Czajkowski wrote: If this assertion is not tied to any possible damage it may cause, can it be removed or perhaps turned into a warning? A warning is similar to other similar things in Valgrind. E.g. a warning is produced when the permission of a large address range is changed. BTW. After removing the assertion, the process runs much further, but eventually (after a day) valgrind hangs somewhere and stops consuming the CPU. First time ever I have seen such behaviour, at the same time our processes have always stretched valgrind. You might use gdb+vgdb to connect to Valgrind and examine what is the state of your process and of the Valgrind scheduler. Philippe -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_feb ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] can't find masiff's ouitput file
On Fri, 2013-02-22 at 20:02 -0500, Konstantine Bogach wrote: Hi, I am on 3.8.1 now and I can not get massif to produce an output file, neither default name nor specifying it on command line. I terminate my program by sending TERM signal to valgrind process. That worked on 3.4.1 (yes, I don't use valgrind oftern :) . I would appreciate if someone could enlighten me on how to get it that file written. It works here ... (f12/x86) for e.g. valgrind --tool=massif sleep 100 (both while letting valgrind terminate normally, or terminating it with kill -TERM) Philippe -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_feb ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] valgrind INTERNAL ERROR received a signal 11 (SIGSEV)
On Wed, 2013-02-27 at 16:29 +0100, Nils Köhler wrote: si_code=80; Faulting address: 0x0; sp: 0x62a9d8f8 valgrind INTERNAL ERROR received a signal 11 (SIGSEV)- exiting I have hundrets of that messages with different SP:adresses Is it and issue in my programm or in valgrind? Does anyone have an idear...? Which version of Valgrind ? On which platform ? Philippe -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_feb ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] can't find masiff's output file
On Wed, 2013-02-27 at 09:58 -0800, John Reiser wrote: I am on 3.8.1 now and I can not get massif to produce an output file, neither default name nor specifying it on command line. I terminate my program by sending TERM signal to valgrind process. That worked on 3.4.1 (yes, I don't use valgrind oftern :) [snip] it works with sleep or ls but I does not work when I run my programm. Run valgrind under strace and inspect all file operations: strace -f -o strace.out -e trace=file,close,dup,dup2 \ valgrind options ./my_app args Look for an extra close(), etc. You can also start valgrind with some additional tracing options and compare the difference between a kill -TERM of a sleep 1000 under Valgrind and a kill -TERM of your application. for example, do: valgrind --tool=massif -v -v -v -d -d -d --trace-signals=yes sleep 1000 Maybe this gives some light about what goes wrong ... Philippe -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_feb ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Show where a reference was lost in addition to where it was allocated
On Thu, 2013-02-28 at 14:53 -0800, Kyle Mahan wrote: Hi all, I'm wondering if it's possible for memcheck to show the last place that some memory was accessible before being leaked. For example, I would like to see the line numbers for both allocated here and leaked here in the example below. int* g() { int* x = new int[256]; -- allocated here ... return x; } int f() { int* x = g(); ... return 0; -- leaked here } No, this is not supported. IIRC, there was an experimental tool (omega ?) that was trying to do that. To my knowledge, there were some conceptual problems in it but I do not know the details. Philippe -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_feb ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] floating point print error in ada
On Mon, 2013-03-04 at 22:08 +0100, Roland Mainz wrote: On Mon, Mar 4, 2013 at 9:29 PM, Philippe Waroquiers philippe.waroqui...@skynet.be wrote: GNAT runtime is implementing various features (e.g. float images) by using long_long_float, which are 80 bits floats. As these are not properly supported by Valgrind, this can introduce numerical differences/errors. Erm... are there any plans to fix this ? AFAIK ADA ist not the only user of |long long double| ... some parts of JDK, perl and ksh93 rely on 80biit IEEE754 math on x86 ... No, there is no plan to fix this. See https://bugs.kde.org/show_bug.cgi?id=197915#c9 See e.g. https://bugs.kde.org/show_bug.cgi?id=130358 Do you have a full list of bugs for this issue ? https://bugs.kde.org/show_bug.cgi?id=197915 seems track a lot of duplicate bugs. But there are probably some not yet marked as duplicate. Philippe -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_feb ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] floating point print error in ada
On Tue, 2013-03-05 at 18:54 +0100, Lionel Cons wrote: (1) in https://bugs.kde.org/show_bug.cgi?id=197915#c9 is a joke: Julian Seward 2010-07-12 15:58:25 UTC As per comment #0, adding support for 80-bit floats is low priority, because (1) AIUI the majority of floating point code is portable and restricts itself to 64-bit values, The majority of _consumer_ software uses using double (aka 64bit float), but the majority of _scientific_ software (for example the whole NIH bioinformtics software stack or 99.9% of CERNs simulation software) is relying on long long double aka 80bit or 128bit floats (depending on platform, AMD64 uses 80bits). valgrind is useless for such software. I am not too sure about the proportion of consumer software versus scientific software. Assuming there is more consumer software, the note (1) above is not such a joke at the end :). Reading the gcc manual, wouldn't it be a good idea to have the scientific software to be rewritten (or at least compilable) so as to use sse ? gcc manual tells for `sse' The resulting code should be considerably faster in the majority of cases and avoid the numerical instability problems of 387 code, but may break some existing code that expects temporaries to be 80bit. This is the default choice for the x86-64 compiler. And as a bonus, you can run it under Valgrind on x86/amd64 :). If this scientific code is fully portable, then you could also decide to run it under Valgrind e.g. on ppc32/ppc64 systems. Note: at my work, we are using Ada/gnat/gcc on x86/amd64. The application code is compiled with sse. We contemplated recompiling and/or changing the Ada runtime to use sse only and fully avoid the 80 bits. However, as at the end, we found very little impact (at least for our apps) of running the 80 bits runtime on Valgrind, we have kept the default gnat runtime (which uses 80 bits floats here and there). As long as these 80 bits computation are ok if computation is in reality done with 64 bits float, then not much impact. YMMV. Note that we have not got any indication that the original problem of Bob is linked to 80 bits. Duncan's trials were succesful, I also tried at work and at home, and always obtained the expected answer. Philippe -- Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the endpoint security space. For insight on selecting the right partner to tackle endpoint security challenges, access the full report. http://p.sf.net/sfu/symantec-dev2dev ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] helgrind bug in pthread_cond_destroy (testcase)
On Thu, 2013-03-14 at 18:48 +0100, David Faure wrote: The attached testcase (which is simply pthread_cond_init + pthread_cond_destroy), leads to an error in helgrind: pthread_cond_destroy: destruction of unknown cond var Looks like this is: https://bugs.kde.org/show_bug.cgi?id=307082 which contains an analysis and has an attached patch. I have on my list of things to do to look at this patch. I've seen this forever with helgrind, but it's time to clean this up :) However my debugging got stuck. I found out that 1) the call is given a valid condition variable pointer, and it actually succeeds, outside and inside helgrind. 2) the error message comes from this line of code: DO_CREQ_v_W(_VG_USERREQ__HG_PTHREAD_COND_DESTROY_PRE, pthread_cond_t*,cond); (hg_intercepts.c:940). How do I debug this further? This looks like a hook to me, the actual call is the next line, CALL_FN_W_W(ret, fn, cond), isn't it? There are two possible levels at which you can debug. You can debug the application level (guest process), using gdb+vgdb. This debugs the virtual cpu emulated by Valgrind. Or you can debug the Valgrind level (using gdb, directly on the process). This debugs the real cpu. A client request is executed by the guest process by the virtual cpu, but switches to the real cpu to do the real work (which is then Valgrind code which is executed by the real cpu). Not easy to debug simultaneously, but not impossible (you need two GDBs, one debugging the real cpu, one debugging the virtual cpu). You might need to e.g. avoid vgdb using ptrace syscalls. See README_DEVELOPERS for more details. Philippe -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_mar ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Potentially lost memory
On Thu, 2013-03-14 at 19:21 +, Phil Longstaff wrote: Memcheck will report that memory is potentially lost if there is no pointer to the beginning of a block, but there is an internal pointer. One valid use of an internal pointer is a pointer to a base class in C ++. How hard would it be for memcheck to not report a block as being potentially lost if the internal pointer could be a pointer to a base class? Is there sufficient info in the debug information? No, I do not think so. I think the debug info can only describe stack and global variables, but cannot be used to map a malloc-ed/new-ed memory ptr to a class. IIRC, another leak checker tool (maybe DrMemory?) had an heuristic to guess that an interior pointer was pointing inside such a OO type. Philippe -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_mar ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] memory access outside allocated areas is not detected
On Thu, 2013-03-14 at 13:47 -0700, John Reiser wrote: In the NEWS section, Release 3.8.0 (10 August 2012), TOOL CHANGES, * Non-libc malloc implementations are now supported. This is useful for tools that replace malloc (Memcheck, Massif, DRD, Helgrind). Using the new option --soname-synonyms, such tools can be informed that the malloc implementation is either linked statically into the executable, or is present in some other shared library different from libc.so. This makes it possible to process statically linked programs, and programs using other malloc libraries, for example TCMalloc or JEMalloc. So valgrind-3.8.0 says that it can handle static linking of malloc, but the user must help. A little addition: the user must help if malloc is statically linked. However, at least at this moment, the application cannot be *fully* statically linked. The reason is that the Valgrind code that triggers the replacement is itself a shared object, which is LD_PRELOAD-ed. I still have on my list of things to do in low priority to see if this last limitation can be bypassed. For example, if the tool detects that the replacement code was not PRELOADED-ed, then the tool might mmap the code rwx. This trick might help fully static executable to be better supported. At this stage, just a brainstorming idea that will probably explode in pieces once looked at more in depth. Philippe -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_mar ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Potentially lost memory
On Thu, 2013-03-14 at 14:18 -0700, Patrick J. LoPresti wrote: On Thu, Mar 14, 2013 at 1:58 PM, Philippe Waroquiers philippe.waroqui...@skynet.be wrote: On Thu, 2013-03-14 at 19:21 +, Phil Longstaff wrote: How hard would it be for memcheck to not report a block as being potentially lost if the internal pointer could be a pointer to a base class? Is there sufficient info in the debug information? No, I do not think so. Probably correct in general. But... For polymorphic C++ classes -- presumably a common case when you have a pointer to an internal base class -- dynamic_castDerived() has to work somehow. So I would imagine Valgrind could use the same RTTI mechanism. In theory. The problem is that Valgrind only has a piece of memory. It does not know if this piece of memory is a dynamically allocated C++ object or a dynamically allocated string or a dynamically allocated array of integers Assuming this piece of memory is a C++ object, and starting RTTI on that implies to heuristically guess if the memory piece looks like a C++ object. Valgrind cannot be sure of that. In other words, Valgrind will do RTTI by doing an unchecked cast of any piece of memory to which it finds an interior pointer. IIUC, DrMemory leak checker uses an heuristic by assuming the V-table pointer is located at the beginning of the piece of memory, and confirming this is a V-table pointer by looking if this V-table pointer points to an array of words which are themselves pointing into the text segment of of the application. I do not have a good knowledge of C++ and multiple inheritance v-tables and similar, so I am not sure I properly understand all the above. Such an heuristic might create false negative. There are however already false negative (as e.g. any integer might look like a start pointer). E.g. on a 32 bit application which allocates a lot of memory, filled in with a lot of different integers, there is a significant probability to have false negative. Philippe -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_mar ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Feature request - backtrace on warning set address range perms
On Tue, 2013-03-26 at 14:30 +0100, Jonatan Wallmander wrote: Feature-request: Add backtrace to output for these warnings: Warning: set address range perms: large range [0x4c339040, 0x206094130) (undefined) Probably not difficult to implement, however, see below ... Explanation: This was a large allocation which ate up all my memory when debugging with valgrind making it hard for me to find where it happened. The reason was an undefined integer in a class (uninitialized memory). However, the root cause is that Valgrind should report an error when malloc is called with a undefined size argument. Valgrind only gives this one line warning (which is good), but it would also be nice if it would give a backtrace for this... Might have helped me to track down this bug sooner :) The 'large range' message was produced thanks to some luck. The best would be to file a bug in bugzilla, for the false negative caused by code such as: { size_t undef; char *p = malloc (undef); } Thanks Philippe -- Own the Future-Intelreg; Level Up Game Demo Contest 2013 Rise to greatness in Intel's independent game demo contest. Compete for recognition, cash, and the chance to get your game on Steam. $5K grand prize plus 10 genre and skill prizes. Submit your demo by 6/6/13. http://p.sf.net/sfu/intel_levelupd2d ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Feature request - backtrace on warning set address range perms
On Tue, 2013-03-26 at 22:52 +0100, Philippe Waroquiers wrote: The best would be to file a bug in bugzilla, for the false negative caused by code such as: { size_t undef; char *p = malloc (undef); } Too late to file a bug in bugzilla :). An improvement has been committed in revision 13361. False negative in malloc lib replaced functions should be solved now. E.g. a call to malloc with undefined size will now cause an error to be detected by memcheck. Philippe -- Minimize network downtime and maximize team effectiveness. Reduce network management and security costs.Learn how to hire the most talented Cisco Certified professionals. Visit the Employer Resources Portal http://www.cisco.com/web/learning/employer_resources/index.html ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Debugging a Tcl extension with a multi-thread program
On Fri, 2013-04-12 at 11:03 +0200, MOULINIER Luc (UDS) wrote: As my program is multi-threaded, a normal run of valgrind doesn't work. Valgrind is able to run a multi-thread program (even if it serialises the execution, i.e. even on a multi-cpu, only one thread runs at a single time). However, if you want to debug your program when Valgrind encounters an error, then effectively you have to use GDB+vgdb. gdb /usr/local/ActiveTcl/bin/tclsh and then target remote | vgdb but then I can't give a run /home/moumou/ordali/src/ordali.tcl exe ThrSco command to gdb ! run is not allowed using vgdb Effectively, the GDB command run cannot be used with the Valgrind gdbserver : you must first launch your program using valgrind --vgdb-error=0 your_program your_program_args then connect GDB to the valgrind gdbserver using vgdb. You then use the GDB commands continue, or next, or step to allow the execution of your_program to proceed. Philippe -- Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Valgrind failure report
On Wed, 2013-04-17 at 13:07 +0800, Ice Frog wrote: I'm using Valgrind for profiling, but it reported failures(lots of failures with same error message) as below: Thread 295: status = VgTs_WaitSys There is very little info with which help can be provided. Are you using callgrind or cachegrind for cpu profiling ? Or massif or dhat for heap profiling ? The above msg looks to be an extract from a bigger set of information: looking in the valgrind code, I am guessing that the only piece of code that can produce the above 'Thread status=' message is the function 'VG_(show_sched_status)'. This function is either called explicitely on user request (which I guess is not your case) or if some kind of internal error is detected by Valgrind, causing an abort of Valgrind. For such cases, an error msg explaining what is the internal error is produced preceding or following the status of all threads. The best to provide some info is to attach the full output of your run (as is) and/or another run with some additional tracing arguments (e.g. -v -v -v -d -d -d). Or at least examine the surrounding of these 'Thread status' messages to see if there is not an indication of an internal error. Just having a thread status as shown above gives no idea about what problem has been encountered. Philippe -- Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] multiple cores being used?
On Thu, 2013-04-18 at 08:50 -0700, Brian Budge wrote: Hi Paul. I am at 20% of memory use. I should also note that I followed Julian's advice for increasing vg_n_segments and memory size to 128 GB. Does valgrind itself do anything multithreaded? My program uses all cores on the machine at various stages of the program running. But my understanding was that it was always serialized while running memcheck. In fact, any tool is (almost fully) serialising the execution: there is only one thread that can execute code. the only parallelism is with the threads that are executing a system call. Some prototyping was done of a non serialised valgrind. See https://bugs.kde.org/show_bug.cgi?id=301830 and the MTV branch in svn. This prototype is not usable in its current state: only the none tool was used. There are still many thread unsafe things. The performance of the memcheck data structures is a.o. a difficult problem to look at. Philippe -- Try New Relic Now We'll Send You this Cool Shirt New Relic is the only SaaS-based application performance monitoring service that delivers powerful full stack analytics. Optimize and monitor your browser, app, servers with just a few lines of code. Try New Relic and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] multiple cores being used?
On Tue, 2013-04-23 at 13:08 -0700, Brian Budge wrote: Some prototyping was done of a non serialised valgrind. See https://bugs.kde.org/show_bug.cgi?id=301830 and the MTV branch in svn. This prototype is not usable in its current state: only the none tool was used. There are still many thread unsafe things. The performance of the memcheck data structures is a.o. a difficult problem to look at. Philippe Hi Philippe - I can definitely understand that :) I wasn't really suggesting that it should do so (though, of course, any performance improvement would be great). I was more just confused that memcheck was using so many cores concurrently. Yes, it is not very understandable, unless you have a lot of threads, with many of them executing (heavy) system calls. (but I am still amazed that you could obtain something like 1000% of cpu). Philippe -- Try New Relic Now We'll Send You this Cool Shirt New Relic is the only SaaS-based application performance monitoring service that delivers powerful full stack analytics. Optimize and monitor your browser, app, servers with just a few lines of code. Try New Relic and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Valgrind on custom stack, without libc and signal delivering
On Sun, 2013-05-05 at 18:20 +0400, Anton Kozlov wrote: So, the question is, what mechanism can be used to make bare version act like libc one? I've tried to do STACK_REGISTER, but it brought no success. Increase the size of the stack. If you put 0x1 instead of 0x1000, then both versions are working (with or without libc). From what I understood debugging with GDB+vgdb, the difference of behaviour is due to the different way the char stack[STACK_SZ]; is mapped, and the value of the stack ptr when SIGCHLD is received. It looks like the char stack array is partially located in a file rw mapping, and partially in a anon mapping. When the signal is received, if the sigframe will overlap with the part of stack in the file mapping, then Valgrind will believe it has to grow the stack. With the libc version, the sigframe can be fully in the anon segment, not with the bare version. (I think the sigframe is about 1800 bytes). (I was amazed to see that one single array can be mapped in two different segments). But the below seems to indicate that quite clearly: bare **3740** stack: id=1, begin=0x80493A0, end=0x804A3A0 --3740:0:aspacem1: 000400-0008047fff 64m --3740:0:aspacem2: file 0008048000-0008048fff4096 r-xT- d=0xfd00 i=705272 o=0 (1) --3740:0:aspacem3: file 0008049000-0008049fff4096 rw--- d=0xfd00 i=705272 o=0 (1) --3740:0:aspacem4: anon 000804a000-000804afff4096 rw--- --3740:0:aspacem5: anon 000804b000-000804bfff4096 rwx-- --3740:0:aspacem6: RSVN 000804c000-000884afff 8384512 - SmLower (gdb) p $sp /// when SIGCHLD rcvd $1 = (void *) 0x804a368 stack+4040 with libc: **3721** stack: id=1, begin=0x8049860, end=0x804A860 --3721:0:aspacem 18: 0004028000-0008047fff 64m --3721:0:aspacem 19: file 0008048000-0008048fff4096 r-xT- d=0xfd00 i=705463 o=0 (1) --3721:0:aspacem 20: file 0008049000-0008049fff4096 rw--- d=0xfd00 i=705463 o=0 (1) --3721:0:aspacem 21: anon 000804a000-000804afff4096 rw--- --3721:0:aspacem 22: anon 000804b000-000804bfff4096 rwx-- --3721:0:aspacem 23: RSVN 000804c000-000884afff 8384512 - SmLower --3721:0:aspacem 24: 000884b000-0037ff759m (gdb) p $sp /// when SIGCHLD rcvd $1 = (void *) 0x804a854 stack+4084 If you slightly extend the stack needed with the libc version, then you get the same behaviour: ==3894== Can't extend stack to 0x8049fa0 during signal delivery for thread 1: ==3894== no stack segment (I have introduced a function void bpause() { char truc[400]; pause(); } and calls bpause instead of pause in main. Philippe -- Get 100% visibility into Java/.NET code with AppDynamics Lite It's a free troubleshooting tool designed for production Get down to code-level detail for bottlenecks, with 2% overhead. Download for free and get started troubleshooting in minutes. http://p.sf.net/sfu/appdyn_d2d_ap2 ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Fwd: __malloc_initialize_hook is deprecatedco. warnings while building ast-open.2013-04-22 ...
On Tue, 2013-05-14 at 04:28 +0200, Roland Mainz wrote: On Thu, Apr 25, 2013 at 1:42 PM, Sebastian Feld sebastian.n.f...@gmail.com wrote: On Wed, Apr 24, 2013 at 11:10 PM, Roland Mainz roland.ma...@nrubsig.org wrote: On Wed, Apr 24, 2013 at 10:14 PM, Roland Mainz roland.ma...@nrubsig.org wrote: On Wed, Apr 24, 2013 at 12:45 AM, John Reiser jrei...@bitwagon.com wrote: $ valgrind --allocator-sym-redirect=sh_malloc=malloc,sh_free=free,sh_calloc=calloc ... # would instruct valgrind to take function |sh_malloc()| as an alternative |malloc(), |sh_free()| as alternative |free()| version etc. etc. The only issue is that if multiple allocators are active within a single process we may need some kind of grouping to explain valgrind that memory allocated by |sh_malloc()| can not be freed by |tcfree()| or |_ast_free()| ... maybe it could be done using '{'- and '}'-pairs, e.g. $ valgrind --allocator-sym-redirect={sh_malloc=malloc,sh_free=free,sh_calloc=calloc},{_ast_malloc=malloc,_ast_free=free,_ast_calloc=calloc} ... # The idea of (finally!) providing such an option sounds like a very good idea. Until now the only way to probe python and bash4 via valgrind is to poke in the valgrind sources (which should never happen). I think it would not be very difficult to extend the command line option --soname-synonyms=syn1=pattern1,syn2=pattern2,... synonym soname specify patterns for function wrapping or replacement. To use a non-libc malloc library that is in the main exe: --soname-synonyms=somalloc=NONE in libxyzzy.so: --soname-synonyms=somalloc=libxyzzy.so to support also to give synonym for the function part of a redirection. Now that I understand better all this area, it should be relatively easy to allow to give synonyms for any (existing) redirection (library part or function part). In other words, to make -soname-synonyms generic. I also think the idea to let valgrind detect mixing of different allocators is a very valuable feature since this has been a source of more and more bugs. Usually happens in complex projects with use many different shared libraries, all with their own memory allocators. However, the impact of this part is not as easy. This implies to change the basic way the malloc interception is done, by adding an additional grouping parameter, and store this in each memory chunk managed by memcheck. More impact on memory, and on the interface between the core and the tools replacing the malloc, and a lot more difficult to make generic. I suspect this will imply also some possibly heavy changes to the core redirection logic. Uhm... was there any feedback yet for that idea ? Some feedback above :). Philippe -- AlienVault Unified Security Management (USM) platform delivers complete security visibility with the essential security capabilities. Easily and efficiently configure, manage, and operate all of your security controls from a single console and one unified framework. Download a free trial. http://p.sf.net/sfu/alienvault_d2d ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] Valgrind shows Invalid write os size 4 for memory allocated for the stack
On Mon, 2013-06-10 at 01:23 -0700, mnaret wrote: Hello, Recently I'm getting lot's of invalid read/invalid write valgrind errors which point out at memory allocated for the stack. However the code doesn't crush and finish running successfully. I'm trying to understand where the error comes from - and will be grateful fo any help wih this issue. Do you have a small (compilable) reproducer ? Philippe -- This SF.net email is sponsored by Windows: Build for Windows Store. http://p.sf.net/sfu/windows-dev2dev ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users