Re: [Valgrind-users] FW: Helgrind to-do list
On Friday 27 September 2013 17:51:19 Phil Longstaff wrote: From: David Faure [mailto:fa...@kde.org] Sent: Friday, September 27, 2013 11:18 AM To: Phil Longstaff Cc: valgrind-users@lists.sourceforge.net Subject: Re: [Valgrind-users] FW: Helgrind to-do list On Friday 27 September 2013 15:01:54 Phil Longstaff wrote: I was thinking about this one last night, and it's trickier than I first thought. L = lock, T = trylock Thread1: L1 L2 Thread2: L2 T1 Not a deadlock because the trylock will just fail. However, suppose we have: Thread1: L1 L2 Thread2: L2 T1 And then later: Thread 3: L1 L2 When helgrind handles L2, it would already find the graph edge L1 - L2 so wouldn't it just return since that is the correct order? David sent me some past e-mail and I saw some comments about putting lock vs trylock into the graph. Seems to me that when processing T2, helgrind would not report a problem, but would add the T2 - L1 link, and would also need to ensure that if L1 - L2 happens in the future, it is reported. A failing trylock cannot create a dead lock. Only a succesful one, can. So the question is whether we can assume a failed trylock could have succeeded in creating a deadlock if it hadn't failed. In your particular example, we can. In many other cases, we can't know that what happened after the failed trylock would have happened too, if it had succeded. There could be an if() statement :-) So IMHO it's much easier to just drop failed trylocks and only remember successful ones, but yes, one can refine that for the case above, i.e. when the failed-trylock is the last thing in the chain. If anything happens *after* a failed trylock, then one can't store a T1 - anything link. I agree. My question is what we should store after a *successful* trylock. Suppose helgrind sees these threads: Thr1: L1 L2 So, it should add L1 - L2 to the graph Thr2: L1 L2 Assume thr1 has unlocked both L1 and L2. Since L1 - L2 already exists, does helgrind do anything? Surely not, since L1 - L2 is already there? Thr3: L2 T1 (unsuccessful) No deadlock can occur, no edge should be added, no report. Right. Thr3: L2 T1 (successful) So, is your suggestion that L2 - L1 should be added here, should be indistinguishable from L1 - L2 added previously, and that the report should happen here, even though this operation would not be the one that causes the deadlock? You're right. The whole point of trylock is to not deadlock :) What thread 3 is doing, is valid. The difference between L2 L1 and L2 T1(successful) is that the first one should lead to a report and the second one shouldn't. But once that step is done, in both cases we want L2 - L1 in the graph. So that Thr 1: L2 T1 (successful) Thr 2: L1 L2 (after thread 1) gives a report of wrong lock order. -- David Faure, fa...@kde.org, http://www.davidfaure.fr Working on KDE, in particular KDE Frameworks 5 -- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60133471iu=/4140/ostg.clktrk ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] FW: Helgrind to-do list
I was thinking about this one last night, and it's trickier than I first thought. L = lock, T = trylock Thread1: L1 L2 Thread2: L2 T1 Not a deadlock because the trylock will just fail. However, suppose we have: Thread1: L1 L2 Thread2: L2 T1 And then later: Thread 3: L1 L2 When helgrind handles L2, it would already find the graph edge L1 - L2 so wouldn't it just return since that is the correct order? David sent me some past e-mail and I saw some comments about putting lock vs trylock into the graph. Seems to me that when processing T2, helgrind would not report a problem, but would add the T2 - L1 link, and would also need to ensure that if L1 - L2 happens in the future, it is reported. -Original Message- From: David Faure [mailto:fa...@kde.org] Sent: Friday, September 27, 2013 9:38 AM To: valgrind-users@lists.sourceforge.net Cc: Phil Longstaff Subject: Re: [Valgrind-users] FW: Helgrind to-do list On Wednesday 25 September 2013 18:24:59 Phil Longstaff wrote: * Don't update the lock-order graph, and don't check for errors, when a try-style lock operation happens (e.g. pthread_mutex_trylock). Such calls do not add any real restrictions to the locking order, since they can always fail to acquire the lock, resulting in the caller going off and doing Plan B (presumably it will have a Plan B). Doing such checks could generate false lock-order errors and confuse users. Assuming this one is what you numbered #4 (i.e. that you started to count at 1 and not 0) :-), then it's something I had started a long time ago, details are in the archive for this mailing-list (Nov 2012) (actually let me attached the mails here for convenience), and the testcase is at https://bugs.kde.org/show_bug.cgi?id=243232 I would be very very glad if you could take over, I lack time and valgrind knowledge. I'm also attaching my very preliminary very old patch for it. IIRC it needs to be updated to actually implement what was discussed in the attached emails, in addition to making it work in the first place... -- David Faure, fa...@kde.org, http://www.davidfaure.fr Working on KDE, in particular KDE Frameworks 5 -- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60133471iu=/4140/ostg.clktrk ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] FW: Helgrind to-do list
On Friday 27 September 2013 15:01:54 Phil Longstaff wrote: I was thinking about this one last night, and it's trickier than I first thought. L = lock, T = trylock Thread1: L1 L2 Thread2: L2 T1 Not a deadlock because the trylock will just fail. However, suppose we have: Thread1: L1 L2 Thread2: L2 T1 And then later: Thread 3: L1 L2 When helgrind handles L2, it would already find the graph edge L1 - L2 so wouldn't it just return since that is the correct order? David sent me some past e-mail and I saw some comments about putting lock vs trylock into the graph. Seems to me that when processing T2, helgrind would not report a problem, but would add the T2 - L1 link, and would also need to ensure that if L1 - L2 happens in the future, it is reported. A failing trylock cannot create a dead lock. Only a succesful one, can. So the question is whether we can assume a failed trylock could have succeeded in creating a deadlock if it hadn't failed. In your particular example, we can. In many other cases, we can't know that what happened after the failed trylock would have happened too, if it had succeded. There could be an if() statement :-) So IMHO it's much easier to just drop failed trylocks and only remember successful ones, but yes, one can refine that for the case above, i.e. when the failed-trylock is the last thing in the chain. If anything happens *after* a failed trylock, then one can't store a T1 - anything link. -- David Faure, fa...@kde.org, http://www.davidfaure.fr Working on KDE, in particular KDE Frameworks 5 -- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60133471iu=/4140/ostg.clktrk ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
Re: [Valgrind-users] FW: Helgrind to-do list
On Fri, 2013-09-27 at 15:01 +, Phil Longstaff wrote: I was thinking about this one last night, and it's trickier than I first thought. L = lock, T = trylock Thread1: L1 L2 Thread2: L2 T1 Not a deadlock because the trylock will just fail. However, suppose we have: Thread1: L1 L2 Thread2: L2 T1 And then later: Thread 3: L1 L2 When helgrind handles L2, it would already find the graph edge L1 - L2 so wouldn't it just return since that is the correct order? David sent me some past e-mail and I saw some comments about putting lock vs trylock into the graph. Seems to me that when processing T2, helgrind would not report a problem, but would add the T2 - L1 link, and would also need to ensure that if L1 - L2 happens in the future, it is reported. Did not see much of the attached mails or discussions, so I guess the reference to the discussions on valusers is: http://thread.gmane.org/gmane.comp.debugging.valgrind/12616 Philippe -- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60133471iu=/4140/ostg.clktrk ___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users
[Valgrind-users] FW: Helgrind to-do list
Travis has commented about the need for #2 for a few suppressions. I have at least 1 case of #1 in the PDB tree. From: Phil Longstaff Sent: Wednesday, September 25, 2013 2:24 PM To: valgrind-users@lists.sourceforge.net Subject: Helgrind to-do list The bottom of the helgrind section of the manual has: * For lock order errors, print the complete lock cycle, rather than only doing for size-2 cycles as at present. * The conflicting access mechanism sometimes mysteriously fails to show the conflicting access' stack, even when provided with unbounded storage for conflicting access info. This should be investigated. * Document races caused by GCC's thread-unsafe code generation for speculative stores. In the interim see http://gcc.gnu.org/ml/gcc/2007-10/msg00266.html and http://lkml.org/lkml/2007/10/24/673. * Don't update the lock-order graph, and don't check for errors, when a try-style lock operation happens (e.g. pthread_mutex_trylock). Such calls do not add any real restrictions to the locking order, since they can always fail to acquire the lock, resulting in the caller going off and doing Plan B (presumably it will have a Plan B). Doing such checks could generate false lock-order errors and confuse users. * Performance can be very poor. Slowdowns on the order of 100:1 are not unusual. There is limited scope for performance improvements. I'm interested in solutions for a couple of those (#1, #2 and #4). Is any work being done on them? If I were to try to tackle them, can someone at least point me in the right place in the code? Phil -- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60133471iu=/4140/ostg.clktrk___ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users