Re: [Valgrind-users] mysterious failures when using valgrind

2013-09-27 Thread Julian Seward
On 09/26/2013 04:47 PM, John Reiser wrote:
 The likely cause is __float128 operations being performed as double 
 precision
 of two __float80 by the Intel math library for x86_64.  Memcheck-3.8.1 
 implements
 __float80 operations as __float64 (ordinary IEEE-754 'double'.)
 
 Thanks for analyzing this! I assume this means that a fix will be rather 
 complex?
 
 Nearly every user whose programs utilize 80-bit x86 floating point
 is disappointed by memcheck's 64-bit implementation of 80-bit operations.
 This situation is many years old.  The fix requires a major effort
 of design and implementation.

I'd say it would take about 2-3 weeks for a developer that is familiar
with the VEX IR and the x86_64 front and back ends, to do this.  It
is complex in that it requires changes to the front end, back end, and
to register allocation.  It would also be necessary to check that the
changes don't cause performance regressions for (real) 64-bit FP insns
on x86_64.

So it's not impossible, but given the number and urgency of some of the
other bugs we're faced with, it has so far been difficult to make a case
for allocating developer resources to this.

J


--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60133471iu=/4140/ostg.clktrk
___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


Re: [Valgrind-users] mysterious failures when using valgrind

2013-09-27 Thread Peter van Hoof
Hi John,

 The likely cause is __float128 operations being performed as double 
 precision
 of two __float80 by the Intel math library for x86_64.  Memcheck-3.8.1 
 implements
 __float80 operations as __float64 (ordinary IEEE-754 'double'.)

 Thanks for analyzing this! I assume this means that a fix will be rather
 complex?

 Nearly every user whose programs utilize 80-bit x86 floating point
 is disappointed by memcheck's 64-bit implementation of 80-bit operations.
 This situation is many years old.  The fix requires a major effort
 of design and implementation.

If you don't mind me saying so, this is a pretty incomprehensible design 
decision. This is virtually guaranteed to change the behavior of the 
code, which I would think is a big no-no for a debugging tool. But I 
guess we need to deal with what we have now...

So I see only two options:

- disable the unit tests that fail when running under valgrind.
- switch to gcc's libquadmath. A casual inspection suggests that this is 
based on gmp. It may well be slower than Intel's implementation 
though... I would also need to test if it is mature enough by now.

 If all of your use of 80-bit operations on x86 is indirect as the result
 of __float128, then perhaps you could run on s390, where memcheck has
 good support for the 128-bit hardware floating point.

Unfortunately I do not have access to such a platform. I am very 
surprised that there even is hardware support for 128-bit FP. I always 
thought that that would be too much of a fringe market to be profitable.


Cheers,

Peter.

-- 
Peter van Hoof
Royal Observatory of Belgium
Ringlaan 3
1180 Brussel
Belgium
http://homepage.oma.be/pvh

--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60133471iu=/4140/ostg.clktrk
___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


[Valgrind-users] does valgrind support ARM9 based processor

2013-09-27 Thread Ratheendran R
Hi All,

I am new to this group..

I would like to use valgrind for ARM926 Samsung board, ARM Cortex A9 imx6
freescale board,so kindly let me know valgrind support available on ARM
family.

Thanks in Advance.
RAtheendran
--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60133471iu=/4140/ostg.clktrk___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


Re: [Valgrind-users] FW: Helgrind to-do list

2013-09-27 Thread Phil Longstaff
I was thinking about this one last night, and it's trickier than I first 
thought.

L = lock, T = trylock
Thread1: L1 L2
Thread2: L2 T1

Not a deadlock because the trylock will just fail.  However, suppose we have:

Thread1: L1 L2
Thread2: L2 T1

And then later:

Thread 3: L1 L2

When helgrind handles L2, it would already find the graph edge L1 - L2 so 
wouldn't it just return since that is the correct order?  David sent me some 
past e-mail and I saw some comments about putting lock vs trylock into the 
graph.  Seems to me that when processing T2, helgrind would not report a 
problem, but would add the T2 - L1 link, and would also need to ensure that if 
L1 - L2 happens in the future, it is reported.

-Original Message-
From: David Faure [mailto:fa...@kde.org] 
Sent: Friday, September 27, 2013 9:38 AM
To: valgrind-users@lists.sourceforge.net
Cc: Phil Longstaff
Subject: Re: [Valgrind-users] FW: Helgrind to-do list

On Wednesday 25 September 2013 18:24:59 Phil Longstaff wrote:
 * Don't update the lock-order graph, and don't check for errors,
 when a try-style lock operation happens (e.g. pthread_mutex_trylock).
 Such calls do not add any real restrictions to the locking order, 
 since they can always fail to acquire the lock, resulting in the 
 caller going off and doing Plan B (presumably it will have a Plan B). 
 Doing such checks could generate false lock-order errors and confuse users.

Assuming this one is what you numbered #4 (i.e. that you started to count at 1 
and not 0) :-), then it's something I had started a long time ago, details are 
in the archive for this mailing-list (Nov 2012) (actually let me attached the 
mails here for convenience), and the testcase is at
https://bugs.kde.org/show_bug.cgi?id=243232

I would be very very glad if you could take over, I lack time and valgrind 
knowledge.

I'm also attaching my very preliminary  very old patch for it. IIRC it needs 
to be updated to actually implement what was discussed in the attached emails, 
in addition to making it work in  the first place...

--
David Faure, fa...@kde.org, http://www.davidfaure.fr Working on KDE, in 
particular KDE Frameworks 5

--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60133471iu=/4140/ostg.clktrk
___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


Re: [Valgrind-users] FW: Helgrind to-do list

2013-09-27 Thread David Faure
On Friday 27 September 2013 15:01:54 Phil Longstaff wrote:
 I was thinking about this one last night, and it's trickier than I first
 thought.
 
 L = lock, T = trylock
 Thread1: L1 L2
 Thread2: L2 T1
 
 Not a deadlock because the trylock will just fail.  However, suppose we
 have:
 
 Thread1: L1 L2
 Thread2: L2 T1
 
 And then later:
 
 Thread 3: L1 L2
 
 When helgrind handles L2, it would already find the graph edge L1 - L2 so
 wouldn't it just return since that is the correct order?  David sent me
 some past e-mail and I saw some comments about putting lock vs trylock into
 the graph.  Seems to me that when processing T2, helgrind would not report
 a problem, but would add the T2 - L1 link, and would also need to ensure
 that if L1 - L2 happens in the future, it is reported.

A failing trylock cannot create a dead lock.
Only a succesful one, can.

So the question is whether we can assume a failed trylock could have succeeded 
in creating a deadlock if it hadn't failed.
In your particular example, we can.
In many other cases, we can't know that what happened after the failed 
trylock would have happened too, if it had succeded. There could be an if() 
statement :-)

So IMHO it's much easier to just drop failed trylocks and only remember 
successful ones, but yes, one can refine that for the case above, i.e. when 
the failed-trylock is the last thing in the chain.

If anything happens *after* a failed trylock, then one can't store a
T1 - anything link.

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE, in particular KDE Frameworks 5


--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60133471iu=/4140/ostg.clktrk
___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


Re: [Valgrind-users] FW: Helgrind to-do list

2013-09-27 Thread Philippe Waroquiers
On Fri, 2013-09-27 at 15:01 +, Phil Longstaff wrote:
 I was thinking about this one last night, and it's trickier than I first 
 thought.
 
 L = lock, T = trylock
 Thread1: L1 L2
 Thread2: L2 T1
 
 Not a deadlock because the trylock will just fail.  However, suppose we have:
 
 Thread1: L1 L2
 Thread2: L2 T1
 
 And then later:
 
 Thread 3: L1 L2
 
 When helgrind handles L2, it would already find the graph edge L1 - L2 so 
 wouldn't it just return 
 since that is the correct order?  David sent me some past e-mail and I saw 
 some comments about 
 putting lock vs trylock into the graph.  Seems to me that when processing T2, 
 helgrind would not 
 report a problem, but would add the T2 - L1 link, and would also need to 
 ensure that if L1 - L2 
 happens in the future, it is reported.

Did not see much of the attached mails or discussions,
so I guess the reference to the discussions on valusers is:
http://thread.gmane.org/gmane.comp.debugging.valgrind/12616

Philippe




--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60133471iu=/4140/ostg.clktrk
___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users