Re: [Valgrind-users] RFC: changing Cachegrind default to `--cache-sim=no`

2023-04-04 Thread David Faure
On mardi 4 avril 2023 12:11:18 CEST Nicholas Nethercote wrote:
> On Tue, 4 Apr 2023 at 19:24, David Faure  wrote:
> > But then, with no cache simulation and no call stacks, what's left in
> > `cachegrind --cache-sim=no`?
> 
> From the email that started this thread:
> 
> If you run with `--cache-sim=no` then the cache simulation is disabled and
> > you just get one event: Ir. (This is "instruction cache reads", which is
> > equivalent to "instructions executed".)

Ah, right, sorry.

So to summarize the big picture:
cachegrind -> instructions count, without call stacks, useful for overall 
numbers or with cg_annotate
callgrind -> instructions count, with call stacks, best viewed in kcachegrind


I wish those two could do cycles and not just instructions, but I guess this 
requires a good cache simulator again, back to square one ;)
(perf does cycles, but doesn't give exact number of method calls, that's one 
benefit of cachegrind/callgrind)

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE Frameworks 5





___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


Re: [Valgrind-users] RFC: changing Cachegrind default to `--cache-sim=no`

2023-04-04 Thread David Faure
On lundi 3 avril 2023 23:46:46 CEST Nicholas Nethercote wrote:
> On Mon, 3 Apr 2023 at 21:36, David Faure  wrote:
> > But then, what's the difference between `cachegrind --cache-sim=no`
> > and `callgrind`?
> > 
> > https://accu.org/journals/overload/20/111/floyd_1886/ says
> > "The main differences are that Callgrind has more information about the
> > callstack whilst cachegrind gives more information about cache hit rates."
> > 
> > Wouldn't one want callstacks? (if this means stack traces).
> > I know I must be missing something, thanks for enlightening me.
> 
> Callgrind is a forked and extended version of Cachegrind. It also simulates
> a cache, with a slightly different simulation to Cachegrind's. The fact
> that both tools exist is due to historical reasons; if starting from
> scratch today you wouldn't deliberately split them.

Thanks for the information. This is indeed confusing - like anything that is 
"due to historical reasons" ;-)

> Call stacks are often useful (I regularly use Callgrind as well as
> Cachegrind) but they aren't always necessary. Without them, Cachegrind runs
> faster than Callgrind and produces smaller data files. Cachegrind also
> supports diffing and merging different files, while Callgrind does not.

OK. I thought call stacks were mandatory for any tool to be useful
(they certainly are for KCachegrind (*)), but I now found the documentation on 
cg_annotate.

But then, with no cache simulation and no call stacks, what's left in 
`cachegrind --cache-sim=no`?


(*) This naming adds to the confusion: kcachegrind requires callgrind, it 
can't work with cachegrind... I know, historical reasons :-)

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE Frameworks 5





___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


Re: [Valgrind-users] RFC: changing Cachegrind default to `--cache-sim=no`

2023-04-03 Thread David Faure
[removing valgrind-developers, since I guess I can't post there]

On lundi 3 avril 2023 11:29:25 CEST Nicholas Nethercote wrote:
> I have been using `--cache-sim=no` almost exclusively for a long time. The
> cache simulation done by Valgrind is an approximation of the memory
> hierarchy of a 2002 AMD Athlon processor. Its accuracy for a modern memory
> hierarchy with three levels of cache, prefetching, non-LRU replacement, and
> who-knows-what-else is likely to be low. If you want to accurately know
> about cache behaviour you'd be much better off using hardware counters via
> `perf` or some other profiler.
> 
> But `--cache-sim=no` is still very useful because instruction execution
> counts are still very useful.
> 
> Therefore, I propose changing the default to `--cache-sim=no`. Does anyone
> have any objections to this?

I agree that simulating a cache from 2002 isn't very useful.

But then, what's the difference between `cachegrind --cache-sim=no`
and `callgrind`?

https://accu.org/journals/overload/20/111/floyd_1886/ says
"The main differences are that Callgrind has more information about the 
callstack whilst cachegrind gives more information about cache hit rates."

Wouldn't one want callstacks? (if this means stack traces).
I know I must be missing something, thanks for enlightening me.

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE Frameworks 5





___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


Re: [Valgrind-users] Helgrind causing data race warning in mutex_destroy_WRK?

2021-08-17 Thread David Faure
On mardi 17 août 2021 09:59:23 CEST Schmidt, Adriaan wrote:
> Hi.
> 
> Running Helgrind (Valgrind 3.17.0) on arm32 (Linux 4.14.139), glibc 2.31,
> and an application using Poco 1.10.1, I see the following:
> 
> ==17922== Possible data race during read of size 1 at 0x64BD4C4 by thread
> #97 ==17922== Locks held: 1, at address 0x1C134CC
> ==17922==at 0x48536D8: my_memcmp (hg_intercepts.c:220)
> ==17922==by 0x4853BBF: mutex_destroy_WRK (hg_intercepts.c:859)
> ==17922==by 0x48572F7: pthread_mutex_destroy (hg_intercepts.c:882)
> ==17922==by 0x5705F23: Poco::EventImpl::~EventImpl()
> (Event_POSIX.cpp:96) ==17922==by 0x5706393: Poco::Event::~Event()
> (Event.cpp:40)
> ==17922==by 0x578099B: Poco::Timer::~Timer() (Timer.cpp:34)
> ==17922==by 0x5470827: Poco::Data::SessionPool::~SessionPool()
> (SessionPool.cpp:40) ==17922==by 0x54708A7:
> Poco::Data::SessionPool::~SessionPool() (SessionPool.cpp:50) ==.==[
> ... ]
> ==17922==
> ==17922== This conflicts with a previous write of size 4 by thread #7
> ==17922== Locks held: none
> ==17922==at 0x57F8998: __pthread_mutex_unlock_usercnt
> (pthread_mutex_unlock.c:52) ==17922==by 0x4854273: mutex_unlock_WRK
> (hg_intercepts.c:1106) ==17922==by 0x4857337: pthread_mutex_unlock
> (hg_intercepts.c:1124) ==17922==by 0x5781153: setImpl
> (Event_POSIX.h:61)
> ==17922==by 0x5781153: set (Event.h:101)
> ==17922==by 0x5781153: Poco::Timer::run() (Timer.cpp:216)
> ==17922==by 0x577ACAF: Poco::PooledThread::run() (ThreadPool.cpp:199)
> ==17922==by 0x57765B3: Poco::ThreadImpl::runnableEntry(void*)
> (Thread_POSIX.cpp:345) ==17922==by 0x48562FF: mythread_wrapper
> (hg_intercepts.c:398)
> ==17922==by 0x57F4143: start_thread (pthread_create.c:477)
> 
> To me it seems that Helgrind itself is causing the warning when calculating
> mutex_is_init (hg_intercepts.c:859).

Isn't this rather a race between unlocking a mutex and destroying that mutex?

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE Frameworks 5





___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


Re: [Valgrind-users] How to find memory leaks from an running application

2019-06-20 Thread David Faure
On jeudi 20 juin 2019 12:07:19 CEST Howard Chu wrote:
> David Chapman wrote:
> > On 6/19/2019 10:34 PM, subhasish Karmakar wrote:
> >> Hi,
> >> 
> >> I have an application running on embedded linux.
> >> After running for hours my application causes out of memory issue.
> >> How can I get memory leak report periodically when the application is
> >> running? It's a process with "while(1)" loop.
> 
> If your only problem is memory leaks and no other concerns (such as out of
> bounds accesses, corruptions, etc.) then you should give
> https://github.com/hyc/mleak/ a try. It is faster than all other memory
> leak detectors out there, fast enough to run in production. And you can
> send it a periodic signal to get a snapshot of currently allocated memory.

Or, for another LD_PRELOAD-based tool with a really nice user interface on 
top, https://github.com/KDAB/heaptrack

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE Frameworks 5





___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


Re: [Valgrind-users] Valgrind Finds More Dynamic Allocations than Inte Pin

2019-02-08 Thread David Faure
LOL that was the risk, getting a third, completely different, number ;)

Well, you mention that your tool only looks at "each loaded image",
while heaptrack and valgrind look at ALL allocations.


On vendredi 8 février 2019 18:32:01 CET Ahmad Nouralizadeh wrote:
> Thanks David,
> But heaptrack even reports a larger number: 153 MB!
> 
> On Fri, Feb 8, 2019 at 8:09 PM David Faure  wrote:
> > On vendredi 8 février 2019 16:32:50 CET Ahmad Nouralizadeh wrote:
> > > Hi,
> > > I wrote a really simple Pin tool to calculate the number of dynamically
> > > allocated bytes in a program. I instrumented GIMP with this tool and it
> > > reported 77 MB of allocations. I did the same experiment with Valgrind
> > > which reported 117 MB.
> > > My Pin tool is similar to the example in Pin. It searches for malloc(),
> > > calloc() and memalign() in each loaded image and adds instructions
> > > before
> > > them to calculate the total size of the allocations.
> > > I am really confused and need help!
> > 
> > If you're on Linux, I recommend using heaptrack for this :-)
> > https://github.com/KDAB/heaptrack
> > 
> > This doesn't really answer your question, sorry about that, but you might
> > want
> > to see which of those tools heaptrack agrees with, it might help finding
> > out
> > who is wrong...
> > 
> > --
> > David Faure, fa...@kde.org, http://www.davidfaure.fr
> > Working on KDE Frameworks 5


-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE Frameworks 5





___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


Re: [Valgrind-users] Valgrind Finds More Dynamic Allocations than Inte Pin

2019-02-08 Thread David Faure
On vendredi 8 février 2019 16:32:50 CET Ahmad Nouralizadeh wrote:
> Hi,
> I wrote a really simple Pin tool to calculate the number of dynamically
> allocated bytes in a program. I instrumented GIMP with this tool and it
> reported 77 MB of allocations. I did the same experiment with Valgrind
> which reported 117 MB.
> My Pin tool is similar to the example in Pin. It searches for malloc(),
> calloc() and memalign() in each loaded image and adds instructions before
> them to calculate the total size of the allocations.
> I am really confused and need help!

If you're on Linux, I recommend using heaptrack for this :-)
https://github.com/KDAB/heaptrack

This doesn't really answer your question, sorry about that, but you might want 
to see which of those tools heaptrack agrees with, it might help finding out 
who is wrong...

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE Frameworks 5





___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


[Valgrind-users] Suggestion for vgdb

2018-11-19 Thread David Faure
When using vgdb (e.g. `valgrind --vgdb-error=0 myprog`)
and there's a valgrind warning for an uninitialized read, on a line like
if (a || b)

The question that happens then is, of course, was it a or b that was 
uninitialized. If one uses vgdb to print the values of a and b, it won't 
necessarily be obvious (e.g. two bools, both happen to show as "false", with 
only one actually uninitialized). This makes me wonder, wouldn't it be 
possible for vgdb to output a warning when doing "print a" or "print b" from 
gdb and the value is marked as uninitialized?

If I understand the architecture correctly, this should be possible to 
implement, right?

-- 
David Faure | david.fa...@kdab.com | Managing Director KDAB France
KDAB (France) S.A.S., a KDAB Group company
Tel. France +33 (0)4 90 84 08 53, http://www.kdab.fr
KDAB - The Qt, C++ and OpenGL Experts





___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


Re: [Valgrind-users] helgrind and c++ atomic_flag

2018-09-25 Thread David Faure
On mardi 25 septembre 2018 16:17:48 CEST John Perry wrote:
> > On Sep 25, 2018, at 7:02 AM, Tom Hughes  wrote:
> > 
> > I don't believe helgrind makes any attempt to observe atomic
> > operations so it is entirely unaware of them and of any effect
> > they might have on the thread correctness of a program.
> > 
> > It would be hard to do because where the compiler is able to
> > generate direction instructions for the atomic there will be no
> > function call to intercept, and as there won't necessarily be a
> > one-one mapping from atomic operations to CPU instructions it
> > is hard to determine what the original operation was by
> > observing the instruction stream.
> 
> Thank you! This comes as a huge relief, because I first noticed the issue in 
> a program I was writing where I used that approach and worried I was doing 
> something very, very bad. Now I can rest easy. Or at least easier.

You want to use TSAN (thread-sanitizer) instead (preferably with clang and 
libc++, in my experience), which supports atomic operations.

Sorry for advertising a competing solution on the valgrind mailing-list ;-)
I admit I'm much less of a helgrind fan since tsan started to work well.

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE Frameworks 5





___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


Re: [Valgrind-users] Valgrind hangs when generating supression for Qt5 menus

2017-07-14 Thread David Faure
On vendredi 14 juillet 2017 21:10:50 CEST Nathan Bahr wrote:
> Hi,
> 
> I made a simple Qt5 application with a single menu item and valgrind is
> configured to print suppression code.
> 
> If I open and close a menu, valgrind holds onto the QMenu object and
> prompts to print supression code. This causes the whole setup to hang where
> I cannot interact with the application or command line interface. I can
> force-quit the application, which exits out of the command line. The
> application does not hang if gen-suppressions=no.

This is probably due to QMenu grabbing keyboard/mouse on X11?
You can disable this by passing -nograb to the application.

> Is there any way to set valgrind to automatically generate suppression code
> instead of prompting? 

Yes, --gen-suppressions=all

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE Frameworks 5


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


Re: [Valgrind-users] Many false positives "Mismatched free() / delete / delete []"

2016-12-22 Thread David Faure
I found it.

Using "step" in gdb showed that the new calls that valgrind complains about
go into qtwebengine/src/3rdparty/chromium/base/allocator/allocator_shim.cc

146├>void* ShimCppNew(size_t size) {
147│   const allocator::AllocatorDispatch* const chain_head = GetChainHead();
148│   void* ptr;
149│   do {
150│ ptr = chain_head->alloc_function(chain_head, size);
151│   } while (!ptr && CallNewHandler());
152│   return ptr;
153│ }

Indeed chromium's allocator_shim_override_cpp_symbols.h says
SHIM_ALWAYS_EXPORT void* operator new(size_t size)
SHIM_ALIAS_SYMBOL(ShimCppNew);

This is why it didn't happen in smaller testcases, it only happens when 
including some qtwebengine headers.

=> No valgrind bug, sorry for the noise. I am now going to yell at the 
qtwebengine/chromium people for polluting applications with their custom 
operator new...

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE Frameworks 5


--
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today.http://sdm.link/intel
___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


Re: [Valgrind-users] Many false positives "Mismatched free() / delete / delete []"

2016-12-22 Thread David Faure
On jeudi 22 décembre 2016 21:06:04 CET Philippe Waroquiers wrote:
> To be sure: if you just replace in the above setup valgrind 3.13 SVN
> by valgrind 3.12 release, then you do not have the problem anymore ?

Good point. I just tried with /usr/bin/valgrind, which is 3.11, and the same 
thing happens!

On jeudi 22 décembre 2016 21:28:32 CET pa...@free.fr wrote:
> It doesn't much look like it, but there could be calls to new [] in the
> QBoxLayoutPrivate ctor, or its parent classes.

I don't think so, and again: this is a -O0 -g build, no inlining is happening,
so these frames would show in the stack.

> Do you know if global new/delete are replaced

I wonder how to find out.

To make matters more complex, a simple QVBoxLayout testcase doesn't show the 
issue. Neither do small size autotests with dialogs and layouts. Only the 
bigger test program with lots of memory allocations hits this.

I've seen it before in other programs though so it's not specific to that test 
either.

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE Frameworks 5


--
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today.http://sdm.link/intel
___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


Re: [Valgrind-users] Many false positives "Mismatched free() / delete / delete []"

2016-12-22 Thread David Faure
On jeudi 22 décembre 2016 06:46:44 CET David Chapman wrote:
> If this is new valgrind behavior, I wouldn't discount a bug in its code

It certainly looks like one :)

> but the developers (not me) would need to know what the QVBoxLayout
> constructor is doing.  If it's inlined, the call stack might point
> fingers at the calling function rather than the true offender.

It is not inline, and my call stack is from a non-optimized debug build 
anyway.

> Does the QVBoxLayout constructor allocate any memory inside?

Yes but not with new[].

QVBoxLayout::QVBoxLayout(QWidget *parent)
: QBoxLayout(TopToBottom, parent)
{
}

QBoxLayout::QBoxLayout(Direction dir, QWidget *parent)
: QLayout(*new QBoxLayoutPrivate, 0, parent)
{
d->dir = dir;
}



-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE Frameworks 5


--
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today.http://sdm.link/intel
___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


[Valgrind-users] Many false positives "Mismatched free() / delete / delete []"

2016-12-22 Thread David Faure
There seems to be a regression in valgrind SVN, where it thinks new[] was used, 
while in fact a simple new was used.
I see this all over the place when running valgrind on Qt code.

==4799== Mismatched free() / delete / delete []
==4799==at 0x4C2A65D: operator delete(void*) (vg_replace_malloc.c:576)
==4799==by 0x6CF853D: QVBoxLayout::~QVBoxLayout() (qboxlayout.cpp:1354)
==4799==by 0x6D1CE90: QWidget::~QWidget() (qwidget.cpp:1594)
==4799==by 0x6F631A1: QDialog::~QDialog() (qdialog.cpp:352)
==4799==by 0x5152C85: 
Akonadi::EmailAddressSelectionDialog::~EmailAddressSelectionDialog() 
(emailaddressselectiondialog.cpp:92)
==4799==by 0x401876: main (emailaddressselectiondialogtest.cpp:35)
==4799==  Address 0x279546e0 is 0 bytes inside a block of size 32 alloc'd
==4799==at 0x4C29D78: operator new[](unsigned long) 
(vg_replace_malloc.c:423)
==4799==by 0x5152DB7: 
Akonadi::EmailAddressSelectionDialog::Private::Private(Akonadi::EmailAddressSelectionDialog*,
 QAbstractItemModel*) (emailaddressselectiondialog.cpp:40)
==4799==by 0x5152B22: 
Akonadi::EmailAddressSelectionDialog::EmailAddressSelectionDialog(QWidget*) 
(emailaddressselectiondialog.cpp:82)
==4799==by 0x401681: main (emailaddressselectiondialogtest.cpp:35)

emailaddressselectiondialog.cpp:40 says
 QVBoxLayout *mainLayout = new QVBoxLayout(q);

And this is just one example, it happens in many many places, it's nothing 
special about this particular file.

Any idea why this is happening?

gcc (SUSE Linux) 4.8.5
valgrind-3.13.0.SVN
glibc-2.22-3.7.x86_64
`uname -a` = Linux 4.4.36-8-default #1 SMP Fri Dec 9 16:18:38 UTC 2016 
(3ec5648) x86_64 x86_64 x86_64 GNU/Linux
OpenSuSE Leap 42.2

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE Frameworks 5


--
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today.http://sdm.link/intel
___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


[Valgrind-users] helgrind: false race positive with static variable in function

2016-03-30 Thread David Faure
This simple testcase :

int foo() {
struct Foo {
int *i;
Foo() {  i = new int(42); }
};
static Foo f;
return *(f.i);
}

called from two threads (http://www.davidfaure.fr/2016/testcase_ogoffart.cpp)
leads to this race warning in helgrind :

==16440== Possible data race during read of size 8 at 0x602070 by thread #3
==16440== Locks held: none
==16440==at 0x400B34: foo() (testcase_ogoffart.cpp:10)
==16440==by 0x400B48: threadStart(void*) (testcase_ogoffart.cpp:16)
==16440==by 0x4C3005E: mythread_wrapper (hg_intercepts.c:389)
==16440==by 0x4E430A3: start_thread (pthread_create.c:309)
==16440==by 0x595D00C: clone (clone.S:111)
==16440== 
==16440== This conflicts with a previous write of size 8 by thread #2
==16440== Locks held: none
==16440==at 0x400AF2: foo()::Foo::Foo() (testcase_ogoffart.cpp:7)
==16440==by 0x400B1B: foo() (testcase_ogoffart.cpp:9)
==16440==by 0x400B48: threadStart(void*) (testcase_ogoffart.cpp:16)
==16440==by 0x4C3005E: mythread_wrapper (hg_intercepts.c:389)
==16440==by 0x4E430A3: start_thread (pthread_create.c:309)
==16440==by 0x595D00C: clone (clone.S:111)
==16440==  Address 0x602070 is 0 bytes inside data symbol "_ZZ3foovE1f"

with both clang and gcc.
(and without -Wno-threadsafe-statics)

I assume that both compilers are not buggy. More likely, helgrind doesn't 
recognize
the atomic operations used by the compilers to protect the initialization of 
Foo ?

  :
  _5 = __cxa_guard_acquire (&_ZGVZ3foovE1f);
  if (_5 != 0)
goto ;
  else
goto ;

  :
  foo()::Foo::Foo ();
  __cxa_guard_release (&_ZGVZ3foovE1f);

Isn't cxa_guard_acquire implemented with mutexes? (Google seems to say so)
Why then does helgrind see a race? It doesn't catch the mutexes used internally 
in libstdc++?

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE Frameworks 5


--
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://pubads.g.doubleclick.net/gampad/clk?id=278785471=/4140
___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


Re: [Valgrind-users] helgrind, clang and thread_local

2016-03-22 Thread David Faure
On Monday 21 March 2016 14:56:09 Alexander Potapenko wrote:
> > Helgrind bug, or is clang silently ignoring thread_local?
> Clang documentation (http://clang.llvm.org/cxx_status.html) says
> thread_local support requires libc++abi 3.6+ or libsupc++ 4.8+.
> Does your binary use any of those?

Oh. I use libstdc++, that's why, then.

I had no idea that the feature check for clang could pass and then the
feature would fail at runtime due to using another c++ standard library. Tricky.

libc++ and libsupc++ don't seem to be packaged for OpenSUSE 13.2,
(while clang is) so I'll ignore this for now. Thanks!

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE Frameworks 5


--
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://pubads.g.doubleclick.net/gampad/clk?id=278785351=/4140
___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


[Valgrind-users] helgrind, clang and thread_local

2016-03-19 Thread David Faure
The following code (from Qt5's qlogging.cpp)

static thread_local bool msgHandlerGrabbed = false;

static bool grabMessageHandler()
{
if (msgHandlerGrabbed)
return false;

msgHandlerGrabbed = true;
return true;
}

static void ungrabMessageHandler()
{
msgHandlerGrabbed = false;
}

(purpose: avoiding recursion)

when compiled with clang 3.5.0, and called from multiple threads, leads to this 
helgrind warning:

==1218== Possible data race during write of size 1 at 0x1E1F86F0 by thread #17
==1218== Locks held: none
==1218==at 0x783637B: grabMessageHandler() (qlogging.cpp:1543)
==1218==by 0x783640B: qt_message_print(QtMsgType, QMessageLogContext 
const&, QString const&) (qlogging.cpp:1571)
==1218==by 0x783658A: qt_message_output(QtMsgType, QMessageLogContext 
const&, QString const&) (qlogging.cpp:1622)
==1218==by 0x798DC32: QDebug::~QDebug() (qdebug.cpp:150)
==1218==by 0x4F24EF6: CompletionThread::done() (kurlcompletion.cpp:232)
==1218==by 0x4F1ECCD: DirectoryListThread::run() (kurlcompletion.cpp:368)
==1218==by 0x784B0B6: QThreadPrivate::start(void*) (qthread_unix.cpp:340)
==1218==by 0x4C3005E: mythread_wrapper (hg_intercepts.c:389)
==1218==by 0x9DAE0A3: start_thread (pthread_create.c:309)
==1218==by 0x869100C: clone (clone.S:111)
==1218== 
==1218== This conflicts with a previous write of size 1 by thread #16
==1218== Locks held: none
==1218==at 0x7836399: ungrabMessageHandler() (qlogging.cpp:1549)
==1218==by 0x78364BC: qt_message_print(QtMsgType, QMessageLogContext 
const&, QString const&) (qlogging.cpp:1579)
==1218==by 0x783658A: qt_message_output(QtMsgType, QMessageLogContext 
const&, QString const&) (qlogging.cpp:1622)
==1218==by 0x798DC32: QDebug::~QDebug() (qdebug.cpp:150)
==1218==by 0x4F24EF6: CompletionThread::done() (kurlcompletion.cpp:232)
==1218==by 0x4F1ECCD: DirectoryListThread::run() (kurlcompletion.cpp:368)
==1218==by 0x784B0B6: QThreadPrivate::start(void*) (qthread_unix.cpp:340)
==1218==by 0x4C3005E: mythread_wrapper (hg_intercepts.c:389)
==1218==  Address 0x1e1f86f0 is in a rw- anonymous segment

Helgrind bug, or is clang silently ignoring thread_local?

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE Frameworks 5


--
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://pubads.g.doubleclick.net/gampad/clk?id=278785231=/4140
___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


Re: [Valgrind-users] memory barriers support

2014-08-19 Thread David Faure
On Tuesday 19 August 2014 21:00:58 Philippe Waroquiers wrote:
 On Tue, 2014-08-19 at 16:46 +0200, Roland Mainz wrote:
   ThreadSanitizer won't comprehend the fence instructions inserted by
   urcu.
   I believe even Helgrind won't, because these instructions do not imply
   any happens-before relation.
  
  Is there any opensource or commercial tool which might help in such
  situations (e.g. problems with memory barriers) ?
 
 helgrind or drd or ThreadSanitizer could still be used for race
 condition detection but you would have to annotate either the rcu
 library or the calling code to describe the happens before
 relationships.

Are such annotations documented somewhere?

I'm still trying to find a way to annotate threadsafe-statics so that helgrind 
doesn't complain about them.

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE Frameworks 5


--
___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


Re: [Valgrind-users] helgrind and threadsafe-statics

2014-06-20 Thread David Faure
On Monday 09 June 2014 08:39:45 Patrick J. LoPresti wrote:
 Interesting! So on x86 and similar, they implement thread-safe Meyers
 singletons via the double-checked locking anti-pattern... Which is
 actually safe thanks to Intel's not-exactly-relaxed memory model.

Interesting indeed.
 
  If g++ would be modified such that the if (!guard.first_byte) test can
  be skipped at run-time then it would become possible for Helgrind and
  DRD to recognize static initialization
 
 If I understand you correctly (?), you plan to ask the g++ maintainers
 to tamper with their fast path to make life easier for Helgrind (?)

Yeah, I can't really see that happening, either.
Unless some of you have a really good relationship with the gcc maintainers 
:-)
 
 If so, I would suggest having a Plan B...
 
 Would it make sense to re-think the happens-before/happens-after
 annotation macros for C++11?

Not sure what you mean exactly, so at the risk of asking the same question:

would it be possible for me to annotate a global static with some special 
macros that make helgrind understand what's happening?

I know, annotating all global statics one by one sounds horrible, but actually 
in my case they're already encapsulated in a macro, and I just need a way to 
remove all these false positives in order for helgrind to be usable.

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE Frameworks 5


--
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing  Easy Data Exploration
http://p.sf.net/sfu/hpccsystems
___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


[Valgrind-users] helgrind and threadsafe-statics

2014-06-08 Thread David Faure
Hello,

I'm using helgrind quite a lot these days, and I love it.

However I wonder if it doesn't give me false positives for the case of reading 
a value from a static object, which was set in the constructor.

Given that gcc does indeed implement threadsafe statics as per C++11 (but 
even before C++11 came out), one can assume that gcc does something like a 
mutex around the creation of the object, and therefore that there is a 
happens before relation between the end of the constructor and the use of 
this object later on, right?

In that case it would seem that helgrind needs to learn that, to avoid many 
false positives.

Testcase attached.

The assembly code says
call__cxa_guard_acquire
testl   %eax, %eax
je  .L3
.loc 1 16 0 discriminator 2
movl$_ZZ11threadStartPvE9singleton, %edi
call_ZN9SingletonC1Ev
movl$_ZGVZ11threadStartPvE9singleton, %edi
call__cxa_guard_release
.L3:

IIRC __cxa_guard_acquire/release is the protection around the static, but I'm 
not sure exactly what this means. Is there an actual happens-before relation 
here?

helgrind log:

==31469== Possible data race during read of size 4 at 0x602068 by thread #3
==31469== Locks held: none
==31469==at 0x400ADF: threadStart(void*) (testcase_local_static.cpp:17)
==31469==by 0x4C2D151: mythread_wrapper (hg_intercepts.c:233)
==31469==by 0x4E3C0DA: start_thread (pthread_create.c:309)
==31469==by 0x595B90C: clone (clone.S:111)
==31469== 
==31469== This conflicts with a previous write of size 4 by thread #2
==31469== Locks held: none
==31469==at 0x400BC6: Singleton::Singleton() (testcase_local_static.cpp:9)
==31469==by 0x400AD4: threadStart(void*) (testcase_local_static.cpp:16)
==31469==by 0x4C2D151: mythread_wrapper (hg_intercepts.c:233)
==31469==by 0x4E3C0DA: start_thread (pthread_create.c:309)
==31469==by 0x595B90C: clone (clone.S:111)


-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE Frameworks 5
#include pthread.h
#include stdio.h

// gcc is supposed to have threadsafe statics

class Singleton
{
public:
Singleton() : value(42) {}

int value;
};

void * threadStart(void *)
{
static Singleton singleton;
printf(%d\n, singleton.value);
printf(%d\n, singleton.value);
printf(%d\n, singleton.value);
printf(%d\n, singleton.value);
printf(%d\n, singleton.value);
return 0;
}

int main( int , char** ) {
pthread_t thread1;
if ( pthread_create(thread1, 0, threadStart, 0) )
return 1;
pthread_t thread2;
if ( pthread_create(thread2, 0, threadStart, 0) )
return 1;

void* v;
pthread_join(thread1, v);
pthread_join(thread2, v);
return 0;
}

--
Learn Graph Databases - Download FREE O'Reilly Book
Graph Databases is the definitive new guide to graph databases and their 
applications. Written by three acclaimed leaders in the field, 
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/NeoTech___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


Re: [Valgrind-users] Beginner hg question: mapping address to the variable

2014-01-18 Thread David Faure
On Friday 17 January 2014 12:36:32 Raghu Reddy wrote:
 My question is, how do I find the variable located at the address 0x420A080?
 The code was ready compiled with -g option, so I was wondering why it was
 unable to point me to the variable.  What can I do to get the actual
 variable name or the location where if this is happening?

Are you sure that libiomp5.so itself is compiled with -g, not just your 
application?

If yes, maybe try without optimizations (-O0 or nothing instead of -O2), in 
case inlining got in the way. But this looks more to me like a missing -g in 
the first place.

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE, in particular KDE Frameworks 5


--
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments  Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk
___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


Re: [Valgrind-users] valgrind with shell scripts

2014-01-14 Thread David Faure
On Tuesday 14 January 2014 07:18:05 Samuel Quiring wrote:
 Greetings,
 
 Normally I invoke my app using a shell script:
 
  ./run.sh  –e  run log

[I don't understand the run log part of it, but let's ignore that]

 If I invoke this script using valgrind:
 
  valgrind ./run.sh –e  run log
 
 Will valgrind evaluate the shell script or my app or both?

The shell script. Better modify the shell script itself to call valgrind :)

But you can also use the above command with --trace-children=yes
and then it will trace all children, including your app.

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE, in particular KDE Frameworks 5


--
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments  Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk
___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


Re: [Valgrind-users] Best valgrind options for finding corrupt memory

2014-01-14 Thread David Faure
On Tuesday 14 January 2014 08:03:14 Samuel Quiring wrote:
 Greetings,
 
 I suspect my program is corrupting (overwriting) memory, e.g., malloc'ing 16
 bytes for a string that is 17 bytes when you count the nul, then copying 17
 bytes into the 16 byte area.  What are the best valgrind options for
 detecting memory corruption?

The default options :-)

(memcheck tool)

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE, in particular KDE Frameworks 5


--
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments  Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk
___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


Re: [Valgrind-users] Helgrind 3.9.0: false positive with pthread_mutex_destroy

2013-11-07 Thread David Faure
On Thursday 07 November 2013 16:22:56 Saurabh T wrote:
 Helgrind seems to be reporting false positive data race when
 pthread_mutex_destroy is called in a different thread from
 pthread_mutex_unlock. Unfortunately I cannot make a test case, sorry. But
 here's the relevant output:
 
 ==15996== Possible data race during read of size 1 at 0x4DA7F90 by thread #1
 ==15996== Locks held: none
 ==15996==at 0x4A08D79: my_memcmp (hg_intercepts.c:165)
 ==15996==by 0x4A0906F: pthread_mutex_destroy (hg_intercepts.c:473)
 snip
 ==15996==
 ==15996== This conflicts with a previous write of size 4 by thread #52
 ==15996== Locks held: none
 ==15996==at 0x34EF80D5E2: __lll_unlock_wake (in
 /lib64/libpthread-2.5.so) ==15996==by 0x34EF80A0E6: _L_unlock_766 (in
 /lib64/libpthread-2.5.so) ==15996==by 0x34EF80A04C:
 pthread_mutex_unlock (in /lib64/libpthread-2.5.so) ==15996==by
 0x4A097E0: pthread_mutex_unlock (hg_intercepts.c:635) snip
 ==15996==
 ==15996== Address 0x4DA7F90 is 0 bytes inside a block of size 40 alloc'd
 ==15996==at 0x4A08BE5: operator new(unsigned long)
 (vg_replace_malloc.c:319) snip

Can you prove that the destroy cannot happen during the unlock?

 This was not a problem with 3.8.1 so appears to be a regression or new bug.

... or a fix, which detects an actual problem in the code :)

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE, in particular KDE Frameworks 5


--
November Webinars for C, C++, Fortran Developers
Accelerate application performance with scalable programming models. Explore
techniques for threading, error checking, porting, and tuning. Get the most 
from the latest Intel processors and coprocessors. See abstracts and register
http://pubads.g.doubleclick.net/gampad/clk?id=60136231iu=/4140/ostg.clktrk
___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


Re: [Valgrind-users] Limiting helgrind to finding only write-write conflicts

2013-10-13 Thread David Faure
On Tuesday 08 October 2013 19:34:03 Phil Longstaff wrote:
 I don't see any command line options which limit helgrind to ignoring
 read/write or write/read conflicts and only displaying write/write
 conflicts.  Would that be hard to add?
 
 We have situations where we think only 1 thread updates some counter, and
 periodically, another thread reads the value for display.  If there's a
 conflict, we don't really care whether the display shows the before or the
 after value.  However, we are interested in knowing if some other code path
 tries to update the value i.e. we care about write/write conflicts.

A read-write conflict is still a race condition, which leads to undefined 
behavior according to the C++11 standard. On some non-x86 platforms, you could 
end up with the displayed value always being 0, for instance, if the CPU 
running that thread keeps getting it from its outdated cache. Without mutexes 
or atomic operations, there's no guarantee that cache will ever get updated.
My advice is to use std::atomic_int -- which provides exactly the behavior 
you're hoping to get (we don't care whether the display shows the before or 
the after value).
This being said, helgrind might be missing suppressions for atomic_int (the 
same way I'm using suppressions for Qt's QAtomicInt) - since on the special 
case of x86, the two can't be distinguished by helgrind.

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE, in particular KDE Frameworks 5


--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60134071iu=/4140/ostg.clktrk
___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


Re: [Valgrind-users] FW: Helgrind to-do list

2013-09-28 Thread David Faure
On Friday 27 September 2013 17:51:19 Phil Longstaff wrote:
  From: David Faure [mailto:fa...@kde.org]
  Sent: Friday, September 27, 2013 11:18 AM
  To: Phil Longstaff
  Cc: valgrind-users@lists.sourceforge.net
  Subject: Re: [Valgrind-users] FW: Helgrind to-do list
  
  On Friday 27 September 2013 15:01:54 Phil Longstaff wrote:
   I was thinking about this one last night, and it's trickier than I
   first thought.
   
   L = lock, T = trylock
   Thread1: L1 L2
   Thread2: L2 T1
   
   Not a deadlock because the trylock will just fail.  However, suppose
   we
   have:
   
   Thread1: L1 L2
   Thread2: L2 T1
   
   And then later:
   
   Thread 3: L1 L2
   
   When helgrind handles L2, it would already find the graph edge L1 -
   L2 so wouldn't it just return since that is the correct order?
   David sent me some past e-mail and I saw some comments about putting
   lock vs trylock into the graph.  Seems to me that when processing T2,
   helgrind would not report a problem, but would add the T2 - L1 link,
   and would also need to ensure that if L1 - L2 happens in the future, it
   is reported. 
  A failing trylock cannot create a dead lock.
  Only a succesful one, can.
  
  So the question is whether we can assume a failed trylock could have
  succeeded in creating a deadlock if it hadn't failed. In your particular
  example, we can.
  In many other cases, we can't know that what happened after the failed
  trylock would have happened too, if it had succeded. There could be an
  if() statement :-)
  
  So IMHO it's much easier to just drop failed trylocks and only remember
  successful ones, but yes, one can refine that for the case above, i.e.
  when the failed-trylock is the last thing in the chain.
  
  If anything happens *after* a failed trylock, then one can't store a
  T1 - anything link.
 
 I agree.  My question is what we should store after a *successful* trylock.
 
 Suppose helgrind sees these threads:
 
 Thr1: L1   L2
 
 So, it should add L1 - L2 to the graph
 
 Thr2: L1  L2
 
 Assume thr1 has unlocked both L1 and L2.  Since L1 - L2 already exists,
 does helgrind do anything?

Surely not, since L1 - L2 is already there?

 Thr3: L2  T1 (unsuccessful)
 
 No deadlock can occur, no edge should be added, no report.

Right.

 Thr3: L2 T1 (successful)
 
 So, is your suggestion that L2 - L1 should be added here, should be
 indistinguishable from L1 - L2 added previously, and that the report
 should happen here, even though this operation would not be the one that
 causes the deadlock?

You're right. The whole point of trylock is to not deadlock :)
What thread 3 is doing, is valid.

The difference between L2 L1 and L2 T1(successful) is that the first one 
should lead to a report and the second one shouldn't.
But once that step is done, in both cases we want L2 - L1 in the graph.

So that
Thr 1: L2 T1 (successful)
Thr 2: L1 L2 (after thread 1)
gives a report of wrong lock order.

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE, in particular KDE Frameworks 5

--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60133471iu=/4140/ostg.clktrk
___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


Re: [Valgrind-users] FW: Helgrind to-do list

2013-09-27 Thread David Faure
On Friday 27 September 2013 15:01:54 Phil Longstaff wrote:
 I was thinking about this one last night, and it's trickier than I first
 thought.
 
 L = lock, T = trylock
 Thread1: L1 L2
 Thread2: L2 T1
 
 Not a deadlock because the trylock will just fail.  However, suppose we
 have:
 
 Thread1: L1 L2
 Thread2: L2 T1
 
 And then later:
 
 Thread 3: L1 L2
 
 When helgrind handles L2, it would already find the graph edge L1 - L2 so
 wouldn't it just return since that is the correct order?  David sent me
 some past e-mail and I saw some comments about putting lock vs trylock into
 the graph.  Seems to me that when processing T2, helgrind would not report
 a problem, but would add the T2 - L1 link, and would also need to ensure
 that if L1 - L2 happens in the future, it is reported.

A failing trylock cannot create a dead lock.
Only a succesful one, can.

So the question is whether we can assume a failed trylock could have succeeded 
in creating a deadlock if it hadn't failed.
In your particular example, we can.
In many other cases, we can't know that what happened after the failed 
trylock would have happened too, if it had succeded. There could be an if() 
statement :-)

So IMHO it's much easier to just drop failed trylocks and only remember 
successful ones, but yes, one can refine that for the case above, i.e. when 
the failed-trylock is the last thing in the chain.

If anything happens *after* a failed trylock, then one can't store a
T1 - anything link.

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE, in particular KDE Frameworks 5


--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60133471iu=/4140/ostg.clktrk
___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


[Valgrind-users] helgrind: race with no other thread ?

2013-06-22 Thread David Faure
Hello,

I have just had another case of a helgrind data race report with only one 
backtrace instead of two...
How can there be a race with only one contestant?

$ helgrind /d/kde/inst/kde4.10/bin/kactivitymanagerd --nocrashhandler --nofork
[snip debug output]
==11756== 
==11756== 
==11756== Lock at 0x1191B6B0 was first observed
==11756==at 0x4C2C548: QMutex_constructor_WRK (hg_intercepts.c:2246)
==11756==by 0x4C309F8: QMutex::QMutex(QMutex::RecursionMode) 
(hg_intercepts.c:2252)
==11756==by 0x7238F2A: QMutexPool::createMutex(int) (qmutexpool.cpp:138)
==11756==by 0x7239051: QMutexPool::get(void const*) (qmutexpool_p.h:76)
==11756==by 0x7390A14: signalSlotLock(QObject const*) (qobject.cpp:111)
==11756==by 0x7397518: QMetaObjectPrivate::connect(QObject const*, int, 
QObject const*, int, QMetaObject const*, int, int*) (qobject.cpp:3162)
==11756==by 0x7396085: QObject::connect(QObject const*, char const*, 
QObject const*, char const*, Qt::ConnectionType) (qobject.cpp:2650)
==11756==by 0x4200D3: Resources::Resources(QObject*) (Resources.cpp:305)
==11756==by 0x4160D4: Resources* runInQThreadResources() 
(Application.cpp:57)
==11756==by 0x415D40: Application::Private::Private() (Application.cpp:88)
==11756==by 0x4162E3: kamd::utils::d_ptrApplication::Private::d_ptr() 
(d_ptr_implementation.h:29)
==11756==by 0x4140F0: Application::Application() (Application.cpp:107)
==11756==by 0x415019: Application::self() (Application.cpp:208)
==11756==by 0x4152C2: main (Application.cpp:242)
==11756== 
==11756== Possible data race during read of size 8 at 0x11923F40 by thread #1
==11756== Locks held: 1, at address 0x1191B6B0
==11756==at 0x7398617: QMetaObject::activate(QObject*, QMetaObject const*, 
int, void**) (qobject.cpp:3498)
==11756==by 0x57B5924: KWindowSystem::activeWindowChanged(unsigned long) 
(kwindowsystem.moc:156)
==11756==by 0x57B105C: KWindowSystemPrivate::x11Event(_XEvent*) 
(kwindowsystem_x11.cpp:197)
==11756==by 0x566F049: KEventHackWidget::publicX11Event(_XEvent*) 
(ksystemeventfilter.cpp:43)
==11756==by 0x566ECBB: KSystemEventFilterPrivate::filterEvent(void*) 
(ksystemeventfilter.cpp:102)
==11756==by 0x566EC29: _k_eventFilter(void*) (ksystemeventfilter.cpp:91)
==11756==by 0x73680F5: QAbstractEventDispatcher::filterEvent(void*) 
(qabstracteventdispatcher.cpp:542)
==11756==by 0x7CC5EFE: 
QEventDispatcherX11::processEvents(QFlagsQEventLoop::ProcessEventsFlag) 
(qeventdispatcher_x11.cpp:128)
==11756==by 0x7376C97: 
QEventLoop::processEvents(QFlagsQEventLoop::ProcessEventsFlag) 
(qeventloop.cpp:149)
==11756==by 0x7376E2B: 
QEventLoop::exec(QFlagsQEventLoop::ProcessEventsFlag) (qeventloop.cpp:204)
==11756==by 0x7379FD5: QCoreApplication::exec() (qcoreapplication.cpp:1221)
==11756==by 0x7BE4E97: QApplication::exec() (qapplication.cpp:3823)
==11756==by 0x4152C7: main (Application.cpp:242)
==11756== 
==11756== Address 0x11923F40 is 16 bytes inside a block of size 104 alloc'd
==11756==at 0x4C2BA9B: operator new(unsigned long) (vg_replace_malloc.c:319)
==11756==by 0x7239F6B: QThreadPrivate::QThreadPrivate(QThreadData*) 
(qthread.cpp:190)
==11756==by 0x723A148: QThread::QThread(QObject*) (qthread.cpp:408)
==11756==by 0x416D66: Resources* 
runInQThreadResources()::Thread::Thread(Resources*) (Application.cpp:62)
==11756==by 0x4160F4: Resources* runInQThreadResources() 
(Application.cpp:75)
==11756==by 0x415D40: Application::Private::Private() (Application.cpp:88)
==11756==by 0x4162E3: kamd::utils::d_ptrApplication::Private::d_ptr() 
(d_ptr_implementation.h:29)
==11756==by 0x4140F0: Application::Application() (Application.cpp:107)
==11756==by 0x415019: Application::self() (Application.cpp:208)
==11756==by 0x4152C2: main (Application.cpp:242)
==11756== 

valgrind-3.9.0.SVN, freshly updated.

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE, in particular KDE Frameworks 5


--
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


Re: [Valgrind-users] Helgrind data race question

2013-05-16 Thread David Faure
On Tuesday 14 May 2013 20:18:44 Phil Longstaff wrote:
 int* my_ptr = new int;
 *my_ptr = 10;
 pthread_mutex_lock(lock);
 shared_ptr = my_ptr;
 pthread_mutex_unlock(lock);
 
 Thread 2:
 pthread_mutex_lock(lock);
 int* my_ptr = shared_ptr;
 pthread_mutex_unlock(lock);
 ... = *my_ptr;

You're reading a region of memory outside mutex protection, and that region of 
memory was written to, outside mutex protection. That's the basic definition 
of a data race.

Getting the address of that region of memory within the mutex doesn't change 
that.

You see it as non-racy because how could *my_ptr ever be something else than 
10 ... but if you think about a multi-processor system, the write of the 
value 10 might not get propagated to the cache of the other processor where 
the read happens, since the system had no reason to perform that 
synchronisation.

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE, in particular KDE Frameworks 5


--
AlienVault Unified Security Management (USM) platform delivers complete
security visibility with the essential security capabilities. Easily and
efficiently configure, manage, and operate all of your security controls
from a single console and one unified framework. Download a free trial.
http://p.sf.net/sfu/alienvault_d2d
___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


Re: [Valgrind-users] Fw: valgrind on armv71 does not show point of allocation and deallocation

2013-04-03 Thread David Faure
On Wednesday 03 April 2013 20:59:23 Ganapathy Vijay wrote:
 I have compiled this code with the arm linux cross compiler (ofcourse with
 -g).

Did you also disable optimizations? Make sure -O2 isn't in there.
Otherwise the call to free() can get inlined, for instance.

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE, in particular KDE Frameworks 5


--
Minimize network downtime and maximize team effectiveness.
Reduce network management and security costs.Learn how to hire 
the most talented Cisco Certified professionals. Visit the 
Employer Resources Portal
http://www.cisco.com/web/learning/employer_resources/index.html
___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


Re: [Valgrind-users] Helgrind doesn't handle bit field correctly

2013-03-22 Thread David Faure
On Friday 22 March 2013 14:54:09 Will Deng wrote:
 Hi,
 
 In our application, we use bit fields a lot. For example:
 
 class data {
 unsigned int the_thread_1_data:1;
 unsigned int the_thread_2_data:3;
 ...
 };
 
 When one thread is writing to the_thread_1_data, and another thread is
 reading or writing to the_thread_2_data, helgrind will flag racing
 condition. Is this a known issue in helgrind? Is there a way to get around
 this? I believe the operation is multithread safe.

The C++11 standard says this is only safe if you insert
  unsigned int separator:0;
between the two lines, in order to make these separate memory locations.

I have no idea how compilers are supposed to implement this though.
Maybe like gcc's __sync_fetch_and_{or|and} which is an atomic operation...

Anyway -- this requires a (compliant) C++11-enabled compiler.

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE, in particular KDE Frameworks 5


--
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_mar
___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


[Valgrind-users] helgrind bug in pthread_cond_destroy (testcase)

2013-03-14 Thread David Faure
The attached testcase (which is simply pthread_cond_init + 
pthread_cond_destroy), leads to an error in helgrind:
pthread_cond_destroy: destruction of unknown cond var

I've seen this forever with helgrind, but it's time to clean this up :)

However my debugging got stuck. I found out that 1) the call is given a valid 
condition variable pointer, and it actually succeeds, outside and inside 
helgrind. 2) the error message comes from this line of code:

DO_CREQ_v_W(_VG_USERREQ__HG_PTHREAD_COND_DESTROY_PRE,
pthread_cond_t*,cond);

(hg_intercepts.c:940).
How do I debug this further? This looks like a hook to me, the actual call is 
the next line,  CALL_FN_W_W(ret, fn, cond), isn't it?

Output from helgrind (with debug output added by me)

==4741== Helgrind, a thread error detector
==4741== Copyright (C) 2007-2012, and GNU GPL'd, by OpenWorks LLP et al.
==4741== Using Valgrind-3.9.0.SVN and LibVEX; rerun with -h for copyright info
==4741== Command: ./bin/testcase_pthread_cond
==4741== 
pthread_cond_init(0xffefff390) said 0
cond = 0xffefff390
==4741== ---Thread-Announcement--
==4741== 
==4741== Thread #1 is the program's root thread
==4741== 
==4741== 
==4741== 
==4741== Thread #1: pthread_cond_destroy: destruction of unknown cond var
==4741==at 0x4C2EB28: pthread_cond_destroy_WRK (hg_intercepts.c:940)
==4741==by 0x4C2FA44: pthread_cond_destroy@* (hg_intercepts.c:958)
==4741==by 0x400AFC: main (testcase_pthread_cond.cpp:21)
==4741== 
pthread_cond_destroy 0xffefff390 in helgrind, AFTER DO_ and before CALL_
pthread_cond_destroy(0xffefff390) said 0


-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE, in particular KDE Frameworks 5
#include pthread.h
#include stdio.h

// Modelled after QWaitCondition, to debug helgrind's pthread_cond_destroy: destruction of unknown cond var

void qt_initialize_pthread_cond(pthread_cond_t *cond)
{
pthread_condattr_t condattr;
pthread_condattr_init(condattr);
int ok = pthread_cond_init(cond, condattr);
fprintf(stderr, pthread_cond_init(%p) said %d\n, cond, ok);
pthread_condattr_destroy(condattr);
}

int main( int argc, char** argv ) {
pthread_cond_t cond;
qt_initialize_pthread_cond(cond);
fprintf(stderr, cond = %p\n, cond);
int ok = pthread_cond_destroy(cond);
fprintf(stderr, pthread_cond_destroy(%p) said %d\n, cond, ok);
return 0;
}

--
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_mar___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


Re: [Valgrind-users] Qt5 support in helgrind

2013-01-23 Thread David Faure
On Wednesday 23 January 2013 12:24:30 Julian Seward wrote:
  ==9070== Lock at 0xD81F6F8 was first observed
  ==9070==at 0x4C3077F: QMutex::QMutex(QMutex::RecursionMode)
  (hg_intercepts.c:2186) ==9070==by 0x4C307A4:
  QMutex::QMutex(QMutex::RecursionMode) (hg_intercepts.c:2192) ==9070==   
  by 0x585A9CE: QPostEventList::QPostEventList() (qthread_p.h:110) [...]
  
  Should I just duplicate the code of the wrappers instead?
  Or is there a more clever solution I'm missing?
 
 One alternative approach -- which doesn't really solve the problem, but
 which you might want to look at -- is to put the real code in a worker
 function and call that from all entry points.  For example, see how
 sem_wait_WRK is used in hg_intercepts.c.  That doesn't get rid of the
 second stack frame, but it does at least avoid the impression that
 there's some kind of strange recursion going on.  Personally I quite
 like this scheme, in that it doesn't duplicate code.

OK.

 If you turned the __attribute__((noinline)) into
 __attribute__((always_inline)) then I think you'd get rid of the extra
 frame without having to duplicate the code by hand.

I tried, and it works, but it makes gcc warn:
hg_intercepts.c:2048:13: warning: always_inline function might not be 
inlinable [-Wattributes]

  What I don't know is whether some implementations might should differ from
  the Qt4 implementation...
 
 Right.  That's the real problem.  I don't think there's an easy way to
 figure this out, short of comparing the Qt4 and Qt5 implementations of
 the relevant functions.

Yeah. But in fact, the implementation of QMutex itself having changed, doesn't 
matter for helgrind's intercepts. The API is the same, so the expected 
behavior is the same, so helgrind can react the same. So I think we're fine for 
now.

 Once you have a patch you're satisfied with, I'd be happy to commit it.

Please find the patch attached.

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE, in particular KDE Frameworks 5
Index: hg_intercepts.c
===
--- hg_intercepts.c	(revision 13107)
+++ hg_intercepts.c	(working copy)
@@ -2033,9 +2033,15 @@
ret_ty I_WRAP_SONAME_FNNAME_ZU(libQtCoreZdsoZa,f)(args); \
ret_ty I_WRAP_SONAME_FNNAME_ZU(libQtCoreZdsoZa,f)(args)
 
+// soname is libQt5Core.so.4 ; match against libQt5Core.so*
+#define QT5_FUNC(ret_ty, f, args...) \
+   ret_ty I_WRAP_SONAME_FNNAME_ZU(libQt5CoreZdsoZa,f)(args); \
+   ret_ty I_WRAP_SONAME_FNNAME_ZU(libQt5CoreZdsoZa,f)(args)
+
 //---
 // QMutex::lock()
-QT4_FUNC(void, _ZN6QMutex4lockEv, void* self)
+//__attribute__((always_inline)) // works, but makes gcc warn...
+static void QMutex_lock_WRK(void* self)
 {
OrigFn fn;
VALGRIND_GET_ORIG_FN(fn);
@@ -2056,9 +2062,16 @@
}
 }
 
+QT4_FUNC(void, _ZN6QMutex4lockEv, void* self) {
+QMutex_lock_WRK(self);
+}
+QT5_FUNC(void, _ZN6QMutex4lockEv, void* self) {
+QMutex_lock_WRK(self);
+}
+
 //---
 // QMutex::unlock()
-QT4_FUNC(void, _ZN6QMutex6unlockEv, void* self)
+static void QMutex_unlock_WRK(void* self)
 {
OrigFn fn;
VALGRIND_GET_ORIG_FN(fn);
@@ -2080,10 +2093,17 @@
}
 }
 
+QT4_FUNC(void, _ZN6QMutex6unlockEv, void* self) {
+QMutex_unlock_WRK(self);
+}
+QT5_FUNC(void, _ZN6QMutex6unlockEv, void* self) {
+QMutex_unlock_WRK(self);
+}
+
 //---
 // bool QMutex::tryLock()
 // using 'long' to mimic C++ 'bool'
-QT4_FUNC(long, _ZN6QMutex7tryLockEv, void* self)
+static long QMutex_tryLock_WRK(void* self)
 {
OrigFn fn;
long   ret;
@@ -2110,10 +2130,17 @@
return ret;
 }
 
+QT4_FUNC(long, _ZN6QMutex7tryLockEv, void* self) {
+return QMutex_tryLock_WRK(self);
+}
+QT5_FUNC(long, _ZN6QMutex7tryLockEv, void* self) {
+return QMutex_tryLock_WRK(self);
+}
+
 //---
 // bool QMutex::tryLock(int)
 // using 'long' to mimic C++ 'bool'
-QT4_FUNC(long, _ZN6QMutex7tryLockEi, void* self, long arg2)
+static long QMutex_tryLock_int_WRK(void* self, long arg2)
 {
OrigFn fn;
long   ret;
@@ -2141,6 +2168,12 @@
return ret;
 }
 
+QT4_FUNC(long, _ZN6QMutex7tryLockEi, void* self, long arg2) {
+return QMutex_tryLock_int_WRK(self, arg2);
+}
+QT5_FUNC(long, _ZN6QMutex7tryLockEi, void* self, long arg2) {
+return QMutex_tryLock_int_WRK(self, arg2);
+}
 
 //---
 // It's not really very clear what the args are here.  But from
@@ -2151,9 +2184,7 @@
 // is that of the mutex and the second is either zero or one,
 // probably being the recursion mode, therefore.
 // QMutex::QMutex(QMutex::RecursionMode)  (C1ENS variant)
-QT4_FUNC(void*, _ZN6QMutexC1ENS_13RecursionModeE,
- void* mutex,
- long  recmode)
+static void* QMutex_constructor_WRK(void* mutex, long recmode

Re: [Valgrind-users] helgrind and atomic operations

2013-01-23 Thread David Faure
On Monday 27 August 2012 15:25:14 Marc Mutz wrote:
 If atomic loads and stores on x86 are implemented with a volatile cast,
 then  the compiler can't reorder stuff around them, either. Not more than
 with a std::atomic, at least. QAtomic does that. For load-relaxed, Thiago
 thinks that a normal read (non-volatile) is correct and I couldn't prove
 him wrong.

I was talking to Julian about this again today, and he pointed me to this 
writeup:

http://software.intel.com/en-us/blogs/2007/11/30/volatile-almost-useless-for-multi-threaded-programming

We're looking at how to silence valgrind about Qt atomic ops, but before that,
it would actually be good to be sure that what Qt does it correct, on x86

Where does the claim about volatile cast means the compiler can't reorder 
stuff around them
come from?

In the Qt source code I see qatomic_gcc.h (which is unused, unfortunately) 
calling
__sync_synchronize() in loadAcquire(). Shouldn't Qt's code call that, when 
compiled with gcc?

This does lead to different assembly, too, on x86.
So the claim that x86 doesn't need memory barriers seems wrong?

--- testcase_atomic_ops_helgrind.s.orig 2013-01-23 15:04:20.889417624 +0100
+++ testcase_atomic_ops_helgrind.s  2013-01-23 15:07:06.938422071 +0100
@@ -380,6 +380,7 @@ _ZN10QAtomicOpsIiE11loadAcquireERKi:
movq-24(%rbp), %rax
movl(%rax), %eax
movl%eax, -4(%rbp)
+   mfence
movl-4(%rbp), %eax
popq%rbp
.cfi_def_cfa 7, 8
@@ -403,6 +404,7 @@ _ZN10QAtomicOpsIiE12storeReleaseERii:
movq-8(%rbp), %rax
movl-12(%rbp), %edx
movl%edx, (%rax)
+   mfence
popq%rbp
.cfi_def_cfa 7, 8
ret


-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE, in particular KDE Frameworks 5


--
Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
MVPs and experts. ON SALE this month only -- learn more at:
http://p.sf.net/sfu/learnnow-d2d
___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


Re: [Valgrind-users] helgrind and atomic operations

2013-01-19 Thread David Faure
For a different approach to this issue: if helgrind can't reliably detect 
atomic operations, as discussed before, and if VALGRIND_HG_ENABLE_CHECKING is 
broken (see my previous email in this thread), then a simple solution is to 
add suppressions.

{
   QBasicAtomicPointer_load
   Helgrind:Race
   fun:_ZNK19QBasicAtomicPointer*loadAcquireEv
}
has cut down the noise considerably in my tests.
Plus a few similar ones, like QBasicAtomicInt.

Historically we had Qt-related suppressions in a suppression file shipped with 
KDE, but there are many more Qt users than the KDE developers. Would it be OK
to ship a qt5.supp file within valgrind, and load it unconditionnally, like 
xfree-4.supp is currently handled?

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE, in particular KDE Frameworks 5


--
Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
MVPs and experts. SALE $99.99 this month only -- learn more at:
http://p.sf.net/sfu/learnmore_122912
___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


Re: [Valgrind-users] helgrind/drd annotations for one statement

2012-12-05 Thread David Faure
On Wednesday 05 December 2012 16:39:47 Leif Walsh wrote:
 The important synchronization point isn't the rdunlock, it's the wrlock. 

Well, you need two locks, for a happens-before relationship to be established.
If you remove the rdlock/rdunlock completely (since it could basically be a 
no-op for a write operation anyway, as others pointed out earlier), then this 
will be more clear: this write might never become visible to the other thread.

 I have a hard time believing that you can take a pthread write lock and then
 look at a value some other processor wrote before you took the lock and not
 get that value.

You say before, but this assumes a global ordering, which you don't get, 
when not using atomics or the proper locks.
Each CPU can have a different notion of before, without the correct 
synchronization primitives.

I recommend reading C++ Concurrency in action by Anthony Williams, it taught 
me a lot on all this all works... definitely not a simple topic.

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE, in particular KDE Frameworks 5


--
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


Re: [Valgrind-users] helgrind/drd annotations for one statement

2012-12-05 Thread David Faure
On Wednesday 05 December 2012 16:52:42 Leif Walsh wrote:
 Rdunlock happens before wrlock.

... and is supposed to be about reading only, so why would CPUs bother to 
propagate modified data inside that lock, to other CPUs?

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE, in particular KDE Frameworks 5


--
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


Re: [Valgrind-users] RFC: more flexible way to show or count as error or suppress leak kinds

2012-11-30 Thread David Faure
On Thursday 29 November 2012 23:35:03 Philippe Waroquiers wrote:
 On Thu, 2012-11-29 at 08:44 +0100, David Faure wrote:
   Here are the new command lines args:
   --show-leak-kinds=kind1,kind2,.. which leak kinds to show?
   
   [definite,possible]
   
   --errors-for-leak-kinds=kind1,kind2,..  which leak kinds are errors?
   
   [definite,possible]
   
   where kind is one of definite indirect possible reachable all
   
   none
  
  This sounds good, but I'm missing one piece of information: what will the
  default values be?
 
 The default values are indicated in [] in the --help above.
 These default values are backward compatible with the current default
 values.

Ah, I thought this was the list of possible values in that field. OK.

  It would be good for this to have sane defaults, so that most users
  don't
  actually need to specify these options.
  Would this mean show for possible and error for definite?
 
 It is expected that keeping the same default behaviour as today
 is the sane default.

I thought this was a good opportunity to improve upon today's behavior and not 
error for possible leaks, only show  them. But I'm no expert on the matter.

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE, in particular KDE Frameworks 5


--
Keep yourself connected to Go Parallel: 
TUNE You got it built. Now make it sing. Tune shows you how.
http://goparallel.sourceforge.net
___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


Re: [Valgrind-users] RFC: more flexible way to show or count as error or suppress leak kinds

2012-11-28 Thread David Faure
On Wednesday 28 November 2012 23:56:55 Philippe Waroquiers wrote:
 Currently, Valgrind does not provide a fully flexible
 way to indicate which leak kinds to show,
 which leak kinds to consider as an error,
 and which leak kinds to suppress.
 This is a.o. described in bugs 284540 and 307465.
 
 For example, the current options
 (--show-reachable=yes|no --show-possibly-lost=yes|no)
 do not allow to indicate that reachable blocks should
 be considered as an error.
 
 There is also no way to indicate that possibly lost
 blocks are not an error (whatever the value of --show-possibly-lost).
 
 Leak suppression entries are also currently catching all leak kinds.
 For example, if you have possibly lost blocks which you want
 to suppress, the suppression entry will also suppress
 definitely lost blocks allocated at the same stack trace,
 thereby hiding/suppressing real leaks.
 
 
 The patch attached to bug 307465 implements a flexible way to specify
 on the command line which leak kinds to show and which
 leak kinds to consider as an error.
 It also provides a way to have a leak suppression entry
 matching only a specific set of leak kinds.
 
 Here are the new command lines args:
 --show-leak-kinds=kind1,kind2,.. which leak kinds to show?
 [definite,possible]
 --errors-for-leak-kinds=kind1,kind2,..  which leak kinds are errors?
 [definite,possible]
 where kind is one of definite indirect possible reachable all
 none

This sounds good, but I'm missing one piece of information: what will the 
default values be?

It would be good for this to have sane defaults, so that most users don't 
actually need to specify these options.
Would this mean show for possible and error for definite?

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE, in particular KDE Frameworks 5


--
Keep yourself connected to Go Parallel: 
VERIFY Test and improve your parallel project with help from experts 
and peers. http://goparallel.sourceforge.net
___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


Re: [Valgrind-users] helgrind markup for lock-ordering

2012-11-07 Thread David Faure
On Tuesday 06 November 2012 22:56:32 Philippe Waroquiers wrote:
 On Tue, 2012-11-06 at 13:43 +0100, David Faure wrote:
  On Monday 05 November 2012 23:19:42 Philippe Waroquiers wrote:
   On Mon, 2012-11-05 at 18:59 +0100, David Faure wrote:
The testcase http://www.davidfaure.fr/2012/qmutex_trylock.cpp
(from https://bugs.kde.org/show_bug.cgi?id=243232)
shows that an optimization inside Qt leads to a helgrind warning about
wrong lock ordering, making the use of that feature impossible for
detecting actual problems elsewhere (i.e. I have to use
--track-lockorders=no all the time).

Technically if we ignore the distinction between lock and tryLock,
helgrind is right, we did lock the mutexes in reverse order. But
because
it's a tryLock, it couldn't possibly deadlock.

Should helgrind simply ignore all pthread_mutex_trylock calls, for the
lockorder detection, even if they succeed? I think so, actually (by
definition they couldn't deadlock, which is what track-lockorders is
all
about).
   
   A deadlock can appear if there is a mixture of trylock and lock on
   the same lock. So, trylock cannot just be ignored.
   E.g.
   
Thread 1:trylock mutex1

 lock mutex2

Thread 2:lock mutex2

 lock mutex1
   
   might deadlock.
  
  True.
  This means that only trylock in second place should be ignored. More on
  this below.
 
 More generally, I guess that you mean trylock in last place should
 be ignored (rather than the special case of 2nd place).

Right. 

 This might be difficult to implement as each time a lock is taken,
 helgrind checks for order violation. I suspect a later lock operation
 might then transform a trylock in last place to a trylock which is
 now not anymore in last place.

Right. So let's record it, but let's not warn at the precise time the 
successful trylock happens.
If another lock happens later, out of order due to the successful trylock, 
then yes, let's warn.

 But of course, when the trylock operation has just been done,
 this trylock is last place and so if we would ignore it, then
 this would be similar to always ignore the trylock, which is not ok.

Right, not ignore it as if it didn't happen, but ignore it as in don't warn 
right now, and still record it.

 Currently, helgrind maintains a graph of lock order.
 I suspect we might need different graph node types and/or edge types
 to cope with trylock. For sure, more investigations needed looking
 in depth at the current algorithm.

I think it would be good enough to add the successful trylock to the graph, 
just without a warning at that point.

   Even without mixture, isn't the below slightly bizarre/dangerous ?
   
 Thread 1:  trylock mutex1
 
trylock mutex2
 
 Thread 2:  trylock mutex2
 
trylock mutex1
  
  No deadlock can ever happen here.
 
 Yes, no deadlock can occur. However, this is really a really doubful
 construction. The question is: should helgrind report a lock order
 warning for such constructs ?

I don't think so, due to the valid use cases I'm showing here.

 The idea of helgrind is that it detects lock order problems and/or
 race condition problems *even* if no deadlock happens and/or if no
 race condition really happened.
 Maybe it is very unlikely that the trylock fails. Still would be nice
 to discover the bug. And if the trylock does not fail, then the race
 condition will then not be detected by helgrind.

The simpler construct that can lead to this problem is
*  trylock mutex1
*  access shared data

If the trylock fails, then we have a race condition.
If it succeeds, then there is no problem.
I don't see how you can detect the potential race condition with helgrind when 
the trylock succeeds, unless you go as far as saying all trylocks are bad
(which was actually my thinking until reading the code of QOrderedMutexLocker
which has a valid use case for trylock -- maybe there are more, of course).

But this seems impossible to detect. I mean, the code could say
  if (trylock succeeds)
  // do something
  else
  // do something else
You're working on a runtime analysis tool, not on a static analysis tool,
so you can't know anything about the branch that is not being taken in a given 
run of helgrind. Therefore I see no way of saying there could have been a 
race if this trylock had failed, but it actually succeeded in this run.

So I don't disagree on it would be nice to discover the bug, but if it's 
impossible then let's drop the idea and come back to what is actually possible 
:)

  Yes this is exactly that QOrderedMutexLocker does.
  
 Thread 1:  lock mutex1
 
lock mutex2
 
 Thread 2:  lock mutex2
 
trylock mutex1
if that failed,

  unlock mutex2
  lock mutex1
  lock mutex2

Re: [Valgrind-users] helgrind markup for lock-ordering

2012-11-07 Thread David Faure
On Wednesday 07 November 2012 23:00:51 Philippe Waroquiers wrote:
 On Wed, 2012-11-07 at 10:51 +0100, David Faure wrote:
   The idea of helgrind is that it detects lock order problems and/or
   race condition problems *even* if no deadlock happens and/or if no
   race condition really happened.
   Maybe it is very unlikely that the trylock fails. Still would be nice
   to discover the bug. And if the trylock does not fail, then the race
   condition will then not be detected by helgrind.
  
  The simpler construct that can lead to this problem is
  *  trylock mutex1
  *  access shared data
 
 discover the bug is related to the doubful construct, not
 to a race condition

If there's no race condition and no deadlock, I'm not sure what bug you want 
to detect :-)

 Note that currently, laog is producing messages which should
 be considered as lock order warning, not as
 for sure there is a deadlock order problem.

Yes, but why does it warn about lock order? Because it could cause deadlocks.

I agree, this is about potential deadlocks, not actual deadlocks. But
trylock 1 + trylock 2 vs trylock2 + trylock1 (the case we're talking about in 
this part of the mail) is not even a potential deadlock. It can't ever 
deadlock. So there's nothing to warn about.

 The trylock is one case of warning, not an error which could/should
 be improved.
 But there are others e.g. laog does not understand the concept of
 guard locks which is (IIUC): each thread can acquire a single lock
 in a set of locks. If a thread wants to acquire more than one lock
 (in any order then), it first has to acquire the guard lock,
 and then can lock in any order any nr of locks in the lock set.
 With this guard lock, not possible to have a deadlock,
 but for sure this is not understood by the current helgrind
 laog algorithm.

Right. That one definitely needs annotations in the source code, I would think.
There's no way for the tool to detect that these mutexes all go together.

  At the time of the trylock, it is the last one - no warning at that
  precise moment. This sounds like a simple enough change in the current
  algorithm? Basically adding one if() ... if only I knew where ;)
 
 To avoid doing a lock order warning when doing the trylock is easy
 I believe:
 in hg_main.c:3697, put a condition
 'if (!is_a_try_lock)
 before:
other = laog__do_dfs_from_to(lk, thr-locksetA);
if (other) {

 (where is_a_trylock has to be given by the caller).

I gave it a try, but I'm hitting a problem with exactly that, passing 
isTryLock to that code. isTryLock is set in HG_PTHREAD_MUTEX_LOCK_PRE and 
similar, while the above code is called from HG_PTHREAD_MUTEX_LOCK_POST and 
similar. If I understand correctly, adding an argument to the _POST variant 
would break source compatibility for the existing userland macros?

 I think it is almost mechanical work to add arguments to *POST event
 handlers and corresponding requests to transfer the is a try lock
 from the helgrind interception to the line 3697).

Ah OK, so you don't seem to see a problem with adding an argument :)
No source compatibility for the user request macros is OK?

I'll finish the patch then, but only if you agree with the approach, otherwise 
this would be dead code, i.e. a wasted effort. At this point you don't seem 
fully convinced :)

 But I suspect that the insertion of a trylock in the graph
 might later on cause a 'wrong' cycle to be detected.
 E.g. (L = lock, T = trylock, L and T followed by lock nr)
   threadA L1 T2
   threadB L2 L3
   threadC L3 L1
 cannot deadlock (I think :) if threadA releases lock 1
 when T2 fails.

Well, if T2 fails then we have no cycle, and if it succeeds we have a real 
potential deadlock. The question is whether we want to remember the T2 attempt 
(and warn later) even when T2 fails. I would say, if it failed, it's like it 
didn't happen.
Do you actually store failed trylocks currently?

 But when L3 L1 will be inserted, a cycle will be found
 via T2 (if the graph has not remembered this is a trylock).
 So, I am still (somewhat intuitively) thinking that we need
 to have nodes and/or edges marked with this is a trylock
 and have the graph exploration taking these marking into account
 to not generate such false warnings.

My idea is rather that a successful trylock is just like a lock (except that 
it shouldn't warn at that precise moment, but if any later real-lock happens 
out of order with it, then we should warn, this is why successful trylocks 
should be recorded), and a failed trylock should NOT be recorded.
So at this point I don't see a need for remembering this was a trylock. I 
don't see how this could matter later on.

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE, in particular KDE Frameworks 5


--
LogMeIn Central: Instant, anywhere, Remote PC access and management.
Stay in control, update software, and manage PCs from one

Re: [Valgrind-users] helgrind markup for lock-ordering

2012-11-06 Thread David Faure
On Monday 05 November 2012 23:19:42 Philippe Waroquiers wrote:
 On Mon, 2012-11-05 at 18:59 +0100, David Faure wrote:
  The testcase http://www.davidfaure.fr/2012/qmutex_trylock.cpp
  (from https://bugs.kde.org/show_bug.cgi?id=243232)
  shows that an optimization inside Qt leads to a helgrind warning about
  wrong lock ordering, making the use of that feature impossible for
  detecting actual problems elsewhere (i.e. I have to use
  --track-lockorders=no all the time).
  
  Technically if we ignore the distinction between lock and tryLock,
  helgrind is right, we did lock the mutexes in reverse order. But because
  it's a tryLock, it couldn't possibly deadlock.
  
  Should helgrind simply ignore all pthread_mutex_trylock calls, for the
  lockorder detection, even if they succeed? I think so, actually (by
  definition they couldn't deadlock, which is what track-lockorders is all
  about).
 A deadlock can appear if there is a mixture of trylock and lock on
 the same lock. So, trylock cannot just be ignored.
 E.g.
  Thread 1:trylock mutex1
   lock mutex2
 
  Thread 2:lock mutex2
   lock mutex1
 might deadlock.

True.
This means that only trylock in second place should be ignored. More on this 
below.

 Even without mixture, isn't the below slightly bizarre/dangerous ?
   Thread 1:  trylock mutex1
  trylock mutex2
 
   Thread 2:  trylock mutex2
  trylock mutex1

No deadlock can ever happen here.

 If the 2nd trylock fails, what is the plan B ?

If the program then accesses shared data, a race condition will happen and 
will be detected by helgrind anyway. So ignoring the ordering of these 
trylocks is ok, I would think. Of course helgrind must record we got the 
lock, for the race condition detection feature, but it shouldn't warn about 
the wrong order of the locking, since it can't possibly deadlock.
Not that the above would be good programming practice, of course, but helgrind 
can't say anything about it if all the locks were acquired. It will warn in 
another run, where some trylock fails, and a race ensues.

 It seems that a task must unlock all locks and restart
 from scratch in the above case.

Yes this is exactly that QOrderedMutexLocker does.

   Thread 1:  lock mutex1
  lock mutex2
 
   Thread 2:  lock mutex2
  trylock mutex1
  if that failed,
unlock mutex2
lock mutex1
lock mutex2

If the trylock fails (because thread1 was first), then it unlocks and restarts 
from scratch. I can't see a deadlock risk with that, so ideally helgrind 
shouldn't warn.

 I guess we might need an option such as:
--trylock-logic={normal-lock|local-retry|full-retry}
 normal-lock = current behaviour
 local-retry means the task would re-trylock
 full-retry means that the plan B is to unlock all locks
 and retry everything.

I don't see how this can be a global option. Some piece of code (like 
QOrderedMutexLocker) might have the full retry logic above, but other pieces 
of code might do something different - e.g. something wrong. It doesn't make 
sense to me to tell helgrind this is what all the code paths are going to do 
about tryLock, that's impossible to predict in a complex program.

Let me present another case:

   Thread 1:  lock mutex1
  lock mutex2
 
   Thread 2:  lock mutex2
  trylock mutex1
  if that fails, unlock mutex2 and give up

This could happen for a non-critical operation that can be canceled if it 
can't be done immediately. Again, no deadlock possible, so helgrind shouldn't 
warn about a successful trylock being out of order. And yet this isn't a 
full retry, so I don't think --trylock-logic=full-retry is the solution.

Deadlock can only happen if both threads use normal locking as their
second operation. A trylock as the second operation doesn't deadlock.

 Or maybe we would need the same info but with an annotation
 of the lock operation and/or of the lock itself ?

Sounds like an annotation of the trylock operation, unless we agree on my next 
statement:

 I am quite sure the simple logic trylock can be ignored
 is not ok for all cases.

Right. Let me refine that: trylock can be ignored as the second operation, 
i.e. helgrind shouldn't issue the out-of-order-locking-warning at the precise 
moment it's seeing a successful out-of-order trylock.
If we can agree on that, then there's no need for an annotation in the code.

  PS: see my previous email about VALGRIND_HG_ENABLE_CHECKING not working.
  Is this known? Should I report a bug?
 
 Always good to report a bug if you have found a bug :).
 Mail will be forgotten, bug entries are less likely to be lost.

OK, will do. On the other hand I'm getting more reaction about the tryLock 
issue here than on bug 243232 :-). Anyway, thanks for your input, much 
appreciated.

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE, in particular KDE

Re: [Valgrind-users] helgrind and atomic operations

2012-10-24 Thread David Faure
Hi, I'm finally coming back to this issue.

On Sunday 26 August 2012 12:49:27 Julian Seward wrote:
  I'm not sure what can be done then, to avoid a helgrind warning.
 
 If you can guarantee that // some calculation goes here touches only
 thread-local state, then there is only a race on sharedInt itself.  In
 which case, use VALGRIND_HG_{DISABLE,ENABLE}_CHECKING to disable reporting
 on the relevant specific instances of the sharedInt.

This seems to be what I need.
VALGRIND_HG_DISABLE_CHECKING(_q_value, sizeof(_q_value));
in loadAcquire silences the warning.

Surprisingly, VALGRIND_HG_ENABLE_CHECKING doesn't appear to work, though.
All races are suppressed, even obvious races that warn if disable+enable was 
never used before. Testcase attached, see the call to oops(). In my tests, 
ENABLE_CHECKING basically behaves like DISABLE_CHECKING (for instance if you 
simply put a ENABLE_CHECKING at the beginning of loadAcquire and nothing else, 
then there's no warning at all anymore).


 --
 
 My understanding of this is that it is in violation of the C++11 memory
 model, which says that the implementation may deliver stores from one
 core to another in any order, in the absence of any C++11 mandated inter-
 thread synchronisation.
 
 You can argue that for x86 the hardware's TSO guarantees make this
 harmless ...
 
  Marc Mutz said 
  The standard says it's racy, but the implementation of
 
 ... but AIUI, the implementation also includes the compiler, and I
 believe it has been observed that GCC will indeed break your code in
 unexpected ways, in some cases.  In short you need to prove that not
 only the hardware won't reorder stores between cores -- which for x86
 happens to be true -- but also the compiler won't.

Yes but AFAIU that's what the volatile does -- prevent the compiler from 
reordering.

However valgrind can't possibly find out that volatile was used, if all that 
does is disable compiler optimizations, so I agree that this cannot all work 
out of the box, valgrind-specific markup is definitely needed in the Qt atomics 
class. Current version of my patch attached -- no re-ENABLE for now, since it 
doesn't work anyway ;)

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE, in particular KDE Frameworks 5
#include pthread.h
#include stdio.h
#include /d/other/inst/include/valgrind/helgrind.h

template typename T struct QAtomicOps // qgenericatomic.h
{
static inline
T loadAcquire(const T _q_value)
{
T tmp = *static_castconst volatile T *(_q_value);
return tmp;
}

static inline
void storeRelease(T _q_value, T newValue)
{
*static_castvolatile T *(_q_value) = newValue;
}
};

class QBasicAtomicInt // qbasicatomic.h
{
public:
typedef QAtomicOpsint Ops;
int _q_value;

int loadAcquire() const {
VALGRIND_HG_DISABLE_CHECKING(_q_value, sizeof(_q_value));
const int ret = Ops::loadAcquire(_q_value);
VALGRIND_HG_ENABLE_CHECKING(_q_value, sizeof(_q_value));
return ret;
}
void storeRelease(int newValue) { Ops::storeRelease(_q_value, newValue); }

void oops() { _q_value = 63; }
};

// Modelled after qt_metatype_id()
static int onDemandNumber() {
static QBasicAtomicInt sharedInt = { 0 };
if (!sharedInt.loadAcquire()) {
// some calculation goes here
sharedInt.storeRelease(41);
}
sharedInt.oops(); // ### surely this should warn!
return sharedInt.loadAcquire();
}

void * threadStart(void *)
{
printf(%d\n, onDemandNumber());
printf(%d\n, onDemandNumber());
printf(%d\n, onDemandNumber());
printf(%d\n, onDemandNumber());
printf(%d\n, onDemandNumber());
printf(%d\n, onDemandNumber());
return 0;
}


int main( int argc, char** argv ) {
pthread_t thread1;
if ( pthread_create(thread1, 0, threadStart, 0) )
return 1;
pthread_t thread2;
if ( pthread_create(thread2, 0, threadStart, 0) )
return 1;

void* v;
pthread_join(thread1, v);
pthread_join(thread2, v);
return 0;
}




commit 605d6ce526063105cba69c1cddfbb9fd7833a47e
Author: David Faure fa...@kde.org
Date:   Tue Oct 23 22:19:39 2012 +0200

Use helgrind macros to silence its warnings in QBasicAtomicInt/Pointer

Change-Id: I5134106e8ac0e5d124226b563bb892d725723ba4

diff --git a/src/corelib/thread/qbasicatomic.h b/src/corelib/thread/qbasicatomic.h
index 3e9c72b..b3d5a55 100644
--- a/src/corelib/thread/qbasicatomic.h
+++ b/src/corelib/thread/qbasicatomic.h
@@ -98,6 +98,19 @@
 #  error Qt has not been ported to this platform
 #endif
 
+#if defined(Q_OS_LINUX) || defined(Q_OS_MAC)
+#define QTCORE_USE_HELGRIND
+#else
+#undef QTCORE_USE_HELGRIND
+#endif
+
+#ifdef QTCORE_USE_HELGRIND
+#include helgrind_p.h
+#define QT_HG_DISABLE_CHECKING(start, len) VALGRIND_HG_DISABLE_CHECKING(start, len)
+#else
+#define QT_HG_DISABLE_CHECKING(start, len)
+#endif
+
 // Only include if the implementation has been ported to QAtomicOps

Re: [Valgrind-users] closing bracket of destructor

2012-10-02 Thread David Faure
On Tuesday 02 October 2012 14:55:57 jody wrote:
 Th puzzling thing is that WorldTile.cpp:267 and TDWorker.cpp:130 refer
 to the closing brackets of the respective destructors.
 What does that mean?

This is where the local variables (the ones created on the stack) get deleted, 
obviously.

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE, in particular KDE Frameworks 5


--
Don't let slow site performance ruin your business. Deploy New Relic APM
Deploy New Relic app performance management and know exactly
what is happening inside your Ruby, Python, PHP, Java, and .NET app
Try New Relic at no cost today and get our sweet Data Nerd shirt too!
http://p.sf.net/sfu/newrelic-dev2dev
___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


Re: [Valgrind-users] Large application SIGSEGV when run in valgrind

2012-10-02 Thread David Faure
On Tuesday 02 October 2012 12:27:12 Pierre-Luc Provencal wrote:
 ==11125==by 0x8188CED: __static_initialization_and_destruction_0(int, 
 int) (Base64.cpp:29)
 ==11125==by 0x8188D2F: global constructors keyed to Base64.cpp 
 (Base64.cpp:121)

Greetings from Provence :-)

What does the code at these two lines, say?

Unless valgrind is wrong, something funky is being done by that code, so it 
might help to see what it's doing.

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Working on KDE, in particular KDE Frameworks 5


--
Don't let slow site performance ruin your business. Deploy New Relic APM
Deploy New Relic app performance management and know exactly
what is happening inside your Ruby, Python, PHP, Java, and .NET app
Try New Relic at no cost today and get our sweet Data Nerd shirt too!
http://p.sf.net/sfu/newrelic-dev2dev
___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


[Valgrind-users] Fwd: Re: helgrind and atomic operations

2012-08-28 Thread David Faure

--  Forwarded Message  --

Subject: Re: [Valgrind-users] helgrind and atomic operations
Date: Monday 27 August 2012, 15:25:14
From: Marc Mutz marc.m...@kdab.com
To: David Faure fa...@kde.org
CC: valgrind-users@lists.sourceforge.net

[I can't post to valgrind-users, please fwd]

On Monday August 27 2012, David Faure wrote:
 On Sunday 26 August 2012 12:53:41 Marc Mutz wrote:
  On Sunday August 26 2012, David Faure wrote:
[...]
   static int onDemandNumber() {
  
   static QBasicAtomicInt sharedInt = { 0 };
   if (!sharedInt.loadAcquire()) {
  
   // some calculation goes here
   sharedInt.storeRelease(42);
  
   }
   return sharedInt.loadAcquire();
  
   }
 
  If sharedInt is a std::atomicint, then the above is not a C++11 data
  race,

 No, it's a bare int, just like Qt5's QBasicAtomicInt does by default on
 x86. You missed the initial email with the testcase, here it is
 (testcase_atomic_ops_helgrind.cpp).

The point I was trying to make was that unless Q*Atomic has the same semantics 
as std::atomic, we don't need to continue talk about this, so let's assume 
that it should have, and check what's missing, then. And, according to the 
threads here:
  http://www.decadent.org.uk/pipermail/cpp-threads/2008-December/
the implementation of std::atomic would (except for IA-64 load) look exactly 
the same as QAtomic, save that the compiler must magically know it's dealing 
with an atomic operation, and not reorder them. AFAIU, this is ensured by the 
use of asm() in the QAtomic code, or alternatively with a volatile cast.

I'm still not sure that Helgrind will be able to detect atomic operations from 
assembler instructions alone, because at least on x86, all loads and stores 
are already atomic (because of E in MESI), unless they're misaligned and/or 
cross cache-line boundaries.

What's worth, memory barriers on x86 are no-ops, unless you go for 
std::memory_order_seq_cst, which QAtomics don't implement.

So, IMO, QAtomic needs to mark up the code such that valgrind can see that 
*this* MOV comes from an atomic, and thus never races against another 
*atomic* MOV on the same memory location, but races against a MOV that's not 
emitted from std::atomic or QAtomic: the identical assembler code could mean 
a C++11 data race or not; the semantics have been lost at the compilation 
stage.

  so Helgrind shouldn't warn.

 Helgrind warns, hence my mail here.

  If 'some calculation goes here' doesn't write
  to memory global memory, you could even drop the memory ordering. Atomics
  don't participate in data races, but they might also not establish the
  happens-before that you need elsewhere.

 This is obviously a reduced testcase ;)
 The calculation, in the example I took, is to register the metatype, which
 is global stuff, but which is somewhat safe if done twice (it will return
 the same number). Inside QMetaType::registerNormalizedType there's a mutex.
 The whole point, though, as I see it, is to avoid locking in the very
 common case where the metatype has been registered already.

It's the job of QMetaType to ensure visibility. All data that could be thought 
of being associated with the ID is private to QMetaType and both in the 
current as well as the new one proposed in 
https://codereview.qt-project.org/#change,30559 contain additional fences to 
order writes to and reads from the custom metatype storage. So the memory in 
qt_metatype_id() are indeed unnecessary.

[...]
  If it writes to a shared memory location, the code contains a C++11 data
  race, and you need more complex synchronisation

 Yes, which is there, but that's not the point. From what you say, the code
 is fine, so helgrind shouldn't warn about sharedInt itself. However, Julian
 says the code is not fine, the compiler could generate code that doesn't
 work correctly. I don't know who to trust at this point, I'm caught in the
 middle

 :-).

 If this is safe, helgrind should be fixed. If this is not safe, Qt should
 be fixed. But it sounds to me like different people have different
 interpretations on what is safe, in such code :-)

There's only one interpretation: C++11. Under C++11, QAtomic has a data race, 
because it's not using the std atomic operations. OTOH, those very same std 
operations should produce identical assembler instruction sequences compared 
with QAtomic, so there's no actual data race. What's more, a C++11-data-racy 
access to a memory location is indistinguishable from an atomic, and 
therefore race-free, access. - need for markup

Damn x86 and it's sequential consistency :)

 I just realized one thing though: I was wrong when I said it is racy
 because the value could be 0 or 42, that is not a race. Writing the same
 code with a mutex around the int leads to exactly this behavior (0 or 42),
 which is fine. See testcase_int_mutex_helgrind.cpp. No helgrind warning,
 and I think everyone agrees that this code is ok. So if ints are read and
 written atomically

Re: [Valgrind-users] helgrind and atomic operations

2012-08-27 Thread David Faure
On Sunday 26 August 2012 12:53:41 Marc Mutz wrote:
 On Sunday August 26 2012, David Faure wrote:
  On Sunday 26 August 2012 11:28:06 Julian Seward wrote:
   On Sunday, August 26, 2012, David Faure wrote:
Thiago expects that helgrind can't autodetect this case and that
helgrind- macros markup is needed in Qt, I'm fine with adding that if
you guys agree -- after you show me how by modifying the attached
example :-)
   
   IIUC then, QBasicAtomicInt::{loadAcquire,storeRelease} just does normal
   loads and stores of an int.  Right?
  
  Yep (on x86). This whole API exists so that other architectures can get
  another implementation.
  
   What atomicity properties are you expecting onDemandNumber() to have?
   Viz, how is onDemandNumber supposed to behave when you have multiple
   threads doing it at the same time on the same location?
  
  static int onDemandNumber() {
  
  static QBasicAtomicInt sharedInt = { 0 };
  if (!sharedInt.loadAcquire()) {
  
  // some calculation goes here
  sharedInt.storeRelease(42);
  
  }
  return sharedInt.loadAcquire();
  
  }
 
 If sharedInt is a std::atomicint, then the above is not a C++11 data race,

No, it's a bare int, just like Qt5's QBasicAtomicInt does by default on x86.
You missed the initial email with the testcase, here it is 
(testcase_atomic_ops_helgrind.cpp).

 so Helgrind shouldn't warn.

Helgrind warns, hence my mail here.

 If 'some calculation goes here' doesn't write
 to memory global memory, you could even drop the memory ordering. Atomics
 don't participate in data races, but they might also not establish the
 happens-before that you need elsewhere.

This is obviously a reduced testcase ;)
The calculation, in the example I took, is to register the metatype, which is 
global stuff, but which is somewhat safe if done twice (it will return the same 
number). Inside QMetaType::registerNormalizedType there's a mutex. The whole 
point, though, as I see it, is to avoid locking in the very common case where 
the metatype has been registered already.

  I think the point is that the first call to loadAcquire() should either
  return 0 or 42, but never some intermediate value due to a write in
  progress.
 
 Correct. What's more: for any thread, any read of an atomic variable must
 return a value previously read, or a value written later by another thread.
 IOW: once a thread sees 42, it can't magically see 0 the next time.
 According to Hans Boehm, this may happen on IA-64 because the architecture
 allows to reorder reads from the same memory location.

Which is why the atomic stuff expands to different code on IA-64, so no problem.

 If it writes to a shared memory location, the code contains a C++11 data
 race, and you need more complex synchronisation

Yes, which is there, but that's not the point. From what you say, the code is 
fine, so helgrind shouldn't warn about sharedInt itself. However, Julian says 
the code is not fine, the compiler could generate code that doesn't work 
correctly. I don't know who to trust at this point, I'm caught in the middle 
:-).

If this is safe, helgrind should be fixed. If this is not safe, Qt should be 
fixed. But it sounds to me like different people have different interpretations 
on what is safe, in such code :-)

I just realized one thing though: I was wrong when I said it is racy because 
the value could be 0 or 42, that is not a race. Writing the same code with a 
mutex around the int leads to exactly this behavior (0 or 42), which is fine. 
See testcase_int_mutex_helgrind.cpp. No helgrind warning, and I think everyone 
agrees that this code is ok. So if ints are read and written atomically, then 
the helgrind warning in the initial testcase is wrong? (ok I guess it's more 
complicated, if the compiler can reorder stuff in the first testcase and not 
the 
second).

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Sponsored by Nokia to work on KDE, incl. KDE Frameworks 5
#include pthread.h

template typename T struct QAtomicOps // qgenericatomic.h
{
static inline
T loadAcquire(const T _q_value)
{
T tmp = *static_castconst volatile T *(_q_value);
return tmp;
}

static inline
void storeRelease(T _q_value, T newValue)
{
*static_castvolatile T *(_q_value) = newValue;
}
};

class QBasicAtomicInt // qbasicatomic.h
{
public:
typedef QAtomicOpsint Ops;
int _q_value;

int loadAcquire() const { return Ops::loadAcquire(_q_value); }
void storeRelease(int newValue) { Ops::storeRelease(_q_value, newValue); }
};

// Modelled after qt_metatype_id()
static int onDemandNumber() {
static QBasicAtomicInt sharedInt = { 0 };
if (!sharedInt.loadAcquire()) {
// some calculation goes here
sharedInt.storeRelease(42);
}
return sharedInt.loadAcquire();
}

void * threadStart(void *)
{
onDemandNumber();
onDemandNumber();
onDemandNumber();
onDemandNumber

Re: [Valgrind-users] helgrind and atomic operations

2012-08-26 Thread David Faure
On Sunday 26 August 2012 01:30:43 David Faure wrote:
 On Saturday 25 August 2012 22:43:58 Julian Seward wrote:
  Or maybe Qt really is racey
 
 Bingo :)

OK, maybe not. Turns out the code runs as intended by the Qt developers.

Marc Mutz said 
The standard says it's racy, but the implementation of 
std::atomic::load(memory_order_acquire) won't look different. Simple reads 
and writes on x86 are already sequentially consistent. Think MESI cache 
coherency. Before a CPU writes to a memory location it needs to acquire 
exclusive ownership (E) of the cache line, the well-known hardware mutex on 
a cache line that produces False Sharing, too. This seems to hold for all 
architectures, cf. threads re: Brief example ... at 
http://www.decadent.org.uk/pipermail/cpp-threads/2008-December/thread.html


I attached a pure C++ testcase of the issue. Compile it with
g++ testcase_atomic_ops_helgrind.cpp -o testcase_atomic_ops_helgrind -lpthread

Thiago expects that helgrind can't autodetect this case and that helgrind-
macros markup is needed in Qt, I'm fine with adding that if you guys agree -- 
after you show me how by modifying the attached example :-)

Thanks.

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Sponsored by Nokia to work on KDE, incl. KDE Frameworks 5
#include pthread.h

template typename T struct QAtomicOps // qgenericatomic.h
{
static inline
T loadAcquire(const T _q_value)
{
T tmp = *static_castconst volatile T *(_q_value);
return tmp;
}

static inline
void storeRelease(T _q_value, T newValue)
{
*static_castvolatile T *(_q_value) = newValue;
}
};

class QBasicAtomicInt // qbasicatomic.h
{
public:
typedef QAtomicOpsint Ops;
int _q_value;

int loadAcquire() const { return Ops::loadAcquire(_q_value); }
void storeRelease(int newValue) { Ops::storeRelease(_q_value, newValue); }
};

// Modelled after qt_metatype_id()
static int onDemandNumber() {
static QBasicAtomicInt sharedInt = { 0 };
if (!sharedInt.loadAcquire()) {
// some calculation goes here
sharedInt.storeRelease(42);
}
return sharedInt.loadAcquire();
}

void * threadStart(void *)
{
onDemandNumber();
onDemandNumber();
onDemandNumber();
onDemandNumber();
return 0;
}


int main( int argc, char** argv ) {
pthread_t thread1;
if ( pthread_create(thread1, 0, threadStart, 0) )
return 1;
pthread_t thread2;
if ( pthread_create(thread2, 0, threadStart, 0) )
return 1;

void* v;
pthread_join(thread1, v);
pthread_join(thread2, v);
return 0;
}




--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


Re: [Valgrind-users] helgrind and atomic operations

2012-08-26 Thread David Faure
On Sunday 26 August 2012 11:28:06 Julian Seward wrote:
 On Sunday, August 26, 2012, David Faure wrote:
  Thiago expects that helgrind can't autodetect this case and that helgrind-
  macros markup is needed in Qt, I'm fine with adding that if you guys agree
  -- after you show me how by modifying the attached example :-)
 
 IIUC then, QBasicAtomicInt::{loadAcquire,storeRelease} just does normal
 loads and stores of an int.  Right?

Yep (on x86). This whole API exists so that other architectures can get 
another implementation.

 What atomicity properties are you expecting onDemandNumber() to have?
 Viz, how is onDemandNumber supposed to behave when you have multiple
 threads doing it at the same time on the same location?

static int onDemandNumber() {
static QBasicAtomicInt sharedInt = { 0 };
if (!sharedInt.loadAcquire()) {
// some calculation goes here
sharedInt.storeRelease(42);
}
return sharedInt.loadAcquire();
}

I think the point is that the first call to loadAcquire() should either return 
0 or 42, but never some intermediate value due to a write in progress.

Hmm, I see the point though. This *is* racy by design, since you don't know if 
you'll get 0 or 42, if another thread is running at the same time. The only 
thing is, we know it's fine to get either one, since we'll simply do the 
calculation twice if two threads get 0 at the same time. Overall this is more 
performant than using a mutex every single time this is called.

I'm not sure what can be done then, to avoid a helgrind warning.

I presume the only solution is to annotate onDemandNumber() itself, not 
QBasicAtomicInt. I.e. annotate all uses of QBasicAtomicInt where the overall 
logic makes it safe, in order to still be able to detect unsafe uses...
(like load, increment, store).

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Sponsored by Nokia to work on KDE, incl. KDE Frameworks 5


--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


[Valgrind-users] helgrind and atomic operations

2012-08-25 Thread David Faure
How do I tell helgrind that some atomic operations on integers are OK?

A suppression like this works, but it would make helgrind more useful to all Qt 
users if either atomic operations were handled automatically, or if this was 
added to valgrind's own suppressions.

{
   qt5_basic_atomic_integer
   Helgrind:Race
   fun:_ZNK19QBasicAtomicIntegerIiE4loadEv
}

Also, this only works for integers, not for pointers...

==8489== Possible data race during read of size 8 at 0x8B9C968 by thread #3
==8489==at 0x57549C0: QBasicAtomicPointerQTextCodec::loadAcquire() const 
(qgenericatomic.h:110)
==8489==by 0x5753A26: QTextCodec::codecForLocale() (qtextcodec.cpp:683)
==8489==by 0x558BD08: QString::toLocal8Bit() const (qstring.cpp:3959)
==8489== 
==8489== This conflicts with a previous write of size 8 by thread #2
==8489==at 0x57549A2: 
QBasicAtomicPointerQTextCodec::storeRelease(QTextCodec*) 
(qgenericatomic.h:119)
==8489==by 0x5757D5D: QIcuCodec::defaultCodecUnlocked() (qicucodec.cpp:441)
==8489==by 0x5753A43: QTextCodec::codecForLocale() (qtextcodec.cpp:687)
==8489==by 0x558BD08: QString::toLocal8Bit() const (qstring.cpp:3959)

Can't make suppressions for these, given that the template class could use any 
type
(here QTextCodec).

The implementation of loadAcquire/storeRelease is compiler and platform 
dependent.
With C++11 it uses std::atomic, with gcc (in non-c++11 mode) it uses some 
__sync_synchronize stuff,
on ARM there's a bit of assembly, etc.

I presume all this is too low-level for helgrind? I.e. can it actually detect 
atomic operations
and therefore validate Qt's implementation, or should we rather just trust Qt 
and aim for
silencing helgrind whenever it sees QBasicAtomic{Integer,Pointer}?

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Sponsored by Nokia to work on KDE, incl. KDE Frameworks 5


--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users