On Saturday 19 May 2012 19:05:58 Thiago Macieira wrote: > On sábado, 19 de maio de 2012 18.34.30, Olivier Goffart wrote: > > Hi, > > > > Regarding valgrind: > > *) On debug build, nothing is inlined. > > *) If we keep it inline, then we would just need a patch like this [1] > > -fno-inline doesn't help because of -fvisibility-inlines-hidden. The call > cannot be rerouted to valgrind.
Visibility does not really matter for valgrind. It does address redirection, using the debug symbols. You can see it works by trying the attached wrapper.c > The annotation you added might help, but as I said, adding instructions -- > even if they produce no architectural change -- still consumes CPU > resources. I'd like to benchmark the annotation vs the function call. Yes, they have a cost which I am not sure we want to pay on release build. > > Regarding Transactional memory: > > *) Notice the end of section 8.2.1: "Improper use of hints will not cause > > functional bugs though it may expose latent bugs already in the > > code.". So in other words, we can use XAQUIRE and XRELEASE without any > > problem in inline code, without binary compatibility issue > > Indeed, but note that what it says about transactions that abort too often. > If the transaction aborts, then the code needs to be re-run > non-transactionally, with the lock. That means decreased performance and > increased power consumption. Yes, but we are talking about the rare case in which a QMutex is shared between two different objects compiled with different version of Qt. And in that unlikely case, one can just recompile to fix the performance issue. > Note also that all x87 instructions will abort, so any transactions around > x87 code (32-bit floating point) would cause aborts. > > At this point, we don't know which mutex locks we should make transactional. > As I said, neither you nor I have a Haswell prototype to test on. At the > earliest, we'll be able to test this for Qt 5.2. Indeed, QMutex can be used for all sort of cases. There can be also way too much code in the critical section to fit into the transaction cache. Or maybe there is side effects. QMutexLocker lock(&mutex) qDebug() << "What now? does it also restart the transaction?" So it is probably bad to do the lock elision within QMutex... We need to test it on real hardware to see if it works. But my point is that the current QMutex architecture does not keep us from using lock elision later. -- Olivier Woboq - Qt services and support - http://woboq.com
/* * Wrapper for helgrind that works with Qt5 mutexes * * Compile and run with: gcc -shared -fPIC -o wrapper.so wrapper.c LD_PRELOAD=wrapper.so valgrind -tool=helgrind <Qt5 application> * * A debug build is required * * Olivier Goffart <ogoff...@woboq.com> */ #include <stdio.h> #include <valgrind/valgrind.h> #include <valgrind/helgrind.h> void I_WRAP_SONAME_FNNAME_ZU(Za,_ZN11QBasicMutex4lockEv)( void *mutex ) { OrigFn fn; VALGRIND_GET_ORIG_FN(fn); // printf("LOCK %p \n", mutex); DO_CREQ_v_WW(_VG_USERREQ__HG_PTHREAD_MUTEX_LOCK_PRE, void*, mutex, long, 0); CALL_FN_v_W(fn, mutex); DO_CREQ_v_W(_VG_USERREQ__HG_PTHREAD_MUTEX_LOCK_POST, void*, mutex); } void I_WRAP_SONAME_FNNAME_ZU(Za,_ZN11QBasicMutex6unlockEv)( void *mutex ) { OrigFn fn; VALGRIND_GET_ORIG_FN(fn); // printf("UNLOCK %p \n", mutex); DO_CREQ_v_W(_VG_USERREQ__HG_PTHREAD_MUTEX_UNLOCK_PRE, void*, mutex); CALL_FN_v_W(fn, mutex); DO_CREQ_v_W(_VG_USERREQ__HG_PTHREAD_MUTEX_UNLOCK_POST, void*, mutex); } void I_WRAP_SONAME_FNNAME_ZU(Za,_ZN11QBasicMutex7tryLockEi)( void *mutex, int timeout ) { OrigFn fn; VALGRIND_GET_ORIG_FN(fn); // printf("TRYLOCK %p %d\n", mutex, timeout); DO_CREQ_v_WW(_VG_USERREQ__HG_PTHREAD_MUTEX_LOCK_PRE, void*, mutex, long, 1); CALL_FN_v_WW(fn, mutex, timeout); DO_CREQ_v_W(_VG_USERREQ__HG_PTHREAD_MUTEX_LOCK_POST, void*, mutex); }
_______________________________________________ Development mailing list Development@qt-project.org http://lists.qt-project.org/mailman/listinfo/development