I posted this on the valgrind-dev mailing list a while back but didn't
receive a response. Hopefully some users have some input.

I'm working on a project that leverages Callgrind to generate VEX IR
traces. I'm using Valgrind 3.12.0.
I also use Callgrind's infrastructure to detect when Valgrind switches
thread contexts, however I'm getting unexpected behavior.

It looks like the best place to detect a thread context switch in Callgrind is
in CLG_(setup_bbcc) in bbcc.c  (line 561):

  /* This is needed because thread switches can not reliable be tracked
   * with callback CLG_(run_thread) only: we have otherwise no way to get
   * the thread ID after a signal handler returns.
   * This could be removed again if that bug is fixed in Valgrind.
   * This is in the hot path but hopefully not to costly.
   */
  tid = VG_(get_running_tid)();
#if 1
  /* CLG_(switch_thread) is a no-op when tid is equal to CLG_(current_tid).
   * As this is on the hot path, we only call CLG_(switch_thread)(tid)
   * if tid differs from the CLG_(current_tid).
   */
  if (UNLIKELY(tid != CLG_(current_tid)))
     CLG_(switch_thread)(tid);

The above is called every instrumented basic block.
I've noticed strange behavior, where* a thread switch would not always be
detected.*
I detected the unexpected behavior with the following modifications:

To investigate further, I modified the above:
- if (UNLIKELY(tid != CLG_(current_tid)))
+ if (UNLIKELY(tid != CLG_(current_tid))) {
     CLG_(switch_thread)(tid);
+    VG_(printf)("Thread switched to: %d\n", tid);
+ }


   - With this change, I run the parsec 3.0 benchmark blackscholes with 4
   threads, input_test.tar, and expect to see *5 *threads (numbered 1-5, 1
   master and 4 worker threads) printed.
   - Under default flags, I'm seeing all 5 threads printed
   - when I add --fair-sched=yes, often I'd see the last thread (5) *not
   printed*.
   - I confirmed this behavior by printing VG_(get_running_tid)() every
   instrumented basic block.
   - I confirmed this behavior by using --separate-threads=yes for
   Callgrind. This only outputs 4 per-thread files, instead of 5.
   - I know that the thread switch happened or else the application would
   have failed.

This does not happen all the time but it happens on the majority of runs. I
also noticed that if I put a print statement in the blackscholes worker
thread, the unexpected behavior manifests far less often. I conclude it
must have something to do with the thread exiting too quickly and not
having enough work to do.

*Is this considered a bug? If not, how do I detect every time the Valgrind
thread context changes. I saw this thread
<http://valgrind-developers.narkive.com/ualztznb/thread-change-callback>from
a long time ago but I'm not sure if there's been any progress.*

$ uname -a
Linux ubuntu-VirtualBox 3.19.0-25-generic #26~14.04.1-Ubuntu SMP Fri Jul 24
21:16:20 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

*Steps to reproduce:*
mkdir detect_thread_switch && cd detect_thread_switch
curl -L http://parsec.cs.princeton.edu/download/3.0/parsec-3.0-core.tar.gz |
tar xz
parsec-3.0/bin/parsecmgmt -a build -p blackscholes -c gcc-pthreads
tar xf parsec-3.0/pkgs/apps/blackscholes/inputs/input_test.tar

curl -L http://valgrind.org/downloads/valgrind-3.12.0.tar.bz2 | tar xj
*# MAKE THE CHANGE TO bbcc.c TO PRINT THREAD ID ON THREAD SWITCH*

cd valgrind-3.12.0 && ./autogen.sh && ./configure
make -j4 && cd ..

*# WILL SHOW THREADS 1-5*
valgrind-3.12.0/vg-in-place --tool=callgrind
parsec-3.0/pkgs/apps/blackscholes/inst/amd64-linux.gcc-pthreads/bin/blackscholes
4 in_4.txt prices.txt

*# MAY HAVE TO RUN SEVERAL TIMES IN SUCCESSION, WILL EVENTUALLY BE MISSING
THREAD 5*
valgrind-3.12.0/vg-in-place --fair-sched=yes --tool=callgrind
parsec-3.0/pkgs/apps/blackscholes/inst/amd64-linux.gcc-pthreads/bin/blackscholes
4 in_4.txt prices.txt

--------------------------------------------------------
Some example output with default flags:

==3382== Callgrind, a call-graph generating cache profiler
==3382== Copyright (C) 2002-2015, and GNU GPL'd, by Josef Weidendorfer et
al.
==3382== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==3382== Command:
parsec-3.0/pkgs/apps/blackscholes/inst/amd64-linux.gcc-pthreads/bin/blackscholes
4 in_4.txt prices.txt
==3382==
==3382== For interactive control, run 'callgrind_control -h'.
PARSEC Benchmark Suite Version 3.0-beta-20150206
Num of Options: 4
Num of Runs: 100
Size of data: 160
*Thread switched to: 4*
*Thread switched to: 3*
*Thread switched to: 2*
*Thread switched to: 1*
*Thread switched to: 5*
*Thread switched to: 4*
*Thread switched to: 1*
==3382==
==3382== Events    : Ir
==3382== Collected : 569502
==3382==
==3382== I   refs:      569,502

With --fair-sched=yes:
==3375== Callgrind, a call-graph generating cache profiler
==3375== Copyright (C) 2002-2015, and GNU GPL'd, by Josef Weidendorfer et
al.
==3375== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==3375== Command:
parsec-3.0/pkgs/apps/blackscholes/inst/amd64-linux.gcc-pthreads/bin/blackscholes
4 in_4.txt prices.txt
==3375==
==3375== For interactive control, run 'callgrind_control -h'.
PARSEC Benchmark Suite Version 3.0-beta-20150206
Num of Options: 4
Num of Runs: 100
Size of data: 160
*Thread switched to: 2*
*Thread switched to: 1*
*Thread switched to: 3*
*Thread switched to: 2*
*Thread switched to: 1*
*Thread switched to: 4*
*Thread switched to: 3*
*Thread switched to: 1*
*Thread switched to: 4*
*Thread switched to: 2*
*Thread switched to: 1*
*Thread switched to: 2*
*Thread switched to: 1*
==3375==
==3375== Events    : Ir
==3375== Collected : 569505
==3375==
==3375== I   refs:      569,505

Thanks!
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users

Reply via email to