Hi,

I used "valgrind --tool=drd …" to debug my POSIX pthread based program in
C. Valgrind detected an error and terminated a thread with the following
error message:

drd: drd_vc.c:96 (vgDrd_vc_increment): Assertion 'oldcount <
vc->vc[i].count' failed.

host stacktrace:
==27993==    at 0x38025C68: ??? (in /usr/lib64/valgrind/drd-amd64-linux)
==27993==    by 0x38025D94: ??? (in /usr/lib64/valgrind/drd-amd64-linux)
==27993==    by 0x38025F21: ??? (in /usr/lib64/valgrind/drd-amd64-linux)
==27993==    by 0x38018317: ??? (in /usr/lib64/valgrind/drd-amd64-linux)
==27993==    by 0x380183EC: ??? (in /usr/lib64/valgrind/drd-amd64-linux)
==27993==    by 0x380187FC: ??? (in /usr/lib64/valgrind/drd-amd64-linux)
==27993==    by 0x3801D87E: ??? (in /usr/lib64/valgrind/drd-amd64-linux)
==27993==    by 0x3800967A: ??? (in /usr/lib64/valgrind/drd-amd64-linux)
==27993==    by 0x3803DB80: ??? (in /usr/lib64/valgrind/drd-amd64-linux)
==27993==    by 0x38078BDF: ??? (in /usr/lib64/valgrind/drd-amd64-linux)
==27993==    by 0x3808742A: ??? (in /usr/lib64/valgrind/drd-amd64-linux)

sched status:
  running_tid=1

Thread 1: status = VgTs_Runnable (lwpid 27993)
==27993==    at 0x4C339D3: pthread_mutex_unlock (in
/usr/lib64/valgrind/vgpreload_drd-amd64-linux.so)
==27993==    by 0x47F671: lf_pthread_mutex_unlock (htab2.c:192)
==27993==    by 0x405194: prepare_to_read_n_go (ep3.c:805)
==27993==    by 0x4053C4: reading_begin (ep3.c:847)
==27993==    by 0x405CFF: start_file_loader (ep3.c:1062)
==27993==    by 0x405D4E: start_services (ep3.c:1076)
==27993==    by 0x406743: init_procs (ep3.c:1295)
==27993==    by 0x40336B: main (ep.c:89)

In the program source code, the system call "pthread_mutex_unlock" in
"lf_pthread_mutex_unlock" was the last statement executed before the
termination. I am using OpenSuse wrapped with "Linux linux 4.4.76-1-default
#1 SMP Fri Jul 14 08:48:13 UTC 2017 (9a2885c) x86_64 x86_64 x86_64
GNU/Linux". The compiler is gcc-4.8.

I located the source code in drd_vc.c:

/** Increment the clock of thread 'tid' in vector clock 'vc'. */
void DRD_(vc_increment)(VectorClock* const vc, DrdThreadId const tid)
{
   unsigned i;
   for (i = 0; i < vc->size; i++)
   {
      if (vc->vc[i].threadid == tid)
      {
         typeof(vc->vc[i].count) const oldcount = vc->vc[i].count;
         vc->vc[i].count++;
         // Check for integer overflow.
         tl_assert(oldcount < vc->vc[i].count);
         return;
      }
   }
   …
}

where the statement "tl_assert(oldcount < vc->vc[i].count);" was the break
point.

I don't understand what the semantics of the thread counter "count" is and
how it was overflowed with my program. My program was running a quite large
volume of data (millions of records). I changed my program with different
numbers of variables and noted the error occurred consistently. Therefore I
guess the error was not caused by a memory overwrite.

Any information about the nature of the source code drd_vc.c would be
greatly appreciated. Thank you.

Note that the same question was posted in
https://stackoverflow.com/questions/56453500/posix-pthread-counter-integer-overflow-in-valgrind-drd-vc-c

Kevin
_______________________________________________
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users

Reply via email to