Re: [Valgrind-users] threads to queue

Darryl Miles Sat, 01 Nov 2008 15:56:37 -0700

tom fogal wrote:
> Darryl Miles <[EMAIL PROTECTED]> writes:
>> Julian Seward wrote:
>> Then on the return from the system call the current thread checks to see 
>> if it is still allowed to run, if it is not allowed to run it suspends 
>> itself.
> 
> You miss his point.  Say we have two threads:
> 
>               T1                                        T2
>     pthread_mutex_lock(&m);                   pthread_mutex_lock(&m);
>         ht->add(key, value);                      ht->remove(key);
>     pthread_mutex_unlock(&m);                 pthread_mutex_unlock(&m);
> 
> Now say thread 1 acquires the lock.  Finally, this logic of valgrind
> deciding which thread to run decides (through bug or otherwise) that
> thread 2 should run.  Both threads are currently paused.  Then:
> 
> The kernel decides to wake thread 1.  Valgrind sees `runnable thread'
> != `current thread', and does a sched_yield().  The kernel decides to
> wake thread 1.  Valgrind sees `runnable thread' != `current thread',
> and does a sched_yield().  The kernel decides to wake thread 1 ...


Remember my rule: All threads in a kernel system call are running (and 
maybe put to asleep inside the kernel if it so wishes).  As soon as the 
kernel passing back control via system call return you always intercept 
with valgrind doing a "Can I still run check?  Yes=carry on, No=go to 
sleep under valgrind's control".

Why would valgrind keep waking up the wrong thread, it wont do that with 
round-robin since it knows that T2 is blocked in the kernel (waiting for 
the lock).

As stated in the rules in my previous email the entry point for a 
systemcall into the kernel MUST wake up at least one other thread (that 
is in user-space) in the process.  Well the only other thread in your 
situation is T1, so when the futex() syscall is made by T2 (during the 
lock) this will implicitly cause T1 to be woken up.  T1 may run some 
more and in turn call a syscall itself which wakes up another thread, 
etc... a cascade.  You get a situation where:

  * Its possible for ALL threads to be inside a system call (its not 
relevant that they are blocking system calls or not, it doesn't matter, 
a thread inside a system call is beyond the control of valgrind)

  * That zero or one thread is running in user space executing 
application/valgrind code at all times.  Once you have control down to 1 
thread valgrind really has complete control of thread execution of 
application/client code from that vantage point.  Which was the goal.

  * That between zero and process_thread_count-1 threads in userspace 
are in a valgrind scheduling queue, waiting to be dispersed.  This queue 
is in effect an artificial suspension.  Anything in the this should be 
executing application/client code right now.  That is what the 
application/client code is expecting to be happening.


So given a walk though of my previous description of the mechanism and 
your example scenario do you still see there is a problem ?


We don't actually decide what runs next the kernel still does.  Don't 
forget all blocking waits happen inside the kernel, for most 
applications that are not consuming any CPU they will have all threads 
inside a syscall blocking on something.  This will remain true in the 
scheme I propose.

What we want from valgrind is not strictly the ability to suspend a 
thread or force only a certain thread to run.  We want the ability to 
make the target thread we are monitoring/debugging some threading issue 
on to become the LOWEST priority thread to run right after acquiring a 
lock/resource.

We then want to attempt/allow all other thread that can run to have a go 
until they either block or run out of timeslice(s).

I already artificially do this sort of things in my code when debugging 
by doing:

pthread_mutex_lock(&m);
// Artifically enlarge the window when the lock is held
//  in an attempt to catch the program out
nanosleep({3,0}, NULL); // its more complex than this due to EINTR
ht->add(key, value);
pthread_mutex_unlock(&m);

What I don't get at the unlock is an audit trail of access to memory I 
am interested in.  A chronological order of read/write access with 
thread_id, pointer to start and length of access.  This is what valgrind 
does so well.


No one is saying we'd use this mechanism to stop a thread forever, they 
are HINTS to the scheduling to lean a particular way during the decision 
making process.  If there is nothing else to run it doesn't matter what 
scheduling algorithm you have, when there is only one thread that can 
run, then that thread must run.


struct timeval tv = { 5, 0 };
vg_thread_control(VG_THREAD_YEILD, &tv);

What this might mean is yield my timeslice to any/all other threads (in 
preference to us, we/current-thread temporarily adopts a priority of 
"absolute last resort", but only for a limited amount of time) in so 
doing we don't get passed control for at least 5 seconds, providing 
there is some other thread that can run.  If there is just no other 
thread that can run then we will get back control sooner.



> The only way I can see to fix this is to have some sort of scheduling
> logic inside valgrind itself, which takes into account which threads
> hold which locks (or shmem semaphores, or anything else that might
> block the app).  Julian's argument (correct me if I'm wrong) was that
> creating such a scheduling algorithm is going to be extremely
> difficult.

Can we intercept all application/client system calls on a whim ?  The 
entry points this and other threads use to make system calls we need to 
flip to another table.  This includes the ability to flip the syscall 
exit code of already running threads that are currently blocked in the 
kernel.  Can that be done ?  on a whim, or is there a minor penalty to 
memcheck users by leaving some hook in place to allow for it, if so then 
a command line option would be needed, since I would not want to see any 
penalty when not using this support.

By modifying the exit code (of an already running system call) that 
thread will participate in the thread control scheme upon its next 
return to user-space.

This is all so that there is zero penalty to valgrind users who do not 
use any thread control API.




When I call the:

/* I demarcate the start of the region of execution where I
  *  want some explicit control over scheduling, this allows
  *  vg to init hooks if there ends up being a speed penalty
  * for using this API
  */

vg_thread_control(VG_ENABLE_EXPLICT_RESCHED_MODE,
&my_scheduler_implementation);


This has the effect of flipping all syscall entry points for all threads.

If threads are in application/client code already can they be 
interrupted too ?  I'm guessing there must be the equivalent of IPI 
inside valgrind http://en.wikipedia.org/wiki/Interprocessor_interrupt
to manage the global emulation state.  The emulator has a way be being 
interrupted with world events (stuff that affects all threads) and 
causes execution of application/client code to stop for a moment while 
the interrupt is processed.




>> That is the ability to control/suspend thread execution/scheduling 
>> within a process (possibly itself, possibly another process) but doing 
>> so in a way that is not observable to the application.  This might have 
>> other uses outside debugging but mainly targeted at debugging.
> 
> Could you write a wrapper application which fork/execs your chosen
> application?  This wrapper could use ptrace to start/pause the
> application, and you could communicate with it however you choose, or
> even just programmatically specify what you'd like.
> 
> .. actually, that sounds a lot like gdb, to be honest.  Maybe I'm
> missing something.

Yes you are missing the part about valgrind being able to audit 
read/write memory accesses at asm level in byte granularity.  The bit 
valgrind does well, gdb can only tell you if the page is valid or not.

Also the part about the application/client code being cooperative and 
interactive in the debugging process, i.e. the ability to call functions 
that solely exist to communicate with the debugger (ala 
vg_thread_control()).  This is like dynamic breakpoints/watchpoints 
yadda yadda.  "Hey look at me debugger, I'm about to do something that 
you should be looking at!"


Getting a full audit of all read/write access over your target locations 
between arbitrary point A and point B is really what this is all about.


 From that information the development can then:

  * Create valgrind specific testcases to show up the imperfections of 
their design (or target to verify concern).

  * Manually see what read/write events took place in relation to their 
expected access patterns.  Half the battle with debugging multi-threaded 
application is just that you can not see what is going on.



Darryl

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Valgrind-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/valgrind-users

Re: [Valgrind-users] threads to queue

Reply via email to