Re: [Valgrind-users] threads to queue [long]

tom fogal Sat, 01 Nov 2008 17:39:01 -0700

Darryl Miles <[EMAIL PROTECTED]> writes:
> tom fogal wrote:
> > Darryl Miles <[EMAIL PROTECTED]> writes:
> >> Julian Seward wrote:
> >> Then on the return from the system call the current thread checks to see 
> >> if it is still allowed to run, if it is not allowed to run it suspends 
> >> itself.
> > 
> > You miss his point.  Say we have two threads:
> > 
> >               T1                                        T2
> >     pthread_mutex_lock(&m);                   pthread_mutex_lock(&m);
> >         ht->add(key, value);                      ht->remove(key);
> >     pthread_mutex_unlock(&m);                 pthread_mutex_unlock(&m);
> > 
> > Now say thread 1 acquires the lock.  Finally, this logic of valgrind
> > deciding which thread to run decides (through bug or otherwise) that
> > thread 2 should run.  Both threads are currently paused.  Then:
> > 
> > The kernel decides to wake thread 1.  Valgrind sees `runnable thread'
> > != `current thread', and does a sched_yield().  The kernel decides to
> > wake thread 1.  Valgrind sees `runnable thread' != `current thread',
> > and does a sched_yield().  The kernel decides to wake thread 1 ...


One thing I neglected to think about earlier is how one would do a
`Can I still run?' check.  There is no notification to a user process
when a context switch comes back.

Remember that valgrind instruments instructions.

> Remember my rule: All threads in a kernel system call are running (and 
> maybe put to asleep inside the kernel if it so wishes).  As soon as the 
> kernel passing back control via system call return you always intercept 
> with valgrind doing a "Can I still run check?  Yes=carry on, No=go to 
> sleep under valgrind's control".
> 
> Why would valgrind keep waking up the wrong thread, it wont do that with 
> round-robin since it knows that T2 is blocked in the kernel (waiting for 
> the lock).

I had a long diatribe about how you essentially want the JVM's safe
points and how ridiculously difficult it will be, as you'd have to
write a parallel scheduler.  Which I still think is true, in your
general case of the choosing which thread to run.

You really don't need that functionality though.  All you want is to
guarantee that <= 1 thread is running, at all times.

A solution to your issue -- introduce a new mutex, within valgrind.
Wrap every function you possibly can.  Acquire the lock before the
function, and release it after the function.

An example.  User code:

         T1                                T2
    some_system_call()                acquire(&m1)
    acquire(&m1)                      sys_call();
    ... do stuff                      ... whatever
    release(&m1)                      release(&m1)

would translate to:

    acquire(&vg_runnable_thread)      acquire(&vg_runnable_thread)
    some_system_call()                acquire(&m1)
    release(&vg_runnable_thread)      release(&vg_runnable_thread)
    acquire(&vg_runnable_thread)      acquire(&vg_runnable_thread)
    acquire(&m1)                      sys_call()
    release(&vg_runnable_thread)      release(&vg_runnable_thread)
    ... do stuff                      ... whatever
    acquire(&vg_runnable_thread)      acquire(&vg_runnable_thread)
    release(&m1)                      release(&m1)
    release(&vg_runnable_thread)      release(&vg_runnable_thread)

I think that's safe.  Think.

Notes:
    * `acquire' is of course really pthread_mutex_lock, which is
      itself of course a user-available function which you must wrap.
    * the lock is not acquired for user-code: `do stuff' might be a
      for loop which sums an array, for example, and you'd probably
      want it to be able to execute concurrently with `whatever'.
        * Otherwise -- where does it stop?  You could acquire/release
          around every *instruction*, I suppose, but that's
          crazy-painful, performance wise.

There is a bit of a race at the beginning of a thread's lifetime --
threads may execute concurrently until they call something which is
wrappable.

In this scheme, you don't get to control which thread runs.  But I
think Julian stressed, and I stress now, that you really don't want to
do that.

> pthread_mutex_lock(&m);
> // Artifically enlarge the window when the lock is held
> //  in an attempt to catch the program out
> nanosleep({3,0}, NULL);       // its more complex than this due to EINTR
> ht->add(key, value);
> pthread_mutex_unlock(&m);
> 
> What I don't get at the unlock is an audit trail of access to memory I 
> am interested in.  A chronological order of read/write access with 
> thread_id, pointer to start and length of access.  This is what valgrind 
> does so well.

It sounds like, if you wanted to do this, you *would* need to lock at
every instruction, since you'd need some kind of global hash table or
other data structure which would maintain this list.

By the way, have you heard of software transactional memory? ``STM''.
I'm not sure if there are any open-source systems which implement it.
However, an STM system must keep exactly this, except they of course
do it to provide better parallelism than the absurdity that is
threads.

Thinking about those gives me the idea that you could probably avoid
this wrapping in regions of code which could provably not r/w a shared
variable, perhaps by knowing that the thread holds no locks.

> No one is saying we'd use this mechanism to stop a thread forever, they 
> are HINTS to the scheduling to lean a particular way during the decision 
> making process. [snip]

If you only need a hint, not a hard guarantee, change your wrappers to
set and restore the thread priority instead of grab a lock.  I imagine
that'd be much cheaper.

> struct timeval tv = { 5, 0 };
> vg_thread_control(VG_THREAD_YEILD, &tv);
> 
> What this might mean is yield my timeslice to any/all other threads (in 
> preference to us, we/current-thread temporarily adopts a priority of 
> "absolute last resort", but only for a limited amount of time) in so 
> doing we don't get passed control for at least 5 seconds, providing 
> there is some other thread that can run.  If there is just no other 
> thread that can run then we will get back control sooner.

Implementing a timeout would probably be pretty hard..

> > The only way I can see to fix this is to have some sort of scheduling
> > logic inside valgrind itself [snip]
>
> Can we intercept all application/client system calls on a whim ?
> The entry points this and other threads use to make system calls we
> need to flip to another table. This includes the ability to flip
> the syscall exit code of already running threads that are currently
> blocked in the kernel.
[snip]
> By modifying the exit code (of an already running system call) that 
> thread will participate in the thread control scheme upon its next 
> return to user-space.
[snip]

I am not following this at all.  Why would we ever want to modify the
return values of a system call?

> If threads are in application/client code already can they be 
> interrupted too ?  I'm guessing there must be the equivalent of IPI 
> inside valgrind http://en.wikipedia.org/wiki/Interprocessor_interrupt
> to manage the global emulation state.

Valgrind reads in instructions, translates them, does its fancy magic
on them, translates them back, and spits them back out.  Then the CPU
runs the new instruction stream.

I'm not sure how you're thinking it works, but it doesn't sound like
the instructions can `change' action based on an event, because
they're already a bunch of instructions.  Like a JIT though, there
might be some way to cause future translations to have different
output.

>From a quick glance at nulgrind, I'd guesstimate the atomic unit
valgrind operates on is a basic block.

> >> That is the ability to control/suspend thread execution/scheduling 
> >> within a process (possibly itself, possibly another process) but doing 
> >> so in a way that is not observable to the application.  This might have 
> >> other uses outside debugging but mainly targeted at debugging.
> > 
> > Could you write a wrapper application which fork/execs your chosen
> > application?  This wrapper could use ptrace to start/pause the
> > application, and you could communicate with it however you choose, or
> > even just programmatically specify what you'd like.
> > 
> > .. actually, that sounds a lot like gdb, to be honest.  Maybe I'm
> > missing something.
> 
> Yes you are missing the part about valgrind being able to audit 
> read/write memory accesses at asm level in byte granularity.  The bit 
> valgrind does well, gdb can only tell you if the page is valid or not.
> 
> Also the part about the application/client code being cooperative and 
> interactive in the debugging process, i.e. the ability to call functions 
> that solely exist to communicate with the debugger (ala 
> vg_thread_control()).  This is like dynamic breakpoints/watchpoints 
> yadda yadda.  "Hey look at me debugger, I'm about to do something that 
> you should be looking at!"

Maybe looking at hacking gdb would be a quicker route to your goal.
You might be able to add code which says, `when a lock at address X
is acquired, automatically add watchpoints on address[es] ABC; when X
is unlocked, delete watchpoint[s] ABC'.


Anyway, I'm getting closer and closer to believing you want an STM-like
system, but to detect threading bugs instead of make multithreaded
programming easier.  So I think you should googly around for some of
those first and see if those ideas can be applied in VG / gdb.

(I'd be very interested in what you find.)

Best,

-tom

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Valgrind-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/valgrind-users

Re: [Valgrind-users] threads to queue [long]

Reply via email to