Re: [Valgrind-users] threads to queue [long]

Darryl Miles Sun, 02 Nov 2008 16:49:31 -0800

tom fogal wrote:
> Darryl Miles <[EMAIL PROTECTED]> writes:
>> tom fogal wrote:
> It doesn't; I wrote this before I understood your `acquire a
> VG-internal lock at syscall entry' implementation, and didn't come
> back and edit.
 >
 > I don't see how that's not a scheduler.  Each thread essentially has a
 > valgrind-specific state -- whether or not you'd like it to artifically
 > suspend or not -- and valgrind should choose at various points whether
 > or not to pause or delay the thread based on that information.


System calls would be left to run, i.e. any thread that wanted to jump 
into a system call should be left to do so (T1) but in the act of doing 
this at least one thread that was artificially blocked in user-space (in 
valgrind scheduler code) would be released to run (T2).  If there are no 
threads to run because they are all running them thats a cool situation too.

On the return from syscall of that first thread (T1) it would by default 
be intercepted and a simple check made.  If there is already another 
running thread in user-space i.e. the T2 thread we unblocked then T1 
will be suspended by the VG scheduler.  If there was no T2 thread 
running then the T1 thread will be left to continue running.

So thats how to get thread control so only one thread runs 
application/client code at one time.  Sure we allow multiple threads to 
be running inside the kernel, running includes being delayed or blocked 
inside the kernel as every syscall should be treated as-if it will block.

I hope that clears up how thread management is possible to do.



I'm not averse to calling the valgrind thread management a "scheduler" 
it's just not a parallel scheduler to the one in the kernel, the one in 
the kernel has a different set of input events/wait-queues to the one 
what valgrind has.  But none the less both have influence over the 
runability of threads and specifically the runability of 
application/client code being examined.  Which at the end of the day is 
the goal.



>> Note that you could then obtain/influence threading control with the 
>> vg_scheduler_pre_client_hook() and the vg_scheduler_post_client_hook().
> 
> Yes.. but you don't need to.  The locks around the syscall already
> assure you that only one thread can run.
> 
> However, those hooks would be the logical place to modify the internal
> hash table.

The reasons for wanting only one thread to run at once:
  * To get a consistent log of memory access, we've now serialized all 
application/client code memory access.

  * To be able to force scheduler situations/scenarios that improve the 
chances of causing a threading bug to show up.  This is the enforced 
yield, enforced the point of making the current thread that is happy to 
yeild go to sleep.  Like nanosleep() can do.


"The locks around syscalls" I'm not sure on that terminology, I don't 
intend to make syscalls mutually exclusive, far from it.  My design 
states the opposite of this.

The terminology I have used was to "intercept" syscalls, i.e. allow vg 
to do something before and after (aka wrapper).  The design I hold up 
for discussion allows all threads to be inside the kernel at the same 
time, we deal with serializing application/client code on the "return 
from syscall".  We don't need to serialize kernel calls nor put locks 
around them, I'm not sure where this misunderstanding comes from I don't 
think its from me.  That will cause deadlocks.

The mutexes you proposed were there to protect tiny fragments of 
application/client code so that only one thread of application/client 
code was running at any one time.  They are not there to do anything in 
relation to syscalls, unfact during syscall entry you'd need to release 
that lock so allow another user-space thread to run.  I still cite my 
technique is better and the logical progression after you try your mutex 
approach that way and find out performance sucks to much.




> Valgrind already does this.  Look up function wrapping in the manual.

Absolutely.


> Well, it doesn't add it to an internal data structure, as far as I
> know .. just reports it immediately.  Yes though, seems like most of
> the pieces you want are already there.

This is where you STM sounds good, a transaction log (or journal) of all 
access.  This

I'm proposing an internal data structure would be added for this 
functionality, I understand it may not currently work that way.

Also I'm only interested in the specific locations I tell valgrind to 
watch I don't need a complete memory view, I don't need any reverse 
engineering of whats going on.



> This is way off-topic -- but threads don't scale, && they're much too
> difficult to get right.  I think it's an `Edward Lee' that has a paper
> on how a better approach to parallelism would be to start from
> determinism, and add non-determinism, instead of the reverse like we
> currently do with threads.
> 
> Anyway with any luck most everything will be data-parallel in a
> decade, and threads will be this strange thing that only operating
> system programmers actually use.

Yes maybe we should start another thread on this issue, an interesting 
discussion point.  What CPUs currently in production provide hardware 
assistance to this data access model.  What languages/compilers/tools ? 
  What real world problems need it (stuff we just cant do today without 
it) ?  Maybe a decade is optimistic, thats about 4 iterations of 
language/software technology these days, or 2.5 iterations of CPU/hardware.



> It's just that jumping to timeouts normally means mucking with
> signals, which I gather is painful in something like valgrind.  I
> could be wrong there.

futex() doesn't use signals, but allows a thread to go to sleep with the 
ability to receive a wakeup-now event.

nanosleep() also allows for a delay.  Without use of signals.

alarm() in many version of Unix require the use of signals, of which the 
SIGALRM might be something the application/client is using so needs 
special care.



>> Now if ALL code is instrumented under emulation then there may not be a 
>> need to intercept syscalls if we can ensure control is passed back to 
>> valgrind BEFORE application/client code.
> 
> I don't think this is possible, for the reasons mentioned at the very
> top of this email -- we can't know when/where a context switch will
> `jump back to', and the code is already translated at that point.

Ah I don't understand the correlation between needing to know when a 
context switch occurs, ensuring valgrind has CPU control at the time we 
need it to and the instrumentation/translation.

If it's a case that in order to use this feature full stop we need to 
leave the hooks in place inside the translated code then that might be a 
command line option to enable the API which will add a few asm 
instructions to the normal code path to checking if a flag is on and 
making a branch/call where the hooks are needed.


Darryl


-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Valgrind-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/valgrind-users

Re: [Valgrind-users] threads to queue [long]

Reply via email to