Julian Seward wrote:
> I can't claim to really understand this, but I do have a couple of
> questions:
> 
> * "it knows that T2 is blocked in the kernel"
>   how does V know that a syscall it is about do perform (on behalf
>      of the client application) will or will not block?
>   AFAICT that is unknowable from user space.  And how would it 
>   distinguish a block from merely a long wait?

It would work on the basis that Valgrind knows a syscall is being 
performed, its running the application/client under emulation and it has 
already intercepted the entrypoints to all syscalls.

Valgrind already does a similar thing (but at a higher level up) for 
malloc/free/realloc etc.

The application/client thinks it is calling the libc malloc, but it is 
not, it is calling valgrind's implementation.

There is nothing to stop interception of anything in this way so that 
valgrind can do some work on entry and do some more work on exit.  This 
is what I mean by syscall interception.


> * "No=go to sleep under valgrind's control"
>   how?

Valgrind has a list of threads, when valgrind decides it doesn't want to 
run this thread for a while it puts it to sleep using the same standard 
mechanisms that already exist (syscalls like futexes etc...).


> Overall I have to say my feeling is that messing with the kernel's
> scheduling is a losing proposition.  We went to those kinds of places
> in earlier years of the project, and it was always a massive PITA
> and source of fragility.

While the proposition of having complete authoritative control of thread 
scheduling sounds nice to have in your toolbag, its not required.  What 
is really wanted is the ability to hint/bend the thread scheduler 
decisions in ways that don't violate basic runtime rules but can make 
debugging easier.


We might hint that we want the current thread to yield CPU time to all 
other threads.

One interpretation of this hint is that we want the scheduler to go out 
of its way to see to it that every other thread is given an opportunity 
to run (possibly for a number of occasions) before it returns control 
back to the current thread.

We might even by willing to take a sleep/delay hit to give other threads 
even more time to wake up, should they not be in a runnable state just 
now.  This is what my nanosleep() after gaining lock.

Now that sort of hinting is useful to getting nearer the goal of proving 
multi-threaded application design.  You can make the window of time for 
worst case scenario much larger in an attempt to trip things up when 
running testcases.




> It also seems to me that both you and Felix want to control the
> scheduling as a way of shaking out threading-related bugs in applications
> (but correct me if I'm wrong).  FWIW, if that is indeed the case,
> I would suggest that you'd be better off looking into race detection
> algorithms which are more scheduling-independent and/or can understand
> atomic instructions better (I think you mentioned something about them
> earlier in the thread).

The biggest problem is that I would have thought valgrind is in the 
ideal position to be able to provide an audit of whats going on, if you 
so instruct valgrind about what you are interested in.

Having it reverse engineer a program at runtime sounds great and all and 
for memcheck that approach works well.  But if there is some specific 
threading issue that I'm looking to test I'd prefer to explicitly define 
a set of rules, then define a point A and point B and then ask valgrind 
to report (all of this to be done at runtime).

In some cases I want it to also report on what happened even when no 
bug/problem was detected.  So I can debug my instructions at configuring 
valgrind.  So I can prove my testcase is actually testing and doing the 
correct thing.



> Do you have any concise fragments of code illustrating what problem
> it is you are really trying to solve?

"Concise" might seem a little subjective.

Certainly I've code that I'd like to better prove it is doing what I 
expect/think it to be doing.  A simple audit of memory access would go a 
long way to achieving that.

That is the main problem I'm trying to solve a tool that I can audit 
memory access with, it doesn't matter what the code actually is I just 
want to see "thread_id, pointer, length, mode(read or write)" in a list. 
  I can take over from there.

The data reported needs to be the memory's view of the world, which then 
brings in the issue well the only way to get a consist and 100% accurate 
log is to ensure only one thread is running application/client code at 
any one time.  Which brings us back to reason we need threading control.



Speaking in relation to atomic instructions and all that,

At one layer there is the ability to prove custom threading primitives 
on a given platform.  These being custom building blocks to supplement 
the limited primitives pthread provides for a given specialist task. 
For example the pthread implementation on Linux didn't always have a 
rwlock so we'd have to roll our own.  Now we have rwlock the next 
specialist primitive of the month might be a fair_double_locking_doobery 
(whatever one of those is!).

Then at another level is application usage of and interaction between a 
one or more primitives and data/memory.


While making life better on the issue of debugging custom lower layer 
primitives is a worthy goal; most of my time (and I guess other peoples) 
is spent debugging applications using standard primitives.  So to save 
getting off track feel free to stick with the goal of making life better 
in the domain of "application usage of standard threading primitives".

Save the notion that valgrind can assist in debugging custom threading 
primitives for another day.


Darryl

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Valgrind-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/valgrind-users

Reply via email to